Issue #16487 has been updated by ahorek (Pavel Rosick=FD).


> Do you have any practical applications whose performance is significantly=
 improved by the SIMD hacks? I'm unsure about coderange_scan, but it is dif=
ficult for me to imagine an application that String#strip is a bottleneck.

I agree, String#strip probably won't be a bottleneck, it was just easy to i=
mplement as an example.

there's a real use case for coderange_scan https://github.com/rubyjs/mini_r=
acer/pull/128
https://github.com/rails/rails/blob/2ae9e5da734e85bc5afaa15089171f1e996bd30=
6/activesupport/lib/active_support/core_ext/string/multibyte.rb#L48
https://github.com/rails/rails/blob/98a57aa5f610bc66af31af409c72173cdeeb3c9=
e/actionview/lib/action_view/template/handlers/erb.rb#L75
https://github.com/mikel/mail/blob/6bc16b4bce4fe280b19523c939b14a30e32a8ba4=
/lib/mail/fields/unstructured_field.rb#L28
etc.

the steam hardware survey states that any reasonable x86 CPU supports at le=
ast SSE4.2. https://store.steampowered.com/hwsurvey/
SSE2 100.00%
SSE3 100.00%
SSSE3 98.47%
SSE4.1 97.70%
SSE4.2 96.99%
AVX 92.79%
AVX2 74.63%
AVX512CD 0.16%

in fact, AVX was introduced in 2011, so this requirement for portability is=
 very low. Some Linux distributions already dropped support for old process=
ors and have more aggressive flags by default. https://clearlinux.org/news-=
blogs/smart-not-enough

even mentioned PHP has some functions optimized this way. Of course, it has=
 to be carefully decided what's worth to optimize and what's not, but this =
is one of many opportunities on how to improve performance.

here's also a very well written example
https://dev.to/wunk/fast-array-reversal-with-simd-j3p

> it would need to be dynamic if we want most users to benefit from it.

SSE2 is a hard requirement for x86_64 CPUs. If you need a portable package,=
 this is the baseline. I don't think dynamic loading is a solution. You can=
't use for example AVX instructions generated from a regular C code, even i=
f your processor supports it. You have to recompile it for your platform, t=
hat's a pain of all C programs.

> Introducing SIMD will make maintenanceability worse.

that's definitely true and valid concern. If there's any good library to ma=
ke things simpler (without sacrificing performance), that would be great.

----------------------------------------
Misc #16487: Potential for SIMD usage in ruby-core
https://bugs.ruby-lang.org/issues/16487#change-83744

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
* Assignee: =

----------------------------------------
### Context

There are several ruby core methods that could be optimized with the use of=
 SIMD instructions.

I experimented a bit on `coderange_scan` https://github.com/Shopify/ruby/pu=
ll/2, and Pavel Rosick=FD experimented on `String#strip` https://github.com=
/ruby/ruby/pull/2815.

### Problem

The downside of SIMD instructions is that they are not universally availabl=
e.
So it means maintaining several versions of the same code, and switching th=
em either statically or dynamically.

And since most Ruby users use precompiled binaries from repositories and su=
ch, it would need to be dynamic if we want most users to benefit from it.

So it's not exactly "free speed", as it means a complexified codebase.

### Question

So the question is to know wether ruby-core is open to patches using SIMD i=
nstructions ? And if so under which conditions.

cc @shyouhei





-- =

https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>