Tag Archives: intrinsics

ARM NEON C++ Cheat Sheet

Newer ARM processors have their own flavor of SIMD instructions called NEON. In my little Android application Arashi, NEON is used a lot to speed up the simulation of particles.

Here is a table explaining some of the NEON functions that are used:

[table caption="C++ NEON functions" width="500" colwidth="20|100|50" colalign="left|left"]
NEON,Explanation,Pseudocode
vdupq_n_f32(a),New NEON value,a
vsubq_f32(a\, b),Subtract,a - b
vaddq_f32(a\, b),Add,a + b
vmulq_f32(a\, b),Multiply,a * b
vmlaq_f32(a\, b\, c),Multiply and add,a + (b * c)
vmlsq_f32(a\, b\, c),Multiply and subtract,a - (b * c)
vrsqrteq_f32(a),Reciprocal square root,1 / sqrt(a)
vcgtq_f32(a\, b),Compare greater than,a > b ? 1 : 0
vcltq_f32(a\, b),Compare less than,a < b ? 1 : 0
vbslq_f32(mask\, a\, b),Select by mask,mask != 0 ? a : b
vminq_f32(a\, b),Get minimum,a < b ? a : b
vmaxq_f32(a\, b),Get maximum,a > b ? a : b
[/table]