[openal] [PATCH V2] Add some mixer SSE2/4.1 optimisations

Timothy Arceri t_arceri at yahoo.com.au
Thu Jun 5 18:55:57 EDT 2014


On Tue, 2014-06-03 at 17:33 +1000, Timothy Arceri wrote:

> +
> +        frac4 = _mm_add_epi32(frac4, increment4);
> +        pos4 = _mm_add_epi32(pos4, _mm_srli_epi32(frac4, FRACTIONBITS));
> +        frac4 = _mm_and_si128(frac4, fracMask4);
> +

After playing around with an AVX version of the optimisation I'm
starting to think maybe my logic is wrong. Is using const __m128i
increment4 = _mm_set1_epi32(increment*4); to jump the value of frac4
forward correct? Or does the mask need to be applied between each
iteration meaning I cant just times by 4.

It seems to work ok with the SSE version but when trying to use
_mm_set1_epi32(increment*8) for AVX it seems to cause problems which
results in I believe pos being wrong and finally locking up the test
benchmark and my system.



More information about the openal mailing list