[openal] [PATCH V2] Add some mixer SSE2/4.1 optimisations
Timothy Arceri
t_arceri at yahoo.com.au
Thu Jun 5 18:55:57 EDT 2014
On Tue, 2014-06-03 at 17:33 +1000, Timothy Arceri wrote:
> +
> + frac4 = _mm_add_epi32(frac4, increment4);
> + pos4 = _mm_add_epi32(pos4, _mm_srli_epi32(frac4, FRACTIONBITS));
> + frac4 = _mm_and_si128(frac4, fracMask4);
> +
After playing around with an AVX version of the optimisation I'm
starting to think maybe my logic is wrong. Is using const __m128i
increment4 = _mm_set1_epi32(increment*4); to jump the value of frac4
forward correct? Or does the mask need to be applied between each
iteration meaning I cant just times by 4.
It seems to work ok with the SSE version but when trying to use
_mm_set1_epi32(increment*8) for AVX it seems to cause problems which
results in I believe pos being wrong and finally locking up the test
benchmark and my system.
More information about the openal
mailing list