[openal] [PATCH V4] Add some mixer SSE2/SSE4.1/AVX optimisations

Chris Robinson chris.kcat at gmail.com
Sat Jun 7 04:53:51 EDT 2014

On 06/06/2014 01:18 AM, Timothy Arceri wrote:
> +        frac4_1 = _mm_add_epi32(frac4_1, increment4_8);
> +        pos4_1 = _mm_add_epi32(pos4_1, _mm_srli_epi32(frac4_1, FRACTIONBITS));
> +        frac4_1 = _mm_and_si128(frac4_1, fracMask4);
> +
> +        frac4_2 = _mm_add_epi32(frac4_1, increment4_4);
> +        pos4_2 = _mm_add_epi32(pos4_2, _mm_srli_epi32(frac4_2, FRACTIONBITS));
> +        frac4_2 = _mm_and_si128(frac4_2, fracMask4);

This is incorrect, I think. You're adding increment*8 to frac4_1 to 
increment it 8 times, then you add the whole number portion to the 
position and mask it out to retain only the fraction. You're then adding 
increment*4 to frac4_1, but that's just the fractional component of the 
previous result.

I think what you should do is

frac4_2 = _mm_add_epi32(frac4_2, increment4_8);

to increment frac4_2 8 times from where it was, similar to how frac4_1 
was incremented.

More information about the openal mailing list