[openal] [PATCH V4] Add some mixer SSE2/SSE4.1/AVX optimisations
Chris Robinson
chris.kcat at gmail.com
Sat Jun 7 04:53:51 EDT 2014
On 06/06/2014 01:18 AM, Timothy Arceri wrote:
> + frac4_1 = _mm_add_epi32(frac4_1, increment4_8);
> + pos4_1 = _mm_add_epi32(pos4_1, _mm_srli_epi32(frac4_1, FRACTIONBITS));
> + frac4_1 = _mm_and_si128(frac4_1, fracMask4);
> +
> + frac4_2 = _mm_add_epi32(frac4_1, increment4_4);
> + pos4_2 = _mm_add_epi32(pos4_2, _mm_srli_epi32(frac4_2, FRACTIONBITS));
> + frac4_2 = _mm_and_si128(frac4_2, fracMask4);
This is incorrect, I think. You're adding increment*8 to frac4_1 to
increment it 8 times, then you add the whole number portion to the
position and mask it out to retain only the fraction. You're then adding
increment*4 to frac4_1, but that's just the fractional component of the
previous result.
I think what you should do is
frac4_2 = _mm_add_epi32(frac4_2, increment4_8);
to increment frac4_2 8 times from where it was, similar to how frac4_1
was incremented.
More information about the openal
mailing list