[openal] [PATCH V4] Add some mixer SSE2/SSE4.1/AVX optimisations

Timothy Arceri t_arceri at yahoo.com.au
Sat Jun 7 06:06:53 EDT 2014


On Sat, 2014-06-07 at 01:53 -0700, Chris Robinson wrote:
> On 06/06/2014 01:18 AM, Timothy Arceri wrote:
> > +        frac4_1 = _mm_add_epi32(frac4_1, increment4_8);
> > +        pos4_1 = _mm_add_epi32(pos4_1, _mm_srli_epi32(frac4_1, FRACTIONBITS));
> > +        frac4_1 = _mm_and_si128(frac4_1, fracMask4);
> > +
> > +        frac4_2 = _mm_add_epi32(frac4_1, increment4_4);
> > +        pos4_2 = _mm_add_epi32(pos4_2, _mm_srli_epi32(frac4_2, FRACTIONBITS));
> > +        frac4_2 = _mm_and_si128(frac4_2, fracMask4);
> 
> This is incorrect, I think. You're adding increment*8 to frac4_1 to 
> increment it 8 times, then you add the whole number portion to the 
> position and mask it out to retain only the fraction. You're then adding 
> increment*4 to frac4_1, but that's just the fractional component of the 
> previous result.
> 
> I think what you should do is
> 
> frac4_2 = _mm_add_epi32(frac4_2, increment4_8);
> 
> to increment frac4_2 8 times from where it was, similar to how frac4_1 
> was incremented.

Yes that patch I sent out had a few errors slip in as result from all my
trial and error. I've just sent a new patch, it still needs some tidying
up but it applies to head and it seems to not cause any freezing so far.
I'm just about to try profiling it to see how it compares to the other
resamplers.




More information about the openal mailing list