[openal] [PATCH V2] Add some mixer SSE2/4.1 optimisations

Chris Robinson chris.kcat at gmail.com
Fri Jun 6 01:07:26 EDT 2014


On 06/05/2014 03:55 PM, Timothy Arceri wrote:
> After playing around with an AVX version of the optimisation I'm
> starting to think maybe my logic is wrong. Is using const __m128i
> increment4 = _mm_set1_epi32(increment*4); to jump the value of frac4
> forward correct? Or does the mask need to be applied between each
> iteration meaning I cant just times by 4.

I can't see any reason why it would be wrong. It works when I do

DataPosFrac += increment*DstBufferSize;
DataPosInt  += DataPosFrac>>FRACTIONBITS;
DataPosFrac &= FRACTIONMASK;

which can be up to 1024x. And it appears to work with the SSE linear 
resamplers which do 4x. So I don't see why 8x wouldn't also work with 
AVX if you're doing 8 samples at a time.


More information about the openal mailing list