[openal] [PATCH V4] Add some mixer SSE2/SSE4.1/AVX optimisations

Timothy Arceri t_arceri at yahoo.com.au
Fri Jun 6 04:21:43 EDT 2014


On Fri, 2014-06-06 at 18:18 +1000, Timothy Arceri wrote:
> When SSE4.1 is enabled these changes can reduce cpu time spent in Resample_lerp32 by upto 43% during the OpenArena benchmark of the Phoronix Test Suite.
> 
> V4: fix obvious init copy and paste error in AVX code (still freezing)
> V3: wip AVX optimisation (currently causes benchmark to freeze), set caps flags individually without nested if assuming that previous extensions are available (just makes code a little nicer to read).
> V2: removed obsolete UpdatePositions change, moved InitiatePositionArrays to a common location

Also I should point out that the AVX code doesn't always freeze. The
benchmark does three passes and it will usually freeze during one of
those passes.



More information about the openal mailing list