[openal] [PATCH V4] Add some mixer SSE2/SSE4.1/AVX optimisations

Timothy Arceri t_arceri at yahoo.com.au
Fri Jun 6 04:21:43 EDT 2014

On Fri, 2014-06-06 at 18:18 +1000, Timothy Arceri wrote:
> When SSE4.1 is enabled these changes can reduce cpu time spent in Resample_lerp32 by upto 43% during the OpenArena benchmark of the Phoronix Test Suite.
> V4: fix obvious init copy and paste error in AVX code (still freezing)
> V3: wip AVX optimisation (currently causes benchmark to freeze), set caps flags individually without nested if assuming that previous extensions are available (just makes code a little nicer to read).
> V2: removed obsolete UpdatePositions change, moved InitiatePositionArrays to a common location

Also I should point out that the AVX code doesn't always freeze. The
benchmark does three passes and it will usually freeze during one of
those passes.

