[openal] Feedback on OpenAL Soft's x86 build

Chris Robinson chris.kcat at gmail.com
Thu Jun 21 18:31:32 EDT 2018


Hello,

I'm looking for feedback regarding the 32-bit x86-based builds for 
OpenAL Soft. Currently the build scripts use the compiler's default 
codegen flags, with whatever optimizations CMake passes in. For GCC (and 
I believe Clang), this will typically be a 586 or 686 CPU with an x87 
FPU, without MMX or SSE. The SSE-enhanced functions are still built and 
selected at runtime as necessary, but code that's not written 
specifically using SSE intrinsics still uses the x87 instruction set.

This presents some problems related to performance. Particularly with 
the biquad filters, reverb, and other kinds of decaying feedback loops, 
x87 processing results in denormal (ridiculously small) numbers, which 
is extremely slow for the FPU to deal with. It is possible to disable 
denormal numbers with SSE and SSE2, instead treating them as 0, however 
this only applies to the SSE unit so the calculations done on the x87 
unit are still subject to denormals.

It wouldn't be hard to make a build option to generate SSE or SSE2 
instructions instead of x87 for 32-bit (in fact, you should be able to 
just pass -DCMAKE_C_FLAGS="-msse2 -fmpmath=sse" to cmake), but given the 
potential performance implications and CPU requirements, I would like 
feedback with a couple questions.

First, should x86 builds default to generating SSE or SSE2 instead of 
x87 math, with an option to go back to the slower x87 processing? As far 
as I'm aware, MSVC defaults to using SSE2 codegen on 32-bit, and x86-64 
CPUs must support SSE2 (meaning it's been available since even before 
then, so you'd have to have a really old CPU to lack SSE or SSE2). But 
if you're targeting a 32-bit x86 system, maybe you'd expect a really old 
CPU lacking SSE? In which case I wonder if the performance impact from 
filters and effects on x87 is a big detriment. And in either case, would 
it be a problem to require setting an option when you build the library?

Second, should the prebuilt Windows binaries have SSE/SSE2 required for 
the 32-bit DLLs? If you're using it to build a 32-bit app, the first 
question applies (are you really targeting CPUs so old they don't have 
SSE or SSE2?). If you just use them to get the DLLs for preexisting 
32-bit apps on a non-ancient system, you almost assuredly have SSE2 support.

Any feedback you can give would be helpful in figuring out the better 
option. Thanks!


More information about the openal mailing list