[openal] Questions about OpenAL Soft's Resampler
Chris Robinson
chris.kcat at gmail.com
Tue Nov 7 21:17:55 EST 2017
On 11/07/2017 11:10 AM, Ethan Lee wrote:
> Hey there!
>
> I'm currently working on a reimplementation of Microsoft's XACT runtime,
> called FACT...
>
> https://github.com/flibitijibibo/FACT
Sounds interesting.
> ... but I also have to factor in pitch changes, which SDL_AudioStream
> doesn't account for (it only expects one input frequency for the
> duration of its existence).
>
> Normally I'd just use OpenAL for this sort of thing, but part of FACT's
> job is mimicking XAudio2 accurately, which unfortunately makes the two
> incompatible in really subtle ways (and fixing them would mean ripping
> up the library for what is essentially one exact use case and nothing
> else). I'd still like to use OpenAL's ideas though, but to keep the
> permissive license intact I'd have to, at most, look at papers/documents
> that explain how it's done, rather than just using the code directly.
>
> TL;DR: How does OpenAL's resampler work with adjustable pitch changes,
> and are there any resources online that were used as references for the
> resampler? I'm not too concerned about the resampling function itself
> (we'll probably just use a linear resampler), but I'm a lot more
> concerned about stuff like weird step sizes, fractions of samples,
> padding buffer sizes (for both resampling and possibly for output...?),
> handling wildly fluctuating pitches, little details like that.
For OpenAL Soft, the general idea is that each source maintains a
fixed-point offset, split into two 32-bit ints (could probably use a
64-bit int these days), with 12 bits of precision. A stepping value is
calculated in the same 12-bit fixed-point format based on the input
sample rate, output sample rate, and the pitch multiple (so for example,
a 22050hz input sample rate with 44100hz output sample rate has an
inherent 0.5 bias factor in the stepping value just to play at the
correct speed, which is multiplied with the pitch to get the desired
pitch shift).
The actual resampling is essentially a FIR filter. An output sample is
generated by filtering the input samples around the source's fixed-point
offset using some method. Then the source's offset is incremented by the
stepping value and it goes to the next output sample. This is repeated
until the until the end of the input samples is reached or the output is
filled.
In regards to the resamplers themselves, point and linear are the
easiest, where the former just drops the fractional offset and the
latter uses the two samples around the current offset with the
fractional component being a blend factor them, but they also have a
fair bit of noise. Because of how simple they are though, changing the
pitch is accomplished by merely calculating a new stepping value. The
higher order resamplers make use of precalculated, rate-specific filter
tables to improve quality, in which case changing the pitch needs to
calculate the new stepping value and recalculate the table offset for
the new rate (not difficult if done right, but something to be aware of).
As a notable detail, OpenAL Soft "loads" samples from the source input
into some reusable scratch memory and supplies that as the resampler
input. This helps with resamplers that want to use future samples (such
as linear or the higher order sinc resamplers) by guaranteeing
0-amplitude samples after the end of the input with no risk of
overrunning. This also allows for perfect looping and multi-buffer
queues by concatenating samples into a continuous stream for the
resampler to act on, ensuring no glitches when crossing boundaries. It's
also used for deinterlacing multichannel streams and converting samples
to floats, as the resamplers can then assume single-channel float32
input (to avoid having multiple versions of the same resampler for
different channel configurations or sample types).
Hope that helps with understanding it. If you still need something
clarified, feel free to ask.
More information about the openal
mailing list