[openal] Questions about OpenAL Soft's Resampler

Thu Nov 9 18:54:49 EST 2017

This was super helpful, thanks!

I spent a couple days toying around with this and got as far as 
implementing the fixed-point offsets and writing a quick linear 
resampler using those offsets. I also have it designed pretty much as 
described, but without the stereo un-interleaving(?) work done yet.

I think all that's left before this works is padding. How are padding 
sizes calculated, and how is padding adjusted for changing pitches? For 
now I just have an arbitrary block of memory to store padding...

https://github.com/flibitijibibo/FACT/blob/a55aaace54f91e0ba22738c0274c823b4c6c8216/src/FACT_platform_sdl2.c#L111

... but I assume that has to be different based on the ratio. Plugging 
the padding length in shouldn't be hard after that, I think:

https://github.com/flibitijibibo/FACT/blob/a55aaace54f91e0ba22738c0274c823b4c6c8216/src/FACT_platform_sdl2.c#L227

(Right now I'm being lazy and getting data lengths by converting the 
step size to a float64, once my brain figures out how to do it without 
getting floats involved that should be fixed too... but I figured I may 
as well do that after padding is done.)

Lastly, and this is probably super picky, but when converting from float 
to fixed is it better to round or ceil? Right now this helper is just 
used to calculate the stepping size, and plain rounding seems to work 
fine...

https://github.com/flibitijibibo/FACT/blob/a55aaace54f91e0ba22738c0274c823b4c6c8216/src/FACT_platform_sdl2.c#L73

... but I feel like I'm missing something important.

-Ethan

On 11/07/2017 09:17 PM, Chris Robinson wrote:
> On 11/07/2017 11:10 AM, Ethan Lee wrote:
>> Hey there!
>>
>> I'm currently working on a reimplementation of Microsoft's XACT 
>> runtime, called FACT...
>>
>> https://github.com/flibitijibibo/FACT
>
> Sounds interesting.
>
>> ... but I also have to factor in pitch changes, which SDL_AudioStream 
>> doesn't account for (it only expects one input frequency for the 
>> duration of its existence).
>>
>> Normally I'd just use OpenAL for this sort of thing, but part of 
>> FACT's job is mimicking XAudio2 accurately, which unfortunately makes 
>> the two incompatible in really subtle ways (and fixing them would 
>> mean ripping up the library for what is essentially one exact use 
>> case and nothing else). I'd still like to use OpenAL's ideas though, 
>> but to keep the permissive license intact I'd have to, at most, look 
>> at papers/documents that explain how it's done, rather than just 
>> using the code directly.
>>
>> TL;DR: How does OpenAL's resampler work with adjustable pitch 
>> changes, and are there any resources online that were used as 
>> references for the resampler? I'm not too concerned about the 
>> resampling function itself (we'll probably just use a linear 
>> resampler), but I'm a lot more concerned about stuff like weird step 
>> sizes, fractions of samples, padding buffer sizes (for both 
>> resampling and possibly for output...?), handling wildly fluctuating 
>> pitches, little details like that.
>
> For OpenAL Soft, the general idea is that each source maintains a 
> fixed-point offset, split into two 32-bit ints (could probably use a 
> 64-bit int these days), with 12 bits of precision. A stepping value is 
> calculated in the same 12-bit fixed-point format based on the input 
> sample rate, output sample rate, and the pitch multiple (so for 
> example, a 22050hz input sample rate with 44100hz output sample rate 
> has an inherent 0.5 bias factor in the stepping value just to play at 
> the correct speed, which is multiplied with the pitch to get the 
> desired pitch shift).
>
> The actual resampling is essentially a FIR filter. An output sample is 
> generated by filtering the input samples around the source's 
> fixed-point offset using some method. Then the source's offset is 
> incremented by the stepping value and it goes to the next output 
> sample. This is repeated until the until the end of the input samples 
> is reached or the output is filled.
>
> In regards to the resamplers themselves, point and linear are the 
> easiest, where the former just drops the fractional offset and the 
> latter uses the two samples around the current offset with the 
> fractional component being a blend factor them, but they also have a 
> fair bit of noise. Because of how simple they are though, changing the 
> pitch is accomplished by merely calculating a new stepping value. The 
> higher order resamplers make use of precalculated, rate-specific 
> filter tables to improve quality, in which case changing the pitch 
> needs to calculate the new stepping value and recalculate the table 
> offset for the new rate (not difficult if done right, but something to 
> be aware of).
>
> As a notable detail, OpenAL Soft "loads" samples from the source input 
> into some reusable scratch memory and supplies that as the resampler 
> input. This helps with resamplers that want to use future samples 
> (such as linear or the higher order sinc resamplers) by guaranteeing 
> 0-amplitude samples after the end of the input with no risk of 
> overrunning. This also allows for perfect looping and multi-buffer 
> queues by concatenating samples into a continuous stream for the 
> resampler to act on, ensuring no glitches when crossing boundaries. 
> It's also used for deinterlacing multichannel streams and converting 
> samples to floats, as the resamplers can then assume single-channel 
> float32 input (to avoid having multiple versions of the same resampler 
> for different channel configurations or sample types).
>
>
> Hope that helps with understanding it. If you still need something 
> clarified, feel free to ask.