[openal] Panning, Ambisonics, and HRTF [also, TEEM]

Mon Sep 15 06:42:30 EDT 2014

> -----Original Message-----
> From: Chris Robinson [mailto:chris.kcat at gmail.com]
> Sent: 15 September 2014 02:41
> To: openal mailing list; Richard Furse
> Subject: Re: Panning, Ambisonics, and HRTF [also, TEEM]
> 
> [...]
> 
> It is neat. I'm kinda sad there aren't more ambisonic audio files (.amb)
> available, since even a first-order 4-channel encoding can give a pretty
> good surround sound response. But I suppose one of the big issues there,
> aside from general support, is file size. Since .amb files are stored as
> uncompressed wav, even a few minutes of audio can easily be 100+MB.
> Would be nice to see a variation of the format that's compressed and
> stored as FLAC (if not Vorbis or Opus, though perhaps the lossy nature
> of those codecs would be too much of a problem).

Yep. It's been suggested. We've implemented a simple lossless encoder in the TEEM SDK which works fine, but lossy is trickier because there's channel hierarchy and phase to worry about. 

> I'm a little confused on .amb files themselves though, since I've read
> about two different channel orderings available. One being the
> WXYZRST... and the other being WYZXVTR... (aka ACN, Ambisonic Channel
> Number). There doesn't seem to be a way to tell which is used. 

The WAV extension predates ACN. These files should always contain FuMa (WXYZRSTUV...).

> Also,
> there's a bit of uncertainty with the GUIDs:
> http://dream.cs.bath.ac.uk/researchdev/wave-ex/bformat.html
> First, it says the B-Format integer PCM GUID is
>   {00000001-0721-11d3-8644-C8C1CA000000}
> and then, after explaining the GUID struct, says it gets written as
>   {0x00000001,0x0000,0x0010,{0x80,0x00,0x00,0xaa,0x00,0x38,0x9b,0x71}}
> which is a completely different value. I've seen both GUIDs used, but I
> have no idea what the difference is between them.

Not sure about this. Maybe one of these is for UHJ? But you can rely on libsndfile to sort this out.

> [...] 
> Out of curiosity, how do the different order coefficients relate? For
> instance, can I mix second- or third-order input with a first-order
> decoder, and simply treat the missing decoder coefficients as 0? And the
> same for a first-order input with a second- or third-order decoder
> (treat the missing inputs as 0)? I really hope that's the case, but I
> fear it won't be...
> [...]

Your first scenario is completely fine. The second is okay.

The orders are strictly hierarchical, each order building on the ones before. You can discard higher orders if your decoder can't/doesn't use them; you'll just get a blurrier image (as if they were never there). This is equivalent to setting the "missing" decoder coefficients to zero as you describe.

The subject of feeding lower order material to a higher order decoder and padding with zeroes is more subtle because telling a decoder that the soundfield components are zero is not the same as telling it that the components are unknown, so you'll get slightly different results depending on the decoder and/or frequency bands. But this isn't a large issue. (See our "TOA First Order Injector" and "TOA Harpex Upsampler" VST plugins for a serious treatment of this issue, or you can decode using different order decoders.) IMHO you don't need to worry about this.

Best wishes,

--Richard