[openal] AL_SOFT_UHJ proposal
Chris Robinson
chris.kcat at gmail.com
Sun Apr 3 21:00:20 EDT 2022
One of the last two extensions before the next version, as long as there's no
big concerns or issues. As mentioned here:
https://openal.org/pipermail/openal/2021-December/000812.html
this is adding support for 2-, 3-, and 4-channel UHJ buffer formats, and a
Super Stereo processing mode for sources.
Feedback is welcome.
-------------- next part --------------
Name
AL_SOFT_UHJ
Contributors
Chris Robinson
Contact
Chris Robinson (chris.kcat 'at' gmail.com)
Status
In progress
Dependencies
This extension is for OpenAL 1.1.
This extension requires AL_EXT_BFORMAT.
Overview
This extension adds support for UHJ channel formats and a Super Stereo
(a.k.a. Stereo Enhance) processor. UHJ is a method of encoding surround
sound from a first-order B-Format signal into a stereo-compatible signal.
Such signals can be played as normal stereo (with more stable and wider
stereo imaging than pan-pot mixing) or decoded back to surround sound,
which makes it a decent choice where 3+ channel surround sound isn't
available or desirable. When decoded, a UHJ signal behaves like B-Format,
which allows it to be rotated through AL_EXT_BFORMAT's source orientation
property as with its B-Format formats.
The standard equation for decoding UHJ to B-Format is:
S = Left + Right
D = Left - Right
W = 0.981532*S + 0.197484*j(0.828331*D + 0.767820*T)
X = 0.418496*S - j(0.828331*D + 0.767820*T)
Y = 0.795968*D - 0.676392*T + j(0.186633*S)
Z = 1.023332*Q
where j is a wide-band +90 degree phase shift. 2-channel UHJ excludes the
T and Q input channels, and 3-channel excludes the Q input channel. Be
aware that the resulting W, X, Y, and Z signals are 3dB louder than their
FuMa counterparts, and the implementation should account for that to
properly balance it against other sounds.
An alternative equation for decoding 2-channel-only UHJ is:
S = Left + Right
D = Left - Right
W = 0.981532*S + j(0.163582*D)
X = 0.418496*S - j(0.828331*D)
Y = 0.762956*D + j(0.384230*S)
Which equation to use depends on the implementation and user preferences.
It's relevant to note that the standard equation is reversible with the
standard encoding equations, meaning decoding UHJ to B-Format and then
encoding B-Format to UHJ results in the original UHJ signal, even for
2-channel.
One additional note for decoding 2-channel UHJ is the resulting B-Format
signal should pass through alternate shelf filters for frequency-dependent
processing. For the standard equation, suitable shelf filters are given
as:
W: LF = 0.661, HF = 1.000
X/Y: LF = 1.293, HF = 1.000
And for the alternative equation, suitable shelf filters are given as:
W: LF = 0.646, HF = 1.000
X/Y: LF = 1.263, HF = 1.000
3- and 4-channel UHJ should use the normal shelf filters for B-Format.
Super Stereo is a technique for processing a plain (non-UHJ) stereo signal
to derive a B-Format signal. It's backed by the same functionality as UHJ
decoding, making it an easy addition on top of UHJ support. Super Stereo
has a variable width control, allowing the stereo soundfield to encompass
more or less around the listener while maintaining a stable center image
(a more naive virtual speaker approach would cause the center image to
collapse as the soundfield widens). As this derives a B-Format signal like
UHJ, it similarly allows such sources to be rotated through the source
orientation property.
There are various forms of Super Stereo, with varying equations, but a
good option is:
S = Left + Right
D = Left - Right
W = 0.6098637*S - j(0.6896511*w*D)
X = 0.8624776*S + j(0.7626955*w*D)
Y = 1.6822415*w*D - j(0.2156194*S)
where w is a variable width control, in the range [0...0.7]. As with UHJ,
the resulting W, X, Y, and Z signals are 3dB louder than their FuMa
counterparts. The normal shelf filters for playing B-Format should apply.
Issues
Q: 3- and 4-channel UHJ weren't widely, if ever, used, in part due to the
extra channels not being stereo-compatible (players need to be aware to
drop them if not decoding them) and making it more practical and
efficient to use B-Format directly. Why include them here?
A: UHJ is a hierarchal system, where 3-channel is a subset of 4-channel,
and 2-channel is a subset of 3-channel. There's little extra work
necessary to support them, and there are techniques for getting 3- and
4-channel UHJ into a stereo-compatible stream, so having the option is
not a bad idea.
Q: Why include Super Stereo here as it's not strictly UHJ?
A: Super Stereo is built on the same structure as UHJ, utilizing phase
shift filters to generate a B-Format signal from pre-existing stereo
content. Given the similarity in functionality, it provides a good
option for handling stereo content. Additionally, even in the hardware
space it's not uncommon for UHJ decoders to have Super Stereo
capabilities, so it makes sense to have it here too.
Q: Super Stereo seems to have a width factor limit of 0.7, but the
AL_SUPER_STEREO_WIDTH_SOFT attribute goes up to 1.0. Why?
A: For flexibility of implementation. If a method is developed that allows
using wider factors, an arbitrary 0.7 limit would be unnecessary. There
is some precedent for this with the source's AL_PITCH property being
any finite non-negative value, but an implementation internally clamps
to its own limits.
New Procedures and Functions
None.
New Tokens
Accepted by the <format> parameter of alBufferData:
AL_FORMAT_UHJ2CHN8_SOFT 0x19A2
AL_FORMAT_UHJ2CHN16_SOFT 0x19A3
AL_FORMAT_UHJ2CHN_FLOAT32_SOFT 0x19A4
AL_FORMAT_UHJ3CHN8_SOFT 0x19A5
AL_FORMAT_UHJ3CHN16_SOFT 0x19A6
AL_FORMAT_UHJ3CHN_FLOAT32_SOFT 0x19A7
AL_FORMAT_UHJ4CHN8_SOFT 0x19A8
AL_FORMAT_UHJ4CHN16_SOFT 0x19A9
AL_FORMAT_UHJ4CHN_FLOAT32_SOFT 0x19AA
Accepted by the <param> parameter of alSourcei, alSourceiv, alGetSourcei,
and alGetSourceiv:
AL_STEREO_MODE_SOFT 0x19B0
Accepted by the <param> parameter of alSourcef, alSourcefv, alGetSourcef,
and alGetSourcefv:
AL_SUPER_STEREO_WIDTH_SOFT 0x19B1
Accepted by the <value> parameter of alSourcei and alSourceiv for
AL_STEREO_MODE_SOFT:
AL_NORMAL_SOFT 0x0000
AL_SUPER_STEREO_SOFT 0x0001
Additions to Specification
UHJ Buffer Formats
The formats AL_FORMAT_UHJ2CHN8_SOFT, AL_FORMAT_UHJ2CHN16_SOFT,
AL_FORMAT_UHJ2CHN_FLOAT32_SOFT, AL_FORMAT_UHJ3CHN8_SOFT,
AL_FORMAT_UHJ3CHN16_SOFT, AL_FORMAT_UHJ3CHN_FLOAT32_SOFT,
AL_FORMAT_UHJ4CHN8_SOFT, AL_FORMAT_UHJ4CHN16_SOFT, and
AL_FORMAT_UHJ4CHN_FLOAT32_SOFT may be used for buffering data to functions
like alBufferData.
8-bit data is expressed as an unsigned value over the range 0 to 255, 128
being an audio output level of zero.
16-bit data is expressed as a signed value over the range -32768 to 32767,
0 being an audio output level of zero. Byte order for 16-bit values is
determined by the native format of the CPU.
32-bit float data is expressed as a signed value with the normalized range
-1.0 to +1.0, 0.0 being an audio output level of zero. Byte order for 32-
bit values is determined by the native format of the CPU.
These formats are interleaved, with UHJ2 having the left and right samples
in order, UHJ3 having the left, right, and T samples in order, and UHJ4
having left, right, T, and Q samples in order.
UHJ formats are decoded and played according to the rules of BFORMAT
buffer formats. When played, such formats may be oriented according to the
source's AL_ORIENTATION and AL_SOURCE_RELATIVE properties.
Super Stereo Processing
When playing Stereo formats, a source may opt to enable Super Stereo
processing with the AL_STEREO_MODE_SOFT attribute.
Name Signature Values Default
------------------- --------- -------------------- --------------
AL_STEREO_MODE_SOFT i,iv AL_NORMAL_SOFT, AL_NORMAL_SOFT
AL_SUPER_STEREO_SOFT
Description: When AL_STEREO_MODE_SOFT is set to AL_NORMAL_SOFT, Stereo
formats are processed and mixed as normal for multi-channel formats. When
set to AL_SUPER_STEREO_SOFT, Stereo formats are processed with a Super
Stereo (sometimes called Stereo Enhance) algrorithm. In this mode, the
stereo sound is converted to B-Format using the width factor specified by
AL_SUPER_STEREO_WIDTH_SOFT, and is treated as a B-Format source which may
be oriented with this source's AL_ORIENTATION and AL_SOURCE_RELATIVE
properties.
This attribute cannot be changed while the source is in an AL_PLAYING or
AL_PAUSED stated, and it has no effect when the source is not playing a
STEREO format.
Name Signature Values Default
-------------------------- --------- ------------ -------
AL_SUPER_STEREO_WIDTH_SOFT f,fv [0.0f, 1.0f] I/D
Description: The width factor for the resulting soundfield using Super
Stereo processing. The default value is implementation-defined, with a
suggested value that provides good quality for a wide range of stereo
content. An implementation may internally clamp the maximum value
depending on the limits imposed by the selected algorithm.
Has no effect when AL_STEREO_MODE_SOFT is not AL_SUPER_STEREO_SOFT or the
source is not playing a STEREO format.
Errors
An AL_INVALID_OPERATION error is generated if an attempt is made to set
AL_STEREO_MODE_SOFT on a source while it's in an AL_PLAYING or AL_PAUSED
state.
More information about the openal
mailing list