FrostByte Unveiled: Crafting Planewaves in a 3D Audio World
30/09/2021
With the development of FrostByte, I sought to further my learning within the field of audio programming and spatial audio by tackling Ambisonic plugin development. Whilst this felt like a big shift in complexity, the previous research conducted on mid-side processing provided a strong backbone, solidifying a basic intuition for constructing channel matrices algorithms.
I confronted academic literature regarding various Ambisonic encoding and decoding methods to understand the underlying theory involved, applied this theory through the construction of Ambisonic Digital Signal Processing (DSP) algorithms within several basic plugins, deepened my understanding of scalable and efficient code design through the use of more sophisticated object-oriented programming (OOP) concepts and furthered my knowledge of the JUCE framework to build more professional and aesthetically designed Graphical User Interfaces (GUIs) to develop FrostByte.
Design Brief
Functionality
Digital Signal Processing
- Conditional statements in the AudioProcessor that allow the use of one DSP decoding/output algorithm at a time
- Mono planewave encoding to FOA AmbiX DSP algorithm
- FOA AmbiX to Blumlein mid-side stereo decoding DSP algorithm
- FOA AmbiX to Blumlein crossed pair stereo decoding algorithm
- Additional FOA AmbiX output option
Parameters
- A parameter which is assigned the current state of the decode menu to trigger the correct conditional statement for the matched decode DSP algorithm
- Parameters to contain and apply the values from the azimuth and elevation sliders to the respective DSP algorithms in the mono planewave encode state
- Parameters to contain and and apply the values from the width and elevation sliders to the respective DSP algorithms in the stereo planewave encode state
Features
Menus
- One independently addressable drop down menu from which the user can select the decode/output options available.
- The decode menu defaults to the mid-side decode and provides a drop down when clicked from which the crossed pair decode and an AmbiX output mode can be selected.
Mono encoding
- Two addressable rotary sliders which link to the azimuth and elevation parameters in the audio engine thread.
- Units displayed in degrees (°) to provide a more intuitive representation of the rotation angle.
- A range of 180° of rotation left and right on the azimuth plane
- A range of 180° of rotation up and down on the elevation plane
- A visual top-down representation of the mono encoded planewave on the surface of the Ambisonic isotropic sphere.
Ambisonic Mono Planewave Encoding
Planewave encoding represents a sound source at an infinite distance to emulate a straight wavefront from a defined direction. This being the real world implication or manifestation of this encoding process, and in practice allows direct point source control in terms of azimuth and elevation placement of a sound point source along the surface of the isotropic Ambisonic sphere.
The following planewave encoding method I found within Lossius’s and Anderson’s (2014) paper, which elaborates upon the encoding calculations used by their ATK Reaper plugin suite. Here are the parameters are considered:
- Gain – g
- Azimuth – θ
- Elevation – ϕ
- Directness – γ
These parameters are incorporated in the following matrix for FuMa B-Format encoding:
In the equations provided in Lossius’s and Anderson’s paper, the azimuth of the source is represented as θ, and elevation as ϕ. It would be these values I allow the user to alter in a planewave encoding plugin that provides the ability to change the placement of a point source across the surface of the Ambisonic isotropic sphere.
Ambisonic Pantophonic Stereo Decoding
Using only 2 output channels and a gain matrix, stereo decoding methods unfortunately do not provide a lossless conversion from the periphonic soundfield representation possible in the Ambisonic domain. Only spatialisation on the azimuth plane is represented, with the Z component in FuMa, or the ACN2 channel in AmbiX not incorporated in the decoding matrix. This type of spatial representation is also known as a pantophonic soundfield. Note however, that this type of decoding has a very high compatibility with stereo speaker playback!
Blumlein Stereo Pantophonic Decode
One such stereo decoding method is the Mid-Side decode, where a mono signal is used with the mid-side matrix, here a mid and side signal are extracted from the 4 FOA channels. In FuMa notation, the W and X component are combined in a calculation to provide the mid signal (this includes a weighting of 1/√2 for the W component), and the Y component is extracted as the side signal. The side signal is then added to the mid signal to provide the left channel information, and taken away from the mid signal to provide the right channel information. Here is this decoding matrix illustrated as pseudo code:
This decode method provides a very literal pantophonic representation of the Ambisonic soundfield.
Blumlein Crossed Pair Decode
To provide additional stereo decoding functionality, I also looked into the Blumlein Cross Pair or X/Y decode for FOA. The matrix for this decode operates in a slightly different manner compared to the mid-side decode, using only the X and Y components in FuMa notation. The X component representing the intensity and proximity of the sound whilst Y accounts for the lateral placement of the sound. Here is this decoding matrix illustrated as pseudo code:
I believe this type of decoding method would be more appropriate for the static placement of stereo sources, whilst the mid-side decode seems more favourable to create dynamic lateral movement for mono sources.