Audio Overview
Game Boy audio is sometimes called “8-bit”. This does not refer to the bit depth of the sound generated, but rather that it has sound capabilities typical of 8-bit consoles. Like much of its contemporary hardware, the Game Boy produces sound generated by simple digital circuits.
Architecture
The Game Boy has four sound generation units, called channels 1 through 4, notated “CH1”, “CH2”, etc. Unlike some other sound chips, such as the C64’s SID or the Atari 5200’s POKEY, each sound channel is specialized in a way largely different from the other channels.
Each channel generates an electronic signal; these signals are then mixed into two new channels (for stereo: one for the left ear, one for the right ear), which are then individually amplified, and then output either to the headphone jack, or the speaker1.
Channels 1 and 2, the “pulse channels”, produce pulse width modulated waves with 4 fixed pulse width settings. Channel 3, the “wave” channel, produces arbitrary user-supplied waves. Channel 4 is the “noise” channel, producing a pseudo-random wave.
The VIN channel is an analog signal received directly from the cartridge, allowing external hardware to supply a fifth sound channel. No licensed games used this feature, and it was omitted from the Game Boy Advance.
POCKET MUSIC
Despite some online claims, Pocket Music does not use VIN. It refuses to run on the GBA for a different reason: the developer couldn’t figure out how to silence buzzing associated with sample playback on the wave channel.
The speaker merges back the two channels, losing the stereo aspect entirely.
Common concepts
APU
The Game Boy’s sound chip is called the APU.
The APU runs off the same master clock as the rest of the Game Boy, which is to say, it is fully synced with the CPU and PPU. This also means that the APU runs about 2.4% faster on the SGB1, increasing frequencies by as much and thus sounding slightly higher-pitched. The SGB2 rectifies this issue.
All interfaces to the APU use durations instead of frequencies, which may be confusing as signal theory and music are more typically based on the latter. Thus, durations will be expressed from their frequencies: for example, a “256 Hz tick” means “1 ∕ 256th of a second”.
The length of APU ticks is not affected by CGB double speed, so the APU works just the same regardless of CPU speed.
Terminology
The Game Boy’s APU is actually full of tricky details; this chapter will mostly describe the intended / common behavior, and often paper over bugs & quirks. Readers wishing to learn more should read the APU details chapter.
Triggering
Triggering a channel causes it to turn on if it wasn’t2, and to start playing its wave from the beginning3. Most changes to a channel’s parameters take effect immediately, but some require re-triggering the channel.
Volume & envelope
The volume can be controlled in two ways: there is a “master volume” control4 (which has separate settings for the left and right outputs), and each channel’s volume can be individually set as well (CH3’s less precisely than the others).
Additionally, an envelope can be configured for CH1, CH2 and CH4, which allows automatically adjusting the volume over time. The parameters that can be controlled are the initial volume, the envelope’s direction (but not its slope), and its duration. Internally, all envelopes are ticked at 64 Hz, and every 1–7 of those ticks, the volume will be increased or decreased.
Length timer
All channels can be individually set to automatically shut themselves down after a certain amount of time.
If the functionality is enabled, a channel’s length timer ticks up5 at 256 Hz (tied to DIV-APU) from the value it’s initially set at. When the length timer reaches 64 (256 for wave channel), the channel is turned off.
Frequency
Music notes and audio waves are typically manipulated in terms of frequency6, i.e. how often the signal repeats per second. However, as explained above, the Game Boy APU primarily works with durations; thus, periods will be used instead of frequency.7
The terms “period” and “period value” throughout this document refer to a parameter that has a somewhat nonintuitive relationship with frequency. See the description of each NRx3 register for more information.
If the channel’s DAC is off, the channel will not turn on.
Except for pulse channels, whose phase position is only ever reset by turning the APU off. This is usually not noticeable unless changing the duty cycle mid-note.
This is separate from the physical volume knob located on the side of the console.
Internally, the length timer is inverted when written, and that ticks down until it reaches 0. But the effect is as if the counter ticked up.
There is also pitch, which is merely a measure of how we perceive frequency. The higher the frequency, the higher the pitch; therefore, pitch will be omitted from the rest of the document.
Actually, the APU interfaces don’t work strictly with periods, but with values that can be thought of as negative periods.