Home

👋 Welcome to gb-asm-tutorial! This tutorial will teach you how to make games for the Game Boy and Game Boy Color.

⚠️

While the Game Boy and Game Boy Color are almost the same console, the Game Boy Advance is entirely different. However, the GBA is able to run GB and GBC games! If you are looking to program GBC games and run them on a GBA, you’re at the right place; however, if you want to make games specifically for the GBA, please check out Tonc instead.

Controls

There are some handy icons near the top of your screen!

  • The “burger” toggles the navigation side panel;
  • The brush allows selecting a different color theme;
  • The magnifying glass pops up a search bar;
  • The world icon lets you change the language of the tutorial;
  • The printer gives a single-page version of the entire tutorial, which you can print if you want;
  • The GitHub icon links to the tutorial’s source repository;
  • The edit button allows you to suggest changes to the tutorial, provided that you have a GitHub account.

Additionally, there are arrows to the left and to the right of the page (they are at the bottom instead on mobile) to more easily navigate to the next page.

With that said, you can get started by simply navigating to the following page :)

Licensing

In short:

  • Code within the tutorial is essentially public domain, meaning that you are allowed to copy it freely without restrictions.
  • You are free to copy the tutorial’s contents (prose, diagrams, etc.), modify them, and share that, but you must give credit and license any copies permissively.
  • This site’s source code can be freely copied, but you must give a license and copyright notice.

Full details, please follow these links for more information on the respective licenses:

  • All the code contained within the tutorial itself is licensed under CC0. To the extent possible under law, all copyright and related or neighboring rights to code presented within GB ASM Tutorial have been waived. This work is published from France.
  • The contents (prose, images, etc.) of this tutorial are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
  • Code used to display and format the site is licensed under the MIT License unless otherwise specified.

Roadmap

The tutorial is split into three sections. I strongly advise you go through the tutorial in order!

In Part Ⅰ, we run our first “Hello World!” program, which we then dissect to learn what makes the Game Boy tick.

In Part Ⅱ, we program our first game, a clone of Arkanoid; we learn how to prod the hardware into having something we can call a “game”. Along the way, we will make plenty of mistakes, so we can learn how to debug our code.

And finally, Part Ⅲ is about “advanced” use of the hardware, where we learn how to make even better-looking games, and we program a Shoot ’Em Up!

I hope the tutorial will work for you!! But if it doesn’t (the format doesn’t work well for everyone, and that’s okay), I encourage you to look at some other resources, which might work better for you. It’s also fine to take a break from time to time; feel free to read at your own pace, and to ask for clarifications if anything isn’t clear to you.


Unfortunately, this tutorial is a work in progress. Currently, Part Ⅱ is being written, and Part Ⅲ should follow soon after.

Help and feedback

If you are stuck in a certain part of the tutorial, want some advice, or just wish to chat with us, the GBDev community chat is the place to go! I actively participate there, and don’t be afraid to ask questions! (The “ASM” channel should be the most appropriate to discuss the tutorial, by the way.)

If you are interested in contributing to or translating the tutorial, thank you! Follow the link above, I’d be happy to work with you.

Noticed a problem with the tutorial? Please check out our issue tracker; if there is no open issue about your problem, please create a new one, or reach out to us via the link above.

If you prefer email, my address is [email protected]<domain>, where you replace <domain> with this website’s domain name (it ends with .fr). Anti-spam measure, I hope you understand.

Setup

First, we should set up our dev environment. We will need:

  1. A POSIX environment
  2. RGBDS v0.5.1 (though v0.5.0 should be compatible)
  3. GNU Make (preferably a recent version)
  4. A code editor
  5. A debugging emulator

❓😕

The following install instructions are provided on a “best-effort” basis, but may be outdated, or not work for you for some reason. Don’t worry, we’re here to help: ask away in GBDev, and we’ll help you with installing everything!

Tools

Linux & macOS

Good news: you’re already fulfilling step 1! You just need to install RGBDS, and maybe update GNU Make.

macOS

At the time of writing this, macOS (up to 11.0, the current latest release) ships a very outdated GNU Make. You can check it by opening a terminal, and running make --version, which should indicate “GNU Make” and a date, among other things.

If your Make is too old, you can update it using Homebrew’s formula make. At the time of writing, this should print a warning that the updated Make has been installed as gmake; you can either follow the suggestion to use it as your “default” make, or use gmake instead of make in this tutorial.

Linux

Once RGBDS is installed, open a terminal and run make --version to check your Make version (which is likely GNU Make).

If make cannot be found, you may need to install your distribution’s build-essentials.

Windows

The sad truth is that Windows is a terrible OS for development; however, you can install environments that solve most issues.

On Windows 10, your best bet is WSL, which sort of allows running a Linux distribution within Windows. Install WSL 1 or WSL 2, then a distribution of your choice, and then follow these steps again, but for the Linux distribution you installed.

If WSL is not an option, you can use MSYS2 or Cygwin instead; then check out RGBDS’ Windows install instructions. As far as I’m aware, both of these provide a sufficiently up-to-date version of GNU Make.

If you have programmed for other consoles, such as the GBA, check if MSYS2 isn’t already installed on your machine. This is because devkitPro, a popular homebrew development bundle, includes MSYS2.

Code editor

Any code editor is fine; I personally use Sublime Text with its RGBDS syntax package; however, you can use any text editor, including Notepad, if you’re crazy enough. Awesome GBDev has a section on syntax highlighting packages, see there if your favorite editor supports RGBDS.

Emulator

Using an emulator to play games is one thing; using it to program games is another. The two aspects an emulator must fulfill to allow an enjoyable programming experience are:

  • Debugging tools: When your code goes haywire on an actual console, it’s very difficult to figure out why or how. There is no console output, no way to gdb the program, nothing. However, an emulator can provide debugging tools, allowing you to control execution, inspect memory, etc. These are vital if you want GB dev to be fun, trust me!
  • Good accuracy: Accuracy means “how faithful to the original console something is”. Using a bad emulator for playing games can work (to some extent, and even then…), but using it for developing a game makes it likely to accidentally render your game incompatible with the actual console. For more info, read this article on Ars Technica (especially the An emulator for every game section at the top of page 2). You can compare GB emulator accuracy on Daid’s GB-emulator-shootout.

The emulator I will be using for this tutorial is BGB (1.5.9 when I’m writing this). It’s Windows-only, but macOS and Linux users can install Wine to be able to run it, and macOS users will additionally have to use the 64-bit version. Other debugging emulators are possible (such as SameBoy on macOS), but I will be giving directives for and including screenshots of BGB.

Hello World!

In this lesson, we will begin by assembling our first program. The rest of this chapter will be dedicated to explaining how and why it works.

Note that we will need to type a lot of commands, so open a terminal now. It’s a good idea to create a new directory (mkdir gb_hello_world, for example, then cd gb_hello_world to enter the new directory).

Grab the following files (right-click each link, “Save Link As…”), and place them all in this new directory:

Then, still from a terminal within that directory, run the following three commands.

⚠️

To clarify where each individual command begins, I’ve added a $ before each command, but don’t type them!

rgbasm -L -o hello-world.o hello-world.asm
rgblink -o hello-world.gb hello-world.o
rgbfix -v -p 0xFF hello-world.gb

‼️

Be careful with arguments! Some options, such as -o here, use the argument after them as a parameter:

  1. rgbasm -L -o hello-world.asm hello-world.o won’t work (and may corrupt hello-world.asm!)
  2. rgbasm -L hello-world.asm -o hello-world.o will work
  3. rgbasm hello-world.asm -o hello-world.o -L will also work

If you need whitespace within an argument, you must quote it:

  1. rgbasm -L -o hello world.o hello world.asm won’t work
  2. rgbasm -L -o "hello world.o" "hello world.asm" will work

It should look like this:

(If you encounter an error you can’t figure out by yourself, don’t be afraid to ask us! We’ll sort it out.)

Congrats! You just assembled your first Game Boy ROM! Now, we just need to run it; open BGB, right-click the window that opened, and load hello-world.gb.

🤕

If you are using Wine and BGB is failing to start (or gives a blank/black screen), try these troubleshooting tips.

And in any case, don’t hesitate to ask for help!

You could also take a flash cart (I use the EverDrive GB X5, but there are plenty of alternatives), load up your ROM onto it, and run it on an actual console!

Picture of the Hello World running on a physical DMG

Well, now that we have something working, it’s time to peel back the curtains…

The toolchain

So, in the previous lesson, we built a nice little “Hello World!” ROM. Now, let’s find out exactly what we did.

Let’s begin by explaining what rgbasm and rgblink do.

RGBASM is an assembler. It is responsible for reading the source code (in our case, hello-world.asm and hardware.inc), and generating blocks of code with some “holes”. RGBASM does not always have enough information to produce a full ROM, so it does most of the work, and stores its intermediary results in what’s known as object files (hence the .o extension).

RGBLINK is a linker. Its job is taking object files (or, like in our case, just one), and “linking” them into a ROM, which is to say: filling the aforementioned “holes”. RGBLINK’s purpose may not be obvious with programs as simple as this Hello World, but it will become much clearer in Part Ⅱ.

So: Source code → rgbasm → Object files → rgblink → ROM, right? Well, not exactly.

RGBFIX

RGBLINK does produces a ROM, but it’s not quite usable yet. See, actual ROMs have what’s called a header. It’s a special area of the ROM that contains metadata about the ROM; for example, the game’s name, Game Boy Color compatibility, and more. For simplicity, we defaulted a lot of these values to 0 for the time being; we’ll come back to them in Part Ⅱ.

However, the header contains three crucial fields:

When the console first starts up, it runs a little program known as the boot ROM, which reads and draws the logo from the cartridge, and displays the little boot animation. When the animation is finished, the console checks if the logo matches a copy that it stores internally; if there is a mismatch, it locks up! And, since it locks up, our game never gets to run… 😦 This was meant as an anti-piracy measure; however, that measure has since then been ruled as invalid, so don’t worry, we are clear! 😄

Similarly, the boot ROM also computes a checksum of the header, supposedly to ensure that it isn’t corrupted. The header also contains a copy of this checksum; if it doesn’t match what the boot ROM computed, then the boot ROM also locks up!

The header also contains a checksum over the whole ROM, but nothing ever uses it. It doesn’t hurt to get it right, though.

Finally, the header also contains the ROM’s size, which is required by emulators and flash carts.

RGBFIX’s role is to fill in the header, especially these 3 fields, which are required for our ROM to be guaranteed to run fine. The -v option instructs RGBFIX to make the header valid, by injecting the Nintendo logo and computing the two checksums. The -p 0xFF option instructs it to pad the ROM to a valid size, and set the corresponding value in the “ROM size” header field.

Alright! So the full story is: Source code → rgbasm → Object files → rgblink → “Raw” ROM → rgbfix → “Fixed” ROM. Good.

You might be wondering why RGBFIX’s functionality hasn’t been included directly in RGBLINK. There are some historical reasons, but RGBLINK can also be used to produce things other than ROMs (especially via the -x option), and RGBFIX is sometimes used without RGBLINK anywhere in sight.

File names

Note that RGBDS does not care at all about the files’ extensions. Some people call their source code .s, for example, or their object files .obj. The file names don’t matter, either; it’s just practical to keep the same name.

Binary and hexadecimal

Before we talk about the code, a bit of background knowledge is in order. When programming at a low level, understanding of binary and hexadecimal is mandatory. Since you may already know about both of these, a summary of the RGBDS-specific information is available at the end of this lesson.

So, what’s binary? It’s a different way to represent numbers, in what’s called base 2. We’re used to counting in base 10, so we have 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. Here’s how digits work:

  42 =                       4 × 10   + 2
     =                       4 × 10^1 + 2 × 10^0
                                  ↑          ↑
    These tens come from us counting in base 10!

1024 = 1 × 1000 + 0 × 100  + 2 × 10   + 4
     = 1 × 10^3 + 0 × 10^2 + 2 × 10^1 + 4 × 10^0
       ↑          ↑          ↑          ↑
And here we can see the digits that make up the number!

ℹ️

^ here means “to the power of”, where X^N is equal to multiplying X with itself N times, and X ^ 0 = 1.

Decimal digits form a unique decomposition of numbers in powers of 10 (decimal is base 10, remember?). But why stop at powers of 10? We could use other bases instead, such as base 2. (Why base 2 specifically will be explained later.)

Binary is base 2, so there are only two digits, called bits: 0 and 1. Thus, we can generalize the principle outlined above, and write these two numbers in a similar way:

  42 =                                                    1 × 32  + 0 × 16  + 1 × 8   + 0 × 4   + 1 × 2   + 0
     =                                                    1 × 2^5 + 0 × 2^4 + 1 × 2^3 + 0 × 2^2 + 1 × 2^1 + 0 × 2^0
                                                              ↑         ↑         ↑         ↑         ↑         ↑
                                          And since now we're counting in base 2, we're seeing twos instead of tens!

1024 = 1 × 1024 + 0 × 512 + 0 × 256 + 0 × 128 + 0 × 64  + 0 × 32  + 0 × 16  + 0 × 8   + 0 × 4   + 0 × 2   + 0
     = 1 × 2^10 + 0 × 2^9 + 0 × 2^8 + 0 × 2^7 + 0 × 2^6 + 0 × 2^5 + 0 × 2^4 + 0 × 2^3 + 0 × 2^2 + 0 × 2^1 + 0 × 2^0
       ↑          ↑         ↑         ↑         ↑         ↑         ↑         ↑         ↑         ↑         ↑

So, by applying the same principle, we can say that in base 2, 42 is written as 101010, and 1024 as 10000000000. Since you can’t tell ten (decimal 10) and two (binary 10) apart, RGBDS assembly has binary numbers prefixed by a percent sign: 10 is ten, and %10 is two.

Okay, but why base 2 specifically? Rather conveniently, a bit can only be 0 or 1, which are easy to represent as “ON” or “OFF”, empty or full, etc! If you want, at home, to create a one-bit memory, just take a box. If it’s empty, it stores a 0; if it contains something, it stores a 1. Computers thus primarily manipulate binary numbers, and this has a slew of implications, as we will see throughout this entire tutorial.

Hexadecimal

To recap, decimal isn’t practical for a computer to work with, instead relying on binary (base 2) numbers. Okay, but binary is really impractical to work with. Take %10000000000, aka 2048; when in decimal only 4 digits are required, binary instead needs 12! And, did you notice that I actually wrote one zero too few? Fortunately, hexadecimal is here to save the day! 🦸

Base 16 works just the same as every other base, but with 16 digits, called nibbles: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F.

  42 =            2 × 16   + 10
     =            2 × 16^1 + A × 16^0

1024 = 4 × 256  + 0 × 16   + 0
     = 4 × 16^2 + 0 × 16^1 + 0 × 16^0

Like binary, we will use a prefix to denote hexadecimal, namely $. So, 42 = $2A, and 1024 = $400. This is much more compact than binary, and slightly more than decimal, too; but what makes hexadecimal very interesting is that one nibble corresponds exactly to 4 bits!

NibbleBits
$0%0000
$1%0001
$2%0010
$3%0011
$4%0100
$5%0101
$6%0110
$7%0111
$8%1000
$9%1001
$A%1010
$B%1011
$C%1100
$D%1101
$E%1110
$F%1111

This makes it very easy to convert between binary and hexadecimal, while retaining a compact enough notation. Thus, hexadecimal is used a lot more than binary. And, don’t worry, decimal can still be used 😜

(Side note: one could point that octal, i.e. base 8, would also work for this; however, we will primarily deal with units of 8 bits, for which hexadecimal works much better than octal. RGBDS supports octal via the & prefix, but I have yet to see it used.)

💡

If you’re having trouble converting between decimal and binary/hexadecimal, check if your favorite calculator program doesn’t have a “programmer” mode, or a way to convert between bases.

Summary

  • In RGBDS assembly, the hexadecimal prefix is $, and the binary prefix is %.
  • Hexadecimal can be used as a “compact binary” notation.
  • Using binary or hexadecimal is useful when individual bits matter; otherwise, decimal works just as well.
  • For when numbers get a bit too long, RGBASM allows underscores between digits (123_465, %10_1010, $DE_AD_BE_EF, etc.)

Registers

Alright! Now that we know what bits are, let’s talk about how they’re used. Don’t worry, this is mostly prep work for the next section, where we will—finally!—look at the code 👀

First, if you opened BGB, you have been greeted with just the Game Boy screen. So, it’s time we pop the debugger open! Right-click the screen, select “Other”, and click “Debugger”. Oh, and while we’re at this, we might as well increase the screen size a little.

The debugger may look intimidating at first, but don’t worry, soon we’ll be very familiar with it! For now, let’s focus on this small box near the top-right, the register viewer.

Picture of the register viewer’s location

⚠️

The register viewer shows both CPU registers and some hardware registers. This lesson will only deal with CPU registers, so that’s why we will be ignoring some of these entries here.

What are CPU registers? Well, imagine you’re preparing a cake. You will be following a recipe, whose instructions may be “melt 125g of chocolate and 125g of butter, blend with 2 eggs” and so on. You will fetch some ingredients from the fridge as needed, but you don’t cook inside the fridge; for that, you have a small workspace.

Registers are pretty much the CPU’s workspace. They are small, tiny chunks of memory embedded directly in the CPU (only 10 bytes for the Game Boy’s CPU, and even modern CPUs have less than a kilobyte if you don’t count SIMD registers). Operations are not performed directly on data stored in memory, which would be equivalent to breaking eggs directly inside our fridge, but they are performed on registers.

ℹ️

There are exceptions to this rule, like many other “rules” I will give in this tutorial; I will paper over them to keep the mental complexity reasonable, but don’t treat my word as gospel either.

General-purpose registers

CPU registers can be placed into two categories: general-purpose and special-purpose. A “general-purpose” register (GPR for short) can be used for storing arbitrary integer numbers. Some GPRs are special nonetheless, as we will see later; but the distinction is “can I store arbitrary integers in it?”.

I won’t introduce special-purpose registers quite yet, as their purpose wouldn’t make sense yet. Rather, they will be discussed as the relevant concepts are introduced.

The Game Boy CPU has seven 8-bit GPRs: a, b, c, d, e, h, and l. “8-bit” means that, well, they store 8 bits. Thus, they can store integers from 0 to 255 (%1111_1111 aka $FF).

a is the accumulator, and we will see later that it can be used in special ways.

A special feature is that these registers, besides a, are paired up, and the pairs can be treated as the 16-bit registers bc, de, and hl. The pairs are not separate from the individual registers; for example, if d contains 192 ($C0) and e contains 222 ($DE), then de contains 49374 ($C0DE) = 192 × 256 + 222. The other pairs work similarly.

Modifying de actually modifies both d and e at the same time, and modifying either individually also affects the pair. How do we modify registers? Let’s see how, with our first assembly instructions!

Assembly basics

Alright, now that we know what the tools do, let’s see what language RGBASM speaks. I will take a short slice of the beginning of hello-world.asm, so that we agree on the line numbers, and you can get some syntax highlighting even if your editor doesn’t support it.

INCLUDE "hardware.inc"

SECTION "Header", ROM0[$100]

	jp EntryPoint

	ds $150 - @, 0 ; Make room for the header

EntryPoint:
	; Shut down audio circuitry
	ld a, 0
	ld [rNR52], a

Let’s analyze it. Note that I will be ignoring a lot of RGBASM’s functionality; if you’re curious to know more, you should wait until parts II and III, or read the docs.

Comments

We’ll start with line 10, which should appear gray above. Semicolons ; denote comments. Everything from a semicolon to the end of the line is ignored by RGBASM. As you can see on line 7, comments need not be on an otherwise empty line.

Comments are a staple of every good programming language; they are useful to give context as to what code is doing. They’re the difference between “Pre-heat the oven at 180 °C” and “Pre-heat the oven at 180 °C, any higher and the cake would burn”, basically. In any language, good comments are very useful; in assembly, they play an even more important role, as many common semantic facilities are not available.

Instructions

Assembly is a very line-based language. Each line can contain one of two things:

  • a directive, which instructs RGBASM to do something, or
  • an instruction1, which is written directly into the ROM.

We will talk about directives later, for now let’s focus on instructions: for example, in the snippet above, we will ignore lines 1 (INCLUDE), 7 (ds), and 3 (SECTION).

To continue the cake-baking analogy even further, instructions are like steps in a recipe. The console’s processor (CPU) executes instructions one at a time, and that… eventually does something! Like baking a cake, drawing a “Hello World” image, or displaying a Game Boy programming tutorial! *wink* *wink*

Instructions have a mnemonic, which is a name they are given, and operands, which indicate what they should act upon. For example, in “melt the chocolate and butter in a saucepan”, the whole sentence would be the instruction, the verb “melt” would be the mnemonic, and “chocolate”, “butter”, and “saucepan” the operands, i.e. some kind of parameters to the operation.

Let’s discuss the most fundamental instruction, ld. ld stands for “LoaD”, and its purpose is simply to copy data from its right operand (RHS) into its left operand (LHS). For example, take line 11’s ld a, 0: it copies (“loads”) the value 0 into the 8-bit register a2. If you look further in the file, line 33 has ld a, b, which causes the value in register b to be copied into register a.

InstructionMnemonicEffect
LoadldCopies values around

ℹ️

Due to CPU limitations, not all operand combinations are valid for ld and many other instructions; we will talk about this when writing our own code later.

🤔

RGBDS has an instruction reference worth bookmarking, and you can also consult it locally with man 7 gbz80 if RGBDS is installed on your machine (except Windows…). The descriptions there are more succinct, since they’re intended as reminders, not as tutorials.

Directives

In a way, instructions are destined to the console’s CPU, and comments are destined to the programmer. But some lines are neither, and are instead sort of metadata destined to RGBDS itself. Those are called directives, and our Hello World actually contains three of those.

Including other files

INCLUDE "hardware.inc"

Line 1 includes hardware.inc3. Including a file has the same effect as if you copy-pasted it, but without having to actually do that.

It allows sharing code across files easily: for example, if two files a.asm and b.asm were to include hardware.inc, you would only need to modify hardware.inc once for the modifications to apply to both a.asm and b.asm. If you instead copy-pasted the contents manually, you would have to edit both copies in a.asm and b.asm to apply the changes, which is more tedious and error-prone.

hardware.inc defines a bunch of constants related to interfacing with the hardware. Constants are basically names with a value attached, so when you write out their name, they are replaced with their value. This is useful because, for example, it is easier to remember the address of the LCD Control register as rLCDC than $FF40.

We will discuss constants in more detail in Part Ⅱ.

Sections

Let’s first explain what a “section” is, then we will see what line 3 does.

A section represents a contiguous range of memory, and by default, ends up somewhere not known in advance. If you want to see where all the sections end up, you can ask RGBLINK to generate a “map file” with the -m flag:

rgblink hello-world.o -m hello-world.map

…and we can see, for example, where the "Tilemap" section ended up:

  SECTION: $05a6-$07e5 ($0240 bytes) ["Tilemap"]

Sections cannot be split by RGBDS, which is useful e.g. for code, since the processor executes instructions one right after the other (except jumps, as we will see later). There is a balance to be struck between too many and not enough sections, but it typically doesn’t matter much until banking is introduced into the picture—and it won’t be until much, much later.

So, for now, let’s just assume that one section should contain things that “go together” topically, and let’s examine one of ours.

SECTION "Header", ROM0[$100]

So! What’s happening here? Well, we are simply declaring a new section; all instructions and data after this line and until the next SECTION one will be placed in this newly-created section. Before the first SECTION directive, there is no “active” section, and thus generating code or data will be met with a Cannot output data outside of a SECTION error.

The new section’s name is “Header”. Section names can contain any characters (and even be empty, if you want), and must be unique4. The ROM0 keyword indicates which “memory type” the section belongs to (here is a list). We will discuss them in Part Ⅱ.

The [$100] part is more interesting, in that it is unique to this section. See, I said above that:

a section […] by default, ends up somewhere not known in advance.

However, some memory locations are special, and so sometimes we need a specific section to span a specific range of memory. To enable this, RGBASM provides the [addr] syntax, which forces the section’s starting address to be addr.

In this case, the memory range $100–$14F is special, as it is the ROM’s header. We will discuss the header in a couple lessons, but for now, just know that we need not to put any of our code or data in that space. How do we do that? Well, first, we begin a section at address $100, and then we need to reserve some space.

Reserving space

	jp EntryPoint

	ds $150 - @, 0 ; Make room for the header

Line 7 claims to “Make room for the header”, which I briefly mentioned just above. For now, let’s focus on what ds actually does.

ds is used for statically allocating memory. It simply reserves some amount of bytes, which are set to a given value. The first argument to ds, here $150 - @, is how many bytes to reserve. The second (optional) argument, here 0, is what value to set each reserved byte to5.

We will see why these bytes must be reserved in a couple of lessons.

It is worth mentioning that this first argument here is an expression. RGBDS (thankfully!) supports arbitrary expressions essentially anywhere. This expression is a simple subtraction: $150 minus @, which is a special symbol that stands for “the current memory address”.

A symbol is essentially “a name attached to a value”, usually a number. We will explore the different types of symbols throughout the tutorial, starting with labels in the next section.

A numerical symbol used in an expression evaluates to its value, which must be known when compiling the ROM—in particular, it can’t depend on any register’s contents.

Oh, but you may be wondering what the “memory addresses” I keep mentioning are. Let’s see about those!


1

Technically, instructions in RGBASM are implemented as directives, basically writing their encoded form to the ROM; but the distinction between the instructions in the source code and those in the final ROM is not worth bringing up right now.

2

The curious reader may ask where the value is copied from. The answer is simply that the "immediate" byte ($00 in this example) is stored in ROM just after the instruction’s opcode byte, and it’s what gets copied to a. We will come back to this when we talk about how instructions are encoded later on.

3

hardware.inc itself contains more directives, in particular to define a lot of symbols. They will be touched upon much later, so we won’t look into hardware.inc yet.

4

Section names actually only need to be unique for “plain” sections, and function differently with “unionized” and “fragment” sections, which we will discuss much later.

5

Actually, since RGBASM 0.5.0, ds can accept a list of bytes, and will repeat the pattern for as many bytes as specified. It just complicates the explanation slightly, so I omitted it for now. Also, if the argument is omitted, it defaults to what is specified using the -p option to RGBASM.

Memory

🎉

Congrats, you have just finished the hardest lessons of the tutorial! Since you have the basics, from now on, we’ll be looking at more and more concrete code.

If we look at line 29, we see ld a, [de]. Given what we just learned, this copies a value into register a… but where from? What do these brackets mean? To answer that, we need to talk about memory.

What’s a memory?

The purpose of memory is to store information. On a piece of paper or a whiteboard, you can write letters to store the grocery list, for example. But what can you store in a computer memory? The answer to that question is current1. Computer memory is made of little cells that can store current. But, as we saw in the lesson about binary, the presence or absence of current can be used to encode binary numbers!

tl;dr: memory stores numbers. In fact, memory is a long array of numbers, stored in cells. To uniquely identify each cell, it’s given a number (what else!) called its address. Like street numbers! The first cell has address 0, then address 1, 2, and so on. On the Game Boy, each cell contains 8 bits, i.e. a byte.

How many cells are there? Well, this is actually a trick question…

The many types of memory

There are several memory chips in the Game Boy, but we can put them into two categories: ROM and RAM2. ROM simply designates memory that cannot be written to3, and RAM memory that can be written to.

Due to how they work, the CPU, as well as the memory chips, can only use a single number for addresses. Let’s go back to the “street numbers” analogy: each memory chip is a street, with its own set of numbers, but the CPU has no idea what a street is, it only deals with street numbers. To allow the CPU to talk to multiple chips, a sort of “postal service”, the chip selector, is tasked with translating the CPU’s street numbers into a street & street number.

For example, let’s say a convention is established where addresses 0 through 1999 go to chip A’s addresses 0–1999, 2000–2999 to chip B’s 0–999, and 3000–3999 to chip C’s 0–999. Then, if the CPU asks for the byte at address 2791, the chip selector will ask chip B for the byte at its own address 791, and forward the reply to the CPU.

Since addresses dealt with by the CPU do not directly correspond to the chips’ addresses, we talk about logical addresses (here, the CPU’s) versus physical addresses (here, the chips’), and the correspondence is called a memory map. Since we are programming the CPU, we will only be dealing with logical addresses, but it’s crucial to keep in mind that different addresses may be backed by different memory chips, since each chip has unique characteristics.

This may sound complicated, so here is a summary:

  • Memory stores numbers, each 8-bit on the Game Boy.
  • Memory is accessed byte by byte, and the cell being accessed is determined by an address, which is just a number.
  • The CPU deals with all memory uniformly, but there are several memory chips each with their own characteristics.

Game Boy memory map

Let’s answer the question that introduced this section: how many memory cells are there on the Game Boy? Well, now, we can reframe this question as “how many logical addresses are there?” or “how many physical addresses are there in total?”.

Logical addresses, which again are just numbers, are 16-bit on the Game Boy. Therefore, there are 2^16 = 65536 logical addresses, from $0000 to $FFFF. How many physical addresses, though? Well, here is a memory map courtesy of Pan Docs (though I will simplify it a bit):

StartEndNameDescription
$0000$7FFFROMThe game ROM, supplied by the cartridge.
$8000$9FFFVRAMVideo RAM, where graphics are stored and arranged.
$A000$BFFFSRAMSave RAM, optionally supplied by the cartridge to save data to.
$C000$DFFFWRAMWork RAM, general-purpose RAM for the game to store things in.
$FE00$FE9FOAMObject Attribute Memory, where “objects” are stored.
$FF00$FF7FI/ONeither ROM nor RAM, but this is where you control the console.
$FF80$FFFEHRAMHigh RAM, a tiny bit of general-purpose RAM which can be accessed faster.
$FFFF$FFFFIEA lone I/O byte that’s separated from the rest for some reason.

$8000 + $2000 + $2000 + $2000 + $A0 + $80 + $7F + 1 adds up to $E1A0, or 57760 bytes of memory that can be actually accessed. The curious reader will naturally ask, “What about the remaining 7776 bytes? What happens when accessing them?”; the answer is: “It depends, it’s complicated; avoid accessing them”.

Labels

Okay, memory addresses are nice, but you can’t possibly expect me to keep track of all these addresses manually, right?? Well, fear not, for we have labels!

Labels are symbols which basically allow attaching a name to a byte of memory. A label is declared like at line 9 (EntryPoint:): at the beginning of the line, write the label’s name, followed by a colon, and it will refer to the byte right after itself. So, for example, EntryPoint refers to the ld a, 0 right below it (more accurately, the first byte of that instruction, but we will get there when we get there).

If you peek inside hardware.inc, you will see that for example rNR52 is not defined as a label. That’s because they are constants, which we will touch on later; since they can be used mostly like labels, we will conflate the two for now.

Writing out a label’s name is equivalent to writing the address of the byte it’s referencing (with a few exceptions we will see in Part Ⅱ). For example, consider the ld de, Tiles at line 25. Tiles (line 64) is referring to the first byte of the tile data; if we assume that the tile data ends up being stored starting at $0193, then ld de, Tiles is equivalent to ld de, $0193!

What’s with the brackets?

Right, we came into this because we wanted to know what the brackets in ld a, [de] mean. Well, they can basically be read as “at address…”. For example, ld a, b can be read as “copy into a the value stored in b”; ld a, [$5414] would be read as “copy into a the value stored at address $5414”, and ld a, [de] would be read as “copy into a the value stored at address de”. Wait, what does that mean? Well, if de contains the value $5414, then ld a, [de] will do the same thing as ld a, [$5414].

If you’re familiar with C, these brackets are basically how the dereference operator is implemented.

hli

An astute reader will have noticed the ld [hli], a just below the ld a, [de] we have just studied. [de] makes sense because it’s one of the register pairs we saw a couple lessons ago, but [hli]? It’s actually a special notation, which can also be written as [hl+]. It functions as [hl], but hl is incremented just after memory is accessed. [hld]/[hl-] is the mirror of this one, decrementing hl instead of incrementing it.

An example

So, if we look at the first two instructions of CopyTiles:

	ld a, [de]
	ld [hli], a

…we can see that we’re copying the byte in memory pointed to by de (that is, whose address is contained in de) into the byte pointed to by hl. Here, a serves as temporary storage, since the CPU is unable to perform ld [hl], [de] directly.

While we’re at this, let’s examine the rest of .copyTiles in the following lessons!


1

Actually, this depends a lot on the type of memory. A lot of memory nowadays uses magnetic storage, but to keep the explanation simple, and to parallel the explanation of binary given earlier, let’s assume that current is being used.

2

There are other types of memory, such as flash memory or EEPROM, but only Flash has been used on the Game Boy, and for only a handful of games; so we can mostly forget about them.

3

No, really! Mask ROM is created by literally punching holes into a layer of silicon using acid, and e.g. the console’s boot ROM is made of hard-wired transitors within the CPU die. Good luck writing to that!
“ROM” is sometimes (mis)used to refer to “persistent memory” chips, such as flash memory, whose write functionality was disabled. Most bootleg / “repro” Game Boy cartridges you can find nowadays actually contain flash; this is why you can reflash them using specialized hardware, but original cartridges cannot be.

Header

Let’s go back to a certain line near the top of hello-world.asm.

	ds $150 - @, 0 ; Make room for the header

What is this mysterious header, why are we making room for it, and more questions answered in this lesson!

What is the header?

First order of business is explaining what the header is. It’s the region of memory from $0104 to $014F (inclusive). It contains metadata about the ROM, such as its title, Game Boy Color compatibility, size, two checksums, and interestingly, the Nintendo logo that is displayed during the power-on animation.

You can find this information and more in the Pan Docs.

Interestingly, most of the information in the header does not matter on real hardware (the ROM’s size is determined only by the capacity of the ROM chip in the cartridge, not the header byte). In fact, some prototype ROMs actually have incorrect header info!

Most of the header was only used by Nintendo’s manufacturing department to know what components to put in the cartridge when publishing a ROM. Thus, only ROMs sent to Nintendo had to have a fully correct header; ROMs used for internal testing only needed to pass the boot ROM’s checks, explained further below.

However, in our “modern” day and age, the header actually matters a lot. Emulators (including hardware emulators such as flashcarts) must emulate the hardware present in the cartridge. The header being the only source of information about what hardware the ROM’s cartridge should contain, they rely on some of the values in the header.

Boot ROM

The header is intimately tied to what is called the boot ROM.

The most observant and/or nostalgic of you may have noticed the lack of the boot-up animation and the Game Boy’s signature “ba-ding!” in BGB. When the console powers up, the CPU does not begin executing instructions at address $0100 (where our ROM’s entry point is), but at $0000.

However, at that time, a small program called the boot ROM, burned within the CPU’s silicon, is “overlaid” on top of our ROM! The boot ROM is responsible for the startup animation, but it also checks the ROM’s header! Specifically, it verifies that the Nintendo logo and header checksums are correct; if either check fails, the boot ROM intentionally locks up, and our game never gets to run :(

For the curious

You can find a more detailed description of what the boot ROM does in the Pan Docs, as well as an explanation of the logo check. Beware that it is quite advanced, though.

If you want to enable the boot ROMs in BGB, you must obtain a copy of the boot ROM(s), whose SHA256 checksums can be found in their disassembly for verification. If you wish, you can also compile SameBoy’s boot ROMs and use those instead, as a free-software substitute.

Then, in BGB’s options, go to the System tab, set the paths to the boot ROMs you wish to use, tick Enable bootroms, select the appropriate system, and click OK or Apply. Now, just reset the emulator, and voilà!

A header is typically called “valid” if it would pass the boot ROM’s checks, and “invalid” otherwise.

RGBFIX

RGBFIX is the third component of RGBDS, whose purpose is to write a ROM’s header. It is separate from RGBLINK so that it can be used as a stand-alone tool. Its name comes from that RGBLINK typically does not produce a ROM with a valid header, so the ROM must be “fixed” before it’s production-ready.

RGBFIX has a bunch of options to set various parts of the header; but the only two that we are using here are -v, which produces a valid header (so, correct Nintendo logo and checksums), and -p 0xFF, which pads the ROM to the next valid size (using $FF as the filler byte), and writes the appropriate value to the ROM size byte.

If you look at other projects, you may find RGBFIX invocations with more options, but these two should almost always be present.

So, what’s the deal with that line?

Right! This line.

	ds $150 - @, 0 ; Make room for the header

Well, let’s see what happens if we remove it (or comment it out).

rgbasm -L -o hello-world.o hello-world.asm
rgblink -o hello-world.gb -n hello-world.sym hello-world.o

(I am intentionally not running RGBFIX; we will see why in a minute.)

Make sure the boot ROMs are not enabled for this! If they are, make sure to disable them (untick their box in the options, click OK or Apply, and reset the emulator).

“This rom would not work on a real gameboy.”

As I explained, RGBFIX is responsible for writing the header, so we should use it to fix these warnings.

rgbfix -v -p 0xFF hello-world.gb
warning: Overwrote a non-zero byte in the Nintendo logo
warning: Overwrote a non-zero byte in the header checksum

I’m sure these warnings are nothing to be worried about… (Depending on your version of RGBDS, you may have gotten different warnings, or none at all.)

Let’s run the ROM…

Screenshot of BGB reporting “Unsupported RAM size”

… dismiss this pesky warning, and…

Screenshot of BGB's debugger, the title bar reads "invalid opcode"
When the debugger pops open on its own, and the title bar reads "invalid opcode", you might have screwed up somewhere.

“This is fine” meme strip

Okay, so, what happened?

As we can see from the screenshot, PC is at $0105. What is it doing there?

…Oh, EntryPoint is at $0103. So the jp at $0100 went there, and started executing instructions (3E CE is the raw form of ld a, $CE), but then $ED does not encode any valid instruction, so the CPU locks up.

But why is EntryPoint there? Well, as you may have figured out from the warnings RGBFIX printed, it overwrites the header area in the ROM. However, RGBLINK is not aware of the header (because RGBLINK is not only used to generate ROMs!), so you must explicitly reserve space for the header area.

🥴

Forgetting to reserve this space, and having a piece of code or data ending up there then overwritten, is a common beginner mistake that can be quite puzzling. Fortunately, RGBFIX since version 0.5.1 warns when it detects this mistake, as shown above.

So, we prevent disaster like this:

SECTION "Header", ROM0[$100]

	jp EntryPoint

	ds $150 - @, 0 ; Make room for the header

The directive ds stands for “define space”, and allows filling a range of memory. This specific line fills all bytes from $103 to $14F (inclusive) with the value $00. Since different pieces of code and/or data cannot overlap, this ensures that the header’s memory range can safely be overwritten by RGBFIX, and that nothing else accidentally gets steamrolled instead.

It may not be obvious how this ds ends up filling that specific memory range. The 3-byte jp covers memory addresses $100, $101, and $102. (We start at $100 because that’s where the SECTION is hardcoded to be.) When RGBASM processes the ds directive, @ (which is a special symbol that evaluates to “the current address”) thus has the value $103, so it fills $150 - $103 = $4D bytes with zeros, so $103, $104, …, $14E, $14F.

Bonus: the infinite loop

(This is not really linked to the header, but I need to explain it somewhere, and here is as good a place as any.)

You may also be wondering what the point of the infinite loop at the end of the code is for.

Done:
	jp Done

Well, simply enough, the CPU never stops executing instructions; so when our little Hello World is done and there is nothing left to do, we must still give the CPU some busy-work: so we make it do nothing, forever.

We cannot let the CPU just run off, as it would then start executing other parts of memory as code, possibly crashing. (See for yourself: remove or comment out these two lines, re-compile the ROM, and see what happens!)

Operations & flags

Alright, we know how to pass values around, but just copying numbers is no fun; we want to modify them!

The GB CPU does not provide every operation under the sun (for example, there is no multiplication instruction), but we can just program those ourselves with what we have. Let’s talk about some of the operations that it does have; I will be omitting some not used in the Hello World for now.

Arithmetic

The simplest arithmetic instructions the CPU supports are inc and dec, which INCrement and DECrement their operand, respectively. (If you aren’t sure, “to increment” means “to add 1”, and “to decrement” means “to subtract 1”.) So for example, the dec bc at line 32 of hello-world.asm simply subtracts 1 from bc.

Okay, cool! Can we go a bit faster, though? Sure we can, with add and sub! These respectively ADD and SUBtract arbitrary values (either a constant, or a register). Neither is used in the tutorial, but a sibling of sub’s is: have you noticed little cp over at line 17? cp allows ComParing values. It works the same as sub, but it discards the result instead of writing it back. “Wait, so it does nothing?” you may ask; well, it does update the flags.

Flags

The time has come to talk about the special-purpose register (remember those?) f, for, well, flags. The f register contains 4 bits, called “flags”, which are updated depending on an operation’s results. These 4 flags are:

NameDescription
ZZero flag
NAddition/subtraction
HHalf-carry
CCarry

Yes, there is a flag called “C” and a register called “c”, and they are different, unrelated things. This makes the syntax a bit confusing at the beginning, but they are always used in different contexts, so it’s fine.

We will forget about N and H for now; let’s focus on Z and C. Z is the simplest flag: it gets set when an operation’s result is 0, and gets reset otherwise. C is set when an operation overflows or underflows.

What’s an overflow? Let’s take the simple instruction add a, 42. This simply adds 42 to the contents of register a, and writes the result back into a.

    ld a, 200
    add a, 42

At the end of this snippet, a equals 200 + 42 = 242, great! But what if I write this instead?

    ld a, 220
    add a, 42

Well, one could think that a would be equal to 220 + 42 = 262, but that would be incorrect. Remember, a is an 8-bit register, it can only store eight bits of information! And if we were to write 262 in binary, we would get %100000110, which requires at least 9 bits… So what happens? Simply, that ninth bit is lost, and the value that we end up with is %00000110 = 6. This is called an overflow: after adding, we get a value smaller than what we started with.

We can also do the opposite with sub, and—for example—subtract 42 from 6; as we know, for all X and Y, X + Y - Y = X, and we just saw that 220 + 42 = 6 (this is called modulo 256 arithmetic, by the way); so, 6 - 42 = (220 + 42) - 42 = 220. This is called an underflow: after subtracting, we get a value greater than what we started with.

When an operation is performed, it sets the carry flag if an overflow or underflow occurred, and clears it otherwise. (We will see later that not all operations update the carry flag, though.)

Summary

  • We can add and subtract numbers.
  • The Z flag lets us know if the result was 0.
  • However, registers can only store a limited range of integers.
  • Going outside this range is called an overflow or underflow, for addition and subtraction respectively.
  • The C flag lets us know if either occurred.

Comparison

Now, let’s talk more about how cp is used to compare numbers. Here is a refresher: cp subtracts its operand from a and updates flags accordingly, but doesn’t write the result back. We can use flags to check properties about the values being compared, and we will see in the next lesson how to use the flags.

The simplest interaction is with the Z flag. If it’s set, we know that the subtraction yielded 0, i.e. a - operand == 0; therefore, a == operand! If it’s not set, well, then we know that a != operand.

Okay, checking for equality is nice, but we may also want to perform comparisons. Fret not, for the carry flag is here to do just that! See, when performing a subtraction, the carry flag gets set when the result goes below 0—but that’s just a fancy way of saying “becomes negative”!

So, when the carry flag gets set, we know that a - operand < 0, therefore that a < operand..! And, conversely, we know that if it’s not set, a >= operand. Great!

Instruction summary

InstructionMnemonicEffect
AddaddAdds values to a
SubtractsubSubtracts values from a
ComparecpCompares values with what’s contained in a

Jumps

Once this lesson is done, we will be able to understand all of CopyTiles!

So far, all the code we have seen was linear: it executes top to bottom. But this doesn’t scale: sometimes, we need to perform certain actions depending on the result of others (“if the crêpes start sticking, grease the pan again”), and sometimes, we need to perform actions repeatedly (“If there is some batter left, repeat from step 5”).

Both of these imply reading the recipe non-linearly. In assembly, this is achieved using jumps.

The CPU has a special-purpose register called “PC”, for Program Counter. It contains the address of the instruction currently being executed1, like how you’d keep in mind the number of the recipe step you’re currently doing. PC increases automatically as the CPU reads instructions, so “by default” they are read sequentially; however, jump instructions allow writing a different value to PC, effectively jumping to another piece of the program. Hence the name.

Okay, so, let’s talk about those jump instructions, shall we? There are four of them:

InstructionMnemonicEffect
JumpjpJump execution to a location
Jump RelativejrJump to a location close by
CallcallCall a subroutine
ReturnretReturn from a subroutine

We will focus on jp for now. jp, such as the one line 5, simply sets PC to its argument, jumping execution there. In other words, after executing jp EntryPoint (line 5), the next instruction executed is the one below EntryPoint (line 16).

🤔

You may be wondering what is the point of that specific jp. Don’t worry, we will see later why it’s required.

Conditional jumps

Now to the really interesting part. Let’s examine the loop responsible for copying tiles:

	; Copy the tile data
	ld de, Tiles
	ld hl, $9000
	ld bc, TilesEnd - Tiles
CopyTiles:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, CopyTiles

Don’t worry if you don’t quite get all the following, as we’ll see it live in action in the next lesson. If you’re having trouble, try going to the next lesson, watch the code execute step by step; then, coming back here, it should make more sense.

First, we copy Tiles, the address of the first byte of tile data, into de. Then, we set hl to $9000, which is the address where we will start copying the tile data to. ld bc, TilesEnd - Tiles sets bc to the length of the tile data: TilesEnd is the address of the first byte after the tile data, so subtracting Tiles to that yields the length.

So, basically:

  • de contains the address where data will be copied from;
  • hl contains the address where data will be copied to;
  • bc contains how many bytes we have to copy.

Then we arrive at the main loop. We read one byte from the source (line 29), and write it to the destination (line 30). We increment the destination (via the implicit inc hl done by ld [hli], a) and source pointers (line 31), so the following loop iteration processes the next byte.

Here’s the interesting part: since we’ve just copied one byte, that means we have one less to go, so we dec bc. (We have seen dec two lessons ago; as a refresher, it simply decreases the value stored in bc by one.) Since bc contains the amount of bytes that still need to be copied, it’s trivial to see that we should simply repeat the operation if bc != 0.

😓

dec usually updates flags, but unfortunately dec bc doesn’t, so we must check if bc reached 0 manually.

ld a, b and or a, c “bitwise OR” b and c together; it’s enough to know for now that it leaves 0 in a if and only if bc == 0. And or updates the Z flag! So, after line 34, the Z flag is set if and only if bc == 0, that is, if we should exit the loop.

And this is where conditional jumps come into the picture! See, it’s possible to conditionally “take” a jump depending on the state of the flags.

There are four “conditions”:

NameMnemonicDescription
ZerozZ is set (last operation had a result of 0)
Non-zeronzZ is not set (last operation had a non-zero result)
CarrycC is set (last operation overflowed)
No carryncC is not set (last operation did not overflow)

Thus, jp nz, CopyTiles can be read as “if the Z flag is not set, then jump to CopyTiles”. Since we’re jumping backwards, we will repeat the instructions again: we have just created a loop!

Okay, we’ve been talking about the code a lot, and we have seen it run, but we haven’t really seen how it runs. Let’s watch the magic unfold in slow-motion in the next lesson!


1

Not exactly; instructions may be several bytes long, and PC increments after reading each byte. Notably, this means that when an instruction finishes executing, PC is pointing to the following instruction. Still, it’s pretty much “where the CPU is currently reading from”, but it’s better to keep it simple and avoid mentioning instruction encoding for now.

Tracing

Ever dreamed of being a wizard? Well, this won’t give you magical powers, but let’s see how emulators can be used to control time!

First, make sure to focus the debugger window. Let’s first explain the debugger’s layout: Top-left is the code viewer, bottom-left is the data viewer, top-right are some registers (as we saw in the registers lesson), and bottom-right is the stack viewer. What’s the stack? We will answer that question a bit later… in Part Ⅱ 😅

Setup

For now, let’s focus on the code viewer. If you focus the debugger while the emulator is running, it will pause, as indicated by the screen window’s title changing to bgb (debugging).

Okay, but what’s with this weird syntax? Well, there are several syntaxes for Game Boy assembly, and BGB doesn’t use RGBDS’ by default. We can fix that in the options, which we can access by either:

  • Right-clicking the screen and selecting “Options”
  • In the debugger, open the “Window” menu, and select “Options”
  • Press F11 while focusing either the screen or debugger

BGB has a ton of options, but don’t worry, it’s got good defaults, so we don’t need to look at most of them for now. Select the “Debug” tab, and set “Disasm syntax” to “rgbds”. Oh, and check the “Local symbols” box, too. Now, click “OK” to apply the options.

Screenshot of the options window

You can also customize a lot of key bindings in the options, but I will stick to the default ones for this tutorial for the sake of simplicity.

Alright, great! But where are the labels? Well, as we have seen a couple of lessons ago, labels are merely a convenience provided by RGBASM, but they are not part of the ROM itself. It is very much inconvenient to debug without them, and so sym files (for “symbols”) have been developed. Let’s run RGBLINK to generate a sym file for our ROM:

rgblink -n hello-world.sym hello-world.o

‼️

The file names matter! When looking for a ROM’s sym file, BGB takes the ROM’s file name, strips the extension (here, .gb), replaces it with .sym, and looks for a file in the same directory with that name.

Then, in the debugger, we can go in the “File” menu and select “reload SYM file”.

Screenshot of the debugger showing the labels in the code viewer

Much better!

🔍

If a sym file is loaded, pressing Tab allows toggling whether labels are displayed or not.

Stepping

When pausing execution, the debugger will automatically focus on the instruction the CPU is about to execute, as indicated by the line highlighted in blue. Screenshot of the debugger showing that the highlighted line corresponds to PC

ℹ️

The instruction highlighted in blue is always what the CPU is about to execute, not what it just executed. Keep this in mind.

If we want to watch execution from the beginning, we need to reset the emulator. Go into the debugger’s “Run” menu, and select “Reset”, or tap your numpad’s * key.

The blue line should automatically move to address $01001, and now we’re ready to trace! All the commands for that are in the “Run” menu.

  • “Run” simply unpauses the emulator. Clicking on the screen also does the same.
  • “Trace” (more commonly known as “Step Into”) and “Step Over” advance the emulator by one instruction. They only really differ on the call instruction and interrupts, neither of which we are using here, so we will use “Trace”.
  • The other options are not relevant for now.

We will have to “Trace” a bunch of times, so it’s a good idea to use the key shortcut. If we press F7 once, the jp EntryPoint is executed. And if we press it a few more times, can see the instructions being executed, one by one!

Now, you may notice the WaitVBlank loop runs a lot of times, but what we are interested in is the CopyTiles loop. We can easily skip over it in several ways; this time, we will use a breakpoint. We will place the breakpoint on the ld de, Tiles at 00:0162; either double-click on that line, or select it and press F2. The line will turn red:

Debugger screenshot showcasing the breakpoint

Then you can resume execution either by clicking the screen or pressing F9, and BGB will automatically pause. Whenever BGB is running, and the (emulated) CPU is about to execute an instruction a breakpoint was placed on, it automatically pauses.

Debugger screenshot showcasing execution paused on the breakpoint

You can see where execution is being paused both from the green arrow and the value of PC.

If we trace the next three instructions, we can see the three arguments to the CopyTiles loop getting loaded into registers.

The state of some registers at the beginning of the CopyTiles loop

For fun, let’s watch the tiles as they’re being copied. For that, obviously, we will use the data viewer, and position it at the destination. As we can see from the image above, that would be $9000!

Select the data viewer (either click somewhere in it, or use Ctrl+Tab to switch focus, as indicated by the grey bar on the left), and press Ctrl+G (for “Goto”). In the popup, type the address you wish to go to, in our case 9000 (sans dollar sign!!).

Awesome, right?

What next?

Congrats, you have just learned how to use a debugger! We have only scratched the surface, though; we will use more of BGB’s tools to illustrate the next parts. Don’t worry, from here on, lessons will go with a lot more images—you’ve made it through the hardest part!


1

Why does execution start at $0100? That’s because it’s where the boot ROM hands off control to our game once it’s done.

Tiles

💭

“Tiles” were called differently in documentation of yore. They were usually called “patterns” or “characters”, the latter giving birth to the “CHR” abbreviation which is sometimes used to refer to tiles.

For example, on the NES, tile data is usually provided by the cartridge in either CHR ROM or CHR RAM. The term “CHR” is typically not used on the Game Boy, though exchanges between communities cause terms to “leak”, so some refer to the area of VRAM where tiles are stored as “CHR RAM” or “CHR VRAM”, for example.

As with all such jargon whose meaning may depend on who you are talking to, I will stick to “tiles” across this entire tutorial for consistency, being what is the most standard in the GB dev community now.

Well, copying this data blindly is fine and dandy, but why exactly is the data “graphics”?

Screenshot of some tile definitions in the code
Ah, yes, pixels.

Let’s see about that!

Helpful hand

Now, figuring out the format with an explanation alone is going to be very confusing; but fortunately, BGB got us covered thanks to its VRAM viewer. You can open it either by selecting “Window” then “VRAM viewer”, or pressing F5 when the debugger is in focus. It is also in “Other” in the screen’s right-click menu.

Screenshot of the VRAM viewer’s “Tiles” tab

By default, it should open on the “Tiles” tab like in the image, but if not, please select that. We will come to the other tabs in due time. This one shows the tiles present in the Game Boy’s video memory (or “VRAM”).

🤔

I encourage you to experiment with the VRAM viewer, hover over things, tick and untick checkboxes, see by yourself what’s what. Any questions you might have will be answered in due time, don’t worry! And if what you’re seeing later on doesn’t match my screenshots, ensure that the checkboxes match mine.

Don’t mind the “Nintendo” in the top-left; we did not put it there ourselves, and we will see why it’s there later. We will also be ignoring the right half of the tile grid, as it is exclusive to the Game Boy Color: we will get there when we get there.

Short primer

You may have heard of tiles before, especially as they were really popular in 8-bit and 16-bit systems. That’s no coincidence: tiles are very useful. Instead of storing every on-screen pixel (144 × 160 pixels × 2 bits/pixel = 46080 bits = 5760 bytes, compared to the console’s 8192 bytes of VRAM), pixels are grouped into tiles, and then tiles are assembled in various ways to produce the final image.

In particular, tiles can be reused very easily and at basically no cost, saving a lot of memory! In addition, manipulating whole tiles at once is much cheaper than manipulating the individual pixels, so this spares processing time as well.

The concept of a “tile” is very general, but on the Game Boy, tiles are always 8 by 8 pixels. Often, hardware tiles are grouped to manipulate them as larger tiles (often 16×16); to avoid the confusion, those are referred to as meta-tiles.

“bpp”?

You may be wondering where that “2 bits/pixel” figure earlier came from… This is something called “bit depth”.

See, colors are not stored in the tiles themselves! Instead, it works like a coloring book: the tile itself contains 8 by 8 indices, not colors; you give the hardware a tile and a set of colors—a palette—and it colorizes them! (This is also why color swaps were very common back then: you could create enemy variations by storing tiny palettes instead of large different graphics.)

Anyway, as it is, Game Boy palettes are 4 colors large.1 This means that the indices into those palettes, stored in the tiles, can be represented in only two bits! This is called “2 bits per pixel”, noted “2bpp”.

With that in mind, we are ready to explain how these bytes turn into pixels!

Encoding

As I explained, each pixel takes up 2 bits. Since there are 8 bits in a byte, you might expect each byte to contain 4 pixels… and you would be neither entirely right, nor entirely wrong. See, each row of 8 pixels is stored in 2 bytes, but neither of these bytes contains the info for 4 pixels. (Think of it like a 10 € banknote torn in half: neither half is worth anything, but the full bill is worth, well, 10 €.)

For each pixel, the least significant bit of its index is stored in the first byte, and the most significant bit is stored in the second byte. Since each byte is a collection of one of the bits for each pixel, it’s called a bitplane.

The leftmost pixel is stored in the leftmost bit of both bytes, the pixel to its right in the second leftmost bit, and so on. The first pair of bytes stores the topmost row, the second byte the row below that, and so on.

Here is a more visual demonstration; note that it’s for the SNES, but its 2BPP format is the same as the Game Boy’s.

This encoding may seem a little weird at first, and it can be; it’s made to be more convenient for the hardware to decode, keeping the circuitry simple and low-power. It even makes a few cool tricks possible, as we will see (much) later!

You can read up more about the encoding in the Pan Docs and ShantyTown’s site.

In the next lesson, we shall see how colors are applied!


1

Other consoles can have varying bit depths; for example, the SNES has 2bpp, 4bpp, and 8bpp depending on the graphics mode and a few other parameters.

Palettes

In the previous lesson, I briefly mentioned that colors are applied to tiles via palettes, but we haven’t talked much about those yet.

The black & white Game Boy has three palettes, one for the background called BGP (“BackGround Palette”), and two for the objects called OBP0 and OBP1 (“OBject Palette 0/1”). If you are wondering what “objects” are, you will have to wait until Part Ⅱ to find out; for now, let’s focus on the background.

🌈

The Game Boy Color introduced, obviously, colors, and this was mainly done by reworking the way palettes are handled. We will not talk about Game Boy Color features in Part Ⅰ for the sake of simplicity, but we will do so in later parts.

Please select the “Palettes” tab of BGB’s VRAM viewer.

Screenshot of the VRAM viewer’s Palette tab

🥴

The VRAM viewer’s layout may seem weird. As mentioned in the previous tip box, the Game Boy Color changed the way palettes work, and the Super Game Boy throws another wrench into the mix.

The VRAM viewer uses the same layout for monochrome and Color, leading to this admittedly strange layout.

We will be taking a look at the “BGP” line. As I explained before, tiles store “color indices” for each pixel, which are used to index into the palette. Color number 01 is the leftmost in that line, and number 3 is the rightmost.

So, in our case, color number 0 is “white”, color number 1 is “light gray”, number 2 is “dark gray”, and number 3 “black”. I put air quotes because “black” isn’t true black, and “white” isn’t true white. Further, note that the original Game Boy had shades of green, but the later Game Boy Pocket’s screen produced shades of gray instead. And, even better, the Game Boy Color will automatically colorize games that lack Game Boy Color support!

Screenshot of our Hello World, automatically colorized by the Game Boy Color

All this to say, one shouldn’t expect specific colors out of a Game Boy game2, just four more or less bright colors.

Getting our hands dirty

Well, so far in this tutorial, besides running the Hello World, we have been pretty passive, watching it unfold. What do you say we start prodding the ROM a bit?

In BGB’s debugger, select the “Window” menu, and open the “IO map” (or just press F10 within the debugger).

Screenshot of the IO map

While the VRAM viewer offers a visual representation of the palette, the IO map shows the nitty-gritty: how it’s encoded. The IO map also lets us modify BGP easily; but to do so, we need to understand how values we write are turned into colors.

Encoding

Fortunately, the encoding is very simple. I will explain it, and at the same time, give an example with the palette we have at hand, $E4.

Take the byte, break its 8 bits into 4 groups of 2.

[BGP] = $E4
$E4 = %11100100 (refresh your memory in the "Binary and hexadecimal" lesson if needed!)
That gets broken down into %11, %10, %01, %00

Color number 0 is the rightmost “group”, color number 3 is the leftmost one. Simple! And this matches what the VRAM viewer is showing us: color number 0, the rightmost, is the brightest (%00), up to color number 3, the leftmost and the darkest (%11).

Lights out

For fun, let’s make the screen completely black. We can easily do this by setting all colors in the palette to black (%11). This would be %11 %11 %11 %11 = $FF.

In the IO map, click the text box left of “BGP”, erase the “E4”, type “FF”, and hit Enter. BGP immediately updates, turning the screen black!

Screenshot of the palette viewer with the BGP line entirely black
Observe how the BGP line is entirely black now. Also, I could have shown a screenshot of the black screen, but that would have been silly.

What if we wanted to take the original palette, but invert it? %11 would become %00, %01 would become %10, %10 would become %01, and %00 would become %11. We would get thus:

%11_10_01_00
 ↓  ↓  ↓  ↓
%00_01_10_11

(I’m not giving the value in hexadecimal, use this as an opportunity to exercise your bin-to-hex conversions!)

Screenshot of the color-inverted Hello World
If you got it right, it should look like this!

If you go to the Tile Viewer and uncheck the “show paletted” checkbox, you will notice that the tile data stays the same regardless of how the palette is modified! This is an advantage of using palettes: fading the screen in and out is very cheap, just modifying a single byte, instead of having to update every single on-screen pixel.

Got all that? Then let’s take a look at the last missing puzzle piece in the Hello World’s rendering process, the tilemap!


1

Numbering often starts at 0 when working with computers. We will understand why later, but for now, please bear with it!

2

Well, it is possible to detect these different models and account for them, but this would require taking plenty of corner cases into consideration, so it’s probably not worth the effort.

Tilemap

🧐

Some spell them “tile map”, some “tilemap”. I will be using the latter by preference, but I also stay consistent with it in the code (Tilemap and not TileMap), as well as later when we will talk about attribute maps (“attrmap” and Attrmap instead of AttrMap).

We are almost there. We have seen how graphics on the Game Boy are composed of 8×8 “tiles”, and we have seen how color is added into the mix.

But we have not seen yet how those tiles are arranged into a final picture!

Tiles are basically a grid of pixels; well, the tilemaps are basically a grid of tiles! To allow for cheap reuse, tiles aren’t stored in the tilemap directly; instead, tiles are referred to by an ID, which you can see in BGB’s VRAM viewer.

Screenshot highlighting where a tile's ID can be seen
The ID is displayed in hexadecimal without a prefix, so this is tile number $10, aka 16. As you may have noticed, the tiles are displayed in rows of 16, so it's easier to locate them by hexadecimal ID. Nifty!

Now, of course, tile IDs are numbers, like everything that computers deal with. IDs are stored in bytes, so there are 256 possible tile IDs. However, the astute reader will have noticed that there are 384 tiles in total1! By virtue of the pigeonhole principle, this means that some IDs refer to several tiles at the same time.

Indeed, BGB reports that the first 128 tiles have the same IDs as the last 128. There exists a mechanism to select whether IDs 0–127 reference the first or last 128 tiles, but for simplicity’s sake, we will overlook this for now, so please ignore the first (topmost) 128 tiles for the time being.

Now, please turn your attention to the “BG map” tab of BGB’s VRAM viewer, pictured below.

Screenshot of the tilemap viewer

You may notice that the image shown is larger than what is displayed on-screen. Only part of the tilemap, outlined by a thicker border in the VRAM viewer, is displayed on-screen at a given time. We will explain this in more detail in Part Ⅱ.

Here we will be able to see the power of tile reuse in full force. As a convenience and a refresher, here are the tiles our Hello World loads into VRAM:

Enlarged view of the tiles loaded in VRAM

You can see that we only loaded a single “blank” tile ($00, the first aka. top-left one), but it can be repeated to cover the whole background at no extra cost!

Repetition can be more subtle: for example, tile $01 is used for the top-left corner of the H, E, L, L, and W (red lines below)! The R, L, and D also both share their top-left tile ($2D, blue lines below); and so on. You can confirm this by hovering over tiles in the BG map tab, which shows the ID of the tile at that position.

Diagram of some tile mappings
Here are some examples of tile reuse. Not everything is drawn, as it would become a mess.

All in all, we can surmise that displaying graphics on the Game Boy consists of loading “patterns” (the tiles), and then telling the console which tile to display for each given location.


1

The even more astute (astuter?) reader will have noticed that 384 = 3 × 128. Thus, tiles are often conceptually grouped into three “blocks” of 128 tiles each, which BGB shows as separated by thicker horizontal lines.

Wrapping up

Congrats! You have made it through the first part of this tutorial. By this point, you have a basic enough understanding of the console that you know how to display a picture. And hey, that doesn’t sound like much, but consider everything you have seen so far—there is a lot that goes into it!

🥳

Honestly, congrats on coming this far—many people have given up earlier than this. So you can give yourself a pat on the back, you honestly deserve it! Now may also be a good time to take a break if you are reading all this in a single trait. I encourage you to give it a little time to sink in, and maybe go back to the lessons you struggled on the most. Maybe a second read can help.


And yes, you could simply have let a library handle all that. However, the details always leak through eventually, so knowing about them is helpful, if only for debugging.

Plus, understanding what’s really going on under the hood makes you a better programmer, even if you don’t end up using ASM in the long run. Amusingly, even modern systems work similarly to older ones in unexpected places, so some things you just learned will carry over! Trust me, everything you have learned and will learn is worth it! ✊

That said, right now, you may have a lot of questions.

  • Why do we turn off the LCD?
  • We know how to make a static picture, but how to we add motion into the mix?
  • Also, how do I get input from the player?
  • The code mentions shutting down audio, but how do I play some of those famed beeps and bloops?
  • Writing graphics in that way sound tedious, is there no other way?
  • Actually, wait, how do we make a game out of all this??

… All of that answered, and more, in Part Ⅱ! 👀

Getting started

In this lesson, we will start a new project from scratch. We will make a Breakout / Arkanoid clone, which we’ll call “Unbricked”! (Though you are free to give it any other name you like, as it will be your project.)

Open a terminal and make a new directory (mkdir unbricked), and then enter it (cd unbricked), just like you did for “Hello, world!”.

Start by creating a file called main.asm, and include hardware.inc in your code.

INCLUDE "hardware.inc"

You may be wondering what purpose hardware.inc serves. Well, the code we write only really affects the CPU, but does not do anything with the rest of the console (not directly, anyway). To interact with other components (like the graphics system, say), Memory-Mapped I/O (MMIO) is used: basically, memory in a certain range (addresses $FF00–FF7F) does special things when accessed.

These bytes of memory being interfaces to the hardware, they are called hardware registers (not to be mistaken with the CPU registers). For example, the “PPU status” register is located at address $FF41. Reading from that address reports various bits of info regarding the graphics system, and writing to it allows changing some parameters. But, having to remember all the numbers (non-exhaustive list) would be very tedious—and this is where hardware.inc comes into play! hardware.inc defines one constant for each of these registers (for example, rSTAT for the aforementioned “PPU status” register), plus some additional constants for values read from or written to these registers.

Don’t worry if this flew over your head, we’ll see an example below with rLCDC and LCDCF_ON.

By the way, the r stands for “register”, and the F in LCDCF stands for “flag”.

Next, make room for the header. Remember from Part Ⅰ that the header is where some information that the Game Boy relies on is stored, so you don’t want to accidentally leave it out.

SECTION "Header", ROM0[$100]

	jp EntryPoint

	ds $150 - @, 0 ; Make room for the header

The header jumps to EntryPoint, so let’s write that now:

EntryPoint:
	; Do not turn the LCD off outside of VBlank
WaitVBlank:
	ld a, [rLY]
	cp 144
	jp c, WaitVBlank

	; Turn the LCD off
	ld a, 0
	ld [rLCDC], a

The next few lines wait until “VBlank”, which is the only time you can safely turn off the screen (doing so at the wrong time could damage a real Game Boy, so this is very crucial). We’ll explain what VBlank is and talk about it more later in the tutorial.

Turning off the screen is important because loading new tiles while the screen is on is tricky—we’ll touch on how to do that in Part 3.

Speaking of tiles, we’re going to load some into VRAM next, using the following code:

	; Copy the tile data
	ld de, Tiles
	ld hl, $9000
	ld bc, TilesEnd - Tiles
CopyTiles:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, CopyTiles

This loop might be reminiscent of part Ⅰ. It copies starting at Tiles to $9000 onwards, which is the part of VRAM where our tiles are going to be stored. Recall that $9000 is where the data of background tile $00 lies, and the data of subsequent tiles follows right after. To get the number of bytes to copy, we will do just like in Part Ⅰ: using another label at the end, called TilesEnd, the difference between it (= the address after the last byte of tile data) and Tiles (= the address of the first byte) will be exactly that length.

That said, we haven’t written Tiles nor any of the related data yet. We’ll get to that later!

Almost done now—next, write another loop, this time for copying the tilemap.

	; Copy the tilemap
	ld de, Tilemap
	ld hl, $9800
	ld bc, TilemapEnd - Tilemap
CopyTilemap:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, CopyTilemap

Note that while this loop’s body is exactly the same as CopyTiles’s, the 3 values loaded into de, hl, and bc are different. These determine the source, destination, and size of the copy, respectively.

DRY

If you think that this is super redundant, you are not wrong, and we will see later how to write actual, reusable functions. But there is more to them than meets the eye, so we will start tackling them much later.

Finally, let’s turn the screen back on, and set a background palette. Rather than writing the non-descript number %10000001 (or $81 or 129, to taste), we make use of two constants graciously provided by hardware.inc: LCDCF_ON and LCDCF_BGON. When written to rLCDC, the former causes the PPU and screen to turn back on, and the latter enables the background to be drawn. (There are other elements that could be drawn, but we are not enabling them yet.) Combining these constants must be done using |, the binary “or” operator; we’ll see why later.

	; Turn the LCD on
	ld a, LCDCF_ON | LCDCF_BGON
	ld [rLCDC], a

	; During the first (blank) frame, initialize display registers
	ld a, %11100100
	ld [rBGP], a

Done:
	jp Done

There’s one last thing we need before we can build the ROM, and that’s the graphics. We will draw the following screen:

Layout of unbricked

In hello-world.asm, tile data had been written out by hand in hexadecimal; this was to let you see how the sausage is made at the lowest level, but boy is it impractical to write! This time, we will employ a more friendly way, which will let us write each row of pixels more easily. For each row of pixels, instead of writing the bitplanes directly, we will use a backtick (`) followed by 8 characters. Each character defines a single pixel, intuitively from left to right; it must be one of 0, 1, 2, and 3, representing the corresponding color index in the palette.

If the character selection isn’t to your liking, you can use RGBASM’s -g option or OPT g to pick others. For example, rgbasm -g '.xXO' (...) or OPT g.xXO would swap the four characters to ., x, X, and O respectively.

For example:

	dw `01230123 ; This is equivalent to `db $55,$33`

You may have noticed that we are using dw instead of db; the difference between these two will be explained later. We already have tiles made for this project, so you can copy this premade file, and paste it at the end of your code.

Then copy the tilemap from this file, and paste it after the TilesEnd label.

You can build the ROM now, by running the following commands in your terminal:

rgbasm -L -o main.o main.asm
rgblink -o unbricked.gb main.o
rgbfix -v -p 0xFF unbricked.gb

If you run this in your emulator, you should see the following:

Screenshot of our game

That white square seems to be missing! You may have noticed this comment earlier, somewhere in the tile data:

	dw `22322232
	dw `23232323
	dw `33333333
	; Paste your logo here:

TilesEnd:

The logo tiles were left intentionally blank so that you can choose your own. You can use one of the following pre-made logos, or try coming up with your own!

Add your chosen logo’s data (click one of the “Source” links above) after the comment, build the game again, and you should see your logo of choice in the bottom-right!

Objects

The background is very useful when the whole screen should move at once, but this is not ideal for everything. For example, a cursor in a menu, NPCs and the player in a RPG, bullets in a shmup, or balls in an Arkanoid clone… all need to move independently of the background. Thankfully, the Game Boy has a feature that’s perfect for these! In this lesson, we will talk about objects (sometimes called “OBJ”).

The above description may have made you think of the term “sprite” instead of “object”. The term “sprite” has a lot of meanings depending on context, so, to avoid confusion, this tutorial tries to use specific alternatives instead, such as object, metasprite, actor, etc.

Each object allows drawing one or two tiles (so 8×8 or 8×16 pixels, respectively) at any on-screen position—unlike the background, where all the tiles are drawn in a grid. Therefore, an object consists of its on-screen position, a tile ID (like with the tilemap), and some extra properties called “attributes”. These extra properties allow, for example, to display the tile flipped. We’ll see more about them later.

Just like how the tilemap is stored in VRAM, objects live in a region of memory called OAM, meaning Object Attribute Memory. Recall from above that an object consists of:

  • Its on-screen position
  • A tile ID
  • The “attributes”

These are stored in 4 bytes: one for the Y coordinate, one for the X coordinate, one for the tile ID, and one for the attributes. OAM is 160 bytes long, and since 160 ∕ 4 = 40, the Game Boy stores a total of 40 objects at any given time.

There is a catch, though: an object’s Y and X coordinate bytes in OAM do not store its on-screen position! Instead, the on-screen X position is the stored X position minus 8, and the on-screen Y position is the stored Y position minus 16. To stop displaying an object, we can simply put it off-screen, e.g. by setting its Y position to 0.

These offsets are not arbitrary! Consider an object’s maximum size: 8 by 16 pixels. These offsets allow objects to be clipped by the left and top edges of the screen. The NES, for example, lacks such offsets, so you will notice that objects always disappear after hitting the left or top edge of the screen.

Let’s discover objects by experimenting with them!

First off, when the Game Boy is powered on, OAM is filled with a bunch of semi-random values, which may cover the screen with some random garbage. Let’s fix that by first clearing OAM before enabling objects for the first time. Let’s add the following just after the CopyTilemap loop:

	ld a, 0
	ld b, 160
	ld hl, _OAMRAM
ClearOam:
	ld [hli], a
	dec b
	jp nz, ClearOam

This is a good time to do that, since just like VRAM, the screen must be off to safely access OAM.

Once OAM is clear, we can draw an object by writing its properties.

	ld hl, _OAMRAM
	ld a, 128 + 16
	ld [hli], a
	ld a, 16 + 8
	ld [hli], a
	ld a, 0
	ld [hli], a
	ld [hl], a

Remember that each object in OAM is 4 bytes, in the order Y, X, Tile ID, Attributes. So, the object’s top-left pixel lies 128 pixels from the top of the screen, and 16 from its left. The tile ID and attributes are both set to 0.

You may remember from the previous lesson that we’re already using tile ID 0, as it’s the start of our background’s graphics. However, by default objects and backgrounds use a different set of tiles, at least for the first 128 IDs. Tiles with IDs 128–255 are shared by both, which is useful if you have a tile that’s used both on the background and by an object.

If you press F5 in BGB to open the VRAM viewer, you should see three distinct sections.

image

Because we need to load this to a different area, we’ll use the address $8000 and load a graphic for our game’s paddle. Let’s do so right after CopyTilemap:

	; Copy the tile data
	ld de, Paddle
	ld hl, $8000
	ld bc, PaddleEnd - Paddle
CopyPaddle:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, CopyPaddle

And don’t forget to add Paddle to the bottom of your code.

Paddle:
	dw `13333331
	dw `30000003
	dw `13333331
	dw `00000000
	dw `00000000
	dw `00000000
	dw `00000000
	dw `00000000
PaddleEnd:

Finally, let’s enable objects and see the result. Objects must be enabled by the familiar rLCDC register, otherwise they just don’t show up. (This is why we didn’t have to clear OAM in the previous lessons.) We will also need to initialize one of the object palettes, rOBP0. There are actually two object palettes, but we’re only going to use one.

	; Turn the LCD on
	ld a, LCDCF_ON | LCDCF_BGON | LCDCF_OBJON
	ld [rLCDC], a

	; During the first (blank) frame, initialize display registers
	ld a, %11100100
	ld [rBGP], a
	ld a, %11100100
	ld [rOBP0], a

Movement

Now that you have an object on the screen, let’s move it around. Previously, the Done loop did nothing; let’s rename it to Main and use it to move our object. We’re going to wait for VBlank before changing OAM, just like we did before turning off the screen.

Main:
    ; Wait until it's *not* VBlank
    ld a, [rLY]
    cp 144
    jp nc, Main
WaitVBlank2:
	ld a, [rLY]
	cp 144
	jp c, WaitVBlank2

	; Move the paddle one pixel to the right.
	ld a, [_OAMRAM + 1]
	inc a
	ld [_OAMRAM + 1], a
	jp Main

🤨

Here, we are accessing OAM without turning the LCD off, but it’s still safe. Explaining why requires a more thorough explanation of the Game Boy’s rendering, so let’s ignore it for now.

Now you should see the paddle moving… very quickly. Because it moves by a pixel ever frame, it’s going at a speed of 60 pixels per second! To slow this down, we’ll use a variable.

So far, we have only worked with the CPU registers, but you can create global variables too! To do this, let’s create another section, but putting it in WRAM0 instead of ROM0. Unlike ROM (“Read-Only Memory”), RAM (“Random-Access Memory”) can be written to; thus, WRAM, or Work RAM, is where we can store our game’s variables.

Add this to the bottom of your file:

SECTION "Counter", WRAM0
wFrameCounter: db

Now we’ll use the wFrameCounter variable to count how many frames have passed since we last moved the paddle. Every 10th frame, we’ll move the paddle by one pixel, slowing it down to 6 pixels per second. Don’t forget that RAM is filled with garbage values when the Game Boy starts, so we need to initialize our variables before first using them.

	ld a, 0
	ld [wFrameCounter], a

Main:
	ld a, [rLY]
	cp 144
	jp nc, Main
WaitVBlank2:
	ld a, [rLY]
	cp 144
	jp c, WaitVBlank2

	ld a, [wFrameCounter]
	inc a
	ld [wFrameCounter], a
	cp a, 15 ; Every 15 frames (a quarter of a second), run the following code
	jp nz, Main

	; Reset the frame counter back to 0
	ld a, 0
	ld [wFrameCounter], a

	; Move the paddle one pixel to the right.
	ld a, [_OAMRAM + 1]
	inc a
	ld [_OAMRAM + 1], a
	jp Main

Alright! Up next is us taking control of that little paddle.

Functions

So far, we have only written a single “flow” of code, but we can already spot some snippets that look redundant. Let’s use functions to “factor out” code!

For example, in three places, we are copying chunks of memory around. Let’s write a function below the jp Main, and let’s call it Memcpy, like the similar C function:

; Copy bytes from one area to another.
; @param de: Source
; @param hl: Destination
; @param bc: Length
Memcopy:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, Memcopy
	ret

The new ret instruction should immediately catch our eye. It is, unsurprisingly, what makes execution return to where the function was called from. Importantly, many languages have a definite “end” to a function: in C or Rust, that’s the closing brace }; in Pascal or Lua, the keyword end, and so on; the function implicitly returns when execution reaches its end. However, this is not the case in assembly, so you must remember to add a ret instruction at the end of the function to return from it! Otherwise, the results are unpredictable.

Notice the comment above the function, explaining which registers it takes as input. This comment is important so that you know how to interface with the function; assembly has no formal parameters, so comments explaining them are even more important than with other languages. We’ll see more of those as we progress.

There are three places in the initialization code where we can use the Memcpy function. Find each of these copy loops and replace them with a call to Memcpy; for this, we use the call instruction. The registers serve as parameters to the function, so we’ll leave them as-is.

BeforeAfter
	; Copy the tile data
	ld de, Tiles
	ld hl, $9000
	ld bc, TilesEnd - Tiles
CopyTiles:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, CopyTiles
	; Copy the tile data
	ld de, Tiles
	ld hl, $9000
	ld bc, TilesEnd - Tiles
	call Memcopy
	; Copy the tilemap
	ld de, Tilemap
	ld hl, $9800
	ld bc, TilemapEnd - Tilemap
CopyTilemap:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, CopyTilemap
	; Copy the tilemap
	ld de, Tilemap
	ld hl, $9800
	ld bc, TilemapEnd - Tilemap
	call Memcopy
	; Copy the tile data
	ld de, Paddle
	ld hl, $8000
	ld bc, PaddleEnd - Paddle
CopyPaddle:
	ld a, [de]
	ld [hli], a
	inc de
	dec bc
	ld a, b
	or a, c
	jp nz, CopyPaddle
	; Copy the tile data
	ld de, Paddle
	ld hl, $8000
	ld bc, PaddleEnd - Paddle
	call Memcopy

In the next chapter, we’ll write another function, this time to read player input.

Input

We have the building blocks of a game here, but we’re still lacking player input. A game that plays itself isn’t very much fun, so let’s fix that.

Paste this code below your Main loop. Like Memcpy, this is a function that can be reused from different places, using the call instruction.

Input:
  ; Poll half the controller
  ld a, P1F_GET_BTN
  call .onenibble
  ld b, a ; B7-4 = 1; B3-0 = unpressed buttons

  ; Poll the other half
  ld a, P1F_GET_DPAD
  call .onenibble
  swap a ; A3-0 = unpressed directions; A7-4 = 1
  xor a, b ; A = pressed buttons + directions
  ld b, a ; B = pressed buttons + directions

  ; And release the controller
  ld a, P1F_GET_NONE
  ldh [rP1], a

  ; Combine with previous wCurKeys to make wNewKeys
  ld a, [wCurKeys]
  xor a, b ; A = keys that changed state
  and a, b ; A = keys that changed to pressed
  ld [wNewKeys], a
  ld a, b
  ld [wCurKeys], a
  ret

.onenibble
  ldh [rP1], a ; switch the key matrix
  call .knownret ; burn 10 cycles calling a known ret
  ldh a, [rP1] ; ignore value while waiting for the key matrix to settle
  ldh a, [rP1]
  ldh a, [rP1] ; this read counts
  or a, $F0 ; A7-4 = 1; A3-0 = unpressed keys
.knownret
  ret

Unfortunately, reading input on the Game Boy is fairly involved (as you can see!), and it would be quite difficult to explain what this function does right now. So, I ask that you make an exception, and trust me that this function does read input. Alright? Good!

Now that we know how to use functions, let’s call the UpdateKeys function in our main loop to read user input. UpdateKeys writes the held buttons to a location in memory that we called wCurKeys, which we can read from after the function returns. Because of this, we only need to call UpdateKeys once per frame.

This is important, because not only is it faster to reload the inputs that we’ve already processed, but it also means that we will always act on the same inputs, even if the player presses or releases a button mid-frame.

First, let’s set aside some room for the two variables that UpdateKeys will use; paste this at the end of the main.asm:

SECTION "Input Variables", WRAM0
wCurKeys: db
wNewKeys: db

Each variable must reside in RAM, and not ROM, because ROM is “Read-Only” (so you can’t modify it). Additionally, each variable only needs to be one byte large, so we use db (“Define Byte”) to reserve one byte of RAM for each.

We’re going to use the and opcode, which we can use to set the zero flag (z) to the value of the bit. We can use this along with the PADF constants in hardware.inc to read a particular key.

Main:
	ld a, [rLY]
	cp 144
	jp nc, Main
WaitVBlank2:
	ld a, [rLY]
	cp 144
	jp c, WaitVBlank2

	; Check the current keys every frame and move left or right.
	call Input

	; First, check if the left button is pressed.
CheckLeft:
	ld a, [wCurKeys]
	and a, PADF_LEFT
	jp z, CheckRight
Left:
	; Move the paddle one pixel to the left.
	ld a, [_OAMRAM + 1]
	dec a
	; If we've already hit the edge of the playfield, don't move.
	cp a, 15
	jp z, Main
	ld [_OAMRAM + 1], a
	jp Main

; Then check the right button.
CheckRight:
	ld a, [wCurKeys]
	and a, PADF_RIGHT
	jp z, Main
Right:
	; Move the paddle one pixel to the right.
	ld a, [_OAMRAM + 1]
	inc a
	; If we've already hit the edge of the playfield, don't move.
	cp a, 105
	jp z, Main
	ld [_OAMRAM + 1], a
	jp Main

Now, if you compile the project, you should be able to move the paddle left and right using the d-pad!! Hooray, we have the beginnings of a game!

Collision

Being able to move around is great, but there’s still one object we need for this game: a ball! Just like with the paddle, the first step is to create a tile for the ball and load it into VRAM.

Graphics

Add this to the bottom of your file along with the other graphics:

Ball:
	dw `00033000
	dw `00322300
	dw `03222230
	dw `03222230
	dw `00322300
	dw `00033000
	dw `00000000
	dw `00000000
BallEnd:

Now copy it to VRAM somewhere in your initialization code, e.g. after copying the paddle’s tile.

	; Copy the ball tile
	ld de, Ball
	ld hl, $8010
	ld bc, BallEnd - Ball
	call Memcopy

In addition, we need to initialize an entry in OAM, following the code that initializes the paddle.

	; Initialize the paddle sprite in OAM
	ld hl, _OAMRAM
	ld a, 128 + 16
	ld [hli], a
	ld a, 16 + 8
	ld [hli], a
	ld a, 0
	ld [hli], a
	ld [hli], a
	; Now initialize the ball sprite
	ld a, 100 + 16
	ld [hli], a
	ld a, 32 + 8
	ld [hli], a
	ld a, 1
	ld [hli], a
	ld a, 0
	ld [hli], a

As the ball bounces around the screen its momentum will change, sending it in different directions. Let’s create two new variables to track the ball’s momentum in each axis: wBallMomentumX and wBallMomentumY.

SECTION "Counter", WRAM0
wFrameCounter: db

SECTION "Input Variables", WRAM0
wCurKeys: db
wNewKeys: db

SECTION "Ball Data", WRAM0
wBallMomentumX: db
wBallMomentumY: db

We will need to initialize these before entering the game loop, so let’s do so right after we write the ball to OAM. By setting the X momentum to 1, and the Y momentum to -1, the ball will start out by going up and to the right.

	; Now initialize the ball sprite
	ld a, 100 + 16
	ld [hli], a
	ld a, 32 + 8
	ld [hli], a
	ld a, 1
	ld [hli], a
	ld a, 0
	ld [hli], a

	; The ball starts out going up and to the right
	ld a, 1
	ld [wBallMomentumX], a
	ld a, -1
	ld [wBallMomentumY], a

Prep work

Now for the fun part! Add a bit of code at the beginning of your main loop that adds the momentum to the OAM positions. Notice that since this is the second OAM entry, we use + 4 for Y and + 5 for X. This can get pretty confusing, but luckily we only have two objects to keep track of. In the future, we’ll go over a much easier way to use OAM.

Main:
	ld a, [rLY]
	cp 144
	jp nc, Main
WaitVBlank2:
	ld a, [rLY]
	cp 144
	jp c, WaitVBlank2

	; Add the ball's momentum to its position in OAM.
	ld a, [wBallMomentumX]
	ld b, a
	ld a, [_OAMRAM + 5]
	add a, b
	ld [_OAMRAM + 5], a

	ld a, [wBallMomentumY]
	ld b, a
	ld a, [_OAMRAM + 4]
	add a, b
	ld [_OAMRAM + 4], a

You might want to compile your game again to see what this does. If you do, you should see the ball moving around, but it will just go through the walls and then fly offscreen.

To fix this, we need to add collision detection so that the ball can bounce around. We’ll need to repeat the collision check a few times, so we’re going to make use of two functions to do this.

Please do not get stuck on the details of this next function, as it uses some techniques and instructions we haven’t discussed yet. The basic idea is that it converts the position of the sprite to a location on the tilemap. This way, we can check which tile our ball is touching so that we know when to bounce!

; Convert a pixel position to a tilemap address
; hl = $9800 + X + Y * 32
; @param b: X
; @param c: Y
; @return hl: tile address
GetTileByPixel:
	; First, we need to divide by 8 to convert a pixel position to a tile position.
	; After this we want to multiply the Y position by 32.
	; These operations effectively cancel out so we only need to mask the Y value.
	ld a, c
	and a, %11111000
	ld l, a
	ld h, 0
	; Now we have the position * 8 in hl
	add hl, hl ; position * 16
	add hl, hl ; position * 32
	; Convert the X position to an offset.
	ld a, b
	srl a ; a / 2
	srl a ; a / 4
	srl a ; a / 8
	; Add the two offsets together.
	add a, l
	ld l, a
	adc a, h
	sub a, l
	ld h, a
	; Add the offset to the tilemap's base address, and we are done!
	ld bc, $9800
	add hl, bc
	ret

The next function is called IsWallTile, and it’s going to contain a list of tiles which the ball can bounce off of.

; @param a: tile ID
; @return z: set if a is a wall.
IsWallTile:
	cp a, $00
	ret z
	cp a, $01
	ret z
	cp a, $02
	ret z
	cp a, $04
	ret z
	cp a, $05
	ret z
	cp a, $06
	ret z
	cp a, $07
	ret

This function might look a bit strange at first. Instead of returning its result in a register, like a, it returns it in a flag: Z! If at any point a tile matches, the function has found a wall and exits with Z set. If the target tile ID (in a) matches one of the wall tile IDs, the corresponding cp will leave Z set; if so, we return immediately (via ret z), with Z set. But if we reach the last comparison and it still doesn’t set Z, then we will know that we haven’t hit a wall and don’t need to bounce.

Putting it together

Time to use these new functions to add collision detection! Add the following after the code that updates the ball’s position:

BounceOnTop:
	; Remember to offset the OAM position!
	; (8, 16) in OAM coordinates is (0, 0) on the screen.
	ld a, [_OAMRAM + 4]
	sub a, 16 + 1
	ld c, a
	ld a, [_OAMRAM + 5]
	sub a, 8
	ld b, a
	call GetTileByPixel ; Returns tile address in hl
	ld a, [hl]
	call IsWallTile
	jp nz, BounceOnRight
	ld a, 1
	ld [wBallMomentumY], a

You’ll see that when we load the sprite’s positions, we subtract from them before calling GetTileByPixel. You might remember from the last chapter that OAM positions are slightly offset; that is, (0, 0) in OAM is actually completely offscreen. These sub instructions undo this offset.

However, there’s a bit more to this: you might have noticed that we subtracted an extra pixel from the Y position. That’s because (as the label suggests), this code is checking for a tile above the ball. We actually need to check all four sides of the ball so we know how to change the momentum according to which side collided, so… let’s add the rest!

BounceOnRight:
	ld a, [_OAMRAM + 4]
	sub a, 16
	ld c, a
	ld a, [_OAMRAM + 5]
	sub a, 8 - 1
	ld b, a
	call GetTileByPixel
	ld a, [hl]
	call IsWallTile
	jp nz, BounceOnLeft
	ld a, -1
	ld [wBallMomentumX], a

BounceOnLeft:
	ld a, [_OAMRAM + 4]
	sub a, 16
	ld c, a
	ld a, [_OAMRAM + 5]
	sub a, 8 + 1
	ld b, a
	call GetTileByPixel
	ld a, [hl]
	call IsWallTile
	jp nz, BounceOnBottom
	ld a, 1
	ld [wBallMomentumX], a

BounceOnBottom:
	ld a, [_OAMRAM + 4]
	sub a, 16 - 1
	ld c, a
	ld a, [_OAMRAM + 5]
	sub a, 8
	ld b, a
	call GetTileByPixel
	ld a, [hl]
	call IsWallTile
	jp nz, BounceDone
	ld a, -1
	ld [wBallMomentumY], a
BounceDone:

That was a lot, but now the ball bounces around your screen! There’s just one last thing to do before this chapter is over, and thats ball-to-paddle collision.

Paddle bounce

Unlike with the tilemap, there’s no position conversions to do here, just straight comparisons. However, for these, we will need the carry flag. The carry flag is notated as C, like how the zero flag is notated as Z, but don’t confuse it with the c register!

A refresher on comparisons

Just like Z, you can use the carry flag to jump conditionally. However, while Z is used to check if two numbers are equal, C can be used to check if a number is greater than or smaller than another one. For example, cp a, b sets C if a < b, and clears it if a >= b. (If you want to check a <= b or a > b, you can use Z and C in tandem with two jp instructions.)

Armed with this knowledge, let’s work through the paddle bounce code:

	; First, check if the ball is low enough to bounce off the paddle.
	ld a, [_OAMRAM]
	ld b, a
	ld a, [_OAMRAM + 4]
	cp a, b
	jp nz, PaddleBounceDone ; If the ball isn't at the same Y position as the paddle, it can't bounce.
	; Now let's compare the X positions of the objects to see if they're touching.
	ld a, [_OAMRAM + 5] ; Ball's X position.
	ld b, a
	ld a, [_OAMRAM + 1] ; Paddle's X position.
	sub a, 8
	cp a, b
	jp c, PaddleBounceDone
	add a, 8 + 16 ; 8 to undo, 16 as the width.
	cp a, b
	jp nc, PaddleBounceDone

	ld a, -1
	ld [wBallMomentumY], a

PaddleBounceDone:

The Y position’s check is simple, since our paddle is flat. However, the X position has two checks which widen the area the ball can bounce on. First we add 16 to the ball’s position; if the ball is more than 16 pixels to the right of the paddle, it shouldn’t bounce. Then we undo this by subtracting 16, and while we’re at it, subtract another 8 pixels; if the ball is more than 8 pixels to the left of the paddle, it shouldn’t bounce.

jp c, DoNotBounce jp nc, DoNotBounce - 8 + 8 + 16

Paddle width

You might be wondering why we checked 16 pixels to the right but only 8 pixels to the left. Remember that OAM positions represent the upper-left corner of a sprite, so the center of our paddle is actually 4 pixels to the right of the position in OAM. When you consider this, we’re actually checking 12 pixels out on either side from the center of the paddle.

12 pixels might seem like a lot, but it gives some tolerance to the player in case their positioning is off. If you’d prefer to make this easier or more difficult, feel free to adjust the values!

BONUS: tweaking the bounce height

You might notice that the ball seems to “sink” into the paddle a bit before bouncing. This is because the ball bounces when its top row of pixels aligns with the paddle’s top row (see the image above). If you want, try to adjust this so that the ball bounces when its bottom row of pixels touches the paddle’s top.

Hint: you can do this with just a single instruction!

Answer:
	ld a, [_OAMRAM]
	ld b, a
	ld a, [_OAMRAM + 4]
+	sub a, 6
	cp a, b

Alternatively, you can add add a, 6 just after ld a, [_OAMRAM].

In both cases, try playing with that 6 value; see what feels right!

Work in progress

🚧 🚧 🚧 🚧 🚧 🚧 🚧

As explained in the initial tutorial presentation, Part Ⅱ consists of us building an Arkanoid game. However, this is not finished yet; lessons are uploaded as they are made, so the tutorial just abruptly stops at some point. Sorry!

Please hold tight while we are working on this, follow us on Twitter for updates, and go to the next page to find out what you can do in the meantime!

Thank you for your patience 😊 and see you around on GBDev!

Where to go next

Oh. Well, you’ve reached the end of the tutorial… And yes, as you can see, it’s not finished yet. I’m actively working on fixing that, though, please be a little patient :)

In the meantime, the best course of action is to peruse the resources in the next section, and experiment by yourself. Well, given that, it may be a good idea to ask around for advice. A lot of the problems and questions you will be encountering have already been solved, so others can—and will!—help you getting started faster. If you enjoyed the tutorial, please consider contributing.

Resources

A.k.a. “where to go from here”.

Help channels

Other tutorials

  • evie’s interrupts tutorial should help you understand how to use interrupts, and what they are useful for.
  • tbsp’s “Simple GB ASM examples” is a collection of ROMs, each built from a single, fairly short source file. If you found this tutorial too abstract and/or want to get your feet wet, this is a good place to go to!
  • GB assembly by example, Daid’s collection of code snippets. Consider this a continuation of the tutorial, but without explanations; it’s still useful to peruse them and ask about it, they are overall good quality.

Complements

Did you enjoy the tutorial or one of the above? The following should prove useful along the rest of your journey!

Special Thanks

Big thank you to Twoflower/Triad for making the Hello World graphic.

I can’t thank enough Chloé and many others for their continued support.

Thanks to the GBDev community for being so nice throughout the years.

You are all great. Thank you so very much.


Thank you to the Rust language team for making mdBook, which powers this book (this honestly slick design is the stock one!!)

Greets to AYCE, Phantasy, TPPDevs/RainbowDevs, Plutiedev, lft/kryo :)

Shoutouts to Eievui, Rangi, MarkSixtyFour, ax6, Baŝto, bbbbbr, and bitnenfer!