12. Basic Platformer

In this example, I will set up a basic platform jumper that scrolls left to right. Again, I will have a level metatile map mirrored in the RAM. Since we have 2 screens worth of background, we will need 2 pages of RAM ($200).

The first thing you need for a platformer is…gravity. Every frame, sprites affected by gravity need to fall (++Y), unless they are standing on a platform. I will be calculating if the bottom of the metasprite is aligned to the background metatiles, and if so, check if the metatile below is a platform (for both bottom left and bottom right).

Here’s the platform / gravity code…

// first check the bottom left corner of character
 // which nametable am I in?
 NametableB = Nametable;
 Scroll_Adjusted_X = (X1 + Horiz_scroll + 3); // left
 high_byte = Scroll_Adjusted_X >> 8;
 if (high_byte != 0){ // if H scroll + Sprite X > 255, then we should use
  ++NametableB;   // the other nametable's collision map
  NametableB &= 1; // keep it 0 or 1
 // we want to find which metatile in the collision map this point is in...is it solid?
 collision_Index = (((char)Scroll_Adjusted_X>>4) + ((Y1+16) & 0xf0));
 collision = 0;
 Collision_Down(); // if on platform, ++collision
// now check the bottom right corner of character
...same as above, but (X1 + Horiz_scroll + 12); 

// oh, I might as well show you what collision_Down() looks like...
void Collision_Down(void){
 if (NametableB == 0){ // first collision map
  temp = C_MAP[collision_Index];
  collision += PLATFORM[temp];
 else { // second collision map
  temp = C_MAP2[collision_Index];
  collision += PLATFORM[temp];
// platform array has either 0 (no collision) or 1 (collision)
// based on what type of block you are standing on

// then gravity
if(collision == 0){
   Y_speed += 2;
else {
   Y_speed = 0;
   Y1 &= 0xf0; // align to metatile

The next thing I’m going to work on is smooth movement and smooth jumping. We need to use a lot of variables. 2 for X position, 2 for Y (the high byte = the screen position). 2 for X speed, 2 for Y speed. 1 for x acceleration. You could use constant acceration, but I want deceleration to be a bit faster. And, a max speed constant or two….

Actually, I punted on this one, and I just made 1 variable for each of those things, and I’m using the high nibble (upper 4 bits) of Xspeed as the speed. I had planned to do it like described above, but this seemed easier in my head. This is what I ended up doing…

Horiz_scroll += (X_speed >> 4);

We could also add a portion of the screen (in the middle), where L and R don’t move the scroll, but rather the X position of the metasprite. I didn’t do that here, because I want to keep it very simple.

I may come back and revise this code, to have 2 bytes for Xspeed and 2 for Xposition, etc. It would produce smoother movement.

Here’s the link…




11. Scrolling

The $2005 register controls the bg scrolling. The first write for horizontal and the second write  for vertical. Writes to $2005 flip-flops back and forth x, y, x ,y. Reading from $2002 will reset the flip-flop back to x if you’re not sure where you’re at.

99% of NES games only have enough PPU memory to fill 2 screens (nametables). There are addresses for 4, so 2 will be mirrors of the other 2. For emulated games, we have to describe the cart layout in the iNES header. Bytes 6-7 describe the mapper. The low bit of byte 6 describes the mirroring. 0 = horizontal mirroring = scrolling up and down. 1 = vertical mirroring = scrolling right and left.

Gauntlet is special, it has 4 screen scrolling. That required an extra 2k RAM chip on the cartridge. MMC3 games can switch mirroring from horizontal to vertical. But, most games only have one mirroring mode, and most have only 2 nametables to use.

In this first example, it will be setup with vertical mirroring (Note the startup code reset.s sets the mirroring to vertical in the header). Moving the U/D/L/R on the controller will scroll the background around. The red letters are the H scroll value and the V scroll value. (I had to make them sprites so they wouldn’t move with the background.) I highly recommend using FCEUX and having the Nametable Viewer on while moving around.

It looks like this…

Nametable #0 // Nametable #1

Nametable #0 // Nametable #1


As soon as you cross over above 255 (ff) on the H scroll, I have it switch the Nametable assigned to the $2000 register (PPU_CTRL). At that point you can continue scrolling right another 255.

My tool chain here was: make letters in Photoshop, index to 4 colors, cut and paste into YY-CHR, save. Open chr with NES Screen Tool. Draw the backgrounds. Save as .rle (compressed, as a c header file). Include them in the C code, and load them at the start. Movement now is changing the scroll position (rather than moving the sprites around).

void move_logic(void) {
 if ((joypad1 & RIGHT) != 0){
   state = Going_Right;
   if (Horiz_scroll == 0)
 if ((joypad1 & LEFT) != 0){
   state = Going_Left;
   if (Horiz_scroll == 0xff)
 Nametable = Nametable & 1; // keep it 1 or 0
if ((joypad1 & DOWN) != 0){
   state = Going_Down;
   if (Vert_scroll == 0xf0)
      Vert_scroll = 0;
 if ((joypad1 & UP) != 0){
   state = Going_Up;
   if (Vert_scroll == 0xff)
      Vert_scroll = 0xef;

And, here is how my ‘every_frame’ function looks…

void every_frame(void) {
OAM_DMA = 2; // push all the sprite data from the ram at 200-2ff to the sprite memory
PPU_CTRL = (0x90 + Nametable); // screen is on, NMI on
PPU_MASK = 0x1e;
SCROLL = Horiz_scroll;
SCROLL = Vert_scroll;  // setting the new scroll position


In this second example, it is set up for horizontal mirroring. (Note the startup code reset.s sets the mirroring to horizontal in the header) Now it looks like this…

Nametable #0 // Nametable #0

Nametable #2 // Nametable #2

You might notice the max V scroll is $ef. The screen is only 240 pixels high, so I have it jump from ef (239) to zero in the next move.

The only difference in the code is how the nametable is adjusted to write to the $2000 (PPU_CTRL) register. Rather than flopping between 0-1, it flops 0 or 2…

PPU_CTRL = (0x90 + (Nametable << 1));

Here’s the source code…



10. Background Collisions

A bit more challenging than Sprite vs Sprite collisions. Since most games use square blocks of 16×16 metatiles and 16×16 sprites, we will use this as the example. Currently, I have it set to move only 1 pixel R or L or U or D per frame. Every time we move, we need to check for collision. I just check every frame.

The standard, apparently, is to allow the move, check collisions, then eject if collision.

First we’re going to allow the X movement, check collision, eject in X direction. Then we’re going to allow the Y movement and eject in Y if collision.

Another thing… It’s a pain to try to read from the PPU. You’d have to calculate the correct nametable address, AND fetch it from the PPU (during V-blank) in a timely manner in which you can then have game logic around movements and still have enough V-blank time to update Sprites…so let’s drop that idea.

What we are going to do is have a map of the background metatiles that we can nicely fit in 1 page of RAM. This can index to a collision array to indicate which tiles are solid. In my example, metatile 0 = not solid and metatile 1 = solid, so I don’t need to index to a solidity array. If we had more metatiles, we would need a solidity array.

[On a side note, I’m going to load this map into the RAM, but I suppose this was an unnecessary step. I could have just assigned a pointer to the ROM location of the collision map, and indexed using that pointer.]

How do we make a collision map? Sounds time consuming. Well, I used Tiled to make the collision map. First I constructed metatiles in NES Screen Tool (not too hard, we only have 2 here). And, I took a screenshot and cropped it in Photoshop to a 256×256 image. Then I brought that into Tiled, with 16×16 map of 16×16 tiles. Then I redrew the background using metatiles, and then Exported .csv format files (1 file per background). I had to edit the .csv file slightly so it looks like this…

const unsigned char c2[]={

Then I included it in the C code, and referenced its address in an array of addresses. (If any of this is confusing, I included all the files with the source code).


Whenever we press ‘start’ to load a new background, it also loads the collision map into $300-3ff of the RAM. Note, I had to edit the .cfg file to get this to work like I wanted. Specifically, I added a “MAP” segment and defined it to be at $300 and $100 in size. In the C code, I just defined an empty array to be there…

#pragma bss-name(push, “MAP”)
unsigned char C_MAP[256];

[Again, probably not necessary, it COULD have gone anywhere, but I like knowing that it’s exactly at $300. When I open the hex editor debugging tool of FCEUX, and scroll to $300, it’s nice to know that I’m looking at the collision map.]

and did this when Start is pressed, to transfer the collision map from the ROM to the RAM at $300.

p_C_MAP = All_Collision_Maps[which_BGD]; // pointer to a collision map
 for (index = 0;index < 240; ++index){
 C_MAP[index] = p_C_MAP[index]; // transfer to the RAM

When I looked back at this later, I thought, “Why didn’t I use memcpy? So I replaced it with memcpy, and it turns out this innocent looking “for loop” takes 42688 cycles to complete… 9 times longer than memcpy. Here’s how it should look…

void __fastcall__ memcpy (void* dest, const void* src, int count);
p_C_MAP = All_Collision_Maps[which_BGD]; // pointer to a collision map
memcpy (C_MAP, p_C_MAP, 240);

Then I wondered, if memcpy is really the fastest? So, I rewrote this transfer in ASM, just to see if it could be done more efficiently. (That source code is bellow, in the lesson8c.zip)

The ASM version was only 4% faster than memcpy, so I guess it’s safe to use memcpy to transfer bytes. They all appear to work exactly the same, so it’s a bit of a moot point. Later on, when our code is much longer, and we’re worried about fitting all the logic in 1 frame, then we can worry about which code is faster.

And we do a little math to see if we are colliding with the background. First we do moves R or L, calculate the right and left side of the sprite (ours is narrow), and then check (if right) the top right and bottom right corners, to see if they are inside a solid tile….

if ((joypad1 & RIGHT) != 0){
//top right
  corner = ((X1_Right_Side & 0xf0) >> 4) + (Y1_Top & 0xf0); 
  if (C_MAP[corner] > 0)
   X1 = (X1 & 0xf0) + 3; //if collision, realign
//bottom right
  corner = ((X1_Right_Side & 0xf0) >> 4) + (Y1_Bottom & 0xf0); 
  if (C_MAP[corner] > 0)
   X1 = (X1 & 0xf0) + 3; //if collision, realign

Why +3? If you remember from before, our sprite is blank on the 3 pixels on the left (and right).

If collision left, we adjust right…

X1 = (X1 & 0xf0) + 12;

Then I move Up or Down, and then check the top or bottom corners for collision. Our sprite is a full 16 pixels tall…

If collision down, adjustment upward…

Y1 = (Y1 & 0xf0);

If collision above, adjust downward…

Y1 = (Y1 & 0xf0) + 7;

[I just made all these up quickly. They seem to work. I suppose, if our sprites are moving faster than 1 pixel per frame, this logic might not work, and would have to be rewritten.]

Again, sprites always show up on the screen 1 pixel lower than you would expect. You could make an adjustment before updating the Y coordinates to the sprite RAM (OAM). On platform games, I think it looks ok. Top-down games might look odd (as though the character is standing on the wall below him.)


Here’s the source code, the 3 versions only differ in that one ‘for loop’ / ‘memcpy’ / asm transfer. I thought it would be less confusing if they were separated.





9. Drawing a full background

Ok, we learned earlier that you can only draw to the screen… 1. During V-blank or 2. While the screen is off. With option 1, you can only draw 2 rows per frame, and with 30 rows to draw it will take 15-16 frames to do. With option 2 you get 1-2 frames of black screen (or whatever the default BG color is) and the new background is fully shown.

I prefer the second option, as the code is easier. I’ve used NES Screen Tool to create the backgrounds quickly, and used the Save Nametable as .RLE to compress it. I will then be using the UNRLE asm code to decompress it. (another thing that you don’t need to know how it works, it works, forget about it and move on).

I’ve set it to load a new background every time Start is pressed. I have it also checking against previous frame presses, so that we don’t get multiples in a row.

This code…

  1. Gets input every frame
  2. If Current press = Start, and last frame press != Start…
  3. Set flag to turn off screen
  4. Next V-blank, turn off screen
  5. Draw a new screen
  6. Wait till V-blank, turn on screen


Draw the new background…

void Draw_Background(void) {
 PPU_ADDRESS = 0x20; //address $2000 = start of nametable #0
 PPU_ADDRESS = 0x00;
 UnRLE(All_Backgrounds[which_BGD]); //uncompresses our data
 Wait_Vblank(); //don't turn on screen until in v-blank
 if (which_BGD == 4) //shuffles between 0-3
   which_BGD = 0;

const unsigned char * const All_Backgrounds[]={n1,n2,n3,n4};
// pointers to the addresses of each background



Here’s the link to the source code. Now ‘Start’ will change the background…



8.Sprite Collisions

The easiest kind of collision detection is sprite vs. sprite. In this case I will be doing metasprites of 16×16 pixels. That seems to be a standard NES size.

“Are 2 things touching?” is a process comparisons. Is A(left side) less than the right of B(right side)? Is A(right side) more than B(left side)? Is A(top) less than B(bottom)? Is A(bottom) more than B(top)?

If all of these are true, then object A is touching object B. I prefer to define sprite positions by the top, left pixel of the top, left sprite. All other sides are offsets from that position. In my example it looks like this…

A_left_side_X = A_X + 3;
A_right_side_X = A_X + 12;
A_top_Y = A_Y;
A_bottom_Y = A_Y + 15;
//(etc for B)
if (A_left_side_X <= B_right_side_X && 
   A_right_side_X >= B_left_side_X && 
   A_top_Y <= B_bottom_Y && A_bottom_Y >= B_top_Y){do something}

Why is the left side X + 3? Because the left 3 pixels of our Sprite are blank…


However, because we are always using unsigned char variables, things > 255 will roll over to 0, making problems at the edges (when you walk off the edge of the screen). Here’s a little hack to fix it. (so that if A_X = 250, + 12 won’t = 6 (absurd) but 255 (wrong, but still more reasonable). It seems to work.

I suppose we could have made these variables 16-bit (int), but I think that would slow down our collision code (speed is especially important when we’re checking collisions between 20+ objects on the screen). Or, another option would be to make sure sprites never go off the edge of the screen.

 A_left_side_X = A_X + 3;
 if (A_left_side_X < A_X) A_left_side_X = 255; //if overflow, set to max high

In our next example object B is auto-moved (with previously used code), and controller moves A. Every frame they touch, our scoreboard will go up by 1. Here’s the code that adjusts the scoreboard (score5 = ones digit)…

 if (score5 > 9){
  score5 = 0;
 if (score4 > 9){
  score4 = 0;
 if (score3 > 9){
  score3 = 0;
 if (score2 > 9){
  score2 = 0;
 if (score1 > 9){ //if overflow, rolls back to 0
  score1 = 0;
  score2 = 0;
  score3 = 0;
  score4 = 0;
  score5 = 0;


And, everytime the score is changed, it sets a flag to do this at the beginning of the next V-blank, to actually draw those number tiles onto the screen. We absolutely must only write to the PPU during V-blank, that’s why we wait till the beginning of the next V-blank. Remember from before, that we have NMI’s turned ‘on’ to know exactly when the V-blank has begun.

void PPU_Update (void) {
 PPU_ADDRESS = 0x20;
 PPU_ADDRESS = 0x8c;
 PPU_DATA = score1+1; //I made tile 0 = blank, tile 1 = "0", tile 2 = "1", etc
 PPU_DATA = score2+1; //so we have to add 1 to the digit to get the corresponding tile
 PPU_DATA = score3+1;
 PPU_DATA = score4+1;
 PPU_DATA = score5+1;


Here’s the link…



7. Input

Next we will cover button presses. Not too complicated… Joypad 1 is at $4016 and Joypad 2 is at $4017. I read them once a frame, right after all the Ppu updates and Sprite update and setting the scroll.

I always have 2 variables for each, the current buttons, and last frame’s button presses. First, write 1 to $4016, then write 0 to $4016. Now you can read button presses (one at a time = 8 reads per Joypad) and shift them into the new button variables. The order of the bits read are A, B, Select, Start, Up, Down, Left, Right.

We will use the metasprite from the last time and let the user move him around. I thought long and hard, and decided to write this bit in ASM. You don’t need to understand it. It works. You can move on and worry about other things. 🙂

You call an ASM function from the C code, you just need a prototype, the linker will put them together when you compile.

void Get_Input(void);

Of course you don’t have to use (void), as I tend to do. If you pass a variable, I highly recommend you use __fastcall__ which will store 1 passed variable in the A and X registers, rather than the C Stack.

If a function is declared as __fastcall__ (or fastcall), the last (rightmost) parameter is not passed on the stack, but passed in the primary register to the called function. This is A in case of an eight bit value, A/X in case of a 16 bit value, and A/X/sreg in case of a 32 bit value.


Ok, back to Get_Input();

Just call this function once per frame, and it will store the button presses for you.

Here’s the link to the source code. Now you can move the little guy with a controller. You’ll see I added a file called asm4c.s which has our asm function. Also, the compile.bat file has been changed to include/link asm4c into our final output file. Also, I replaced the auto-move code with this function…

void move_logic(void) {
 if ((joypad1 & RIGHT) != 0){
  state = Going_Right;
 if ((joypad1 & LEFT) != 0){
  state = Going_Left;
 if ((joypad1 & DOWN) != 0){
  state = Going_Down;
 if ((joypad1 & UP) != 0){
  state = Going_Up;

I had to define “RIGHT” “LEFT” “UP” “DOWN” to how they would appear in the joypad1 variable. Each bit is a button. It will either be a 0 or a 1. We are masking just that bit with the ‘&’ — which will be != 0 if that specific button was pressed.

#define RIGHT  0x01
#define LEFT  0x02
#define DOWN  0x04
#define UP   0x08
#define START  0x10
#define SELECT  0x20
#define B_BUTTON 0x40
#define A_BUTTON 0x80

Here’s the link to the source code…



6. Sprites

A sprite is an 8×8 object that can move freely on the screen. (Technically, you can set it to 8×16 also). Nearly all characters are constructed out of sprites. There are several exceptions…some end game bosses are actually background tiles, and Little Mac in Punch-Out (also, the hero of Kung Fu) is actually made of background tiles. (The screen horizontal scroll is shifted mid-frame to give the appearance of movement).

I know what you’re thinking, 8×8 is so small. That’s why we connect many sprites together into a Metasprite. A 2×2 block of sprites = small Mario. A 2×4 block of sprites = big Mario.


There are a few limitations. The NES has only 64 sprites. Also, you can only put 8 sprites on the same scanline (left to right). Any more and the lower priority sprites won’t show up on that scanline. Maybe you’ve seen games with flickering sprites, but actually the NES doesn’t naturally flicker, that’s just a way to keep them visible on the screen when there are too many. The programmer does this by shuffling sprite priorities each frame. A flickering sprite is better than an invisible sprite.

What’s a sprite priority? Sprite 0 uses OAM addresses 0-3. Sprite 1 uses OAM addresses 4-7. Sprite 2 uses OAM addresses 8-B. Etc. Sprite 0 will always have priority over the other sprites. Sprite 1 will have the next highest priority, then Sprite 2, etc.

That means that if they occupy the same space, the lower numbered (ie. higher priority) sprites will show up on top. If there are more than 8 sprites on a scanline, the 8 lowest numbered (ie. higher priority) sprites [on that scanline] will show up and the others will disappear.

How do we get sprites to show up on screen? In the start-up code, I set the y coordinates of all sprites to $f8 (just off the bottom of the screen). [I think anything between $fo and $ff would work]. Each sprite uses 4 bytes of memory..(some docs call this OAM)

1.Y coordinate


3.Attributes *

4. X coordinate

*attributes = flipping, which palette, whether it will appear above or below the background. See the wiki…



Note: this picture is using 0x700 as the Sprite page of RAM, whereas the example below is using 0x200 as the Sprite page of RAM.

So, to make a sprite show up, you assign it coordinates, a tile, and pick a palette. You could do this by writing the Sprite Address to $2003 and then writing the Sprite data to $2004. This is rarely done, as there is a much more efficient Memory Copy (DMA) option for sprites. So, we will use $200-2ff of the RAM to store the values, then wait for V-Blank, then write 0 to $2003, then write 2 to $4014 (DMA from RAM 200)…and all the sprite data will be transferred to the sprite memory (OAM).

In our example here, we will make a metasprite of 4 sprites. And, some code to auto move it.

(Note, sprites appear on screen 1 pixel lower than you would expect, so in order to line up to the background, you will have to subtract 1 from its y coordinate. If you look closely at the Mario graphic above, you will see his feet are 1 pixel into the floor.)

#pragma bss-name(push, "ZEROPAGE")

unsigned char NMI_flag;
unsigned char Frame_Count;
unsigned char index;
unsigned char index4;
unsigned char X1;
unsigned char Y1;
unsigned char move;
unsigned char move_count;
#pragma bss-name(push, "OAM")
unsigned char SPRITES[256];
// I've mapped OAM to ram addresses 200-2ff, in the cfg file
const unsigned char PALETTE[]={
0x19, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0,
0x19, 0x37, 0x24, 1,  0, 0, 0, 0,  0, 0, 0, 0,  0, 0, 0, 0};
const unsigned char MetaSprite_Y[] = {0, 0, 8, 8}; // relative y coordinates
const unsigned char MetaSprite_Tile[] = {0, 1, 0x10, 0x11}; // tile numbers
const unsigned char MetaSprite_Attrib[] = {0, 0, 0, 0}; // attributes = flipping, palette
const unsigned char MetaSprite_X[] = {0, 8, 0, 8}; // relative x coordinates
// we are using 4 sprites, each one has a relative position from the top left sprite0
void every_frame(void) {

 OAM_DMA = 2; // push all the sprite data from the ram at 200-2ff to the sprite memory
 PPU_CTRL = 0x90; // screen is on, NMIs on
 PPU_MASK = 0x1e; 
 SCROLL = 0;
 SCROLL = 0;  // just double checking that the scroll is set to 0
// this automates changes to the sprites, like their position
void update_Sprites (void) {
 index4 = 0;
 for (index = 0; index < 4; ++index ){
  SPRITES[index4] = MetaSprite_Y[index] + Y1; // relative y + master y
  SPRITES[index4] = MetaSprite_Tile[index]; // tile numbers
  SPRITES[index4] = MetaSprite_Attrib[index]; // attributes, all zero here
  SPRITES[index4] = MetaSprite_X[index] + X1; // relative x + master x
} // this is not very efficient, but for teaching purposes, this is clearer

void main (void) {
 All_Off(); // turn off screen
 X1 = 0x7f; // set the starting position of the sprites
 Y1 = 0x77; // near the middle of the screen
 All_On(); // turn on screen
 while (1){ // infinite loop
  while (NMI_flag == 0); // wait till NMI
  NMI_flag = 0;
  every_frame(); // should be done first every v-blank
  if (move == 0) ++X1;
  if (move == 1) ++Y1;
  if (move == 2) --X1;
  if (move == 3) --Y1;
  if (move_count == 20){ // does a move for 20 frames, then swithces moves
   move_count = 0;
  if (move == 4) move=0; // keeps the moves between 0-3

And, I also made a slightly better version, where the sprite changes tiles for each move, so he appears to be turning. Relevant changes were…

const unsigned char MetaSprite_Tile[] = { //more tiles
 2, 3, 0x12, 0x13, // right
 0, 1, 0x10, 0x11, // down
 6, 7, 0x16, 0x17, // left
 4, 5, 0x14, 0x15}; // up
void update_Sprites (void) {
 move4 = move << 2; // same as move * 4
 index4 = 0;
 for (index = 0; index < 4; ++index ){
  SPRITES[index4] = MetaSprite_Y[index] + Y1; // relative y + master y
  SPRITES[index4] = MetaSprite_Tile[index + move4]; // tile numbers
  SPRITES[index4] = MetaSprite_Attrib[index]; // attributes, all zero here
  SPRITES[index4] = MetaSprite_X[index] + X1; // relative x + master x


And here is the link to the source code…