23. Advanced Mapper – MMC3

MMC3 is the 2nd most popular mapper. Many of the most popular games were MMC3. Several of the Megaman games. Super Mario Bros 2 + 3. This photo is borrowed from bootgod. The little chip at the top says MMC3B.


MMC3 PRG banks are $2000 in size. Max size is 512k (64 of those PRG banks).

MMC3 CHR banks are $400 in size. Max size is 256k (256 of those CHR banks). This extra small size makes it possible to change 1/4 of a tileset.

Although, only one of the tilesets can do this. The other can only change 1/2 at a time. Which one is optional, but I would prefer the BG to have the smaller banks, so that it more easily animate the backgroud without wasting too much memory.

MMC3 can have WRAM, some times with a battery for saving the game.

MMC3 can change between Horizontal and Vertical mirroring, freely.

Best of all. MMC3 has a scanline counter hardware IRQ. That means you can very exactly time mid-frame changes, such as scrolling changes. You can do parallax scrolling, like the train stage on Ninja Gaiden 2.

Since IRQ code needs to be programmed in Assembly, instead of having you, the casual C user, try to write your own… I made an automated system, that can do most anything you would need it to do mid-screen. (I will discuss a little later)

I don’t want to get into all the details of how the hardware works, which you can find here, if you want to read them.


There are multiple screen splits (timed by IRQ) and I’m animating the gears by swapping CHR banks. Also, the music is in a swappable bank.
Let’s go over every thing that had to be changed…


I made several $2000 byte swappable banks. The C code and all the libraries need to go in a fixed bank. I decided to keep $a000-ffff fixed to the last 3 banks. That means all the swappable banks will go in the $8000 slot… which is why we have all those banks start at $8000. Technically, MMC3 has the ability to swap the $a000-bfff bank, but I’m adapting the code from the MMC1 example, which only expects 1 swappable region, so let’s pretend that you can’t change $a000.

The startup code (reset code) needs to be in the very last $e000-ffff area, because this is the only bank that, by default, we know the position of. Vectors also go here.

Again, this mapper can have WRAM at $6000, so I made that segment to test it. You would need to change the BSS name to XRAM (what I called this segment) to declare a variable or array here.

In the Assembly code, just inserting a .segment like .segment “CODE” makes everything below that point map into that bank.

In the C code, you need to use a pragma, like
#pragma bss-name(push, “ZEROPAGE”)
#pragma bss-name(pop)


#pragma rodata-name (“BANK0”)
#pragma code-name (“BANK0”)

to direct the various types of things into the correct bank.

At the bottom of MMC3_128_128.cfg, it defines some symbols, NES_MAPPER=4.
And size of each ROM chip (NES_PRG_BANKS=8). MMC3 can be expanded to  512k PRG and 256k CHR, if you need.



This is the reset code and where most .s files are included. For example, mmc3_code.asm is included here. All in the “STARTUP” segment, so we know it is mapped to $e000-ffff, the default fixed bank.

I inserted a basic MMC3 initialization, which puts all the banks in place (and bank 0 into the $8000 slot). It also, importantly, does a CLI to allow IRQs to work on the CPU chip.

Also, I defined “SOUND_BANK 12”, and put the music data there, in BANK 12. All the music code will swap that bank in place, to play songs.


MMC3 folder


is all the hidden code that you shouldn’t have to worry about. Except if you want to change the A12_INVERT from $80 to 0. This determines which tileset will get the smaller bank $400 size and which will get the larger $800. I have it $80 so that background has the smaller tile banks, so that swapping BG CHR banks (to animate the background) uses less space.

If you prefer to have smaller sprite banks, change this number to 0. This also changes which CHR bank mode you need to change BG vs sprite tiles.

;if invert bit is 0
;mode 0 changes $0000-$07FF
;mode 1 changes $0800-$0FFF

;mode 2 changes $1000-$13FF
;mode 3 changes $1400-$17FF
;mode 4 changes $1800-$1BFF
;mode 5 changes $1C00-$1FFF

;if invert bit is $80
;mode 0 changes $1000-$17FF
;mode 1 changes $1800-$1FFF

;mode 2 changes $0000-$03FF
;mode 3 changes $0400-$07FF
;mode 4 changes $0800-$0BFF
;mode 5 changes $0C00-$0FFF


is the bank swapping code. This is the same as the MMC1 example.
You only need the banked_call() function. Anytime you call a function that is in a swappable bank, you use..

banked_call(unsigned char bankId, void (*method)(void))

(which bank is it in, the name of the function)
Don’t nest more than 10 function calls deep this way, or it will crash.


Here is the new stuff that you need to know.

set_prg_8000(unsigned char bank_id);

this changes which bank is mapped to $8000. I prefer that you use the banked_call() function, which automatically does this for you. You could use this to read data from a certain bank.


returns the number of the bank at $8000


don’t use this, for how I have it set up. It would change the bank mapped at $a000. Currently, we have code mapped here, which could crash if it was missing.

set_chr_mode_0(unsigned char chr_id);
set_chr_mode_1(unsigned char chr_id);
set_chr_mode_2(unsigned char chr_id);
set_chr_mode_3(unsigned char chr_id);
set_chr_mode_4(unsigned char chr_id);
set_chr_mode_5(unsigned char chr_id);

these change which parts of the CHR ROM are mapped to the tilesets.
See the chart above.

set_mirroring(unsigned char mirroring)

to change the PPU mirroring layout. Horizontal or Vertical.


this could turn the WRAM on and off. I turned it ON by default.


turns off the irq, and redirects the IRQ SYSTEM to point to 0xff (end).

et_irq_ptr(char * address)

turns on the IRQ SYSTEM, and points it to a char array.



So I should talk about this IRQ stuff. An IRQ is a hardware interrupt.
The MMC3 has a scanline counter. Once it is set, it counts down each line that is drawn. When it reaches zero, it jumps to the IRQ code. The standard use for the scanline IRQ is to make mid-screen scrolling changes, such as parallax scrolling.

The IRQ code would need to be written in Assembly. Instead of expecting you to learn this to make it work, I wrote an automated system that will parse a set of instructions (char array) and execute the necessary code behind the scenes. Here’s how it works.

A value < 0xf0, it’s a scanline count
zero is valid, it triggers an IRQ at the end of the current line

if >= 0xf0…
f0 = 2000 write, next byte is write value
f1 = 2001 write, next byte is write value
f2-f4 unused – future TODO ?
f5 = 2005 write, next byte is H Scroll value
f6 = 2006 write, next 2 bytes are write values

f7 = change CHR mode 0, next byte is write value
f8 = change CHR mode 1, next byte is write value
f9 = change CHR mode 2, next byte is write value
fa = change CHR mode 3, next byte is write value
fb = change CHR mode 4, next byte is write value
fc = change CHR mode 5, next byte is write value

fd = very short wait, no following byte
fe = short wait, next byte is quick loop value
(for fine tuning timing of things)

ff = end of data set
Once it sees a scanline value or an 0xff, it exits the parser.
Small example…
irq_array[0] = 47; // wait 48 lines

irq_array[1] = 0xf5; // 2005, H scroll change
irq_array[2] = 0; // change it to zero
irq_array[3] = 0xff; // end of data

This will cause it to, at the very top of the screen, set the scanline counter for 47. The screen will draw the top 48 lines (it’s usually 1 more than the number) then an IRQ will fire.

It will read the next lines, and change the Horizontal scroll to zero, then it will see the 0xff, and exit.

If it saw another scanline value (any # < 0xf0) it would instead set another scanline counter. And so on. Let’s look at the example.


There are 4 IRQ splits here (red lines). I’m doing quite a lot here. This is sort of an extreme example. Let’s go over the array, and see all what’s going on.


Anything put BEFORE the first scanline count will happen in v-blank,  and affect the entire screen.

irq_array[0] = 0xfc; // CHR mode 5, change the 0xc00-0xfff tiles
irq_array[1] = 8; // top of the screen, static value
irq_array[2] = 47; // value < 0xf0 = scanline count, 1 less than #

0xfc, change some tiles, mode 5, use CHR bank 8. Every frame this will always be 8 at the top of the screen. If you look at the video or the ROM, you see the top gear is not spinning. I did this on purpose to show the tile change mid-screen better.

It sees the 47, which is less than 0xf0, so it sets the scanline counter and exits.

At line 48 of the screen, an IRQ fires, the IRQ parser reads the next command.

irq_array[3] = 0xf5; // H scroll change, do first for timing
temp = scroll2 >> 8;
irq_array[4] = temp; // scroll value
irq_array[5] = 0xfc; // CHR mode 5, change the 0xc00-0xfff tiles
irq_array[6] = 8 + char_state; // value = 8,9,10,or 11
irq_array[7] = 29; // scanline count

First, it changes the Horizontal scroll. This code actually takes up an entire scanline, to try to time it near the end of a scanline so it’s not so visible. Note, you can’t change the Vertical scroll this way, only the Horizontal.

Then, it changes the background tiles, and cycles them through 4 options. This causes the gear to spin. The entire screen below this point is affected.

Then, it sets another scanline count and exits. 30 scanlines later another IRQ fires. It then reads this…

irq_array[8] = 0xf5; // H scroll change
temp = scroll3 >> 8;
irq_array[9] = temp; // scroll value
irq_array[10] = 0xf1; // $2001 test changing color emphasis
irq_array[11] = 0xfe; // value COL_EMP_DARK 0xe0 + 0x1e
irq_array[12] = 30; // scanline count

We change the H scroll again. Just to show what ELSE we can do, it’s writing a value to the $2001 register, setting all the color emphasis bits, which darkens the screen. If you move the sprite guy down, you see that he is also affected.

Another scanline count is set, for 30. It takes 31 lines, probably.
irq_array[16] = 0xf5; // H scroll change
irq_array[17] = 0; // to zero, to set the fine X scroll
irq_array[18] = 0xf6; // 2 writes to 2006 shifts screen
irq_array[19] = 0x20; // need 2 values…
irq_array[20] = 0x00; // PPU address $2000 = top of screen
irq_array[21] = 0xff; // end of data

And just to be fancy, I’m showing an example of using the $2006 PPU register mid-screen. But, to make it work, I also needed to set the $2005 register to zero with the 0xf5 command. The 0xf6 command is the only one that is expecting 2 values after it. High byte, then Low byte, of a PPU address.

Writing to $2006 mid-frame causes the scroll to realign to that address. In this example, the address $2000 is the top of the nametable, which means that the top of the nametable is drawn AGAIN, below that point.

Instead of seeing a value < 0xf0, it sees an 0xff, and exits without setting another scanline count. The parser does nothing until the top of the next line.


In the code, a double buffering system was added so that you aren’t editing the same array that the system is currently reading. Every irq_array[] reference above has been replaced with double_buffer[]. Then at the end of the frame logic, it waits till the IRQ System is done while(!is_irq_done() ).

is_irq_done() returns zero if it’s not done, and 0xff if it is done. The preceding ! not operator flips the result, so it becomes a “while this function returns zero, do nothing”.

Then, it copies the double_buffer array to the irq_array.


Swappable PRG banks

These work the same as the MMC1 code, except that the bank size is $2000.

To make the code / data go in BANK 0, use these directives

#pragma rodata-name (“BANK0”)
#pragma code-name (“BANK0”)

The function_bank0() function is compiled into bank 0. To call it, we use

banked_call(0, function_bank0);

0 is the bank, function_bank0 is the name of the function.

This jumps to banked_call(), which is in the fixed bank, it automatically swaps bank 0 in place, and then jumps to function_bank0() with a function pointer. Then it swaps the previous bank back in place, and returns.

As you see, this means that I can’t pass arguments to function_bank0().
So, we have to pass arguments via global variables. arg1, arg2, for example. Obviously, this isn’t safe practice. You could immediately pass those values to static local variables, to avoid errors.

If you call a function in the SAME swapped bank that you are already in, you can safely use regular function calls.

You can call a function from one swapped bank to another swapped bank. But, don’t keep going from one to another swapped bank too much, not more than 10 deep.

Other important notes.

For the IRQ scanline counter to work, background must use tilset #0 (which is the default)and sprites must use tileset #1 bank_spr(1). And, the screen must be on. ppu_on_all().

Here’s the link to the example code.


Future plans… maybe some time, I will make a UxROM (perhaps UNROM 512) code example. This would require VRAM, which would involve compression. Otherwise, the PRG works the same as the MMC1, which we already have working example code, so it wouldn’t be too difficult to convert.

Review – 150 Classic NES Games for GBA

Review – 150 Classic NES Games for GBA

I bought this cart for about $20 US from some store on Amazon. It uses the PocketNES emulator to run NES games on a Gameboy Advanced.


Here is a list of games on it…

Adventure Island 1, 2, 3, & 4,
Adventures of Lolo 1, 2 & 3,
Battletoads & Double Dragon
Bionic Commando,
Blades of Steel,
Blaster Master,
Bomberman 1 & 2,
A Boy and His Blob,
Bubble Bobble 1 & 2,
Castlevania 1, 2 & 3,
Chip n Dale Rescue Rangers,
Cobra Triangle,
Contra Force,
Devil World,
Donkey Kong,
Donkey Kong Jr.,
Double Dragon 1, 2 & 3,
Dr. Mario,
Dragon Warrior 1, 2, 3 & 4,
Duck Tales,
Fester’s Quest,
Final Fantasy 1, 2 & 3,
Fire ‘n Ice,
Ganbare Goemon,
Gargoyles Quest 2,
Ghosts ‘n Goblins,
Goonies 2,
Guardian Legend,
Gun Nac,
Holy Diver,
Ice Climber,
Ice Hockey,
Ikari Warriors,
Journey to Silius,
Kickle Cubicle,
Kid Icarus,
Kid Niki,
Kirby’s Adventure,
Kung Fu,
Legend of Zelda 1, & 2,
Legend of Zelda Outlands,
Legendary Wings,
Little Nemo,
Little Samson,
Lode Runner,
Maniac Mansion,
Marble Madness,
Mario Bros.,
Mario Open Golf,
Mega Man 1 thru 6,
Metal Gear,
Mickey Mousecapade,
Micro Machines,
Moon Crystal,
Mr. Gimmick,
Ms. Pac-Man,
New Ghostbusters II,
Ninja Gaiden 1, 2 & 3,
Over Horizon,
Parasol Stars,
Pipe Dream,
Power Blade,
Prince of Persia,
Rad Racer,
RC Pro-Am,
Ring King,
River City Ransom,
Rush’n Attack, Rygar,
Section Z,
Skate or Die 2,
Snake Rattle n Roll,
Snake’s Revenge,
Solomon’s Key,
Space Invaders,
Star Force,
Star Soldier,
Star Tropics 1 & 2,
Summer Carnival ’92,
Super C,
Super Mario Bros. 1, 2 (USA & Japan) & 3,
Sweet Home,
Tecmo Super Bowl,
TMNT 1, 2 & 3,
To The Eartch,
Twinbee 1, 2 & 3,
Vice Project Doom,
Wario’s Woods,
Wizards & Warriors,
Yoshi’s Cookie,

However, Cobra Triangle immediately crashes when the game starts. Klax has glitched graphics. And To The Earth is a zapper game, and PocketNES has no zapper support, so it is also unplayable. 147 playable, out of 150. Here’s Klax…


Some of the games have very minor glitches, usually on the title screen. The other games seem to play ok. The first thing I noticed was that the games are a bit dark and some games with tiny bullets are hard to see. This is with max brightness…


I believe the GBA SP has an internal light source, which would help. I suppose I’m too used to new cell phones, which have very bright screens.

There are several display options…


Scaled BG and Obj (squashed)



Unscaled, you can’t see the whole screen, but L and R buttons shift the screen up and down… which is kind of a pain.



Unscaled Follow. Same as above, but the screen tries to snap to the main character. You still can’t see the floor below.



I think the Scaled version is the most playable, even if everything is squashed. It cuts a few pixels off the top and bottom, so you can’t really see the ENTIRE screen.


Press L + R to get the menu, A to choose, B to cancel.


You can do save states, which really sells this game for me. Not having to replay the entire game every time you play is a huge benefit.

Exit takes you to the start of the game selection menu, with selection at the first game. Restart also takes you to the game selection menu, but selection is at the last game played.

Some games auto save the SRAM. Games like Dragon Warrior. However, there isn’t enough memory on the cartridge to save all 150 games. At some point when I was scrolling through the games it gave me a weird error message of memory full and I was required to delete one of the in-game saves.

The music / sound quality is good. The game selection is good. I wish it had Mike Tyson’s Punch-Out!! but PocketNES can’t emulate it correctly, so they left it out. I read that Castlevania 3 isn’t fully emulated, but it seems to play ok. I don’t see any problems.

Here’s some more options.


Somewhere I read that you should keep Autosave OFF or something bad will happen.

However, you can do this for savestates…

Quick load: R+START loads the last savestate of the current rom.
Quick save: R+SELECT saves to the last savestate of the current rom


Another thing you can do is speed up and slow down the game…

Speed modes: L+START switches between throttled/unthrottled/slomo mode

But, speed up is TOO FAST and slomo is TOO SLOW. I don’t find this feature useful.


Overall Review = A-

Not perfect, but definitely worth the $20.


22. Advanced mapper MMC1

A mapper is some circuitry on the cartridge that allows you to “map” more than 32k of PRG ROM and/or more than 8k of CHR ROM to the NES. By dividing a larger ROM into smaller “banks” and redirecting read/writes to different banks, you can trick the NES into allowing much larger ROMs. MMC1 was the most common mapper.

MMC1 – 681 games
MMC3 – 600 games
UxROM – 270 games
NROM – 248 games
CNROM – 155 games
AxROM – 76 games
*source BootGod


I borrowed this from Kevtris’s website. The smaller chip on the bottom left says “Nintendo MMC1A”.


MMC1 has the ability to change PRG banks and CHR banks. It can have PRG sizes up to 256k and CHR sizes up to 128k. (some rare variants could go up to 512k in PRG, but that won’t be discussed). It can change the mirroring from Horizontal to Vertical to One Screen. Metroid and Kid Icarus were MMC1, and they switch mirroring to scroll in different directions.

The MMC1 boards frequently had WRAM of 8k ($2000) bytes at $6000-7FFF, which could be battery backed to save a game. When I tried to make an NROM demo with WRAM, several emulators (and my PowerPak) decided that the WRAM didn’t exist because no NROM games ever had WRAM. But you wouldn’t have that problem with MMC1.

(In the most common arrangement…) The last PRG bank is fixed to $C000-FFFF and the $8000-BFFF can be mapped to any of the other banks. PRG banks are 16k ($4000) in size. Graphics can be swapped too. You can either change the entire pattern tables (PPU $0-1FFF) or change each separately (PPU $0-FFF and $1000-1FFF). CHR banks are 4k ($1000) in size. One thing you can do with swappable CHR banks is animate the background like Kirby’s Adventure does (by changing CHR banks every few frames).

I chose a 128k PRG ROM and 128k CHR ROM, and have it set to change each tileset separately.

Behind the scenes, the MMC1 mapper has registers at $8000,$A000,$C000, and $E000. It has to write 5 times to each, because it technically can only send 1 bit at a time. The $8000 register is the MMC1 control, the $A000 register changes the first CHR bank (tileset #0), the $C000 register changes the second CHR bank (tileset #1) (does nothing if CHR are in 8k mode), and the $E000 register changes which PRG bank is mapped to $8000-BFFF.

This is all tricky to program, in general, and more so for cc65. It is important to keep the main C code and all libraries in the fixed bank, including the init code (crt0.s) where the reset code and vectors are and neslib.s where the nmi code is. Level data should go in swapped banks. Infrequently used code should go in swapped banks.

Music is special. You would typically reserve an entire bank for music code and data. And all the music functions have to swap the music code/data in place to use it. You will need to explicitly put the music in a certain bank and change the SOUND_BANK definition to match it (in crt0.s).

Most of this new code was written by cppchriscpp with slight modification by me. Here’s the link to Chris’s code…



Things I changed.

I included all the files in the MMC1 folder. The .c and .h file at the top of the main .c file. The .asm files are included near the bottom of crt0.s.

In the header (crt0.s)…Flag 6 indicates the mapper # = 1 (MMC1). The NES_MAPPER symbol is defined in the .cfg file. Flags 8, indicate 1 PRG RAM (WRAM) bank. At the top of crt0.s the SOUND_BANK bank will need to be correct, and music put in the corresponding segment.

Also in crt0.s, I added the MMC1 reset code, and include the 2 .asm files in the MMC1 folder. I put the music in BANK 6, and now bank 6 is swapped before the music init code is called. All CHR files are put in the CHARS segment, which is 128k in size (it’s not completely filled).

The neslib.s file in the LIB folder has also been changed, specifically the nmi code and the music functions.

Each segment is defined in the .cfg file… MMC1_128_128.cfg. In the asm files, you just have to put a .segment “BANK4” to put everything below that in BANK 4. In the .c and .h files, you have to do this…

#pragma rodata-name (“BANK4”)
#pragma code-name (“BANK4”)

RODATA for Read Only data, like constant arrays.
CODE for code, of course.

Look at the ROM in a hex editor, and you can see how the linker constructed the ROM. I specifically wrote strings called “BANK0” in bank #0 and “BANK1” in bank #1, etc.


What’s new?

Banked calls are necessary, when calling a function in another bank.

banked_call(unsigned char bankId, void (*method)(void));

What this does is push the current PRG bank on an array, swap a new one in place, call the function with a function pointer, return, and pop the old bank back into place, then return. You can even nest banked_calls from one swapped bank to another, but there is a limit of 10 deep before it breaks. In fact, these banked_calls are very slow, so try to stay in one bank as much as possible before switching.

Also, music functions are in their own bank, and it has to do a similar…save bank, swap new one, jump there, return, pop bank, return… thing for any music or sfx call. So, try to minimize how many sfx you call in a frame.

set_prg_bank(unsigned char bank_id);

Use this to read data from a swappable bank (from the fixed bank). It sets a specific bank at $8000, and then you can access the data there.

set_chr_bank_0(unsigned char bank_id);
set_chr_bank_1(unsigned char bank_id);

Use these to change the CHR banks. bank_0 for the first set (which I used for background). bank_1 for the second set (which I used for sprites).

set_mirroring() to change the mirroring from Horizontal to Vertical to Single Screen.

If you are doing a split screen, like with a sprite zero hit, you could set the CHR bank that shows at the top of the screen with this function.


And turn it off with this function.




Look at the code…

in main(), it uses

banked_call(BANK_0, function_bank0);

This function swaps bank #0 into place, then calls function_bank0(), which prints some text, “BANK0”, on the screen.

banked_call(BANK_1, function_bank1);

Does the same, but if you look at function_bank1(), it also calls

banked_call(BANK_2, function_bank2);

to show that you can nest one banked call inside another…up to 10 deep.

And we see that both “BANK1” and “BANK2” printed, so both of those worked.

Next we see that this banked_call() can’t take any extra arguments, so
you would have to pass arguments with global variables…I called them arg1 and arg2.

arg1 = ‘G’; // must pass arguments with globals
arg2 = ‘4’;
banked_call(BANK_3, function_bank3);

function_bank3() prints “BANK3” and “G4”, so we know that worked. Passing arguments by global is error prone, so be careful.

Skipping to banked_call(BANK_5, function_bank5);

function_bank5() also calls function_2_bank5() which is also in the same bank. You would use standard function calls for that, and not banked_call(). It printed “BANK5” and “ALSO THIS” so we know it worked alright. Use regular functions if it’s in the same bank.

Finally, banked_call(BANK_6, function_bank6); reads 2 bytes from the WRAM at $6000-7FFF. Just to have an example of it working. In the .cfg file I stated that there is a BSS segment there called XRAM. At the top of this .c file I declared a large array wram_array[] of 0x2000 bytes. You can read and write to it as needed.

It printed “BANK6” and “AC” (our test values) correctly.

Once we return to the main() function, we know we are in the fixed bank. Without using banked_call() we could swap a bank in place using set_prg_bank(). We could do that to read data in bank… like, for example, level data. You just read from it normally, as if that bank was always there.

I recommend you never use set_prg_bank() and then jumping to it without using banked_call(). The bank isn’t saved to the internal bank variable. If an NMI is triggered, the nmi code swaps the music code in place and then uses the internal bank variable to reset the swapped bank before returning… and that would be the wrong bank, and it would crash. Actually, this scenario might work without crashing. But I still recommend using the banked_call().

There is an infinite loop next, that reads the controller, and processes each button press.

Start button = changes the CHR bank. It calls this function…


Which changes the background tileset. You notice that the sprite (the round guy) never changes. Sprites are using the second tileset. If we wanted to change the second tileset, we would use…





I am also testing the music code. Start calls a DMC sample. Button A calls the song play function music_play(). Button B calls a sound effect sfx_play(). and Select pauses and unpauses the music with music_pause().

I just wanted to make sure that the music code is working correctly, because I rewrote the function code. All the music data and code is in bank #6, and the code swaps in bank #6 and then calls the music function. Then swaps banks back again before returning.

I didn’t show any examples of changing the mirroring, but that is possible too.


Link to the code…



I’m glad I got this working. Now on to actual game code. Oh, also… I was using MMC1_128_128.cfg but you could double the PRG ROM to 256k, by using the MMC1_256_128.cfg (edit the compile.bat linker command line arguments).

You could easily turn the WRAM at $6000-7FFF into save RAM by editing the header. Flags 6, indicate contains battery-backed SRAM, set bit 1… so add 2 to flags 6 in the header in crt0.s.

Maybe next time I will make an MMC3 demo, which can easily use 512k PRG ROM and 256k CHR ROM, and has a scanline counter. Also, there is a homebrew mapper 30, the oversized UNROM 512 board with extra CHR-RAM and (optional) 4-screen mirroring. Either would be easy to adapt the banked call system.


It has been brought to my attention that some of the MMC1 boards do not always reliably boot with the last bank in the fixed position at $c000. Some MMC1 games put reset vectors in every bank, which put the correct bank in place. My example code does not do that. It assumes the last bank is fixed and in place. Every emulator that I know of will boot it with the last bank fixed at $c000. But, this assumption may break if loaded onto a real cartridge. The user might have to hit reset a few times to get it to work. Or, they might get angry and play another game.


All Direction Scrolling

I’ve been planning to make a game that can freely scroll in all directions, like a modern game. The issue with the NES is that it only has enough VRAM for 2 nametables. So, the background is mirrored left to right (horizontal mirror) or top to bottom (vertical mirror). You could use 4 screen mirroring, but that required an extra RAM chip on the cartridge, which was an expense that almost no games used (and people seem to think this is a bit of a cheat, as it isn’t using the basic hardware).

And, when you only have 2 nametables to work with, moving in the perpendicular direction will always be visible. The oldest TVs tended to cut off as much as 8 pixels from the edges of the screen, so maybe you couldn’t see the changes. But, modern TVs and emulators can show 100% of the pixels (256 x 240). So, we want to minimize the visible changes. Including the attribute table, which only has 16×16 granularity.

Vertical Mirroring

#0 #1

#0 #1

A side scroller layout can easily hide the right and left scroll changes, just off screen. But you would see the changes at the top and bottom. Ideally you would have it only update when the Y scroll is aligned to the 8’s (08, 18, 28, 38, etc). So that 8 pixels at the top and 8 pixels at the bottom at most, which would be hidden by most old TVs. Like this.


Another option is to turn the screen off with carefully timed writes to the 2001 register. This would require a mapper with a scanline counter, like MMC3.


…here the bottom 16 pixels are hidden by turning the screen off at the bottom (with a write of zero to 2001). The bottom is the safest / easiest way to hide the visual glitches.

Or, a little more excessive, this game turns both the top 16 pixels off and the bottom 16 pixels off…


I guess for a more balanced visual. But, not really necessary. The screen is turned off at the top of the screen, then back on at line 16 and then back off again at line 224.


Horizontal Mirroring

#0 #0

#2 #2

With horizontal mirroring, you can easily hide changes in the top and bottom just off screen, but the right and left side changes would be visible.

I’m sure you’ve played Super Mario Bros 3, and noticed the tiles change on the right side of the screen, and be the wrong color. So why would you intentionally put the changing tiles at exactly where the user is looking? Oh, well. Let’s see what other games did.

The NES does have a way of hiding the left 8 pixels of the screen in the 2001 register.


By resetting these bits to zero xxxx x00x the left 8 pixels will use the universal BG color (3f00) for the entire strip. Thus, you would only (ideally) 8 pixels worth of changes visible, which might be hidden on the overscan of the old TV. Here on the right…


To be the least visible, you would change tiles every 8 X scroll movements (hidden on the left) and change attribute tables on the 8’s (08, 18, 28, 38, etc) to best hide that on the far right.

And, another interesting choice, Kirby changes tiles and attributes on the left 16 pixels, which would be the simplest for programming. And, change attribute tables on the 0’s.


But we could go 1 step further and draw a column of black sprites along the right edge. Combined with the left 8 turned off, this is the maximum level of hiding scrolling glitches with standard hardware. If programmed right, you shouldn’t see any tiles changing nor attribute table glitches.


…but this requires sprites to be in 8×16 mode, and steals 15 sprites from you. And, worst of all, reduces the number of usable sprites per horizontal line from 8 to 7.

I bet you thought that was enough, right, but look at this game (vertical mirroring)…


…that cuts 16 pixels off the top and more than 16 pixels off the bottom, and has the left 8 pixels turned off. Hmm. That might be a bit excessive.


Single Screen Mirroring

A few games (AxROM) use this and attempt to do all direction scrolling. Cobra Triangle minimizes the attribute table glitches by just having only 1 BG palette used for the entire play area, and then switches to another screen for the bottom area (which uses a different BG palette). The left 8 pixels are turned off to hide left to right scrolling changes, and the swapped screen at the bottom hides the top to bottom changes.


After further consideration.. Cobra Triangle might just be a lousy example. Some games use single screen mirroring and do a great job. They use all 4 palettes, and they use the HUD to hide the screen errors. Some good examples are Battletoads, Solar Jetman, and Wizard & Warriors 2 and 3. They all look great, so maybe this is a good idea.

Battletoads (USA)-0Solar Jetman - Hunt for the Golden Warpship (USA)-0

They both turn the left 8 pixels off, and use a sprite zero hit to switch the nametables and scrolling. Solar Jetman even draws a tiny dot on the bottom left on the background so the sprite zero hit will work correctly. That must take some fancy programming, since the dot needs to move slightly each frame.



Alternating Mirroring

Some mappers (like MMC1 and MMC3) can change between Horizontal and Vertical mirroring. Usually having sections of the game that are strictly side scrolling…


and some sections that are strictly vertical scrolling…


But, I don’t consider these all-direction scrolling. And it requires a special mapper.



I haven’t written the code to make my own all-direction scroller, yet. I did some test code for 4 screen mirroring and scrolling any direction, but I would prefer to rewrite it with standard 2 screen mirroring.

And I think the easiest would be to do what Kirby did, with the left 8 pixels turned off and attribute tables updated on the left on the 0’s (10, 20, 30, 40) of X scroll change. So I might attempt that first.


What is everything?

Someone asked me to explain all the files. Let’s try to do that. Look at the most complicated one.


in BG/

The .tmx files are from tiled map editor. Tiled uses the Metatiles.png and Sprites.png files to as its tileset, and the exported tilemaps are the .csv files. You can’t import a csv into a c project, so I wrote some python scripts to convert .csv files into .c files. Two different python files, one for background and one for sprites, so all the .c files with “SP” are sprite object lists.

All the .c files are included into the project.


The .nes file is our game that can be run in an emulator.

The .s file is the assembly generated by the cc65 compiler, just in case you want to debug by looking though the ASM.

The labels.txt file is a list of all the addresses of every label in the code. You can also use this for debugging, by setting breakpoints for these addresses.

in LIB/

is all the neslib files (neslib.h and neslib.s) and all the funtions that I wrote (nesdoug.h and nesdoug.s). The .s files are included in crt0.s near the bottom. The .h files are included in the main c file.


The .ftm files are Famitracker 0.4.6 files. 1 for music and 1 for sfx. The music file was exported as a .txt file (included here) and processed with text2data (from Shiru’s famitone2 files) into TestMusic3.s. The sfx was exported as .nsf (Nintendo Sound Format) and then processed with nsf2data (also famitone2) into SoundFx.s.

famitone2.s is the famitone code (again, Shiru’s website). All the .s files are included somewhere in crt0.s. If we had DPCM samples, they would also be included in crt0.s in the “SAMPLES” segment.

in NES_ST/

The .nss files are NES Screen Tool 2.3 files. I saved one (title) as a .h compressed RLE, which the game includes and decompresses. I made a .nss file with all the kinds of block in the game (metatiles), and save to .nam (uncompressed list of the entire screen), which meta.py python script converted into a C array in metatiles.txt. Also I did a print screen from both the metatile.nss and sprite.nss files and cropped down in GIMP and save those as the .png files up in the BG/ folder (used by tiled map editor).

The other files are…

License.txt – just the MIT licence

Sprites.h – arrays of all the metasprite definitions, generated by NES Screen Tool (or by hand, which is sometimes faster)

compile.bat – to recompile the project (on Windows)

crt0.s – is all the startup code, but also a convenient place to include asm or binary files.

full_game.c – the game code. I like to use notepad++ to write my code.

full_game.chr – the graphics file, split into 2 sections, the 256 BG tiles, and the 256 Sprite tiles.

full_game.h – variables and constants and prototypes, and some constant arrays

level_data.c – lists of game data and things included for each level

nrom_32k_vert.cfg – the linker file for ld65, tells where each segment goes in the final binary.

screenshot26.png – just a picture of the game.

And some more info on compile.bat

This is a list of command line inputs for compiling the game from scratch. If you change the name of your project, change set name=full_game” to match the main code filename. Or you can change this line

cc65 -Oirs %name%.c –add-source


cc65 -Oirs filename.c –add-source

where filename is the name of the .c file. You can process multiple C files this way, just make sure to also use ca65 to convert each .s file into a .o object file.

-Oirs are optimizations.

–add-source tells it to put the source code in comments in the asm file.

If you have multiple object files you need to change the linker (ld65) line to list every .o file.

ld65 -C nrom_32k_vert.cfg -o %name%.nes crt0.o %name%.o nes.lib -Ln labels.txt


ld65 -C nrom_32k_vert.cfg -o targetname.nes crt0.o file1.o file2.o file3.o nes.lib -Ln labels.txt

-o blah tell it what to name the output file.

If you use cl65 instead of cc65/ca65/ld65, then make sure to set target = NES. It sets a strange default target, which does not use standard ASCII encoding.

del *.o deletes the object files.

move /Y labels.txt BUILD\
move /Y %name%.s BUILD\
move /Y %name%.nes BUILD\

moves these files into the BUILD folder


just runs the game, which is in the BUILD folder.


Hope this helps.



22. Zapper / Power Pad



The zapper gun came with many NES consoles. They are pretty common, but you need a CRT TV for them to work. You can also play on most emulators using the mouse to click on the screen (make sure you set the 2nd player input to “zapper”).

It is possible to play a modified game on an non-CRT TV using a Tommee brand Zapp Gun. You would need to add a 3-4 frame delay in the code, since LCD screens typically have 3-4 frames of lag. You couldn’t play the standard Duck Hunt, but some people have been working on modifying zapper games to have a delay before reading the zapper.

Ok, so we’ve all seen Duck Hunt. Do you know how it works?


The game reads the 2nd player port until it gets a signal that the trigger has been pulled. Once that happens, it blacks out the background and replaces 1 object per frame with a big white blob. If there are 2 objects on screen, it will display the first object on 1 frame and the other object on the next frame… to tell which object you hit (if any).

Duck Hunt uses 32×32 pixel box for standard ducks (4 sprites by 4 sprites). Which is about the size you want for a medium level difficulty. Other games had much bigger things to shoot and much bigger white boxes. To vary the difficulty, they would then change the amount of time that the target was on screen.


Duck Hunt used 24×16 pixel box for clay pidgeons. (3 wide by 2 high). Which is kind of difficult to hit. 24×24 is a little more reasonable, and that’s what I used.


I had to write a different controller reading routine, due to the zapper using completely different pins than the standard controllers. I threw in this zaplib.s and zablib.h files.

zap_shoot(1) takes 1 arg, (0 for port 1, 1 for port 2). And, it returns 0 or 1 if it reads the trigger has been pulled.

I also checked to make sure the last frame didn’t also have trigger pulled.

zapper_ready = pad2_zapper^1;

zapper_ready is just the opposite of the last frame. That’s a bitwise XOR, if you’re not familiar with the caret operator “^”.

So when we get that signal, I turn off the BG… ppu_mask(0x16); and draw a white box where the star was… draw_box();

0x16 is 0001 0110 in binary, see the xxxx 0xxx bit is zero. That’s the “show BG” bit. It’s off, but the sprites are still on.

Then immediately, I turn the BG back on, ppu_mask(0x1e); so the flicker is minimized.

0x1e is standard display. 0001 1110 in binary has the xxxx 1xxx bit active again.

And I call zap_read(1) takes 1 arg, (0 for port 1, 1 for port 2).

which is a loop that waits till it either gets a signal that the gun sees white, or that the end of the frame has been reached. It returns 0 for fail and 1 for success.

I tested this on a real CRT on a real NES, and it seems to work fine at 5 ft. This game uses port 2 for the zapper, which is kind of standard. But, you could make a game that uses port 1, if you wanted to. zaplib.s will have to be included in crt0.s and zaplib.h will have to be included in your main c file.

If you want the game to run on a Famicom, you should put the zapper reads on port 2. The zapper will have to be plugged into the expansion port which is read from port 2.



Also see the wiki for more technical info.




Power Pad


The Power Pad is a little less standard. There weren’t many games that used it. Just like the zapper, the Power Pad uses different pins than the standard controller, so I had to rewrite the code that reads the input. See padlib.s and padlib.h.

read_powerpad(1) takes 1 arg, 0 for port 1, 1 for port 2.

And, it returns a 2 byte value (unsigned int). And I made these contants to represent each foot pad.

#define POWERPAD_4 0x8000
#define POWERPAD_2 0x4000
#define POWERPAD_3 0x2000
#define POWERPAD_1 0x1000
#define POWERPAD_12 0x0800
#define POWERPAD_5 0x0400
#define POWERPAD_8 0x0200
#define POWERPAD_9 0x0100

#define POWERPAD_6 0x0040
#define POWERPAD_10 0x0010
#define POWERPAD_11 0x0004
#define POWERPAD_7 0x0001

padlib.s will have to be included in crt0.s and padlib.h will have to be included in your main c file. This was tested on a real Power Pad on a real NES.

If you want the game to run on a Famicom, you should put the power pad reads on port 2. The power pad will have to be plugged into the expansion port which is read from port 2.



Also see the wiki for more technical info.



21. Finished Platformer

First thing I added was a title screen. To be honest, I made this as quickly as possible, just to show the proof of concept. I made it in NES Screen Tool, and exported the background as a compressed RLE .h file “title.h”. So the title mode waits for the user to press start, and cycles the color a bit to be a little less boring.


temp1 = get_frame_count();
temp1 = (temp1 >> 3) & 3;

title_color_rotate is an array of 4 possible colors.

Now, I didn’t like making 1 room at a time, so I made it so you could have 1 long Tiled file, and export it to csv, and convert it to arrays with CSV2C_BIG.py.


(I had it auto generate an array of pointers at the bottom, but I didn’t end up using those, so I delete them, and instead made 1 larger array of pointers, with all levels in it).

const unsigned char * const Levels_list[]={

I am using 2 kinds of enemies and 2 kinds of coins.

And then, I made a picture with all the Sprite objects on it (with transparency for background), and imported that as a separate tileset in Tiled, and added another layer, where the Sprite objects are placed. I placed the enemies on that 2nd layer (as tiles), and exported another csv file. Then I wrote another .py file “CSV2C_SP.py” to convert those into arrays.


Well, I didn’t end up using it exactly like this. It mixes the coins and the enemies, and I want them in separate arrays. So, I cut and pasted the different kinds of objects into 2 different arrays. But the .py file is helpful, and definitely sped this up.

These arrays might need to be edited slightly, like if we need coins at a different X offset from the tile grid.


Again, I made 2 kinds of enemies. The chasers now collide with walls. I was going to use the same code that the hero used, but decided it was too slow to check many objects this way, so I wrote a much simpler one.

bg_collision_fast(). This only checks 2 points instead of 4.

The chaser code isn’t very smart, they only move X and never change Y. If you put them on a platform, they would float right off it like a ghost. Maybe in the future I will edit this with improved enemy move logic, so he won’t float off a platform, but rather change directions, or fall, or something.

The other enemy is the bouncer. He just bounces up and down. He checks the floor below when falling, to stop exactly at the floor point, reusing the same code from the hero checking the floor.


The second kind of coin is just an end-of-level marker. I suppose we could have added some cool sound fx for reaching the end of the level, or an animation. That would have been nice. Currently, it just fades to black in the switch mode.

Oh, yeah. I added more modes. More game states. title, game, pause, end, switch (transition from one level to another). These are fairly obvious as how they work.

Debugging wasn’t too bad. Mostly I was worried about running past the frame and getting slowdown. I was testing code by placing gray_line() around functions. This helped me speed up things. I combined some of the enemy steps to speed them up. And I would put a gray_line() at the end of the game logic to see how far down on the screen we were. Here’s one of the tests, back when I thought I was going to use sprite zero hit and a HUD above.


We don’t want to make our enemy logic too much more complex, nor put too many objects on the same screen, or we might get slow down, so we need to test as we go, and see how many enemies we can fit on a screen before it crawls to a halt. I think we can handle 7 or 8. That’s more than I need, so we’re still ok.

And finally, I put the # of coins as sprites in the top left. I didn’t put it too high up, where it might get cut off on some old TVs. 16 pixels down is fine.

Oh, almost forgot. The sprite shuffing code. Remember from the Sprite page, I mentioned that you can only have 8 sprites per horizontal line? Well, since that is a possibility, we must add some kind of shuffling to the position of each object inside the OAM buffer, so that we don’t have an enemy disappear entirely.

The simplest way to do this would be to change the starting position each frame. oam_set(rand8() * 4);  or something like that. It wouldn’t be very good, though.

I decided to go through the list of enemies in a different order each frame.

const unsigned char shuffle_array[]={

So, the first pass, it goes forward 0-15. The second pass it goes in reverse. The 3rd pass it does even then odds. The 4th pass, reverse that. This would break immediately if we changed the # of enemies, so it could use some improvement. Also, I’m not shuffling coins, so I have to make sure there aren’t too many coins on the same horizontal line.

And here’s our working game, with 3 levels, 8 rooms each. This could have been expanded a bit. It takes about 2000 bytes per level. We have about 16000 bytes left, so we could have added 7-8 more levels… so maybe 10 levels total. If we needed more than that, we would need to think about a different mapper, or maybe some kind of compression.





And, that’s all. Go make some games.