Things I Forgot

There are so many topics to discuss. It would be very difficult to cover 100%. Let’s briefly mention a few of the things I forgot to mention.


Multiplication and Division

SNES has hardware to perform multiplication and division. Actually, it has 2 ways to do multiplication. You can look at my EasySNES code to find examples of these (link below, search for multiply: and divide:). They are a little slow, so you can’t expect to do this 100x a frame. You have to wait a several cycles before you get a result for the regular multipy and divide functions (see the NOP opcodes, which do nothing but wait).

If you aren’t using Mode 7, there is a second multiply function (signed) which is much faster. It’s called multiply_fast: in this link.

Any other higher level math will have to be done with LUTs (look up tables), precalculated byte arrays.


Development Cart

I have a Super Everdrive, made by Krikzz, from Ukraine. And, ooh, they even have one that supports the SuperFX chip. I wish I had that one.

Well, this one is over $200 US, but the basic model is less than $100, and it is well worth it. It works great. It uses a MicroSD card to hold the game ROMs.


Mode 7

This is the big enchilada, but I haven’t quite figured out this mode. Especially, setting up a tool chain for editing. Mode 7 can stretch and zoom and rotate. Many of the coolest SNES games use this in some way. It’s just currently above my skill level.

I do plan to work on this in the future. I might even make my own tools. But, that might be months, or even years from now.


IRQ Timers

This is for timing mid screen events. You should try to use HDMA instead. If all 8 HDMA channels are being used, you could do a 9th thing with IRQ timers.

You need to enable IRQ timers (probably just the V timer). and CLI to enable IRQs on the CPU. And you need to add code to the IRQ handler. Once set, the counter will trigger an IRQ signal once the PPU reaches a specific scanline. H counter would fire an IRQ signal every scanline, and you probably don’t want that.


Enhancement Chips

Another thing that is a bit over my head. The SA-1 chip is just a much faster 65816 chip, and that might be the easiest to use.

Some chip names = DSP1, DSP1A, DSP1B, DSP2, DSP3, DSP4, GSU1 (aka MarioChip1 aka SuperFX), GSU2, GSU2-SP1, OBC1, SA-1, S-DD1, S-RTC, SPC7110, ST010, ST011, ST018, CX4.

Some of the functions they do…

decompression code

trigonometry functions

image zooming and rotation

converting bitmaps to tiles

drawing vector graphics, triangles

real time clock

enemy AI functions (probably wouldn’t be useful to you)


The cool chip is the SuperFX chip (GSU). That’s what StarFox used. It would be nice if I could figure it out, and explain it. But, I can not.


Other Modes

Hi resolution. Modes 5 and 6 are double horizontal resolution. They can also, optionally, do an interlaced mode which doubles vertical resolution. Very few games used hi resolution.

Offset per tile Modes 2 and 4. I need to investigate these a bit more. I don’t want to put incorrect information here.



For LoROM, SRAM is mapped to banks $70–$7D in the $0000-$7FFF addresses. And also in the $FE-$FF banks in the $0000-$7FFF addresses. (7e and 7f banks are the WRAM, so that couldn’t be used for SRAM). That gives a total possible 512kB SRAM (battery backed save RAM).

HiROM, as usual, is completely different.

You will also need to indicate in the header that the ROM is using SRAM. I think that’s mapped to $FFD7, but it’s this line in the header.asm file

.byte $00 ; backup RAM size

The value is (2^# in kB). 3 is 8kB, 4 is 16kB, 5 is 32kB, 6 is 64kB, 7 is 128 kB, 8 is 256kB, and 9 is 512kB. 0 means 0kB.

Oh, and the previous line, mapped to $FFD6, should have the d1 bit (0000 0010) set. To indicate a battery for the SRAM.

Once you have correctly set this, the emulator should automatically be creating SRAM save files, that persist after power off. You can freely read and write to this anytime, and you can save your game by keeping the progress stored in the SRAM.


SNES main page

Color Math

SNES programming tutorial. Example 12.


What is color math? If you’ve ever worked with Photoshop, it would be like blending 2 different layers. In this case the layers are the MAIN screen and the SUB screen. Everything we have done so far deals with the MAIN screen. So let me try to explain the SUB screen.

All that stuff that the SNES does to produce a picture, putting layers on top of each other, tile priorities, sprite priorities, etc… it does all that TWICE. If you set the settings for the MAIN screen exactly the same as the settings for the SUB screen, it would produce the exact same picture TWICE… with 1 difference.  The main screen uses color index zero as the backdrop color (any pixel that is transparent), and the sub screen uses the “fixed color” as the backdrop color (register $2132).

You would never see the SUB screen, unless you turned on the color math registers, which would then blend the 2 pictures together, using either addition or subtraction. And then there is an optional halving step after that. Each pixel on the screen, the R values are added or subtracted, and the G value, then the B value. That value is clamped to the max and min without overflow. (each RBG value is 0-31)

Let’s say we have it set to ADD. And the main screen pixel is gray 15,15,15, and the sub screen pixel is dark red 10,0,0. The final pixel would be 25,15,15.

If we added the HALF option, each value would shift right once (rounding down), giving a final pixel of 12,7,7.

If we set the color math to SUBTRACT (no halving), the final pixel would be 5,15,15. The RGB values of the sub screen are subtracted from the RBG values on the main screen.

If we added the HALF option, each value would shift right once (rounding down), giving a final pixel of 2,7,7.

Note, any pixel in the sub screen that is transparent will not be halved.

The main use for Color Math is for transparency effects. You will want Adding and Halving. That would equally blend the main and sub screen.

The least useful setting is the subtract and halving. That would just produce a very dark picture, and almost no games used this.


There is a completely different kind of color math operation, that uses ONLY the fixed color. That color is applied to the entire MAIN screen, and if halving is set, it will work for the whole screen. If you set the fixed color register to green, and had the color math set to ADD, it would add a green tint to the screen.

The fixed color register is weird. The wiki example suggest writing each color separately to it (3 writes for R,B, and G). However, you could set them all to a specific value with 1 write. Such as LDA #$E0, STA $2132 would set all fixed colors to zero.

Before we dive into the code, here’s a video. You can probably skip most of this video, which goes into too many details about how the 2 different screens are generated.


Example ROM

I put BG1 on the main screen (gray rocks) and BG2 on the sub screen (color bars).

No effect. Color Math disabled.



Just the Sub screen. (seen by setting the “clipping always to black” bits in the color math logic, and adding the sub screen).


Note, the top left is black (non-zero index). The bottom left is zero index (transparent).  The sub screen will show the “fixed color” (register 2132) where there is transparent. Right now the fixed color is black. Color halving will not work for a transparent pixel on the sub screen. If you notice, the bottom left square will not change at all for these examples, even when halving is indicated.


Color Math Adding.



Color Math Adding and Halving.



Color Math Subtracting.



Color Math Subtracting and Halving.



Fixed color only (red at 50%), Color Math Adding.



Here’s a YouTube video of the Example code.


Example Code

cc = main screen black if... *
--mm---- = prevent color math if... *
------0- = fixed color
------1- = sub screen
d is for an unrelated thing

* 00 => Never
  01 => Outside Color Window only
  10 => Inside Color Window only
  11 => Always

0------- add
1------- subtract
-0------ normal
-1------ result is halved
b = backdrop, o = sprites, 4321 = layers enabled for color math


So let’s go over each examples.

1- no effect, turn off color math

lda #$30 ; = off
sta color_add_sel ; $2130
;and make sure fixed color is black
lda #$e0 ; RGB, value = 0
sta color_fixed ; $2132

2- adding

lda #$02 ; color math with subscreen
sta color_add_sel ; $2130

;adding, not half, affect all layers 
lda #$3f
sta color_add_des ; $2131

3- adding and half, same as last one, just add one bit to the 2131 write

;adding, half, affect all layers 
lda #$7f
sta color_add_des ; $2131

4- subtracting

lda #$02 ; color math with subscreen
sta color_add_sel ; $2130

;subtracting, not half, affect all layers 
lda #$bf
sta color_add_des ; $2131

5- subtracting and half. Same as last one, but add one bit to the 2131 write

;subtract, half, affect all layers	
lda #$ff
sta color_add_des ; $2131

6- fixed color only

;turn on color math, fixed color mode
lda #$00
sta color_add_sel ; $2130

;adding, not half, affect all layers 
lda #$3f
sta color_add_des ; $2131

;set the fixed color to red 50%
lda #$2f ;red at 50%
sta color_fixed ; $2132

We could have also set half mode.

7- see just the sub screen. We did this by setting the “always clip main screen to black” bits in 2130, and then adding the sub screen to the now completely black main screen.

lda #$c2 ;= clip main always to black
sta color_add_sel ; $2130

;adding, not half, affect all layers 
lda #$3f
sta color_add_des ; $2131


Other examples

Color math only affects some sprites. Only sprites that use palettes 4-7 are affected by color math. That is why Mario (and the little ghosts) are solid.

Super Mario World (USA)_006

Windowing can affect where the color math applies. With HDMA adjusting the window, you can make some cool effects.

Contra III - The Alien Wars (USA)_000


Super Mario World (USA)_008


Tint the whole screen (adding a fixed color)… actually, upon further investigation, this is subtracting, which makes the screen slightly darker than the original. Also, the COLOR MATH is not in fixed color mode, it’s in subscreen mode, but NOTHING is enabled on the subscreen, so the subscreen is filled with the backdrop color (which for the sub screen is the fixed color). I guess that works too.

Legend of Zelda, The - A Link to the Past (USA)_000


Smooth Transparencies (add and halving). This is the most common transparency effect on the SNES.

Legend of Zelda, The - A Link to the Past (USA)_001

Sparkster, the water.

Sparkster (USA)_000


And creating shadows (subtracting) Mortal Kombat II. It’s hard to tell, but their shadows are created by color math subtraction. You could also give the appearance of clouds moving overhead by subtracting a cloud shape and having it scroll.

Mortal Kombat II (USA)_000




SNES main page

HDMA Examples

SNES programming tutorial. Example 11.


HDMA is a way to write to PPU registers while the screen is drawing. You can change values at specific scanlines, to create unique effects.

The H is for H-Blank. Remember before, when we talked about V-blank (vertical blank), where the PPU isn’t doing anything for a short while after drawing each screen? Well, it also pauses a VERY SHORT time after drawing each horizontal line. Just long enough for the 8 HDMA channels to quickly change a register or send data, before the screen goes to write the next line.

They work in order, 1,2,3,4,5,6,7,8. They can all write 1 thing (1,2 or 4 bytes) per line. Or you can set them to wait a specific number of lines before changing a value.

HDMA uses the same registers as the DMA registers, and you shouldn’t use both at the same time. You should write zero to HDMA enable ($420c) before performing a DMA. Because the oldest revision of the SNES has a bug where it can crash if they both happen at the same time.

Here’s an interesting video on DMA and HDMA.



Here’s some things you can do with HDMA.

Changing the BG color with HDMA, to create a color gradient.



Changing the Window (there are 2 windows) with HDMA, to block off portions of the screen. The windows have left and right registers, which need to be written every scanline to create these shapes.

Super Mario World (USA)_000

Super Mario World (USA)_003

Mode 7 parameters. (lots of registers to change hundreds of times a frame).



How HDMA works

(for this link, scroll down to 43×0)

When you look at the HDMA registers, it looks like you need to put MORE values than a DMA (DMA uses 4300-4305 for channel 0 and HDMA uses 4300-430a)… but you actually write LESS values. Let’s go over them briefly, then in detail. For channel 0.

4300 – yes
4301 – yes
4302-4 – yes
4305-6 – no
4307 – yes, only if using Indirect Mode.
4308-a – no

4300 - Control Register. da---ttt

D=direction, probably want 0, from CPU to PPU.

A=HDMA mode. 0 for direct, 1 for indirect. More on this later.

TTT = transfer mode, which will vary by which register we use.

000 => 1 register write once (1 byte: p)
001 => 2 registers write once (2 bytes: p, p+1)
010 => 1 register write twice (2 bytes: p, p)
011 => 2 registers write twice each (4 bytes: p, p, p+1, p+1)
100 => 4 registers write once (4 bytes: p, p+1, p+2, p+3)
101 => 2 registers write twice alternate (4 bytes: p, p+1, p, p+1)
110 => 1 register write twice (2 bytes: p, p)
111 => 2 registers write twice each (4 bytes: p, p, p+1, p+1)

4301 – the PPU destination register. 21xx. So if you write $22 here, the HDMA will write to the $2122, the CGRAM Data Register.

4302-4 – the address of the HDMA table. 2=low, 3=middle, 4=upper/bank #.

4307 – if using Indirect HDMA, this is the bank # of the Indirect Addresses.

Anything marked “no”, don’t touch them. They are used by the HDMA hardware.

Then you write the channel (bitfield, each bit represents a channel, 1 for ch0, 2 for ch1, 4 for ch2, 8 for ch3, etc) to $420c, the HDMA enable register. Presumably, you would do this each frame during v-blank.


OK, so we are pointing the HDMA registers to a table (byte array). For a direct mode, the table would be a scanline count, then (depending on the TTT mode) 1,2, or 4 bytes to be written. Then another scanline count, then more bytes. Scanline count, bytes. Scanline count, bytes. Etc, until it sees a zero in the scanline count slot. From my own examples…

.byte 32, $0f
.byte 32, $1f
.byte 32, $2f
.byte 32, $4f
.byte 32, $6f
.byte 32, $9f
.byte 32, $ff
.byte 0 ;end

That reads 32 lines, value $0f. 32 lines, value $1f. 32 lines… etc down to the terminating 0. One interesting thing is that the $0f is written immediately at the very top of the screen. THEN it waits 32 lines.

Here’s another example, when the transfer mode is “1 register write twice”.

.byte 10, 0, 0
.byte 10, 1, 0
.byte 10, 2, 0
.byte 10, 3, 0
.byte 10, 4, 0
.byte 10, 5, 0
.byte 10, 6, 0
.byte 10, 7, 0
.byte 10, 8, 0
.byte 10, 9, 0
.byte 0 ;end

10 is the scanline count. Then 2 bytes to write. Then 10 scanline count. Then 2 bytes. Etc.

Indirect Mode

43×0 register, we can set it to Indirect. Only with indirect do you need to write to 43×7, the bank of the indirect address. The table will always be sets of 3s. First the scanline count, then an indirect address (ie. pointer) to where our data is. I wrote the HDMA table like this…

.byte 8
.addr $1000
.byte 8
.addr $1002
.byte 8
.addr $1004
.byte 8
.addr $1006
.byte 8
.addr $1008
.byte 0 ;end

The .addr directive outputs a 16 bit value, low byte then high byte. I think you could have also used the .word directive. So, 8 is the scanline count, then an indirect address. Our bank byte is $7e, so the first one points to WRAM $7e1000. The second one points to $7e1002. Etc.

One of the advantages of the indirect system is that you can have a repeated pattern that changes.

I had copied the Indirect Table to $7e1000. It looks like this.

.byte 0, 0
.byte 3, 0
.byte 6, 0
.byte 7, 0
.byte 8, 0
.byte 7, 0
.byte 6, 0
.byte 3, 0
.byte 0, 0
.byte $fd, 0
.byte $fa, 0
.byte $f9, 0
.byte $f8, 0
.byte $f9, 0
.byte $fa, 0
.byte $fd, 0
.byte 0,0

All of these are values to be written with HDMA to a PPU register. In this case, a horizontal scroll register, which is write twice (low then high bytes).

This is example 3. I am also shuffling these values every 4 frames, which causes the movement of the the sine wave.


Example Code

No effect. HDMA is turned off by writing zero to $420c.



Example 1. Changing the BG color.

I’m actually setting up 2 separate HDMA transfers. First to set the CG address to zero. Second to write 2 bytes to change the #0 color. You have to rewrite the address each time, because it auto-increments when writing a color.

stz $4300 ;1 register, write once
lda #$21 ;pal_addr
sta $4301 ;destination
ldx #.loword(H_TABLE1)
stx $4302 ;address
lda #^H_TABLE1
sta $4304 ;address

lda #2
sta $4310 ;1 register, write twice
lda #$22 ;pal_data
sta $4311 ;destination
ldx #.loword(H_TABLE2)
stx $4312 ;address
lda #^H_TABLE2
sta $4314 ;address

lda #3 ;channels 1 and 2
sta hdma_enable ;$420c

And we have 2 HDMA tables (see the example code). Each time we are waiting 10 scanlines between changes. Each time, adding a little more red.

On a side note, you could set this up as a single HDMA channel. With a “2 registers write twice each” mode. You would be doing double writes to the CG address, and you would need 4 bytes after each scanline count.



Example 2. Changing window 1 left and right positions.

A window punches a hole in one or more layer. There are 2 windows, but we only need 1 for this example. The only parameters you can set are left and right (and inverse, and combinations with the other window). But with HDMA, you can adjust the window parameters as the screen draws, and draw a shape. Circle shapes are very popular.

If the left position is > than the right position, the window will not appear. That is what we are doing for the top and the bottom of the screen. You also have to tell it which layers are affected with the $212e (window for main screen) and with the $2123-5 registers.

I’m using 2 HDMA channels, and writing 1 byte to 1 register. To 2126 and 2127.

(Again this could have been done with 1 channel with a 2 register transfer mode).

This example shows the Multi Single Scanline feature. If the scanline count is >128 (not including 128), it signals a series of single scanline writes. You can omit the scanline count for a number of lines (ie. subtract 128 from the scanline count number).

.byte 60, $ff ;first we wait 60 scanlines
.byte $c0 ;192-128 = 64 lines of single entries
.byte $7f ;1st write value
.byte $7e ;2nd write value
.byte $7d ;3rd write value
.byte $7c ;4th write value
...etc...64 lines.
.byte 0 ;end

It waits 1 scanline between each write.

Each line, I am moving the left position and right position further apart, and then closer together, which forms a diamond shape.



Example 3. Changing BG1 horizontal scrolling position.

This was already discussed above, in the Indirect Mode section. We are using a sine wave pattern to create a wave in the picture, writing twice to 1 register, the horizontal scroll of BG1.

I wanted to include at least 1 example of indirect mode. I copied the value table to the RAM, so I was able to change the values to make the pattern move. See the Shuffle_f3 function in the HMDA3.asm file. The table of Indirect Addresses points to the RAM where our actual values are stored.

This example would have been even nicer if we wrote new values every scanline. Currently, we are only changing values every 8 scanlines (to make the table simpler / smaller). Maybe even every 2 scanlines would have been enough.



Example 4. Changing the Mosaic filter.

This is the simplest example. A single write to a single register. I haven’t discussed the Mosaic filter before, $2106 . The upper nibble is the mosaic amount (0 = normal, 1 = 2×2, etc up to $f = 16×16), and the lower nibble says which layers are affected. Sprites are never affected.

stz $4300 ;1 register, write once
lda #$06 ;mosaic
sta $4301 ;destination
ldx #.loword(H_TABLE6)
stx $4302 ;address
lda #^H_TABLE6
sta $4304 ;address

lda #1 ;channel 1
sta hdma_enable ;$420c

It waits 32 lines before increasing the mosaic value. There are bigger squares at the bottom. You probably wouldn’t use this exact HDMA effect in a game, but it is just an example of what is possible. You can change so many settings, even the BG mode $2105, which layers are active, change the location of a tilemap or tileset. I think you can even write new data to the VRAM (a little bit at a time).


Here’s a YouTube video of these examples.



Because HDMA settings need to be rewritten every frame, this code example would break in a lag frame… like if our game logic ran so long that it took 2 frames to complete one loop. You would have half of the frames missing the HDMA effect.

Therefore, you should put HDMA code inside the NMI code. NMI code is guaranteed to execute every frame, during v-blank. You should also put your DMA code here (copying bytes to the VRAM or OAM). Some games have very elaborate NMI code. I don’t currently have a proper example with HDMA in the NMI code. I’ll put that on my TODO list.




SNES main page

SNES Music

SNES programming tutorial. Example 10.


Here’s an interesting video on the SPC700 and SNES audio.


Today, we are going to talk about SNES music. The APU (Sony SPC700) is a different chip entirely, and has its own 64k of RAM. At the beginning of our program, we need to load the APU with our SPC file. It is an audio program that runs automatically.

The APU is connected to an 8 channel DSP (digital sound processor). The song will direct the DSP to play different sound samples at different rates to make tones. If you are familiar with MIDI, it is similar. The samples can be looped or not. The samples are compressed into a native compression called BRR (bit rate reduction).

BRR samples are very large, and you will probably be only able to fit 10-15 samples. Each will have to be edited (perhaps with Audacity) to less than a second each, and at a reduced sample rate. We are going to work with SNESGSS (written by Shiru). I got the files from here, and from the source code for one of Shiru’s SNES projects.

However, don’t use the .exe here. Apparently, there is a bug in the SNES hardware, where if the APU reads a value at the exact moment it is changed, it sometimes reads the wrong value. This can cause the game to crash. See the discussion here…

Calima wrote a bug patch, which I have directly patched into the .exe (now called snesgssP.exe (p for patch). You can get it here…

and also grab the music.asm file. You will need it.

Anyway, the patch works, but it forced me to overwrite something, and I chose to disable the streaming function. I tested this on a real SNES dozens of times with no problems.


SNESGSS prefers to have 16-bit MONO WAV samples at sample rate 32000 or 16000. I have tried 8000, but there is no improvement.

There seems to be a bug in Audacity, when you resample to another rate (Tracks/Resample), it seems to work, but then saves it to the original rate. But if you resample it, cut it, and then open a new window and start a new project at the target rate, and then paste it and then save it, it works. I don’t know what’s up with that.

Recording at the desired rate has no problems. 16000 seems to be a nice sweet spot on audio quality and file size.

SNESGSS also suggests tuning the samples to B +21 cents. I did not. I left all my samples at C. They are not in tune with the samples provided with SNESGSS, which I did not use. I think those are tuned to B +21.


Hit the WAV button near the middle of the screen to load your samples. Setting the envelopes similar to this seems good (15, 1, 7, 16) You can press the 2x or 4x buttons if you run out of room for files, to downsample by half. You can set a Loop, and adjust the numbers in “from” and “to”… which I find incredibly difficult to use.

I wouldn’t mess with the volume or EQ settings. That is something you should have done in Audacity while editing. Just keep in mind that the SNES tends to weaken the upper range and make bright sounds feel dull. You might have to do a treble boost for the lead instruments.

This tracker will convert our samples to BRR, but not until your final export. Unfortunately, you can’t import BRR samples to it from other sources. You could use other SNESGSS instruments (the ones that Shiru provided, for example).


Here you can check the size of all the files. Obliviously, you can’t have a bigger SPC file than 64k, the size of the APU RAM.

I should note, that we only load 1 song in the APU RAM at a time. Staring a new song will load a new song (over the previous song), so that only one song is loaded at any time. That should give you a little flexibility on overall size.


Here is the main editor. You type Z-M keys for lower octave, Q-P keys for upper octave. You can change the octave by pressing the octave button. So, this is a standard tracker, it goes downward as the song plays.

You can toggle channels on and off by clicking on the word “channel 1”, etc. You can divide things into sections. Press the spacebar to mark the end of a section. Then you can repeat the previous section with an R00 command.

The order of things is Note, Instrument, Volume 0-99, and Special effects. The SP column is for song speed (smaller is faster). You can scroll up and down with PgUp and PgDn keys, and also Home and End goes to the next section.

CTRL+End marks the end of the song, and CTRL+Home marks the loop back point.

You can import Famitracker and MIDI files (notes only), but I haven’t tried.


On this page, you can mark a song as a “Sound effect”.

Once the songs are done, you File/Export. And that will produce several files.

spc700.bin is our main SPC file. It holds the program and the samples and the sound effects data.

music_1.bin (one file per song) is the song data.

sounds.asm and sounds.h we don’t need. Don’t include them. This was for a different assembler / C compiler. You might want to look at it to find the value of each sound effect.

.define SFX_DING 0

…tells us that the DING sound effect is called with the value zero.

If you look in

you will see a file called This is a python script to split the SPC file into 2 pieces if needed. Let me explain that a bit. We are mapping our SNES game as LoROM, which means that our banks are only 32k. The SPC file could be 64k in size. It needs to be split up to be included without editing the linker file.

Also the music loader function will fail to load correctly if it is >32k. Because of that, I have been copying the SPC file to the 7f0000-7fffff WRAM first, and then calling the INIT function (which copies the SPC file to the APU RAM).

The example code, however, is smaller than 32k, so this step is unnecessary.



Let’s go over the music.asm file, which you should have grabbed from one of my example folders. I had to modify the original code to work with ca65.

spc_init – should be called at the start of the game, with interrupts off (NMI, IRQ, controllers). With AXY16 you load A with the address of the of SPC file (spc700.bin) and X with the bank of the SPC file, and JSL to spc_init.

This is all well and good if the SPC file is < 32k, but if it’s over 32k, and we are mapped as LoROM (ROM banks of 32k), I have had to first copy all the SPC files to $7f0000-7fffff WRAM and then load A with 0000 and X with $7f and then JSL to spc_init.

spc_init expects the SPC file to be contiguous.

SPC file > 32k also means I need to either modify the linker script, or split the SPC file into 2 chunks, and include them into 2 different banks. I decided on the later. You could do this in a hex editor. I wrote a little python script to do the same ( If you had python3 installed, you would call a command line promt and type “ spc700.bin” and it would split it into 2 files (smaller than 32k).

By the way. Running this function takes a long time. It could take 2 seconds or more.

spc_load_data is an internal function, for loading data to the APU RAM.

spc_play_song loads a song (data) to the APU RAM and then starts playing it. This also should be done with interrupts off. Note that this system only loads one song at a time to the APU RAM. If you have a song in and then load another song, it overwrites the first song.

With AXY16 load A with the address of the song data (ike music_1.bin) and load X with the bank of the song, then JSL to spc_play_song. Once it’s done, it will begin playing the song automatically.

spc_command_asm is an internal function. It’s what sends signals to the APU.

spc_stereo is to set mono (default) or stereo audio. Load A (8 or 16) with 0 for mono, 1 for stereo. Audio channels can be panned left or right.

spc_global_volume is to set the max volume, 0-127. It can also be used to fade in or fade out. One of the variables is called speed, and it is the step value, to go from previous volume to the new volume. 255 is the default speed, which is instant change (any value >= 127 would be instant). Speed of 7 seems nice for a fade, and will take 2 seconds to transition. Don’t give it a speed of zero, the volume won’t change.

AXY8 or AXY16, load A with the speed of volume change (1-255), and load X with the new volume (0-127), then jJSL spc_global_volume.

The SNES has a master volume variable, which affects all channels. That’s what this sets, and doesn’t affect individual channel volumes.

spc_channel_volume sets the max volume for an individual audio channel. AXY8 or AXY16, load A with the channels and load X with the volume (0-127) and the JSL to spc_channel_volume. I’m not sure what circumstances I would use this. Maybe to silence or dim a lead instrument, for a change in dramatic tone.

Note, the channel here is a bitfield, with each bit representing a channel.

0000 0001 = channel 1
0000 0010 = channel 2
0000 0100 = channel 3
0000 1000 = channel 4
0001 0000 = channel 5
0010 0000 = channel 6
0100 0000 = channel 7
1000 0000 = channel 8

For example, LDA #$42 (0100 0010) would effect channels 2 and 7.

music_stop stops the song. JSL here.

music_pause will pause and unpause the song (and not effect the sound effects that are playing). Load A (8 or 16) with 1 for pause and 0 for unpause, then JSL here.

sound_stop_all stops all sounds, song and sound effects. JSL here.

sfx_play_center plays a sound effect, pan center. With AXY8 or AXY16, load A with the # of the sound effect, load X with the max volume of the sound effect (0-127), and load Y with the channel (0-7), the sound effect should play. Channel needs to be higher than the max channel for the song playing. Therefore, you must reserve some empty channels in the song, if you want sound effects to play with it.

sfx_play_left, is the same, but pan left.

sfx_play_right, is the same, but pan right.

sfx_play is an internal function that the 3 above functions call.

Streaming has been removed.



This is the audio loading code from the example. It was for an SPC file smaller than 32k.

;copy music to 7f0000
BLOCK_MOVE (music_code_end-music_code), music_code, $7f0000

;copy the music code and samples to the Audio RAM 
lda #0000
ldx #$7f ;address 7f0000
jsl spc_init

lda #$0001
jsl spc_stereo

…and at the bottom we have

.segment "RODATA6"
.incbin "MUSIC/spc700.bin"


In another test project, I had an SPC file bigger than 32k. I split the SPC file in half and put then in ROM banks 6 and 7.

.segment "RODATA6"
.incbin "MUSIC/spc700_1.bin"

.segment "RODATA7"
.incbin "MUSIC/spc700_2.bin"

And the copying to the APU RAM was the same, except that I had 2 move instructions to copy to $7f0000.

;copy music to 7f0000
BLOCK_MOVE $8000, music_code, $7f0000

BLOCK_MOVE (music_code2_end-music_code2), music_code2, $7f8000

;copy the music code and samples to the Audio RAM 
lda #0000
ldx #$7f ;address 7f0000
jsl spc_init

…so that spc_init could load the entire SPC file as one contiguous chunk.


Then I load the song, and start it playing (before I turn on NMI interrupts).

lda #.loword(song1)
ldx #^song1
jsl spc_play_song

By the way “.loword()” gets a 16 bit value from a 24 bit label. ^ gets the bank of a label.


Now I just need to set up a trigger for the sound effect. We already have that yellow block triggering the screen to go dark, so I just snuck in a little more code there. I didn’t want it re-starting the same sound effect over and over and over each frame, so I added a variable to remember the LAST FRAME, if we were over the yellow block, and skip a trigger in that case.

cmp bright_var2 ;compare to last frame
beq Past_Yellow ;skip if last frame is true

lda #0 ;= ding
ldx #127 ;= volume
ldy #6 ; = channel
jsl sfx_play_center

Our song plays from channels 1-4 (ie. 0-3), and our sound effect uses 2 channels, so we could have set this to 4,5, or 6. This function is zero based index, ie. values 0-7. So 6 means it will play on channels 7 and 8. Sorry for flip flopping between zero based and one based numbers. Hope this isn’t too confusing.

However, if we loaded X with 0,1,2, or 3. It would not play. If we loaded X with 7, only the first channel of the sound effect would play.


Here’s a picture of the demo again. It looks the same as the previous example.


Here’s a Youtube video, if you want to hear it.


There are other programs for getting music onto a Super Nintendo.

You could use SNESMOD with OpenMPT. I still need to research this more before I can recommend it. I have heard that a version of SNESMOD by AugustusBlackheart and KungFuFurby is good. Sorry I can’t be more informative here.

Another program, BRRTools, can convert audio files to BRR. I haven’t used it, but the SNESGSS tool uses the same code. It says you can turn BRR samples into WAV and WAV into BRR. This could be a way to use existing BRR samples in our SNESGSS projects (by using this tool to convert them into WAV files first).


SNES main page


BG Collision

SNES programming tutorial. Example 9.


This time we are going to make a collision map, and make a sprite collide with the background. The actual graphics are not that important.

I took some pictures of some blocks (and a sketch of a cube with eyes) and Photoshopped them (GIMP) into 16×16 sized PNG (indexed 16 colors) and converted them to SNES .chr files with Superfamiconv. Actually, I expanded them to 32×16, with the left just filled with black. I thought that would give us a consistent zero index color of black, and it seems to have worked. Then I loaded everything into my M1TE tool.

I made the BG map in M1TE. It would have been nice if it supported 16×16 tiles, (all of our blocks are 16×16) but I didn’t program in 16×16 tile mode yet. I can load this into the game, no problem, but there is no easy way to make a collision map out of this. I could type one by hand. It would be an array of numbers 16×14 (224 total). Zero for blank, 1 for wall. Each number would represent a 16×16 square area of the screen.


This time I used Tiled Map Editor to make a collision map. I loaded a picture of my 3 tiles as the tileset (see right side), and recreated the BG map that I had made in M1TE. The entire purpose of this is to export a .csv file of our collision map… which is that collision array I was talking about.


The CSV file exported from Tiled.


I added some .byte directives so it can be loaded as a byte array into the asm code. 0 is blank, 1 is red wall, 2 is the yellow square. Maybe in the future I will program some clever app or tool to speed this up. This time I just copy and pasted the word “.byte” to each line and resaved it.



Our code calculates where our guy is on the map, and prevents movements if it’s over a 1 (wall). We now collide with the red walls.


How does it do that? Let’s go over the code. So our byte array has each block 16×16. We need to divide x and y pixel coordinates by 16 (the same as shift right 4x). But we also need to multiply the y by 16 to get to the correct row in our array, which cancels out the divide 16. So the algorithm is (Y & 0xf0) + (X >> 4). If we look at that index in the byte array, it will tell us if a point is in a wall or not. This is the code, with X and Y registers holding the X and Y coordinates…

and #$f0
sta temp1
lsr a
lsr a
lsr a
lsr a
ora temp1
lda HIT_MAP, x

I handled each direction separately. First do the X move, then see if any of the corners of our sprite are inside a wall. If yes, revert to previous X position. Then do the Y move, see if any of the corners of our sprite are inside a wall. If yes, revert to previous Y position.

This code would need to be a little more complex if we move more than 1 pixel per frame. If we are moving 2-3 pixels per frame, and the distance to the wall is 1 pixel, we should allow 1 pixel movement toward the wall… and not be stuck 1 pixel away from the wall. So, this code will need to be improved.


Touching the yellow square will darken the screen. We are just looking if 1 point (the middle of our guy) is over a 2 in the collision map, and changing the screen brightness variable. Remember that the $2100 register is the screen brightness. I am writing to it every frame, during v-blank. Full brightness is $0f. Half brightness is $07.


If we were scrolling in a larger world, the collision map would have to be the size of the world. You could have it compressed, and decompress it to the WRAM. You would have to keep track of X and Y movements with 2 byte variables. One thing I would not recommend is trying to read from the VRAM to see what kind of tile you are standing over. The visuals of the level should probably be separate from the collision map.

One more thing. It wouldn’t be too much trouble to turn this simple example into a platformer. You would just need to add gravity, which is adding a little bit to the Y speed every frame, and then cancelling that if your feet touch the floor. Jumping would be a sudden negative Y speed.

This is a really cool page that explains collision maps in more detail.


SNES main page


BG Scrolling

SNES programming tutorial. Example 8.


So, this isn’t so complicated. I’m using the Example 4 backgrounds, and scrolling them with the controllers. I’m not going to go over the process of making backgrounds again. We will just talk about the scrolling code.

If you press A, B, X, or Y, you will toggle which background is selected. Visible by the sprite in the corner (1,2,3). This is the map_selected variable, which has a value 0-2.

The up/down/left/right functions will do a case switch style check on the map_selected variable. Normally, you would do CMP #1, CMP #2, CMP #3, etc. But you don’t actually need to do a CMP #0. This is something I see new 6502/65816 programmers do. The previous line “lda map_selected” already sets the z flag if map_selected is zero. Lot’s of instructions set the z (zero) and n (negative) flags. LDA, LDX, LDY, TAX, TXA, TXY, PLA, PLX, PLY, etc. If a register is loaded with zero, the z flag is set and BEQ will work.

  lda map_selected
  bne @1or2
@0: ;BG1
  dec bg1_x
  bra @end
  cmp #1
  bne @2
@1: ;BG2
  dec bg2_x
  bra @end
@2: ;BG3
  dec bg3_x

Let’s follow this for each value. If map_selected is zero, the BNE won’t branch, it goes to the @0, dec bg1_x and then exits. If map_selected is 1, the first BNE will branch to @1or2. A is still loaded with map_selected, we compare it to #1, the BNE won’t branch, so we do @1, dec bg2_x and exit. If map_selected is 2, the first BNE branches to @1or2, cmp #1 is false, so the bne @2 branches us to th @2 dec bg3_x line.

Notice, moving the map right means decreasing the horizontal scroll variable. Moving it left means increasing it. Likewise, moving a screen down is decreasing the vertical scroll, and moving it up is increasing it.

Scrolling registers are write twice (8 bit) each. Always write twice. You can actually write to these registers any time, but we want to do it during v-blank so we don’t get any shearing of the background in the middle for 1 frame. Near the top of the game loop, we have jsr set_scroll. Let’s look at set_scroll.

lda bg1_x
sta bg1_scroll_x ;$210d 
stz bg1_scroll_x
lda bg1_y
sta bg1_scroll_y ;$210e
stz bg1_scroll_y

lda bg2_x
sta bg2_scroll_x ;$210f
stz bg2_scroll_x
lda bg2_y
sta bg2_scroll_y ;$2110
stz bg2_scroll_y

lda bg3_x
sta bg3_scroll_x ;$2111
stz bg3_scroll_x
lda bg3_y
sta bg3_scroll_y ;$2112
stz bg3_scroll_y

bg1_x is a 1 byte variable, because our maps are set to 1 screen only (32×32 map and 8×8 tiles). If you made the tilemap bigger (or made the tile size larger), you would need 2 bytes for each scroll variable. With 64×32 our x needs 9 bits. If you also increase tilesize to 16×16 then we need 10 bits.

You can move each layer independently. Usually, you would have BG1 be the foreground and BG2 be the background and BG3 be either the far background or the HUD (scoreboard) always fixed in one place in the front.



SNES main page