6502 based CPUs

From SizeCoding
Revision as of 12:35, 19 February 2021 by Wil (talk | contribs) (Sync with frame: $d011 and $d012 was mixed up)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Wanting to start sizecoding on a 6502 platform in this day and age can be tough.


So here is a bit of help to get you started:

The 6502 processor

The 6502 processor can be seen as the 8-bit micro ARM chip. It has only has 3 registers (Accumulator, X and Y registers) and a handful of instructions to work with.

Adressing modes

To be added.

Zero page

When using the 6502 for sizecoding, you'll mostly be working from zeropage

General 6502 Resources

Atari 8-bit family

The systems in this family are: Atari 400, 800, 5200, 1200XL, 600XL, 800XL, 130XE, 65XE, 800XE and XEGS.

The Atari 8-bit systems consists of the 6502 with custom hardware for graphics and sound.

Setting up

Setting up your development platform for the Atari 8bit systems is quite easy, first get the following tools:

  • Assembler: MADS Assembler - This assembler has nice macros for creating Binaries and SNA snapshot files out of the box. You can download it at https://mads.atari8.info/
  • Emulator(s): I Found Altirra to work best for my usecase. Make sure to use the original Rev2 rom for best compatibility.

Special Memory Adresses


Video display

Video display on the Atari 8bit systems use the ANTIC and GTIA chips. Information can be found here:

Sync with frame

RTCLOK      equ $0012
      lda RTCLOK+2
      cmp RTCLOK+2
      beq waits

Or if you don't mind trashing RTCLOK

RTCLOK      equ $0012
      lsr RTCLOK+2
      bcc waits

Which is two bytes shorter.

Getting something on screen

;fill screen with charset,(88,89)=an address
 org $600; free 6th page:600-6ff
 ldy #0
fl: tya
 bne fl
 jmp *

To be added soon.

SDMCTL	= $022f
HPOSP0  = $d000
SIZEP0  = $d008
GRAFP0  = $d00d
COLPM0  = $d012

WSYNC	= $d40a
VCOUNT	= $d40b

sinewave	= $0600		; to $06ff

		org $80

	; disable all graphics/colors
	ldx #0
	stx SDMCTL	

	ldy #$7f
	sty SIZEP0	; size p0=127
	ldx #0
	ldy #$3f
			lda #0
			adc #0
			sta value_lo+1
			lda #0
			adc #0
			sta value_hi+1
			sta sinewave+$c0,x
			sta sinewave+$80,y
			eor #$7f
			sta sinewave+$40,x
			sta sinewave+$00,y
			lda delta_lo+1
			adc #8
			sta delta_lo+1
			bcc nothing
			inc delta_hi+1
			bpl make_sine

		; vblank
		lda VCOUNT
		bne updateloop

		; clear graphics
		sta HPOSP0
		sta GRAFP0

		ldy #0
		lda #47
		sta COLPM0
		tya           ; graphics shape = y
		sta WSYNC
		sta GRAFP0

		; a = sin(frame+y)+48
		lda sinewave,x
		adc #48
		sta HPOSP0
                bne yloop
		jmp updateloop

		run main


The Atari 8-bit use the POKEY chip to generate sound.


; from Analog, 1985-09,pp.25-32

.MACRO SOUND ; voice,pitch,dist,vol;,dur
	lda #%2
	sta AUDF1+2*%1
    lda #[[%3 * 16] | %4] ;lda #[[%3 shl 4] or %4]
    sta AUDC1+2*%1

 lda #0
 lda #3
 sta SKCTL

 SOUND 0,121,10,8

 jmp *

Make some noise

To be added soon.

Additional Resources

Sizecoding resource for the Atari 8bit are:

Commodore 64

The Commodore system consists of a 6502-compatible MOS 6510 processor with custom hardware for graphics and sound.

Setting up

Setting up your development platform for the Commodore systems is quite easy, first get the following tools:

  • Assembler: To be added
  • Emulator(s): VICE is the way to go

Memory map

    0-1          $0-$1      PROCESSOR PORT REGISTERS, not usable as RAM
   0-255         $0-$FF     ZEROPAGE, mostly used by BASIC interpreter and Kernal
  256-511      $100-$1FF    SYSTEM STACK
  512-600      $200-$258    INPUT BUFFER
  601-630      $259-$276    FILE SYSTEM DATA
  631-640      $277-$280    KEYBOARD BUFFER, filled by Kernal ISR upon key presses 
  641-678      $281-$2A6    VARIOUS KERNAL VARIABLES
  679-767      $2A7-$2FF    Unused (89 byte)
  768-779      $300-$30B    KERNAL VECTORS
  780-783      $30C-$30F    REGISTER VALUES A,X,Y,SR for SYS
  784-819      $310-$333    KERNAL VECTORS
  820-827      $334-$33B    Unused (8 byte)
  828-1019     $33C-$3FB    DATASETTE BUFFER
 1020-1023     $3FC-$3FF    Unused (4 byte)
 1024-2023     $400-$7E7    SCREEN MEMORY (25 lines times 40 columns)
 2024-2039     $7E8-$7F7    Unused (16 byte)
 2040-2047     $7F8-$7FF    SPRITE COSTUME POINTERS
 2048          $800         must contain a value of 0 so that the BASIC program can be RUN
 2049-40959    $801-$9FFF   BASIC AREA (code, variables, arrays, strings)
40960-49151   $A000-$BFFF   BASIC ROM (unused RAM underneath)
49152-53247   $C000-$CFFF   UPPER RAM AREA (unused)
53248-53294   $D000-$D02E   VIC-II, memory-mapped graphics chip registers)
54272-54300   $D400-$D41C   SID, memory-mapped sound chip registers)
55296-56295   $D800-$DBE7   Color RAM, only lower 4 bits are stored, higher nibble undefined
56296-56319   $DBE8-$DBFF   Unused (24 half-bytes)
56320-56335   $DC00-$DC0F   CIA#1, memory-mapped I/O chip registers for keyboard, joystick, mouse, datasette, IRQ control
56576-56591   $DD00-$DD0F   CIA#1, memory-mapped I/O chip registers for serial bus, RS232, NMI control
57344-65535   $E000-$FFFF   KERNAL ROM
65534-65535   $FFFE-$FFFF   Execution address of interrupt service routine, can only be set if KERNAL is turned off

For a detailed explanation of the C64 memory map check out Commodore 64 memory map on Joe Forster/STA homepage.

Program file overhead

A standard CBM file consists of a 2 byte loading address followed by the data that is placed into memory starting at this address. In addition, if your machine program should be able to be started with "RUN", you need a BASIC stub that is loaded into memory from $0801 to $080c, typically followed by your machine program:

        .word $080b   ;address of next BASIC line
        .word LINENO  ;line number, can be chosen between 0 and 63999
        .byte $9e     ;token for SYS command
        .byte $32,$30,$35,$39  ;address to jump to in ASCII code: "2059" 
        .byte $00     ;end of BASIC line
start   lax #0        ;actually address of next BASIC line, $xx00 means end of BASIC program, use $xx for an opcode
; your code here

Together with the 2 byte load address, this makes an overhead of 14 bytes for your program, coming with A and X initialised to 0. To reduce this, you can

  • omit the BASIC line and tell users to start your program by directly entering "SYS address"
  • save your program from $0802 on, the byte at $0801 is recovered automatically after loading by the relink function. This reduces your program file size by 1 byte
  • accept the loss and use the 2 byte establishing LINENO for something good, they are copied to zeropage addresses $39/$3a automatically, saving you the effort of initializing the addresses
  • use an autostart method (see below)

Some compos also specify that the BASIC loader does not count towards the overall filesize, check the rules before you worry.


Unlike the Commodore 128, the Commodore 64 has no dedicated built-in auto-boot feature, but it is possible to autorun a program after loading by specifying the load address so that it overwrites a vector or address that is called by the operating system. There are multiple possibilities:

Overwriting CHROUT vector at $326-$327

The vector at address $326-$327 points at the CHROUT routine, normally at address $F1CA. If you put the entry address of your program into $326/27, your code will be called when the operating system wants to print "READY." after loading. Note that the vector at address $328-$329 points at the STOP routine that checks the status of Stop key indicator. Since this routine is called multiple times before your code is done loading, its default value $F6ED needs to be preserved.

Application example:

        .word start              
        .byte $ed,$f6
; rest of code

Together with the two byte load address, the overwritten CHROUT vector and the preserved STOP vector, your program file will have a deadweight of 6 byte: $26 $03 $0a $03 $ed $f6, less than half of the standard version with a BASIC stub with SYS


  • If your program after $32a is longer than 214 byte, part of your program will load into the screen at $400. Thus, issuing a clear screen will overwrite your program. It is possible to move the screen memory somewhere else, but that requires storing a new value into address $D018, which when done with an LDA # / STA pair will cost you 5 byte
  • Since you changed the CHROUT pointer, standard KERNAL functions for printing won't work. For output of single characters, you can still use the direct address $F1CA. Alternatively, you can restore the vectors by calling $FD15, which will overwrite the memory $314 to $333 with the default values.
  • The memory area $33C-$3FB is the datasette buffer, thus loading a program into this area won't work when loading from a cassette.

Overwriting BASIC idle loop vector at $302-$303

After loading the program, the BASIC idle loop will be called. By overwriting the vector at $302/$303 it is possible to autostart your program. Since the memory before $302 can be safely overwritten, the program would be arranged so that $303 is the last address of your program. If your program requires a loop back to the start, the vector can be used as part of a JMP where the following two bytes are already there, saving one byte in comparison to a loop done with a branch command.

Application example:

*=$304 - (end - start)
        ;your code here
        ; ...
        jmp     start    ;the address for start lies in $302/$303, thus will be called by the Kernal after loading

In cases where the vector has a practical use, the overhead in filesize is only the 2 byte loading address. For programs larger than 93 byte keep in mind that addresses before $2A7 are in use by the Kernal.

Overwriting return address on stack at $1F8

When the load routine is called, the stack pointer is always at $FA at a stock C64. Thus the return address from the loading process is stored in memory address $F8 (low byte) and $F9 (high byte). During the loading routine, stack usage will clobber the stack below $1F8, but a program loaded to $1F8 and above will be fine and the overwritten address stored at $1F8/$1F9 will be called after loading.

Application example:

        .word start-1   ;adjust for the RTS command pulling the address from stack and increasing it by 1
        ;your code here

For programs larger than 95 byte keep in mind that addresses after $259 are in use by the Kernal, you might overwrite some values here if you don't have disk operations, but better check the memory map beforehand.

Together with the two byte load address and the overwritten address at $1F8/$1F9, your program file will have a deadweight of 4 byte when using this method.

Putting your program into zeropage overwriting part of CHRGET

CHRGET is a machine code routine for the BASIC interpreter that is placed in the zero page at $73-$8A. By putting your program at $7C and following, the CHRGET routine runs directly into your code.

Application example:

        ;your code here

With this method, there is no need for a start address vector. The only deadweight in your program is the loading address. However, the zero page is a busy place, if your program extends beyond address $8f (that is it has more than 22 bytes on disk) you have to deal with some parts of your program being overwritten, for example, the status byte in $90 is set after loading.

Loading data into address $A9 has an interesting side effect - the vector of $AE/$AF defines the address where the loaded bytes are written to. If $AE gets overwritten with, for example, $40, the next byte will be written to $41 (because the vector is incremented after the write). This can be used to split your program into different parts of memory.

Video display

Video display on the Commodore has the following native video modes:

  • Standard text mode, 40x25 characters, 1 individual foreground color per character, 1 common background color
  • Multicolor text mode, 40x25 characters, only 8 foreground colors, can be mixed with hires chars, foreground < 8 denotes a hires char, with foreground >= 8 a pair of two bits (brick pixel) define color: background, foreground & %111, common color 1+2
  • Extended background color text mode, 40x25 characters, 1 individual foreground color per character, 1 out of 4 background colors, only 64 chars
  • Hires bitmap, 320x200 pixels, an 8x8 field can have an individual foreground and an individual background color
  • Multicolor bitmap, 160x200 pixels, a pair of two bits (brick pixel) define color: background, screen color lower nibble, screen color higher nibble, color from color RAM

Sync with frame

Depending on the VIC type, there are 262/263 raster lines (NTSC) or 312 raster lines (PAL-B). Therefore, the VIC uses register at $D012 and the highest bit of register $D011 to represent the current raster line. Checking only the lower 8 bits and waiting for them to be 0 would therefore not work because this happens two times per frame. However, any number greater than 64 will only trigger once. Using a value of 255, we would sync to a rasterline at the end of the screen, well below the part where graphics are normally displayed:

        ldx $D012
        bne waitsync

As long as there have more than 63 cycles passed since the snyc, the next sync will happen at the next frame. This works fine as long as we don't miss the line 255 due to an interrupt. If your program has interrupts that are not synced to a rasterline, you can change the condition as follows

        lda #$E3
        cmp $D012
        bcs waitsync

Now we have a window between raster line $E4 and $FF to trigger our sync, which we will meet even if a standard Kernal interrupt comes in-between. Of course, our main program now has to burn around 1700 cycles to avoid a retriggering in the same frame.

Getting something on screen

Output a character:

        lda     #C       ;C must be the char code according to the PETSCII code table
        jsr     $FFD2    ;you can also use the direct address $FC1A, function preserves A,X,Y and enables interrupt

Since the PETSCII code table contains also control characters, this method can be also used to clear the screen, move the cursor or set the text color. However, some functions are shorter (and faster) by directly calling the respective KERNAL function:

Clear screen             JSR $E544
Cursor home position     JSR $E566
Output newline character JSR $AAD7
Set text color           LDA #COLOR, STA $286
Set border color         LDA #COLOR, STA $D020   ;not possible with a PETSCII control code
Set background color     LDA #COLOR, STA $D021   ;not possible with a PETSCII control code

Copy characters directly to screen memory:

.macpack cbm   ;scrcode macro
LINE=5    ;for example
COLUMN=10 ;for example

        ldx     #endtext-text-1
@loop:  lda     text,x
        sta     SCREEN_BASE+LINE*40+COLUMN,x
        bpl     @loop
text:   scrcode "hello world!"

Note that the text needs to be stored in screen code format, not in PETSCII (see Commodore 64 PETSCII code to screen code conversion for the differences). Furthermore, the example above only writes to the screen memory, not the color RAM, assuming that the color RAM already contains the intended text color. This is the case after issuing a clear screen on most C64 machines, except for some old ROM revisions. To be on the save side you would have to write to color RAM as well, which costs another 5 byte:

.macpack cbm     ;scrcode macro
LINE       =5    ;for example
COLUMN     =10   ;for example
COLOR      =1    ;white

        ldx     #endtext-text-1
@loop:  lda     text,x
        sta     SCREEN_BASE+LINE*40+COLUMN,x
        lda     #COLOR
        sta     COLOR_RAM  +LINE*40+COLUMN,x
        bpl     @loop
text:   scrcode "hello world!"


The Commodore 64 uses the famous SID chip to generate sound. To be added soon.

Make some noise

To be added soon.

Useful Kernal functions

Get a pseudo random number

        lda     #1   
        jsr     $E09A
        lda     $63

The sequence of random numbers is always the same after a cold start or reset, the first 20 numbers are:


Increment a zero page counter, return in X register, clear Y register

This is part of the string handling function in the BASIC ROM, but the code could be useful for some loop stuff.

        jsr     $B5FF    ;does inc $23, ldx $23, ldy #$00, rts

Additional Resources

Apple II

The Apple II is an 8-bit home computer and one of the world's first highly successful mass-produced microcomputer products. It was designed primarily by Steve Wozniak.

Setting up

Compilation can be done as follows (master.dsk can be found with applewin)

acme hl.asm
java -jar AppleCommander-1.3.5.jar -d master.dsk hl
java -jar AppleCommander-1.3.5.jar -p master.dsk hl B 24576 < hl.bin

Memory Map

  256-511      $100-$1FF    SYSTEM STACK
  512-767      $200-$2FF    KEYBOARD CHARACTER BUFFER
  976-1023     $3D0-3FF     SYSTEM VECTORS
 1024-2047     $400-$7FF    TEXT AND LO-RES GRAPHICS PAGE 1 <--- !!!
 8192-16383   $2000-$3FFF   HI-RES PAGE 1 OR FREE SPACE <--- !!!
16384-24575   $4000-$5FFF   HI-RES PAGE 2 OR FREE SPACE
24576-38999   $6000-$95FF   FREE SPACE AND STRING STORAGE
38400-49151   $9600-$BFFF   DOS
49152-53247   $C000-$CFFF   I/O HARDWARE (RESERVED)
63488-65535   $F800-$FFFF   SYSTEM MONITOR


Graphics Modes

Text Mode 40x24, for Apple IIe available 80x25 - use PR#3 for switch mode, or hardware switch
LowRes 40x48, 16 colors: https://en.wikipedia.org/wiki/Apple_II_graphics
Hires mode 280x192,6 colors: https://www.xtof.info/blog/?p=768

https://mrob.com/pub/xapple2/colors.html https://archive.org/details/HiRes_Color_Graphics_on_the_Apple_II_Computer_by_Wozniak

However for sizecoding, you almost never want to do direct-access to graphics for Apple II in size-coding because the Apple II graphics modes are horrible. The only fast way to do things is with large lookup tables. To do hires you need to divide by 7 which as you can imagine is a bit difficult to do compactly on 6502. Double-hires is even crazier on top of that. Deater did manage a color-bar style effect in double-hires in 128B but that was doing some crazy tricks with the firmware BASIC routines, definitely not direct-access.

Lores and Hires can be mixed modes and full-graphics The screen structure is called memory holes(https://retrocomputing.stackexchange.com/questions/2534/what-are-the-screen-holes-in-apple-ii-graphics). The GBASCALC($F847) procedure is used to calculate the address of the horizontal line : IN:reg.A=Y, out : GBASL/GBASH($26/$27)=address. See also https://www.callapple.org/uncategorized/use-of-apple-ii-color-graphics-in-assembly-language/

Getting something on screen

Here is an example of a XOR texture, created by g0blinish

 !to "HL.bin", plain	; set output file and format
 !cpu 6502		; set processor type

GBASL	=	$26
GBASH	=	$27
SETGR    =     $FB40 ; setup LoRes
GBASCALC = $F847 ; calc Address

CLRTEXT =  $C050 ;display graphics 
SETTEXT =  $C051 ;display text 

CLRMIXED = $C052 ;clear mixed mode- enable full graphics 
SETMIXED = $C053 ;enable graphics/text mixed mode 

PAGE1 =    $C054 ;select text/graphics page1 
PAGE2 =    $C055 ;select text/graphics page2 

CLRHIRES = $C056 ;select Lo-res 
SETHIRES = $C057 ;select Hi-res 


	JSR   SETGR      ;GR
	BIT CLRMIXED ; full screen

	LDA #0 ; A=0

;	LSR ; A=A/2
	LDY #0;Y=0

	EOR TMP ; A=A xor PEEK($FA)
	and #$0F ; A=A and 15
	TAX ; X=A
	STA(GBASL),Y ; POKE PEEK($26)+256*PEEK($27)+Y,A
	INY ; Y=Y+1
	CPY #40 ; Y=40?
	CMP #24 ; A=24?
M1 JMP M1 ; replace to RTS

COLORS ;N*17, pixel format is AAAABBBB, AAAA - upper dot, BBBB - lower dot
!byte $00,$11,$22,$33,$44,$55,$66,$77
!byte $88,$99,$AA,$BB,$CC,$DD,$EE,$FF


Here is an example for using the speaker, based onthe following basic program:

; 50  POKE 768,V: POKE 769,P - 255 *  INT (P / 256): POKE 800,1 + P / 256
; 60  CALL 770: RETURN 
; 95  FOR K = 1 TO N: READ V(K),P(K): NEXT K
; 100  FOR K = 1 TO N:V = V(K):P = P(K)
; 110  GOSUB 50
;!byte 173,48,192,136,208,5,206,1,3,240,9,202,208,245,174,0,3,76,2,3,206,32,3,208,240,96
 !to "HL.bin", plain	; set output file and format
 !cpu 6502		; set processor type

; 95  FOR K = 1 TO N: READ V(K),P(K): NEXT K
; 100  FOR K = 1 TO N:V = V(K):P = P(K)
	lda #70
	sta cnt+1
	lda #music&255
	sta gotbyte+1
	lda #music/256
	sta gotbyte+2

	jsr gotbyte
	sta L300
	jsr gotbyte
	jsr gotbyte
	sta L301
	jsr gotbyte
	adc #1
	sta L320
	jsr beep
	dec cnt+1
cnt lda #70
	bne lop
; 110  GOSUB 50
; 50  POKE 768,V: POKE 769,P - 255 *  INT (P / 256): POKE 800,1 + P / 256
; 60  CALL 770: RETURN 
	jmp ini
	lda music
	inc gotbyte+1
	bne noinch
	inc gotbyte+2
;!byte 173,48,192,136,208,5,206,1,3,240,9,202,208,245,174,0,3,76,2,3,206,32,3,208,240,96
	ldy #1
	ldx #1
		LDA	$C030

		BNE	loc_30D
		DEC	L301
		BEQ	loc_316

		BNE	loc_305
		LDX	L300
		JMP	loc_302
		DEC	L320
		BNE	loc_30B
L301 !byte 0
L300 !byte 0
L320 !byte 0
 !word 76,192,85,64,96,64,102,64,114,128,114,64,96,64,102,64,114,64,128,64
 !word 114,64,152,64,171,64,152,512,76,192,85,64,96,64,102,64,114,128,114,64
 !word 96,64,102,64,114,64,128,64,114,64,152,64,171,64,152,512,85,64,85,64
 !word 85,64,96,64,144,128,144,64,128,64,76,128,85,64,96,64,144,128,114,64
 !word 96,64,102,128,114,64,128,64,128,128,114,64,128,64,114,512,85,64,85,64
 !word 85,64,96,64,144,128,144,64,128,64,76,128,85,64,96,64,144,128,114,64
 !word 96,64,102,128,114,64,128,64,128,64,128,128,96,64,85,64,96,64,102,64,114,64,114,64

Additional Resources

Atari Lynx

The Atari Lynx consists of the 6502 with custom hardware for graphics and sound.

Setting up

Setting up your development platform for the Atari Lynx:

  • Assembler: -
  • Emulator(s): -

Video display

To be added soon.

Getting something on screen

To be added soon.


To be added soon.

Make some noise

To be added soon.

Additional Resources

Sizecoding resource for the Atari Lynx are sparse

  • 42Bastian's website (link to be added)