- 1 Introduction
- 2 Atari 8-bit family
- 3 Commodore 64
- 3.1 Setting up
- 3.2 Memory map
- 3.3 Program file overhead
- 3.4 Autostart
- 3.5 Video display
- 3.6 Sound
- 3.7 Useful Kernal functions
- 3.8 Additional Resources
Wanting to start sizecoding on a 6502 platform in this day and age can be tough.
So here is a bit of help to get you started:
The 6502 processor
The 6502 processor can be seen as the 8-bit micro ARM chip. It has only has 3 registers (Accumulator, X and Y registers) and a handful of instructions to work with.
To be added.
When using the 6502 for sizecoding, you'll mostly be working from zeropage
General 6502 Resources
- 6502.org http://www.6502.org/
- 6502 instruction reference http://www.6502.org/tutorials/6502opcodes.html
- 6502 books http://retro.hansotten.nl/6502-sbc/
- 6502 Assembler tutorial https://dwheeler.com/6502/oneelkruns/asm1step.html
- Easy 6502 code tester https://skilldrick.github.io/easy6502/
- Synthetic instructions https://wiki.nesdev.com/w/index.php/Synthetic_instructions#8-bit_rotate
Atari 8-bit family
The systems in this family are: Atari 400, 800, 5200, 1200XL, 600XL, 800XL, 130XE, 65XE, 800XE and XEGS.
The Atari 8-bit systems consists of the 6502 with custom hardware for graphics and sound.
Setting up your development platform for the Atari 8bit systems is quite easy, first get the following tools:
- Assembler: MADS Assembler - This assembler has nice macros for creating Binaries and SNA snapshot files out of the box. You can download it at https://mads.atari8.info/
- Emulator(s): I Found Altirra to work best for my usecase. Make sure to use the original Rev2 rom for best compatibility.
Special Memory Adresses
- FRAMECOUNTER_HIGH = 19
- FRAMECOUNTER_LOW = 20
Video display on the Atari 8bit systems use the ANTIC and GTIA chips. Information can be found here:
Sync with frame
RTCLOK equ $0012 lda RTCLOK+2 waits cmp RTCLOK+2 beq waits
Or if you don't mind trashing RTCLOK
RTCLOK equ $0012 waits lsr RTCLOK+2 bcc waits
Which is two bytes shorter.
Getting something on screen
;fill screen with charset,(88,89)=an address org $600; free 6th page:600-6ff ldy #0 fl: tya sta(88),y iny bne fl jmp *
To be added soon.
SDMCTL = $022f HPOSP0 = $d000 SIZEP0 = $d008 GRAFP0 = $d00d COLPM0 = $d012 FRAMECOUNTER_HIGH = 19 FRAMECOUNTER = 20 WSYNC = $d40a VCOUNT = $d40b sinewave = $0600 ; to $06ff org $80 main ; disable all graphics/colors ldx #0 stx SDMCTL ldy #$7f sty SIZEP0 ; size p0=127 ldx #0 ldy #$3f make_sine: value_lo lda #0 clc delta_lo adc #0 sta value_lo+1 value_hi lda #0 delta_hi adc #0 sta value_hi+1 sta sinewave+$c0,x sta sinewave+$80,y eor #$7f sta sinewave+$40,x sta sinewave+$00,y lda delta_lo+1 adc #8 sta delta_lo+1 bcc nothing inc delta_hi+1 nothing inx dey bpl make_sine updateloop: ; vblank lda VCOUNT bne updateloop ; clear graphics sta HPOSP0 sta GRAFP0 ldy #0 lda #47 sta COLPM0 yloop: tya ; graphics shape = y sta WSYNC sta GRAFP0 ; a = sin(frame+y)+48 tya adc FRAMECOUNTER tax lda sinewave,x adc #48 sta HPOSP0 iny bne yloop jmp updateloop run main
The Atari 8-bit use the POKEY chip to generate sound.
; from Analog, 1985-09,pp.25-32 *=$2000 AUDF1=$D200 AUDC1=$D201 AUDCTL=$D208 SKCTL=$D20F .MACRO SOUND ; voice,pitch,dist,vol;,dur lda #%2 sta AUDF1+2*%1 lda #[[%3 * 16] | %4] ;lda #[[%3 shl 4] or %4] sta AUDC1+2*%1 .ENDM lda #0 sta AUDCTL lda #3 sta SKCTL SOUND 0,121,10,8 jmp *
Make some noise
To be added soon.
Sizecoding resource for the Atari 8bit are:
- Mapping the Atari https://www.atariarchives.org/mapping/
- Atari 8bit Memory map https://www.atariarchives.org/mapping/memorymap.php
- Fready's undocumented 6502 opcodes https://github.com/FreddyOffenga/6502
- Atari OS Rev2 disassembly for MADS assembler https://github.com/ilmenit/A800-OS-XL-Rev2
- Fready's github https://github.com/FreddyOffenga/
The Commodore system consists of a 6502-compatible MOS 6510 processor with custom hardware for graphics and sound.
Setting up your development platform for the Commodore systems is quite easy, first get the following tools:
- Assembler: To be added
- Emulator(s): VICE is the way to go
0-1 $0-$1 PROCESSOR PORT REGISTERS, not usable as RAM 0-255 $0-$FF ZEROPAGE, mostly used by BASIC interpreter and Kernal 256-511 $100-$1FF SYSTEM STACK 512-600 $200-$258 INPUT BUFFER 601-630 $259-$276 FILE SYSTEM DATA 631-640 $277-$280 KEYBOARD BUFFER, filled by Kernal ISR upon key presses 641-678 $281-$2A6 VARIOUS KERNAL VARIABLES 679-767 $2A7-$2FF Unused (89 byte) 768-779 $300-$30B KERNAL VECTORS 780-783 $30C-$30F REGISTER VALUES A,X,Y,SR for SYS 784-819 $310-$333 KERNAL VECTORS 820-827 $334-$33B Unused (8 byte) 828-1019 $33C-$3FB DATASETTE BUFFER 1020-1023 $3FC-$3FF Unused (4 byte) 1024-2023 $400-$7E7 SCREEN MEMORY (25 lines times 40 columns) 2024-2039 $7E8-$7F7 Unused (16 byte) 2040-2047 $7F8-$7FF SPRITE COSTUME POINTERS 2048 $800 must contain a value of 0 so that the BASIC program can be RUN 2049-40959 $801-$9FFF BASIC AREA (code, variables, arrays, strings) 40960-49151 $A000-$BFFF BASIC ROM (unused RAM underneath) 49152-53247 $C000-$CFFF UPPER RAM AREA (unused) 53248-53294 $D000-$D02E VIC-II, memory-mapped graphics chip registers) 54272-54300 $D400-$D41C SID, memory-mapped sound chip registers) 55296-56295 $D800-$DBE7 Color RAM, only lower 4 bits are stored, higher nibble undefined 56296-56319 $DBE8-$DBFF Unused (24 half-bytes) 56320-56335 $DC00-$DC0F CIA#1, memory-mapped I/O chip registers for keyboard, joystick, mouse, datasette, IRQ control 56576-56591 $DD00-$DD0F CIA#1, memory-mapped I/O chip registers for serial bus, RS232, NMI control 57344-65535 $E000-$FFFF KERNAL ROM 65534-65535 $FFFE-$FFFF Execution address of interrupt service routine, can only be set if KERNAL is turned off
For a detailed explanation of the C64 memory map check out Commodore 64 memory map on Joe Forster/STA homepage.
Program file overhead
A standard CBM file consists of a 2 byte loading address followed by the data that is placed into memory starting at this address. In addition, if your machine program should be able to be started with "RUN", you need a BASIC stub that is loaded into memory from $0801 to $080c, typically followed by your machine program:
*=$0801 .word $080b ;address of next BASIC line .word LINENO ;line number, can be chosen between 0 and 63999 .byte $9e ;token for SYS command .byte $32,$30,$35,$39 ;address to jump to in ASCII code: "2059" .byte $00 ;end of BASIC line start lax #0 ;actually address of next BASIC line, $xx00 means end of BASIC program, use $xx for an opcode ; your code here
Together with the 2 byte load address, this makes an overhead of 14 bytes for your program, coming with A and X initialised to 0. To reduce this, you can
- omit the BASIC line and tell users to start your program by directly entering "SYS address"
- save your program from $0802 on, the byte at $0801 is recovered automatically after loading by the relink function. This reduces your program file size by 1 byte
- accept the loss and use the 2 byte establishing LINENO for something good, they are copied to zeropage addresses $39/$3a automatically, saving you the effort of initializing the addresses
- use an autostart method (see below)
Some compos also specify that the BASIC loader does not count towards the overall filesize, check the rules before you worry.
Unlike the Commodore 128, the Commodore 64 has no dedicated built-in auto-boot feature, but it is possible to autorun a program after loading by specifying the load address so that it overwrites a vector or address that is called by the operating system. There are multiple possibilities:
Overwriting CHROUT vector at $326-$327
The vector at address $326-$327 points at the CHROUT routine, normally at address $F1CA. If you put the entry address of your program into $326/27, your code will be called when the operating system wants to print "READY." after loading. Note that the vector at address $328-$329 points at the STOP routine that checks the status of Stop key indicator. Since this routine is called multiple times before your code is done loading, its default value $F6ED needs to be preserved.
*=$0326 .word start .byte $ed,$f6 start ; rest of code
Together with the two byte load address, the overwritten CHROUT vector and the preserved STOP vector, your program file will have a deadweight of 6 byte: $26 $03 $0a $03 $ed $f6, less than half of the standard version with a BASIC stub with SYS
- If your program after $32a is longer than 214 byte, part of your program will load into the screen at $400. Thus, issuing a clear screen will overwrite your program. It is possible to move the screen memory somewhere else, but that requires storing a new value into address $D018, which when done with an LDA # / STA pair will cost you 5 byte
- Since you changed the CHROUT pointer, standard KERNAL functions for printing won't work. For output of single characters, you can still use the direct address $F1CA. Alternatively, you can restore the vectors by calling $FD15, which will overwrite the memory $314 to $333 with the default values.
- The memory area $33C-$3FB is the datasette buffer, thus loading a program into this area won't work when loading from a cassette.
Overwriting BASIC idle loop vector at $302-$303
After loading the program, the BASIC idle loop will be called. By overwriting the vector at $302/$303 it is possible to autostart your program. Since the memory before $302 can be safely overwritten, the program would be arranged so that $303 is the last address of your program. If your program requires a loop back to the start, the vector can be used as part of a JMP where the following two bytes are already there, saving one byte in comparison to a loop done with a branch command.
*=$304 - (end - start) start: ;your code here ; ... jmp start ;the address for start lies in $302/$303, thus will be called by the Kernal after loading end:
In cases where the vector has a practical use, the overhead in filesize is only the 2 byte loading address. For programs larger than 93 byte keep in mind that addresses before $2A7 are in use by the Kernal.
Overwriting return address on stack at $1F8
When the load routine is called, the stack pointer is always at $FA at a stock C64. Thus the return address from the loading process is stored in memory address $1F8 (low byte) and $1F9 (high byte). During the loading routine, stack usage will clobber the stack below $1F8, but a program loaded to $1F8 and above will be fine and the overwritten address stored at $1F8/$1F9 will be called after loading.
*=$1F8 .word start-1 ;adjust for the RTS command pulling the address from stack and increasing it by 1 start: ;your code here
For programs larger than 95 byte keep in mind that addresses after $259 are in use by the Kernal, you might overwrite some values here if you don't have disk operations, but better check the memory map beforehand.
Together with the two byte load address and the overwritten address at $1F8/$1F9, your program file will have a deadweight of 4 byte when using this method.
Putting your program into zeropage overwriting part of CHRGET
CHRGET is a machine code routine for the BASIC interpreter that is placed in the zero page at $73-$8A. By putting your program at $7C and following, the CHRGET routine runs directly into your code.
*=$7C start: ;your code here
With this method, there is no need for a start address vector. The only deadweight in your program is the loading address. However, the zero page is a busy place, if your program extends beyond address $8f (that is it has more than 22 bytes on disk) you have to deal with some parts of your program being overwritten, for example, the status byte in $90 is set after loading.
Loading data into address $A9 has an interesting side effect - the vector of $AE/$AF defines the address where the loaded bytes are written to. If $AE gets overwritten with, for example, $40, the next byte will be written to $41 (because the vector is incremented after the write). This can be used to split your program into different parts of memory.
Video display on the Commodore has the following native video modes:
- Standard text mode, 40x25 characters, 1 individual foreground color per character, 1 common background color
- Multicolor text mode, 40x25 characters, only 8 foreground colors, can be mixed with hires chars, foreground < 8 denotes a hires char, with foreground >= 8 a pair of two bits (brick pixel) define color: background, foreground & %111, common color 1+2
- Extended background color text mode, 40x25 characters, 1 individual foreground color per character, 1 out of 4 background colors, only 64 chars
- Hires bitmap, 320x200 pixels, an 8x8 field can have an individual foreground and an individual background color
- Multicolor bitmap, 160x200 pixels, a pair of two bits (brick pixel) define color: background, screen color lower nibble, screen color higher nibble, color from color RAM
Sync with frame
Depending on the VIC type, there are 262/263 raster lines (NTSC) or 312 raster lines (PAL-B). Therefore, the VIC uses register at $D012 and the highest bit of register $D011 to represent the current raster line. Checking only the lower 8 bits and waiting for them to be 0 would therefore not work because this happens two times per frame. However, any number greater than 64 will only trigger once. Using a value of 255, we would sync to a rasterline at the end of the screen, well below the part where graphics are normally displayed:
waitsync: ldx $D012 inx bne waitsync
As long as there have more than 63 cycles passed since the snyc, the next sync will happen at the next frame. This works fine as long as we don't miss the line 255 due to an interrupt. If your program has interrupts that are not synced to a rasterline, you can change the condition as follows
lda #$E3 waitsync: cmp $D012 bcs waitsync
Now we have a window between raster line $E4 and $FF to trigger our sync, which we will meet even if a standard Kernal interrupt comes in-between. Of course, our main program now has to burn around 1700 cycles to avoid a retriggering in the same frame.
Getting something on screen
Output a character:
lda #C ;C must be the char code according to the PETSCII code table jsr $FFD2 ;you can also use the direct address $FC1A, function preserves A,X,Y and enables interrupt
Since the PETSCII code table contains also control characters, this method can be also used to clear the screen, move the cursor or set the text color. However, some functions are shorter (and faster) by directly calling the respective KERNAL function:
Clear screen JSR $E544 Cursor home position JSR $E566 Output newline character JSR $AAD7 Set text color LDA #COLOR, STA $286 Set border color LDA #COLOR, STA $D020 ;not possible with a PETSCII control code Set background color LDA #COLOR, STA $D021 ;not possible with a PETSCII control code
Copy characters directly to screen memory:
.macpack cbm ;scrcode macro SCREEN_BASE=$400 LINE=5 ;for example COLUMN=10 ;for example ldx #endtext-text-1 @loop: lda text,x sta SCREEN_BASE+LINE*40+COLUMN,x dex bpl @loop text: scrcode "hello world!" endtext:
Note that the text needs to be stored in screen code format, not in PETSCII (see Commodore 64 PETSCII code to screen code conversion for the differences). Furthermore, the example above only writes to the screen memory, not the color RAM, assuming that the color RAM already contains the intended text color. This is the case after issuing a clear screen on most C64 machines, except for some old ROM revisions. To be on the save side you would have to write to color RAM as well, which costs another 5 byte:
.macpack cbm ;scrcode macro SCREEN_BASE=$400 COLOR_RAM =$D800 LINE =5 ;for example COLUMN =10 ;for example COLOR =1 ;white ldx #endtext-text-1 @loop: lda text,x sta SCREEN_BASE+LINE*40+COLUMN,x lda #COLOR sta COLOR_RAM +LINE*40+COLUMN,x dex bpl @loop text: scrcode "hello world!" endtext:
The Commodore 64 uses the famous SID chip to generate sound. To be added soon.
Make some noise
To be added soon.
Useful Kernal functions
Get a pseudo random number
lda #1 jsr $E09A lda $63
The sequence of random numbers is always the same after a cold start or reset, the first 20 numbers are:
Delay 1 Millisecond
Delay ~ 2 Seconds
This is actually Commodore's memory testing routine that is run every time the computer is turned on. Execution takes about 2 seconds to execute.
Increment a zero page counter, return in X register, clear Y register
This is part of the string handling function in the BASIC ROM, but the code could be useful for some loop stuff.
jsr $B5FF ;does inc $23, ldx $23, ldy #$00, rts
- Codebase 64 https://codebase64.org/doku.php?id=base:start
- Commodore 64 Memory map https://sta.c64.org/cbm64mem.html
- C64 BASIC & KERNAL ROM Disassembly https://www.pagetable.com/c64ref/c64disasm/
- No More Secrets https://csdb.dk/release/?id=198357 Everything about illegal opcodes and their use in a C64 system