<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://www.sizecoding.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Trixter</id>
		<title>SizeCoding - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="http://www.sizecoding.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Trixter"/>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/wiki/Special:Contributions/Trixter"/>
		<updated>2026-05-05T11:46:09Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.27.0</generator>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Case_Studies&amp;diff=1897</id>
		<title>Case Studies</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Case_Studies&amp;diff=1897"/>
				<updated>2026-04-20T18:12:56Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Sometimes it can be helpful to examine a tinyprog in detail, seeing what choices were made and why:&lt;br /&gt;
&lt;br /&gt;
:[[MELT.COM]]: We take a cute program from the 1980s and see how far we can crunch it down&lt;br /&gt;
&lt;br /&gt;
:[[m8trix 8b]]: An &amp;quot;Enter The Matrix&amp;quot; screen display in only 8 bytes&lt;br /&gt;
&lt;br /&gt;
:[[4is256]]: &amp;amp;#344;r&amp;amp;#345;ola's Tetris is a full tetris clone with scoring&lt;br /&gt;
&lt;br /&gt;
:[[paint16b]]: Hellmood's paint program in 16 bytes (really)&lt;br /&gt;
&lt;br /&gt;
:[[Indian Spirit]]: An American Indian tapestry with music in 32 bytes&lt;br /&gt;
&lt;br /&gt;
:[[Essence]]: Fakery of path tracing and lighting, with audio, in 64 bytes&lt;br /&gt;
&lt;br /&gt;
:[[Memories]]: Detailed writeup on [https://www.pouet.net/prod.php?which=85227 &amp;quot;Memories&amp;quot; (256 bytes)]&lt;br /&gt;
&lt;br /&gt;
:[[Game of Life 32b]]; Writeup for game of life in 32 bytes&lt;br /&gt;
&lt;br /&gt;
:[[Christmas Tree]]: Implementation of a Christmas Tree on several platforms &lt;br /&gt;
&lt;br /&gt;
:[[MUZAK58]]: Playing some background byte-beat music through COVOX LPTDAC in 58 bytes&lt;br /&gt;
&lt;br /&gt;
== External Case Studies ==&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Brainfuck Brainfuck] is a very tiny language, having only 8 commands and an instruction pointer.  Several compilers and interpreters have been made; Gynvael archived [http://gynvael.coldwind.pl/bf125/ many different versions along with their source code], the smallest of which was 125 bytes.  qkumba took that as inspiration and created his own [http://pferrie.epizy.com/misc/tiny/brainfck.htm brainfuck compiler in 100 bytes] (104 for one that is fully compliant).  (Note:  The term &amp;quot;compiler&amp;quot; is used mistakenly a lot in these descriptions; the majority of brainfuck programs are actually interpreters that load brainfuck programs and execute native code token by token.  This one is an exception in that it really does compile the code entirely into native instructions first.)&lt;br /&gt;
&lt;br /&gt;
[https://revival-studios.com/marquee/blog/blog_homogenic.html Small Beginnings: The development of Homogenic] Development Write up on [https://www.pouet.net/prod.php?which=84537 Homogenic, by: Marquee Design (256 bytes)]&lt;br /&gt;
&lt;br /&gt;
[https://revival-studios.com/marquee/blog/blog_nanorail.html Moving on: The development of Nanorail] Development Write up on [https://www.pouet.net/prod.php?which=84624 Nanorail, by: Marquee Design (256 bytes)]&lt;br /&gt;
&lt;br /&gt;
[https://revival-studios.com/marquee/blog/blog_enigma.html Full Circle: The development of Enigma] Development Write up on [https://www.pouet.net/prod.php?which=85232 Enigma, by: Marquee Design (256 bytes)]&lt;br /&gt;
&lt;br /&gt;
[https://revival-studios.com/marquee/blog/blog_microdose.html Crunching content: The development of Microdose] Development Write up on [https://www.pouet.net/prod.php?which=85677 Microdose, by: Marquee Design (128 bytes)]&lt;br /&gt;
&lt;br /&gt;
[https://abaddon.hu/crevenge/history.html The long journey to 1k real-time ray-tracing] Detailed technical writeup on [https://www.pouet.net/prod.php?which=87078 Chrome Revenge by Abaddon (1024 bytes)]&lt;br /&gt;
&lt;br /&gt;
[http://canonical.org/~kragen/demo/fr-016.html Disassembly of Farbrausch's &amp;quot;fr-016: bytes&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[http://canonical.org/~kragen/demo/klappquadrat.html A disassembly of the 64-byte version of Klappquadrat]&lt;br /&gt;
&lt;br /&gt;
[https://trixter.oldskool.org/2012/12/17/maze-generation-in-thirteen-bytes/ Maze generation in 10 bytes]&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode/comments/1se0iyt/writeup_for_16_byte_intro_rainbow_surf_1st_place/ Rainbow Surf write-up by the author of the Revision 2026 winning entry]&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=1185</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=1185"/>
				<updated>2022-06-28T22:14:46Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* .COM file defaults */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=CS&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
BP=09xx&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&lt;br /&gt;
flags=xx02&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(For a full list of values per specific version of DOS, consult http://www.fysnet.net/yourhelp.htm)&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
Different DOS versions may differ slightly in the above.  A full chart exists at http://www.fysnet.net/yourhelp.htm to help you check if your specific target has different defaults than the standard settings above.&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but the BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== How to think like a sizecoder ===&lt;br /&gt;
&lt;br /&gt;
This example framework can be shrunk!  Think a little bit about how you might do it, then check out the example below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        les     bx,[bx]         ;contains 0x9FFF at program start; close enough to 0xA000 ;-)&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to ES:xxxx to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
                                ;Don't care if we set text mode, user can just MODE CO80&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By casually disregarding the user experience :-) and fudging the start of video memory, our basic framework is now 13 bytes.&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=6502&amp;diff=847</id>
		<title>6502</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=6502&amp;diff=847"/>
				<updated>2021-03-18T02:37:34Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: Added link to deater's youtube presentation&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Wanting to start sizecoding on a 6502 platform in this day and age can be tough. &lt;br /&gt;
&lt;br /&gt;
[[File:6502.jpg|thumb]]&lt;br /&gt;
&lt;br /&gt;
So here is a bit of help to get you started:&lt;br /&gt;
&lt;br /&gt;
=== The 6502 processor  ===&lt;br /&gt;
The 6502 processor can be seen as the 8-bit micro ARM chip. &lt;br /&gt;
It has only has 3 registers (Accumulator, X and Y registers) and a handful of instructions to work with.&lt;br /&gt;
&lt;br /&gt;
=== Adressing modes ===&lt;br /&gt;
To be added.&lt;br /&gt;
&lt;br /&gt;
=== Zero page ===&lt;br /&gt;
When using the 6502 for sizecoding, you'll mostly be working from zeropage&lt;br /&gt;
&lt;br /&gt;
=== General 6502 Resources ===&lt;br /&gt;
* 6502.org http://www.6502.org/&lt;br /&gt;
* 6502 instruction reference http://www.6502.org/tutorials/6502opcodes.html&lt;br /&gt;
* 6502 books http://retro.hansotten.nl/6502-sbc/&lt;br /&gt;
* 6502 Assembler tutorial https://dwheeler.com/6502/oneelkruns/asm1step.html&lt;br /&gt;
* Easy 6502 code tester https://skilldrick.github.io/easy6502/&lt;br /&gt;
* Synthetic instructions https://wiki.nesdev.com/w/index.php/Synthetic_instructions#8-bit_rotate&lt;br /&gt;
&lt;br /&gt;
== Atari 8-bit family ==&lt;br /&gt;
The systems in this family are: Atari 400, 800, 5200, 1200XL, 600XL, 800XL, 130XE, 65XE, 800XE and XEGS.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The Atari 8-bit systems consists of the 6502 with custom hardware for graphics and sound.&lt;br /&gt;
&lt;br /&gt;
=== Setting up ===&lt;br /&gt;
Setting up your development platform for the Atari 8bit systems is quite easy, first get the following tools:&lt;br /&gt;
&lt;br /&gt;
* Assembler: MADS Assembler - This assembler has nice macros for creating Binaries and SNA snapshot files out of the box. You can download it at https://mads.atari8.info/&lt;br /&gt;
* Emulator(s): I Found Altirra to work best for my usecase. Make sure to use the original Rev2 rom for best compatibility.&lt;br /&gt;
&lt;br /&gt;
==== Special Memory Adresses ====&lt;br /&gt;
* FRAMECOUNTER_HIGH = 19&lt;br /&gt;
* FRAMECOUNTER_LOW  = 20&lt;br /&gt;
&lt;br /&gt;
=== Video display ===&lt;br /&gt;
Video display on the Atari 8bit systems use the ANTIC and GTIA chips. Information can be found here:&lt;br /&gt;
* https://en.wikipedia.org/wiki/ANTIC&lt;br /&gt;
* https://www.atariarchives.org/agagd/chapter1.php&lt;br /&gt;
&lt;br /&gt;
==== Sync with frame  ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
RTCLOK      equ $0012&lt;br /&gt;
      lda RTCLOK+2&lt;br /&gt;
waits&lt;br /&gt;
      cmp RTCLOK+2&lt;br /&gt;
      beq waits&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Or if you don't mind trashing RTCLOK&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
RTCLOK      equ $0012&lt;br /&gt;
waits&lt;br /&gt;
      lsr RTCLOK+2&lt;br /&gt;
      bcc waits&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Which is two bytes shorter.&lt;br /&gt;
&lt;br /&gt;
==== Getting something on screen ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
;fill screen with charset,(88,89)=an address&lt;br /&gt;
 org $600; free 6th page:600-6ff&lt;br /&gt;
 ldy #0&lt;br /&gt;
fl: tya&lt;br /&gt;
 sta(88),y&lt;br /&gt;
 iny&lt;br /&gt;
 bne fl&lt;br /&gt;
 jmp *&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
SDMCTL	= $022f&lt;br /&gt;
HPOSP0  = $d000&lt;br /&gt;
SIZEP0  = $d008&lt;br /&gt;
GRAFP0  = $d00d&lt;br /&gt;
COLPM0  = $d012&lt;br /&gt;
&lt;br /&gt;
FRAMECOUNTER_HIGH = 19&lt;br /&gt;
FRAMECOUNTER = 20&lt;br /&gt;
WSYNC	= $d40a&lt;br /&gt;
VCOUNT	= $d40b&lt;br /&gt;
&lt;br /&gt;
sinewave	= $0600		; to $06ff&lt;br /&gt;
&lt;br /&gt;
		org $80&lt;br /&gt;
&lt;br /&gt;
main	&lt;br /&gt;
	; disable all graphics/colors&lt;br /&gt;
	ldx #0&lt;br /&gt;
	stx SDMCTL	&lt;br /&gt;
&lt;br /&gt;
	ldy #$7f&lt;br /&gt;
	sty SIZEP0	; size p0=127&lt;br /&gt;
		&lt;br /&gt;
	ldx #0&lt;br /&gt;
	ldy #$3f&lt;br /&gt;
make_sine:&lt;br /&gt;
value_lo&lt;br /&gt;
			lda #0&lt;br /&gt;
			clc&lt;br /&gt;
delta_lo&lt;br /&gt;
			adc #0&lt;br /&gt;
			sta value_lo+1&lt;br /&gt;
value_hi&lt;br /&gt;
			lda #0&lt;br /&gt;
delta_hi&lt;br /&gt;
			adc #0&lt;br /&gt;
			sta value_hi+1&lt;br /&gt;
 &lt;br /&gt;
			sta sinewave+$c0,x&lt;br /&gt;
			sta sinewave+$80,y&lt;br /&gt;
			eor #$7f&lt;br /&gt;
			sta sinewave+$40,x&lt;br /&gt;
			sta sinewave+$00,y&lt;br /&gt;
 &lt;br /&gt;
			lda delta_lo+1&lt;br /&gt;
			adc #8&lt;br /&gt;
			sta delta_lo+1&lt;br /&gt;
			bcc nothing&lt;br /&gt;
			inc delta_hi+1&lt;br /&gt;
nothing&lt;br /&gt;
			inx&lt;br /&gt;
			dey&lt;br /&gt;
			bpl make_sine&lt;br /&gt;
&lt;br /&gt;
updateloop:&lt;br /&gt;
		; vblank&lt;br /&gt;
		lda VCOUNT&lt;br /&gt;
		bne updateloop&lt;br /&gt;
&lt;br /&gt;
		; clear graphics&lt;br /&gt;
		sta HPOSP0&lt;br /&gt;
		sta GRAFP0&lt;br /&gt;
&lt;br /&gt;
		ldy #0&lt;br /&gt;
		lda #47&lt;br /&gt;
		sta COLPM0&lt;br /&gt;
yloop:&lt;br /&gt;
		tya           ; graphics shape = y&lt;br /&gt;
		sta WSYNC&lt;br /&gt;
		sta GRAFP0&lt;br /&gt;
&lt;br /&gt;
		; a = sin(frame+y)+48&lt;br /&gt;
		tya	&lt;br /&gt;
		adc FRAMECOUNTER&lt;br /&gt;
		tax&lt;br /&gt;
		lda sinewave,x&lt;br /&gt;
		adc #48&lt;br /&gt;
		sta HPOSP0&lt;br /&gt;
                &lt;br /&gt;
                iny&lt;br /&gt;
                bne yloop&lt;br /&gt;
		jmp updateloop&lt;br /&gt;
&lt;br /&gt;
		run main&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sound ===&lt;br /&gt;
The Atari 8-bit use the POKEY chip to generate sound.&lt;br /&gt;
* https://en.wikipedia.org/wiki/POKEY&lt;br /&gt;
&lt;br /&gt;
==== BASIC ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
; from Analog, 1985-09,pp.25-32&lt;br /&gt;
*=$2000&lt;br /&gt;
AUDF1=$D200&lt;br /&gt;
AUDC1=$D201&lt;br /&gt;
AUDCTL=$D208&lt;br /&gt;
SKCTL=$D20F&lt;br /&gt;
&lt;br /&gt;
.MACRO SOUND ; voice,pitch,dist,vol;,dur&lt;br /&gt;
	lda #%2&lt;br /&gt;
	sta AUDF1+2*%1&lt;br /&gt;
    lda #[[%3 * 16] | %4] ;lda #[[%3 shl 4] or %4]&lt;br /&gt;
    sta AUDC1+2*%1&lt;br /&gt;
.ENDM&lt;br /&gt;
&lt;br /&gt;
 lda #0&lt;br /&gt;
 sta AUDCTL&lt;br /&gt;
 lda #3&lt;br /&gt;
 sta SKCTL&lt;br /&gt;
&lt;br /&gt;
 SOUND 0,121,10,8&lt;br /&gt;
&lt;br /&gt;
 jmp *&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==== Make some noise ====&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
=== Additional Resources ===&lt;br /&gt;
Sizecoding resource for the Atari 8bit are:&lt;br /&gt;
* Mapping the Atari https://www.atariarchives.org/mapping/&lt;br /&gt;
* Atari 8bit Memory map https://www.atariarchives.org/mapping/memorymap.php&lt;br /&gt;
* Fready's undocumented 6502 opcodes https://github.com/FreddyOffenga/6502&lt;br /&gt;
* Atari OS Rev2 disassembly for MADS assembler https://github.com/ilmenit/A800-OS-XL-Rev2&lt;br /&gt;
* Fready's github https://github.com/FreddyOffenga/&lt;br /&gt;
&lt;br /&gt;
== Commodore 64 ==&lt;br /&gt;
The Commodore system consists of a 6502-compatible MOS 6510 processor with custom hardware for graphics and sound.&lt;br /&gt;
&lt;br /&gt;
=== Setting up ===&lt;br /&gt;
Setting up your development platform for the Commodore systems is quite easy, first get the following tools:&lt;br /&gt;
&lt;br /&gt;
* Assembler: To be added&lt;br /&gt;
* Emulator(s): VICE is the way to go&lt;br /&gt;
&lt;br /&gt;
=== Memory map ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
    0-1          $0-$1      PROCESSOR PORT REGISTERS, not usable as RAM&lt;br /&gt;
   0-255         $0-$FF     ZEROPAGE, mostly used by BASIC interpreter and Kernal&lt;br /&gt;
  256-511      $100-$1FF    SYSTEM STACK&lt;br /&gt;
  512-600      $200-$258    INPUT BUFFER&lt;br /&gt;
  601-630      $259-$276    FILE SYSTEM DATA&lt;br /&gt;
  631-640      $277-$280    KEYBOARD BUFFER, filled by Kernal ISR upon key presses &lt;br /&gt;
  641-678      $281-$2A6    VARIOUS KERNAL VARIABLES&lt;br /&gt;
  679-767      $2A7-$2FF    Unused (89 byte)&lt;br /&gt;
  768-779      $300-$30B    KERNAL VECTORS&lt;br /&gt;
  780-783      $30C-$30F    REGISTER VALUES A,X,Y,SR for SYS&lt;br /&gt;
  784-819      $310-$333    KERNAL VECTORS&lt;br /&gt;
  820-827      $334-$33B    Unused (8 byte)&lt;br /&gt;
  828-1019     $33C-$3FB    DATASETTE BUFFER&lt;br /&gt;
 1020-1023     $3FC-$3FF    Unused (4 byte)&lt;br /&gt;
 1024-2023     $400-$7E7    SCREEN MEMORY (25 lines times 40 columns)&lt;br /&gt;
 2024-2039     $7E8-$7F7    Unused (16 byte)&lt;br /&gt;
 2040-2047     $7F8-$7FF    SPRITE COSTUME POINTERS&lt;br /&gt;
 2048          $800         must contain a value of 0 so that the BASIC program can be RUN&lt;br /&gt;
 2049-40959    $801-$9FFF   BASIC AREA (code, variables, arrays, strings)&lt;br /&gt;
40960-49151   $A000-$BFFF   BASIC ROM (unused RAM underneath)&lt;br /&gt;
49152-53247   $C000-$CFFF   UPPER RAM AREA (unused)&lt;br /&gt;
53248-53294   $D000-$D02E   VIC-II, memory-mapped graphics chip registers)&lt;br /&gt;
54272-54300   $D400-$D41C   SID, memory-mapped sound chip registers)&lt;br /&gt;
55296-56295   $D800-$DBE7   Color RAM, only lower 4 bits are stored, higher nibble undefined&lt;br /&gt;
56296-56319   $DBE8-$DBFF   Unused (24 half-bytes)&lt;br /&gt;
56320-56335   $DC00-$DC0F   CIA#1, memory-mapped I/O chip registers for keyboard, joystick, mouse, datasette, IRQ control&lt;br /&gt;
56576-56591   $DD00-$DD0F   CIA#1, memory-mapped I/O chip registers for serial bus, RS232, NMI control&lt;br /&gt;
57344-65535   $E000-$FFFF   KERNAL ROM&lt;br /&gt;
65534-65535   $FFFE-$FFFF   Execution address of interrupt service routine, can only be set if KERNAL is turned off&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For a detailed explanation of the C64 memory map check out [https://sta.c64.org/cbm64mem.html Commodore 64 memory map] on Joe Forster/STA homepage.&lt;br /&gt;
&lt;br /&gt;
=== Program file overhead ===&lt;br /&gt;
&lt;br /&gt;
A standard CBM file consists of a 2 byte loading address followed by the data that is placed into memory starting at this address. In addition, if your machine program should be able to be started with &amp;quot;RUN&amp;quot;, you need a BASIC stub that is loaded into memory from $0801 to $080c, typically followed by your machine program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
*=$0801&lt;br /&gt;
        .word $080b   ;address of next BASIC line&lt;br /&gt;
        .word LINENO  ;line number, can be chosen between 0 and 63999&lt;br /&gt;
        .byte $9e     ;token for SYS command&lt;br /&gt;
        .byte $32,$30,$35,$39  ;address to jump to in ASCII code: &amp;quot;2059&amp;quot; &lt;br /&gt;
        .byte $00     ;end of BASIC line&lt;br /&gt;
start   lax #0        ;actually address of next BASIC line, $xx00 means end of BASIC program, use $xx for an opcode&lt;br /&gt;
; your code here&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Together with the 2 byte load address, this makes an overhead of 14 bytes for your program, coming with A and X initialised to 0. To reduce this, you can&lt;br /&gt;
* omit the BASIC line and tell users to start your program by directly entering &amp;quot;SYS ''address''&amp;quot;&lt;br /&gt;
* save your program from $0802 on, the byte at $0801 is recovered automatically after loading by the relink function. This reduces your program file size by 1 byte&lt;br /&gt;
* accept the loss and use the 2 byte establishing LINENO for something good, they are copied to zeropage addresses $39/$3a automatically, saving you the effort of initializing the addresses&lt;br /&gt;
* use an autostart method (see below)&lt;br /&gt;
&lt;br /&gt;
Some compos also specify that the BASIC loader does not count towards the overall filesize, check the rules before you worry.&lt;br /&gt;
&lt;br /&gt;
=== Autostart ===&lt;br /&gt;
&lt;br /&gt;
Unlike the Commodore 128, the Commodore 64 has no dedicated built-in auto-boot feature, but it is possible to autorun a program after loading by specifying the load address so that it overwrites a vector or address that is called by the operating system. There are multiple possibilities:&lt;br /&gt;
&lt;br /&gt;
==== Overwriting  CHROUT vector at $326-$327 ====&lt;br /&gt;
&lt;br /&gt;
The vector at address $326-$327 points at the CHROUT routine, normally at address $F1CA.&lt;br /&gt;
If you put the entry address of your program into $326/27, your code will be called when the operating system wants to print &amp;quot;READY.&amp;quot; after loading.&lt;br /&gt;
Note that the vector at address $328-$329 points at the STOP routine that checks the status of Stop key indicator. Since this routine is called multiple times before your code is done loading, its default value $F6ED needs to be preserved. &lt;br /&gt;
&lt;br /&gt;
Application example:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
*=$0326&lt;br /&gt;
        .word start              &lt;br /&gt;
        .byte $ed,$f6&lt;br /&gt;
start&lt;br /&gt;
; rest of code&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Together with the two byte load address, the overwritten CHROUT vector and the preserved STOP vector, your program file will have a deadweight of 6 byte: $26 $03 $0a $03 $ed $f6, less than half of the standard version with a BASIC stub with SYS&lt;br /&gt;
&lt;br /&gt;
Caveat:&lt;br /&gt;
* If your program after $32a is longer than 214 byte, part of your program will load into the screen at $400. Thus, issuing a clear screen will overwrite your program. It is possible to move the screen memory somewhere else, but that requires storing a new value into address $D018, which when done with an LDA # / STA pair will cost you 5 byte&lt;br /&gt;
* Since you changed the CHROUT pointer, standard KERNAL functions for printing won't work. For output of single characters, you can still use the direct address $F1CA. Alternatively, you can restore the vectors by calling $FD15, which will overwrite the memory $314 to $333 with the default values.&lt;br /&gt;
* The memory area $33C-$3FB is the datasette buffer, thus loading a program into this area won't work when loading from a cassette.&lt;br /&gt;
&lt;br /&gt;
==== Overwriting BASIC idle loop vector at $302-$303 ====&lt;br /&gt;
&lt;br /&gt;
After loading the program, the BASIC idle loop will be called. By overwriting the vector at $302/$303 it is possible to autostart your program. Since the memory before $302 can be safely overwritten, the program would be arranged so that $303 is the '''last''' address of your program. If your program requires a loop back to the start, the vector can be used as part of a JMP where the following two bytes are already there, saving one byte in comparison to a loop done with a branch command.&lt;br /&gt;
&lt;br /&gt;
Application example:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
*=$304 - (end - start)&lt;br /&gt;
start:&lt;br /&gt;
        ;your code here&lt;br /&gt;
        ; ...&lt;br /&gt;
        jmp     start    ;the address for start lies in $302/$303, thus will be called by the Kernal after loading&lt;br /&gt;
end:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the vector has a practical use, the overhead in filesize is only the 2 byte loading address. For programs larger than 93 byte keep in mind that addresses before $2A7 are in use by the Kernal.&lt;br /&gt;
&lt;br /&gt;
==== Overwriting return address on stack at $1F8 ====&lt;br /&gt;
&lt;br /&gt;
When the load routine is called, the stack pointer is always at $FA at a stock C64. Thus the return address from the loading process is stored in memory address $F8 (low byte) and $F9 (high byte). During the loading routine, stack usage will clobber the stack below $1F8, but a program loaded to $1F8 and above will be fine and the overwritten address stored at $1F8/$1F9 will be called after loading.&lt;br /&gt;
&lt;br /&gt;
Application example:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
*=$1F8&lt;br /&gt;
        .word start-1   ;adjust for the RTS command pulling the address from stack and increasing it by 1&lt;br /&gt;
start:&lt;br /&gt;
        ;your code here&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For programs larger than 95 byte keep in mind that addresses after $259 are in use by the Kernal, you might overwrite some values here if you don't have disk operations, but better check the memory map beforehand.&lt;br /&gt;
&lt;br /&gt;
Together with the two byte load address and the overwritten address at $1F8/$1F9, your program file will have a deadweight of 4 byte when using this method.&lt;br /&gt;
&lt;br /&gt;
==== Putting your program into zeropage overwriting part of CHRGET ====&lt;br /&gt;
&lt;br /&gt;
CHRGET is a machine code routine for the BASIC interpreter that is placed in the zero page at $73-$8A. By putting your program at $7C and following, the CHRGET routine runs directly into your code.&lt;br /&gt;
&lt;br /&gt;
Application example:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
*=$7C&lt;br /&gt;
start:&lt;br /&gt;
        ;your code here&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With this method, there is no need for a start address vector. The only deadweight in your program is the loading address. However, the zero page is a busy place, if your program extends beyond address $8f (that is it has more than 22 bytes on disk) you have to deal with some parts of your program being overwritten, for example, the status byte in $90 is set after loading.&lt;br /&gt;
&lt;br /&gt;
Loading data into address $A9 has an interesting side effect - the vector of $AE/$AF defines the address where the loaded bytes are written to. If $AE gets overwritten with, for example, $40, the next byte will be written to $41 (because the vector is incremented after the write). This can be used to split your program into different parts of memory.&lt;br /&gt;
&lt;br /&gt;
=== Video display ===&lt;br /&gt;
Video display on the Commodore has the following native video modes:&lt;br /&gt;
* Standard text mode, 40x25 characters, 1 individual foreground color per character, 1 common background color&lt;br /&gt;
* Multicolor text mode, 40x25 characters, only 8 foreground colors, can be mixed with hires chars, foreground &amp;lt; 8 denotes a hires char, with foreground &amp;gt;= 8 a pair of two bits (brick pixel) define color: background, foreground &amp;amp; %111, common color 1+2&lt;br /&gt;
* Extended background color text mode, 40x25 characters, 1 individual foreground color per character, 1 out of 4 background colors, only 64 chars&lt;br /&gt;
* Hires bitmap, 320x200 pixels, an 8x8 field can have an individual foreground and an individual background color&lt;br /&gt;
* Multicolor bitmap, 160x200 pixels, a pair of two bits (brick pixel) define color: background, screen color lower nibble, screen color higher nibble, color from color RAM&lt;br /&gt;
&lt;br /&gt;
==== Sync with frame  ====&lt;br /&gt;
&lt;br /&gt;
Depending on the VIC type, there are 262/263 raster lines (NTSC) or 312 raster lines (PAL-B). Therefore, the VIC uses register at $D012 and the highest bit of register $D011 to represent the current raster line. Checking only the lower 8 bits and waiting for them to be 0 would therefore not work because this happens two times per frame. However, any number greater than 64 will only trigger once. Using a value of 255, we would sync to a rasterline at the end of the screen, well below the part where graphics are normally displayed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
waitsync:&lt;br /&gt;
        ldx $D012&lt;br /&gt;
        inx&lt;br /&gt;
        bne waitsync&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As long as there have more than 63 cycles passed since the snyc, the next sync will happen at the next frame.&lt;br /&gt;
This works fine as long as we don't miss the line 255 due to an interrupt. If your program has interrupts that are not synced to a rasterline, you can change the condition as follows&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
        lda #$E3&lt;br /&gt;
waitsync:&lt;br /&gt;
        cmp $D012&lt;br /&gt;
        bcs waitsync&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now we have a window between raster line $E4 and $FF to trigger our sync, which we will meet even if a standard Kernal interrupt comes in-between. Of course, our main program now has to burn around 1700 cycles to avoid a retriggering in the same frame.&lt;br /&gt;
&lt;br /&gt;
==== Getting something on screen ====&lt;br /&gt;
&lt;br /&gt;
Output a character:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
        lda     #C       ;C must be the char code according to the PETSCII code table&lt;br /&gt;
        jsr     $FFD2    ;you can also use the direct address $FC1A, function preserves A,X,Y and enables interrupt&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Since the [https://en.wikipedia.org/wiki/PETSCII#Character_set PETSCII code table] contains also control characters, this method can be also used to clear the screen, move the cursor or set the text color. However, some functions are shorter (and faster) by directly calling the respective KERNAL function:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
Clear screen             JSR $E544&lt;br /&gt;
Cursor home position     JSR $E566&lt;br /&gt;
Output newline character JSR $AAD7&lt;br /&gt;
Set text color           LDA #COLOR, STA $286&lt;br /&gt;
Set border color         LDA #COLOR, STA $D020   ;not possible with a PETSCII control code&lt;br /&gt;
Set background color     LDA #COLOR, STA $D021   ;not possible with a PETSCII control code&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Copy characters directly to screen memory:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
.macpack cbm   ;scrcode macro&lt;br /&gt;
SCREEN_BASE=$400&lt;br /&gt;
LINE=5    ;for example&lt;br /&gt;
COLUMN=10 ;for example&lt;br /&gt;
&lt;br /&gt;
        ldx     #endtext-text-1&lt;br /&gt;
@loop:  lda     text,x&lt;br /&gt;
        sta     SCREEN_BASE+LINE*40+COLUMN,x&lt;br /&gt;
        dex&lt;br /&gt;
        bpl     @loop&lt;br /&gt;
text:   scrcode &amp;quot;hello world!&amp;quot;&lt;br /&gt;
endtext:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the text needs to be stored in '''screen code''' format, not in PETSCII (see [https://sta.c64.org/cbm64pettoscr.html Commodore 64 PETSCII code to screen code conversion] for the differences). Furthermore, the example above only writes to the screen memory, not the color RAM, assuming that the color RAM already contains the intended text color. This is the case after issuing a clear screen on most C64 machines, except for some old ROM revisions. To be on the save side you would have to write to color RAM as well, which costs another 5 byte:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
.macpack cbm     ;scrcode macro&lt;br /&gt;
SCREEN_BASE=$400&lt;br /&gt;
COLOR_RAM  =$D800&lt;br /&gt;
LINE       =5    ;for example&lt;br /&gt;
COLUMN     =10   ;for example&lt;br /&gt;
COLOR      =1    ;white&lt;br /&gt;
&lt;br /&gt;
        ldx     #endtext-text-1&lt;br /&gt;
@loop:  lda     text,x&lt;br /&gt;
        sta     SCREEN_BASE+LINE*40+COLUMN,x&lt;br /&gt;
        lda     #COLOR&lt;br /&gt;
        sta     COLOR_RAM  +LINE*40+COLUMN,x&lt;br /&gt;
        dex&lt;br /&gt;
        bpl     @loop&lt;br /&gt;
text:   scrcode &amp;quot;hello world!&amp;quot;&lt;br /&gt;
endtext:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sound ===&lt;br /&gt;
The Commodore 64 uses the famous SID chip to generate sound.&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
==== Make some noise ====&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
=== Useful Kernal functions ===&lt;br /&gt;
&lt;br /&gt;
==== Get a pseudo random number ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
        lda     #1   &lt;br /&gt;
        jsr     $E09A&lt;br /&gt;
        lda     $63&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The sequence of random numbers is always the same after a cold start or reset, the first 20 numbers are:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
 4,24,231,4,177,170,193,101,231,18,196,1,75,48,149,36,124,53,65,154&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Increment a zero page counter, return in X register, clear Y register ====&lt;br /&gt;
&lt;br /&gt;
This is part of the string handling function in the BASIC ROM, but the code could be useful for some loop stuff.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
        jsr     $B5FF    ;does inc $23, ldx $23, ldy #$00, rts&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Additional Resources ===&lt;br /&gt;
* Codebase 64 https://codebase64.org/doku.php?id=base:start&lt;br /&gt;
* Commodore 64 Memory map https://sta.c64.org/cbm64mem.html&lt;br /&gt;
* C64 BASIC &amp;amp; KERNAL ROM Disassembly https://www.pagetable.com/c64ref/c64disasm/&lt;br /&gt;
* No More Secrets https://csdb.dk/release/?id=198357 Everything about illegal opcodes and their use in a C64 system&lt;br /&gt;
&lt;br /&gt;
== Apple II ==&lt;br /&gt;
The Apple II is an 8-bit home computer and one of the world's first highly successful mass-produced microcomputer products. It was designed primarily by Steve Wozniak.&lt;br /&gt;
&lt;br /&gt;
=== Setting up ===&lt;br /&gt;
* ACME 6502 cross-assembler(https://sourceforge.net/projects/acme-crossass/)&lt;br /&gt;
* Apple Commander (http://applecommander.sourceforge.net) for batch compilation&lt;br /&gt;
* AppleWin emulator (https://github.com/AppleWin/AppleWin/releases). Supports Mockingboard card(AY-8910+speech synthesier), HDD, Z80 card(for CP/M), mouse etc.&lt;br /&gt;
* CiderPress(https://a2ciderpress.com)&lt;br /&gt;
&lt;br /&gt;
Compilation can be done as follows (master.dsk can be found with applewin) &lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
acme hl.asm&lt;br /&gt;
java -jar AppleCommander-1.3.5.jar -d master.dsk hl&lt;br /&gt;
java -jar AppleCommander-1.3.5.jar -p master.dsk hl B 24576 &amp;lt; hl.bin&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Memory Map ===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
   0-255	 $0-$FF     ZERO-PAGE SYSTEM STORAGE&lt;br /&gt;
  256-511      $100-$1FF    SYSTEM STACK&lt;br /&gt;
  512-767      $200-$2FF    KEYBOARD CHARACTER BUFFER&lt;br /&gt;
  768-975      $300-$3CF    OFTEN AVAILABLE AS FREE SPACE FOR USER PROGRAMS&lt;br /&gt;
  976-1023     $3D0-3FF     SYSTEM VECTORS&lt;br /&gt;
 1024-2047     $400-$7FF    TEXT AND LO-RES GRAPHICS PAGE 1 &amp;lt;--- !!!&lt;br /&gt;
 2048-LOMEM    $800-LOMEM   PROGRAM STORAGE&lt;br /&gt;
 2048-3071     $800-$BFF    TEXT AND LO-RES GRAPHICS PAGE 2 OR FREE SPACE&lt;br /&gt;
 3072-8191     $C00-$1FFF   FREE SPACE UNLESS RAM APPLESOFT IS IN USE&lt;br /&gt;
 8192-16383   $2000-$3FFF   HI-RES PAGE 1 OR FREE SPACE &amp;lt;--- !!!&lt;br /&gt;
16384-24575   $4000-$5FFF   HI-RES PAGE 2 OR FREE SPACE&lt;br /&gt;
24576-38999   $6000-$95FF   FREE SPACE AND STRING STORAGE&lt;br /&gt;
38400-49151   $9600-$BFFF   DOS&lt;br /&gt;
49152-53247   $C000-$CFFF   I/O HARDWARE (RESERVED)&lt;br /&gt;
53248-57343   $D000-$DFFF   APPLESOFT IN LANGUAGE CARD OR ROM&lt;br /&gt;
57344-63487   $E000-$F7FF   APPLESOFT OR INTEGER BASIC IN LANGUAGE CARD OR ROM&lt;br /&gt;
63488-65535   $F800-$FFFF   SYSTEM MONITOR&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Display ===&lt;br /&gt;
&lt;br /&gt;
=== Graphics Modes ===&lt;br /&gt;
: Text Mode 40x24, for Apple IIe available 80x25 - use PR#3 for switch mode, or hardware switch&lt;br /&gt;
: LowRes 40x48, 16 colors: https://en.wikipedia.org/wiki/Apple_II_graphics &lt;br /&gt;
: Hires mode 280x192,6 colors: https://www.xtof.info/blog/?p=768&lt;br /&gt;
https://mrob.com/pub/xapple2/colors.html&lt;br /&gt;
https://archive.org/details/HiRes_Color_Graphics_on_the_Apple_II_Computer_by_Wozniak&lt;br /&gt;
&lt;br /&gt;
However for sizecoding, you almost never want to do direct-access to graphics for Apple II in size-coding because the Apple II graphics modes are horrible.  The only fast way to do things is with large lookup tables.  To do hires you need to divide by 7 which as you can imagine is a bit difficult to do compactly on 6502. Double-hires is even crazier on top of that.  Deater did manage a color-bar style effect in double-hires in 128B but that was doing some crazy tricks with the firmware BASIC routines, definitely not direct-access.&lt;br /&gt;
&lt;br /&gt;
Lores and Hires can be mixed modes and full-graphics&lt;br /&gt;
The screen structure is called memory holes(https://retrocomputing.stackexchange.com/questions/2534/what-are-the-screen-holes-in-apple-ii-graphics). The GBASCALC($F847) procedure is used to calculate the address of the horizontal line : IN:reg.A=Y, out : GBASL/GBASH($26/$27)=address. See also https://www.callapple.org/uncategorized/use-of-apple-ii-color-graphics-in-assembly-language/&lt;br /&gt;
&lt;br /&gt;
==== Getting something on screen ====&lt;br /&gt;
Here is an example of a XOR texture, created by g0blinish&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
 *=$6000&lt;br /&gt;
 !to &amp;quot;HL.bin&amp;quot;, plain	; set output file and format&lt;br /&gt;
 !cpu 6502		; set processor type&lt;br /&gt;
&lt;br /&gt;
GBASL	=	$26&lt;br /&gt;
GBASH	=	$27&lt;br /&gt;
SETGR    =     $FB40 ; setup LoRes&lt;br /&gt;
GBASCALC = $F847 ; calc Address&lt;br /&gt;
&lt;br /&gt;
CLRTEXT =  $C050 ;display graphics &lt;br /&gt;
SETTEXT =  $C051 ;display text &lt;br /&gt;
&lt;br /&gt;
CLRMIXED = $C052 ;clear mixed mode- enable full graphics &lt;br /&gt;
SETMIXED = $C053 ;enable graphics/text mixed mode &lt;br /&gt;
&lt;br /&gt;
PAGE1 =    $C054 ;select text/graphics page1 &lt;br /&gt;
PAGE2 =    $C055 ;select text/graphics page2 &lt;br /&gt;
&lt;br /&gt;
CLRHIRES = $C056 ;select Lo-res &lt;br /&gt;
SETHIRES = $C057 ;select Hi-res &lt;br /&gt;
&lt;br /&gt;
TMP= $FA&lt;br /&gt;
&lt;br /&gt;
	JSR   SETGR      ;GR&lt;br /&gt;
	BIT CLRMIXED ; full screen&lt;br /&gt;
&lt;br /&gt;
	LDA #0 ; A=0&lt;br /&gt;
	STA TMP ; POKE $FA,A&lt;br /&gt;
&lt;br /&gt;
YLP ;&lt;br /&gt;
	LDA TMP ; A=PEEK($FA)&lt;br /&gt;
;	LSR ; A=A/2&lt;br /&gt;
	JSR GBASCALC&lt;br /&gt;
	LDY #0;Y=0&lt;br /&gt;
&lt;br /&gt;
XLP TYA ; A=Y&lt;br /&gt;
	EOR TMP ; A=A xor PEEK($FA)&lt;br /&gt;
	and #$0F ; A=A and 15&lt;br /&gt;
	TAX ; X=A&lt;br /&gt;
	LDA COLORS,X ;A=PEEK(COLORS+X)&lt;br /&gt;
	STA(GBASL),Y ; POKE PEEK($26)+256*PEEK($27)+Y,A&lt;br /&gt;
	INY ; Y=Y+1&lt;br /&gt;
	CPY #40 ; Y=40?&lt;br /&gt;
	BNE XLP&lt;br /&gt;
	INC TMP ; POKE $FA,PEEK($FA)+1&lt;br /&gt;
	LDA TMP ; A=PEEK($FA)&lt;br /&gt;
	CMP #24 ; A=24?&lt;br /&gt;
	BNE YLP&lt;br /&gt;
	&lt;br /&gt;
M1 JMP M1 ; replace to RTS&lt;br /&gt;
&lt;br /&gt;
COLORS ;N*17, pixel format is AAAABBBB, AAAA - upper dot, BBBB - lower dot&lt;br /&gt;
!byte $00,$11,$22,$33,$44,$55,$66,$77&lt;br /&gt;
!byte $88,$99,$AA,$BB,$CC,$DD,$EE,$FF&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Sound  ====&lt;br /&gt;
Here is an example for using the speaker, based onthe following basic program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;basic&amp;quot;&amp;gt;&lt;br /&gt;
; 50  POKE 768,V: POKE 769,P - 255 *  INT (P / 256): POKE 800,1 + P / 256&lt;br /&gt;
; 60  CALL 770: RETURN &lt;br /&gt;
; 95  FOR K = 1 TO N: READ V(K),P(K): NEXT K&lt;br /&gt;
; 100  FOR K = 1 TO N:V = V(K):P = P(K)&lt;br /&gt;
; 110  GOSUB 50&lt;br /&gt;
;!byte 173,48,192,136,208,5,206,1,3,240,9,202,208,245,174,0,3,76,2,3,206,32,3,208,240,96&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;6502&amp;quot;&amp;gt;&lt;br /&gt;
 *=$6000&lt;br /&gt;
 !to &amp;quot;HL.bin&amp;quot;, plain	; set output file and format&lt;br /&gt;
 !cpu 6502		; set processor type&lt;br /&gt;
&lt;br /&gt;
;start&lt;br /&gt;
; 95  FOR K = 1 TO N: READ V(K),P(K): NEXT K&lt;br /&gt;
; 100  FOR K = 1 TO N:V = V(K):P = P(K)&lt;br /&gt;
ini:&lt;br /&gt;
	lda #70&lt;br /&gt;
	sta cnt+1&lt;br /&gt;
	lda #music&amp;amp;255&lt;br /&gt;
	sta gotbyte+1&lt;br /&gt;
	lda #music/256&lt;br /&gt;
	sta gotbyte+2&lt;br /&gt;
&lt;br /&gt;
lop:&lt;br /&gt;
;V&lt;br /&gt;
	jsr gotbyte&lt;br /&gt;
	sta L300&lt;br /&gt;
	jsr gotbyte&lt;br /&gt;
;P&lt;br /&gt;
	jsr gotbyte&lt;br /&gt;
	sta L301&lt;br /&gt;
	jsr gotbyte&lt;br /&gt;
	clc&lt;br /&gt;
	adc #1&lt;br /&gt;
	sta L320&lt;br /&gt;
	jsr beep&lt;br /&gt;
	&lt;br /&gt;
	dec cnt+1&lt;br /&gt;
cnt lda #70&lt;br /&gt;
	bne lop&lt;br /&gt;
; 110  GOSUB 50&lt;br /&gt;
; 50  POKE 768,V: POKE 769,P - 255 *  INT (P / 256): POKE 800,1 + P / 256&lt;br /&gt;
; 60  CALL 770: RETURN &lt;br /&gt;
	jmp ini&lt;br /&gt;
gotbyte&lt;br /&gt;
	lda music&lt;br /&gt;
	inc gotbyte+1&lt;br /&gt;
	bne noinch&lt;br /&gt;
	inc gotbyte+2&lt;br /&gt;
noinch&lt;br /&gt;
	rts&lt;br /&gt;
;!byte 173,48,192,136,208,5,206,1,3,240,9,202,208,245,174,0,3,76,2,3,206,32,3,208,240,96&lt;br /&gt;
beep:&lt;br /&gt;
	ldy #1&lt;br /&gt;
	ldx #1&lt;br /&gt;
loc_302:&lt;br /&gt;
		LDA	$C030&lt;br /&gt;
&lt;br /&gt;
loc_305:&lt;br /&gt;
		DEY&lt;br /&gt;
		BNE	loc_30D&lt;br /&gt;
		DEC	L301&lt;br /&gt;
loc_30B:&lt;br /&gt;
		BEQ	loc_316&lt;br /&gt;
&lt;br /&gt;
loc_30D:&lt;br /&gt;
		DEX&lt;br /&gt;
		BNE	loc_305&lt;br /&gt;
		LDX	L300&lt;br /&gt;
		JMP	loc_302&lt;br /&gt;
loc_316:&lt;br /&gt;
		DEC	L320&lt;br /&gt;
		BNE	loc_30B&lt;br /&gt;
		RTS&lt;br /&gt;
L301 !byte 0&lt;br /&gt;
L300 !byte 0&lt;br /&gt;
L320 !byte 0&lt;br /&gt;
music&lt;br /&gt;
 !word 76,192,85,64,96,64,102,64,114,128,114,64,96,64,102,64,114,64,128,64&lt;br /&gt;
 !word 114,64,152,64,171,64,152,512,76,192,85,64,96,64,102,64,114,128,114,64&lt;br /&gt;
 !word 96,64,102,64,114,64,128,64,114,64,152,64,171,64,152,512,85,64,85,64&lt;br /&gt;
 !word 85,64,96,64,144,128,144,64,128,64,76,128,85,64,96,64,144,128,114,64&lt;br /&gt;
 !word 96,64,102,128,114,64,128,64,128,128,114,64,128,64,114,512,85,64,85,64&lt;br /&gt;
 !word 85,64,96,64,144,128,144,64,128,64,76,128,85,64,96,64,144,128,114,64&lt;br /&gt;
 !word 96,64,102,128,114,64,128,64,128,64,128,128,96,64,85,64,96,64,102,64,114,64,114,64&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Additional Resources ===&lt;br /&gt;
* Deater's page on Apple II sizecoding http://www.deater.net/weave/vmwprod/demos/sizecoding.html (with an additional presentation at https://www.youtube.com/watch?v=kJ-EXbnw0kQ )&lt;br /&gt;
* Article on double hi-res http://www.battlestations.zone/2017/04/apple-ii-double-hi-res-from-ground-up.html&lt;br /&gt;
* Applesoft Hi-Res Subroutines : http://hackzapple.org/scripts_php/index.php?menu=5&amp;amp;mod=ASM&amp;amp;sub=AAL&amp;amp;sub2=8112&amp;amp;PHPSESSID=f65fabfd0cdbf56b6bdc0ddac25117c6#a2&lt;br /&gt;
&lt;br /&gt;
== Atari Lynx ==&lt;br /&gt;
The Atari Lynx consists of the 6502 with custom hardware for graphics and sound.&lt;br /&gt;
&lt;br /&gt;
=== Setting up ===&lt;br /&gt;
Setting up your development platform for the Atari Lynx:&lt;br /&gt;
&lt;br /&gt;
* Assembler: -&lt;br /&gt;
* Emulator(s): -&lt;br /&gt;
&lt;br /&gt;
=== Video display ===&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
==== Getting something on screen ====&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Sound ===&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
==== Make some noise ====&lt;br /&gt;
To be added soon.&lt;br /&gt;
&lt;br /&gt;
=== Additional Resources ===&lt;br /&gt;
Sizecoding resource for the Atari Lynx are sparse&lt;br /&gt;
* 42Bastian's website (link to be added)&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Memories&amp;diff=504</id>
		<title>Memories</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Memories&amp;diff=504"/>
				<updated>2020-04-20T02:25:48Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
Hello! My name is &amp;quot;HellMood&amp;quot; and this is not a usual wiki style article. It will be written from my point of view, in first person. This article is all about a tiny MS DOS program called &amp;quot;Memories&amp;quot;. This program has a size of 256 bytes and won the &amp;quot;PC 256 byte&amp;quot; competition of the [https://en.wikipedia.org/wiki/Demoscene demoscene] event [https://2020.revision-party.net/ &amp;quot;Revision&amp;quot;] in 2020, as well as the public choice award. You can watch a video of the output of this program here (https://www.youtube.com/watch?v=Imquk_3oFf4) and a video including live reactions of online audience and moderators here (https://www.twitch.tv/videos/589179638?t=7h02m40s). You can download the release and comment here (https://www.pouet.net/prod.php?which=85227). This wiki article will give you an indepth analysis of the program and point out historical references as well as development steps. In case you did not notice, you are now on the &amp;quot;sizecoding wiki&amp;quot;. This will help you not only to understand the interior of &amp;quot;Memories&amp;quot;, but also enable you to produce something like this yourself. Feel free to look around! It is recommended that you start with the [http://www.sizecoding.org/wiki/Getting_Started basics of this wiki] first if you're new to sizecoding and/or x86 assembler. While the principles might be easy to grasp, details of the implementation might not.&lt;br /&gt;
&lt;br /&gt;
==Disclaimer==&lt;br /&gt;
This article discusses the submitted DosBox version (256 bytes). The available archive contains versions for FreeDos and Windows XP DOS which, at the time of this writing, don't work on all computers. These alternative versions were included as proof of concept, to show that this program does not only run on the emulator &amp;quot;DosBox&amp;quot;. For the &amp;quot;PC 256 bytes&amp;quot; competition of the event &amp;quot;Revision&amp;quot; in 2020, it was allowed to specify either &amp;quot;FreeDos&amp;quot; or &amp;quot;DosBox&amp;quot; as platform, the latter with a specific configuration. It is in fact possible to modify the DosBox version to work on FreeDos, MS DOS, WinXP and Win98 as the released alternative versions showcase, but this is not the focus of this article. Anyway, safe versions for all platform and all computers are in the works. Regarding possible further optimizations, I will discuss/explain only the submitted version although i already found several spots in the code that can be optimized. As the history of tiny effects show, almost no code is perfect, and can be brought down further in size. To avoid confusion between different versions, only the submitted version will be explained.&lt;br /&gt;
&lt;br /&gt;
==History of tiny effects==&lt;br /&gt;
[[File:Pouet categories.png|thumb|Size categories on http://www.pouet.net]]&lt;br /&gt;
Sizecoders think in size categories. For MS DOS, these categories are 256b, 128b, 64b and 32b. These are the standards of one of the biggest demoscene archives, http://www.pouet.net. There is no 16b category, although many tiny effects can be implemented in 16 bytes. Almost all of the effects of &amp;quot;Memories&amp;quot; have been coded and optimized by me before, and mostly the implementations were attempts to reduce an already existing effect in size, or do something similar in less size. Reducing the size in this case means, reducing it to the (one of the) next lower 2^N category. For example I might have seen an effect done in 33 to 64 bytes, and then reduced the size to 32 bytes or less. Almost everytime I moved an effect to a lower category, I submitted the resulting tiny program to a demoscene party which allowed remote entries in the 256b category and/or published the result to http://www.pouet.net . In this section I will introduce the effects and mention the origins and authors.&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Kasparov.gif|frame|kasparov, 16 bytes]]&lt;br /&gt;
The origin of this effect is my own &amp;quot;Kasparov 16b&amp;quot; from 2018 (https://www.pouet.net/prod.php?which=75912) apparently I &amp;quot;had this piece lying around, stuck at 17 bytes and not that beautiful, until i found a trick ;)&amp;quot; (quote from the release notes) There was a similar effect in 32 bytes before : &amp;quot;ew&amp;quot; by &amp;quot;headcrash&amp;quot; from 2003. (https://www.pouet.net/prod.php?which=17567) In this case, my design goal was to show &amp;quot;real&amp;quot; chessboards of 8x8 fields and recognizable dark and light squares, as well as the correct orientation of the single chessboards, meaning that the bottom right corner (h1) has to be a [https://en.wikibooks.org/wiki/Chess/Arranging_The_Board white square]. For &amp;quot;Memories&amp;quot;, the effect had to be reimplemented with another approach of writing to the screen to work with the framework, also the scrolling direction was changed to be different from the &amp;quot;scrolling tilted plane&amp;quot; effect.&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
The zooming circles were supposed to be entered to a demoparty as a 32 bytes intro, but I never actually did it. There is no real predecessor in 64 bytes, because in the 64b category much more complex effects are possible. The zooming circles were the result of desperately trying to reach the 32b category for a circular &amp;quot;tunnel&amp;quot; effect, for that my own record is still 52 bytes (&amp;quot;Neontube&amp;quot; - 2016)(https://www.pouet.net/prod.php?which=66808) which in turn was the optimization of the all-time-classic 64 byte effect &amp;quot;constant evolution&amp;quot; from ryg/Farbrausch (2003) (https://www.pouet.net/prod.php?which=8697). In the zooming circles routine, the distance and the angle are removed/ignored, so that the 32b category could be reached.&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Floorcast variation.png|thumb|Floorcast, 32 bytes version, variation]]&lt;br /&gt;
The scrolling tilted plane is one of my own releases, &amp;quot;floorcast 32b&amp;quot; from 2018. Floorcasts have its own history in sizecoding, the effect was brought down stepwise from 256 bytes to 32 bytes. The versions differ in the number of planes, some show two planes, others only one. For the &amp;quot;floorcast 32b&amp;quot; release I specifically decided against the &amp;quot;XOR&amp;quot; texture, whereas in &amp;quot;Memories&amp;quot; I used it again, but disguised with another final touch of &amp;quot;AND&amp;quot;.&lt;br /&gt;
* &amp;quot;rain_storm version 2008 - 256 bytes - https://www.pouet.net/prod.php?which=50479&lt;br /&gt;
* &amp;quot;org_100h&amp;quot; version 2008 - 128 bytes - https://www.pouet.net/prod.php?which=63271&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2013 - 86 bytes - https://www.pouet.net/prod.php?which=61070&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2014 - 64 bytes - https://www.pouet.net/prod.php?which=63305&lt;br /&gt;
* &amp;quot;HellMood&amp;quot; version 2018 - 32 bytes - https://www.pouet.net/prod.php?which=77774&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Projektbeschreibung.gif|frame|Projektbeschreibung, 32 bytes]]&lt;br /&gt;
The parallax checkerboards have been released by me as 32 bytes effect &amp;quot;Projektbeschreibung&amp;quot; in 2018. It was the direct attempt to bring &amp;quot;Follow the light&amp;quot; (https://www.pouet.net/prod.php?which=28543) from &amp;quot;Digimind&amp;quot; (2006) or my own &amp;quot;Lucy&amp;quot; (2014)(https://www.pouet.net/prod.php?which=63293) down to 32 bytes. Very helpful inspirations came from &amp;quot;Rrrolas&amp;quot; &amp;quot;Paralaxa&amp;quot; (32 bytes, 2007, https://www.pouet.net/prod.php?which=32281) and &amp;quot;Sensenstahl&amp;quot;s &amp;quot;Byteropolis&amp;quot; (2013) (https://www.pouet.net/prod.php?which=61075). In fact Rrrolas rendering approach was already close to the final solution, the code was modified to correct the planes arrangement, change the shape from triangles to checkerboards and improve on the colors. In &amp;quot;Memories&amp;quot;, the coloring of &amp;quot;Digimind&amp;quot;s version was used. Also, the effect has been modified to reduce shearing as much as possible.&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Colpinski.png|thumb|colpinski, 16 bytes]]&lt;br /&gt;
[[File:Rotastic.png|thumb|left|rotastic, 32 bytes]]&lt;br /&gt;
This consists of two effects, the rotozoomer and the sierpinski effect as a texture. The sierpinski effect is based on my own &amp;quot;Colpinski 16b&amp;quot; from 2013 (https://www.pouet.net/prod.php?which=62079) where &amp;quot;frag&amp;quot; of &amp;quot;fsqrt&amp;quot; helped pushing the limits. There is no real predecessor to that, because it is known that this effect can be done directly by combining X and Y, instead of an Iterated Function System or Cellular Automaton approach. The rotozoomer was released as 32b intro &amp;quot;rotastic&amp;quot; by me in 2017 (https://www.pouet.net/prod.php?which=70112), it is based on the ideas of &amp;quot;ryg&amp;quot; of &amp;quot;Farbrausch&amp;quot; (51 bytes, 2002, https://www.pouet.net/prod.php?which=5015) and &amp;quot;Gargaj&amp;quot; of &amp;quot;Conspiracy&amp;quot; (49 bytes, 2002, https://www.pouet.net/prod.php?which=6088).&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Intoanew.gif|frame|Into a new era, 64 bytes version]]&lt;br /&gt;
The raycast bent tunnel is a tweaked version of my own 64 bytes &amp;quot;Into a new era&amp;quot; release (2018, https://www.pouet.net/prod.php?which=78044). The custom colors were exchanged for colors of the standard palette, the geometry was tweaked a bit and changed accordingly, so that no depth effect was used for the calculation of texture values. The standalone version of this effect has a size of 50 bytes. The two main inspirations for the 64 bytes version were two 128 byte intros, &amp;quot;Spongy&amp;quot; by &amp;quot;TBC&amp;quot; (2009, https://www.pouet.net/prod.php?which=53871) and &amp;quot;Wolf128&amp;quot; from &amp;quot;Baudsurfer&amp;quot; (2014, https://www.pouet.net/prod.php?which=63518), while the particular algorithm was developed independently.&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Ocean.png|thumb|Ocean, 64b version]]&lt;br /&gt;
The ocean effect is based on my own 64 bytes release &amp;quot;Ocean&amp;quot; from 2016 (https://www.pouet.net/prod.php?which=66857). The custom color generation and music generation have been stripped from the original, both were not compatible with the main framework of &amp;quot;Memories&amp;quot; without consuming a LOT of extra space. The special effect of &amp;quot;dawn&amp;quot; happens as a byproduct of the general framework approach. This will be explained in the next chapter.&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
The transition between two effects is an effect itself, which has no real predecessor. It is rather an idea that evolved over the years, and might have been done by many others in similar ways. In a nutshell, while calculating a frame, each pixel position is randomized and the time - which determines the effect to use - is offset by this randomization value which is downscaled before. This allows the usage of the standard VGA palette (picture, source) rather than creating custom colors for real crossfading and thus saves space.&lt;br /&gt;
&lt;br /&gt;
==Tiny megademo framework==&lt;br /&gt;
For putting many tiny effects together into a &amp;quot;megademo&amp;quot;, they have all to use the same approach and they have to be mostly free of assumptions (contents of memory and registers). Also they have to use the same timing values and be tuned to a shared timing in general. Preparing the single effects to be includeable in a framework like this took quite some time and required a lot of extra space initially. It has to be noted, that - sadly - some of my most impressive effects (judged by audience reactions and views on social media) could not be included at all, because the approaches were not combineable without a HUGE overhead. Once all effects were streamlined, I could think about &amp;quot;factoring&amp;quot; out common calculations which repeatedly occur, resulting in a few saved bytes. The framework does the following:&lt;br /&gt;
* Setting up screen mode to 320 x 200 pixels in 256 colors&lt;br /&gt;
* Initializing a pointer to the screen&lt;br /&gt;
* Installing a callback for timing and music&lt;br /&gt;
* Mainloop&lt;br /&gt;
** Calculation of X and Y from the current screen location&lt;br /&gt;
*** with the [http://www.sizecoding.org/wiki/General_Coding_Tricks#Obtaining_X_and_Y_without_DIV_.28The_Rrrola_Trick.29 &amp;quot;Rrrola Trick&amp;quot;]&lt;br /&gt;
** P = Pseudorandom value from screen location&lt;br /&gt;
** Offset current time T by a scaled value P'&lt;br /&gt;
** Selection of effect number N according to T'&lt;br /&gt;
** Execution of effect N (X,Y) for current pixel&lt;br /&gt;
*** On entrance, XY is in register DX (DL,DH)&lt;br /&gt;
*** On entrance, the current time is in register BP&lt;br /&gt;
*** On exit, the pixel color is expected in AL&lt;br /&gt;
** Advance to next pixel&lt;br /&gt;
** Triple diagonal interlacing for smoothing&lt;br /&gt;
** Repeat until frame is complete&lt;br /&gt;
* Set timer to a tempo of ~35 FPS&lt;br /&gt;
* Check keyboard for ESC&lt;br /&gt;
** Quit on ESC, otherwise continue&lt;br /&gt;
=== Code of framework===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
	mov al,0x13				; set AL to mode 320*200 in 256 colors&lt;br /&gt;
	int 0x10	 			; call BIOS to set mode&lt;br /&gt;
	xchg bp,ax				; set timing value to 0x13 &lt;br /&gt;
	push 0xa000-10			; write the screen adress to register ES&lt;br /&gt;
	pop es					; works in conjunction with Rrrola trick&lt;br /&gt;
	mov ax,0x251c			; parameter for changing timer interrupt&lt;br /&gt;
	mov dl,timer			; adress of timer routine, assume DH=1&lt;br /&gt;
	int 0x21				; install timer routine&lt;br /&gt;
top:&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
	call bx					; call the effect&lt;br /&gt;
	stosb					; write the return value and advance&lt;br /&gt;
	inc di					; triple interlace trick for after&lt;br /&gt;
	inc di					; effect and smoothing the animation&lt;br /&gt;
	jnz top					; repeat until the frame is complete&lt;br /&gt;
	mov al,tempo			; set AL to divider for timer&lt;br /&gt;
	out 40h,al				; set timing (dual pass)&lt;br /&gt;
	in al,0x60				; read keyboard&lt;br /&gt;
	dec al					; quit on ESC&lt;br /&gt;
	jnz top					; otherwise repeat loop&lt;br /&gt;
sounds: db 0xc3, 11, 0x93; 0xc3 is MIDI/RET; fx2-s is used as volume&lt;br /&gt;
table: 	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Explanation of effects==&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Array of chessboards.png|thumb|array of chessboards]]&lt;br /&gt;
The most simple effect and a good one to start with. After offsetting the row by the time, the classic XOR pattern is applied. To achieve the impression of a grid of chessboards, all bits but two are set in the color. The real trick lies in shifting to a &amp;quot;good&amp;quot; palette spot. The portion of the frame which is perceived as black, isn't actually black, it's the dark part of the [https://en.wikipedia.org/wiki/Video_Graphics_Array#Color_palette standard VGA palette]. Shifting that way also makes the dark and light square look like they are from an ancient old chessboard.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	xchg dx,ax		; get XY into AX&lt;br /&gt;
	sub ax,bp		; subtract time from row&lt;br /&gt;
	xor al,ah		; XOR pattern (x xor y)&lt;br /&gt;
	or al,0xDB		; pattern for array of boards&lt;br /&gt;
	add al,13h		; shift to good palette spot&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
[[File:Zooming circles.png|thumb|zooming circles]]&lt;br /&gt;
The distance D of a point (X,Y) to the center (0,0) is sqrt(X²+Y²). The framework already makes sure the DL contains a centered X coordinate, DH=Y has to be centered though. Performing a square root on x86 requires quite a bit of code, but this actually can be omitted. With a clever selection of colors, the impression of zooming circles without square root is quite decent.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov al,dh		; get Y in AL&lt;br /&gt;
	sub al,100		; align Y vertically&lt;br /&gt;
	imul al			; AL = Y²&lt;br /&gt;
	xchg dx,ax		; Y²/256 in DH, X in AL&lt;br /&gt;
	imul al			; AL = X²&lt;br /&gt;
	add dh,ah		; DH = (X² + Y²)/256&lt;br /&gt;
	mov al,dh		; AL = (X² + Y²)/256&lt;br /&gt;
	add ax,bp		; offset color by time&lt;br /&gt;
	and al,8+16		; select special rings&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Tiltplane.png|thumb|scrolling tilted plane]]&lt;br /&gt;
This effect is achieved as follow. First, a large constant is divided by the row number Y, to simulate distance. Then the resulting value is used twice, a) scaled by a centered X value and b) offset by current time. These results are then combined with the XOR pattern of which a special pattern is selected.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov ax,0x1329	; initialize with constant&lt;br /&gt;
	add dh,al		; preventing divide overflow&lt;br /&gt;
	div dh			; reverse divide AL = C/Y'&lt;br /&gt;
	xchg dx,ax		; DL = C/Y', AL = X&lt;br /&gt;
	imul dl			; AH = CX/Y'&lt;br /&gt;
	sub dx,bp		; DL = C/Y'-T 	&lt;br /&gt;
	xor ah,dl		; AH = (CX/Y') ^ (C/Y'-T)&lt;br /&gt;
	mov al,ah		; move to AL&lt;br /&gt;
	and al,4+8+16	; select special pattern&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Paralax.png|thumb|parallax checker boards]]&lt;br /&gt;
This is kind of raycasting with dynamic geometry. Each object plane is split horizontally because of the signed 16 bit multiplication and vertically because of an implicit logic column number operation. Additionally, implicit shearing is applied, so that the edges of the resulting grid (4 &amp;quot;solid&amp;quot; patches, 4 &amp;quot;transparent&amp;quot; patches, alternating) connect. If the ray hits one of the solid patches, the color becomes the iteration number (+palette offset into grayscale), if not, the plane is shifted by the screenpointer and the process is repeated until the maximum iteration number is reached.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cx,bp		; set inital point to time&lt;br /&gt;
	mov bx,-16		; limit to 16 iterations&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di		; offset point by screenpointer&lt;br /&gt;
	mov ax,819		; magic, related to Rrrola constant&lt;br /&gt;
	imul cx			; get X',Y' in DX&lt;br /&gt;
	ror dx,1		; set carry flag on &amp;quot;hit&amp;quot;&lt;br /&gt;
	inc bx			; increment iteration count&lt;br /&gt;
	ja fx3L			; loop until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	lea ax,[bx+31]	; map value to standard gray scale&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Sierpinski rotozoomer.png|thumb|sierpinski rotozoomer]]&lt;br /&gt;
[[File:1 div cos(atan(x)).png|thumb|1/cos(atan(x)), made with www.google.com]]&lt;br /&gt;
Rotozooming normally requires trigonometric functions, or good estimations of them. Let's look at the [https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions usual 2D rotation equation] and interpret it in a special way to get rid of the trigonometry in the calculation:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = x*cos(a) - y*sin(a)&lt;br /&gt;
y' = x*sin(a) + y*cos(a)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
If we add zooming to it, it looks like this:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = z * (x*cos(a) - y*sin(a))&lt;br /&gt;
y' = z * (x*sin(a) + y*cos(a))&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Now let's assume, we don't specify z ourself, and factor out 1/cos(a):&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = 1/cos(a) * (x - y*tan(a))&lt;br /&gt;
y' = 1/cos(a) * (x*tan(a) + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Finally, let's substitute tan(a) with our time variable, since the tangens function rises to infinity the closer we get to 180°:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = 1/cos(atan(T)) * (x - y*T)&lt;br /&gt;
y' = 1/cos(atan(T)) * (x*T + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
As long as we don't care about specifying the zoomfactor and don't directly control the angle, we can now rotate between -180° and +180° without using trigonometric functions. As a byproduct, the zoomfactor is now tied to the time T. The function for the zoomfactor is shown in the image, it results in zooming from infinitely small to one (original) and back to infinitely small. Quite a few bytes have been spent to beautify the effect, correct the timing offset, speeding up the animation, enlarging the sierpinski pixels and have decent colors, but I thought it's worth the space.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	lea cx,[bp-2048]; center time to pass zero&lt;br /&gt;
	sal cx,3		; speed up by factor 8!&lt;br /&gt;
	movzx ax,dh		; get X into AL&lt;br /&gt;
	movsx dx,dl		; get Y int DL&lt;br /&gt;
	mov bx,ax		; save X in BX&lt;br /&gt;
	imul bx,cx		; BX = X*T&lt;br /&gt;
	add bh,dl		; BH = X*T/256+Y&lt;br /&gt;
	imul dx,cx		; DX = Y*T&lt;br /&gt;
	sub al,dh		; AL = X-Y*T/256&lt;br /&gt;
	and al,bh		; AL = (X-Y*T/256)&amp;amp;(X*T/256+Y)&lt;br /&gt;
	and al,252		; thicker sierpinski&lt;br /&gt;
	salc			; set pixel value to black&lt;br /&gt;
	jnz fx4q		; leave black if not sierpinski&lt;br /&gt;
	mov al,0x2A		; otherwise: a nice orange&lt;br /&gt;
	fx4q:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Raycastbenttunnel.png|thumb|raycast bent tunnel]]&lt;br /&gt;
This is a variation of a the [https://www.pouet.net/prod.php?which=78044 raycast tunnel &amp;quot;into a new era&amp;quot;] (see also above). A [https://www.reddit.com/r/tinycode/comments/de7dj9/moving_cubes_with_pathtracing_lighting_and/ detailed explanation] has been posted to reddit for the [https://www.pouet.net/prod.php?which=83204 similar production &amp;quot;Essence&amp;quot;]. The custom colors have been removed, the bend direction is different, and the geometry is a bit more closed, to increase the performance on slower computers and DosBox.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cl,-9		; start with depth 9 (moves backwards)&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx			; save DX, destroyed inside the loop&lt;br /&gt;
		mov al,dh	; Get Y into AL&lt;br /&gt;
		sub al,100	; Centering Y has to be done &amp;quot;manually&amp;quot;.&lt;br /&gt;
		imul cl		; Multiply AL=Y by the current distance, to get a projection(1)&lt;br /&gt;
		xchg ax,dx	; Get X into AL, while saving the result in DX (DH)&lt;br /&gt;
		add al,cl	; add distance to projection, (bend to the right)&lt;br /&gt;
		imul cl		; Multiply AL=X by the current distance, to get a projection(2)&lt;br /&gt;
		mov al,dh	; Get projection(1) in AL&lt;br /&gt;
		xor al,ah	; combine with projection(2)&lt;br /&gt;
		add al,4	; center the walls around 0&lt;br /&gt;
		test al,-8	; check if the wall is hit&lt;br /&gt;
	pop dx			; restore DX&lt;br /&gt;
	loopz fx5L		; repeat until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	sub cx,bp		; offset depth by time&lt;br /&gt;
	xor al,cl		; XOR pattern for texture &lt;br /&gt;
	aam 6			; irregular pattern with MOD 6&lt;br /&gt;
	add al,20		; offset into grayscale palette&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Oceannight.png|thumb|left|oceannight]]&lt;br /&gt;
[[File:Oceanday.png|thumb|right|oceanday]]&lt;br /&gt;
The ocean effect is a wonderful example for a [https://en.wikipedia.org/wiki/Bob_Ross#Style &amp;quot;happy accident&amp;quot;]. If you load a value into the FPU as integer, and store it as floating point value, and interpret it as integer again, you get a cool pattern. Combine that with a reverse divide and you get a neat wavy effect. This works in tight conjunction with the register DX, which has a sign flip at the position we want, so we can easily separate sky and sea. The final touch is the coloring though. By design of the global framework, the value of AL is determined on function entrance, it contains the effect adress. With a bit of code shuffling, the color of the sky came &amp;quot;for free&amp;quot;, without using instructions, as well as the &amp;quot;fading to day&amp;quot;-color, which is also an effect adress. And that is no happy accident. In other versions than the original submitted DosBox version, the sky color can vary for that reason.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	sub dh,120			; check if pixel is in the sky&lt;br /&gt;
	js fx6q				; quit if that's the case&lt;br /&gt;
	mov [bx+si],dx		; move XY to a memory location&lt;br /&gt;
	fild word [bx+si]	; read memory location as integer&lt;br /&gt;
	fidivr dword [bx+si]; reverse divide by constant&lt;br /&gt;
	fstp dword [bx+si-1]; store result as floating point&lt;br /&gt;
	mov ax,[bx+si]		; get the result into AX&lt;br /&gt;
	add ax,bp			; modify color by time&lt;br /&gt;
	and al,128			; threshold into two bands&lt;br /&gt;
	dec ax				; beautify colors to blue/black&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
For convenience, the fading effect from the global framework, isolated. Basically it is generating a pseudorandom value from the screenpointer, then offsetting the time by a scaled value of it, then calling the destined effect.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MIDI music part==&lt;br /&gt;
This code section increments the timing value and produces sound. By chosing channel 3 we can reuse the &amp;quot;change channel instrument&amp;quot; instruction again as &amp;quot;RET&amp;quot;. By shifting around the code of the effects, a reasonable value for the volume could be created and thus save another byte. Be aware that this code sequence only works if the MIDI device is already in UART mode, otherwise three more bytes have to be spent. Quite a few people from the audience and sceners told my afterwards that the melody sounds a bit like [https://en.wikipedia.org/wiki/Incantations_(album) Incantations from Mike Oldfield], but it is rather created by a simple approach. Starting with [https://www.pouet.net/prod.php?which=65604 &amp;quot;Hypnoteye&amp;quot; in 2015], I experimented with procedural MIDI and eventually released a [https://www.pouet.net/prod.php?which=66313 small MIDI framework (64 bytes)]. The basic principle is to jump through tonal space with fixed steps, and map high values back to lower ones (by modulo). With simple combinations of stepwidth and modvalue, interesting effects can be produced. For example stepwidth=3 for reduced minor chords which are never resolved, likewise stepwidth=4, or stepwidth=6 for tritonal effects. With a well chosen modvalue, these patterns can create progressions. I did not do an appropriate theoretical analysis yet, rather explored the tonal space and noted when something sounded interesting.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
sounds: db 0xc3, 11, 0x93, fx2-s&lt;br /&gt;
...&lt;br /&gt;
		inc bp				; increment timing value&lt;br /&gt;
		test bp, 7			; play a note every 8th step&lt;br /&gt;
		jnz nomuse			; quit if in between&lt;br /&gt;
		mov dx,0x330		; port number for MIDI&lt;br /&gt;
		mov si,sounds		; adress for sound data&lt;br /&gt;
		outsb				; change instrument of channel 3&lt;br /&gt;
		outsb				; to vibraphone&lt;br /&gt;
		outsb				; play a note on channel 3&lt;br /&gt;
		imul ax,bp,-19*32*4	; the magic melody constant&lt;br /&gt;
		shr ax,10			; scale down and implicit &amp;quot;and 63&amp;quot;&lt;br /&gt;
		add al,22			; pitch base is 22&lt;br /&gt;
		out dx,al			; play THIS note on channel 3&lt;br /&gt;
		outsb				; play it with THIS volume&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Original release code (complete)==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
; &amp;quot;memories&amp;quot; by HellMood/DESiRE&lt;br /&gt;
; the tiny megademo, 256 byte msdos intro&lt;br /&gt;
; shown in April 2020 @ REVISION&lt;br /&gt;
;&lt;br /&gt;
;   (= WILL BE COMMENTED IN DETAIL LATER =)&lt;br /&gt;
;&lt;br /&gt;
; create : nasm.exe memories.asm -fbin -o memories.com&lt;br /&gt;
; CHOOSE YOUR TARGET PLATFORM (compo version is dosbox)&lt;br /&gt;
; be sure to use the dosbox.conf from this archive!&lt;br /&gt;
; only ONE of the defines should be active!&lt;br /&gt;
%define dosbox			; size : 256 bytes&lt;br /&gt;
;%define freedos		; size : 230 bytes&lt;br /&gt;
;%define winxpdos		; size : 263 bytes&lt;br /&gt;
&lt;br /&gt;
; DON'T TOUCH THESE UNLESS YOU KNOW WHAT YOU'RE DOING&lt;br /&gt;
%ifdef winxpdos&lt;br /&gt;
	%define music&lt;br /&gt;
	%define switch_uart&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
	%define safe_segment&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef dosbox&lt;br /&gt;
	%define music&lt;br /&gt;
	;%define safe_dx ; sometimes needed&lt;br /&gt;
%endif&lt;br /&gt;
&lt;br /&gt;
; GLOBAL PARAMETERS, TUNE WITH CARE!&lt;br /&gt;
%define volume 127	; not used on dosbox (optimization)&lt;br /&gt;
%define instrument 11&lt;br /&gt;
%define scale_mod -19*32*4; &lt;br /&gt;
%define time_mask 7&lt;br /&gt;
%define targetFPS 35&lt;br /&gt;
%define tempo 1193182/256/targetFPS		&lt;br /&gt;
%define sierp_color 0x2A&lt;br /&gt;
%define tunnel_base_color 20&lt;br /&gt;
%define tunnel_pattern 6&lt;br /&gt;
%define tilt_plate_pattern 4+8+16&lt;br /&gt;
%define circles_pattern 8+16&lt;br /&gt;
&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov fs,ax&lt;br /&gt;
	mov [fs:0x46c],ax&lt;br /&gt;
%endif&lt;br /&gt;
	mov al,0x13&lt;br /&gt;
	int 0x10	 &lt;br /&gt;
	xchg bp,ax&lt;br /&gt;
	push 0xa000-10&lt;br /&gt;
	pop es&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	mov ax,0x251c&lt;br /&gt;
	%ifdef safe_dx	&lt;br /&gt;
		mov dx,timer	&lt;br /&gt;
	%else ; assume DH=1, mostly true on DosBox&lt;br /&gt;
		mov dl,timer&lt;br /&gt;
	%endif&lt;br /&gt;
	int 0x21&lt;br /&gt;
%endif&lt;br /&gt;
top:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov bp,[fs:0x46c]&lt;br /&gt;
%endif	&lt;br /&gt;
	mov ax,0xcccd&lt;br /&gt;
	mul di&lt;br /&gt;
	add al,ah&lt;br /&gt;
	xor ah,ah&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	shr ax,9&lt;br /&gt;
	and al,15&lt;br /&gt;
	xchg bx,ax&lt;br /&gt;
	mov bh,1&lt;br /&gt;
	mov bl,[byte bx+table]&lt;br /&gt;
	call bx&lt;br /&gt;
	stosb&lt;br /&gt;
	inc di&lt;br /&gt;
	inc di&lt;br /&gt;
	jnz top&lt;br /&gt;
	mov al,tempo&lt;br /&gt;
	out 40h,al&lt;br /&gt;
	in al,0x60&lt;br /&gt;
	dec al&lt;br /&gt;
	jnz top&lt;br /&gt;
sounds:&lt;br /&gt;
	db 0xc3	; is MIDI/RET&lt;br /&gt;
%ifdef music&lt;br /&gt;
	db instrument,0x93&lt;br /&gt;
	%ifdef switch_uart&lt;br /&gt;
		db volume		; without switch, volume is in table&lt;br /&gt;
		db 0x3f &lt;br /&gt;
	%endif&lt;br /&gt;
%endif&lt;br /&gt;
table: ; first index is volume, change order with care!		    					&lt;br /&gt;
	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
stop:&lt;br /&gt;
	pop ax&lt;br /&gt;
	ret&lt;br /&gt;
timer:&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	%ifdef safe_segment&lt;br /&gt;
		push cs&lt;br /&gt;
		pop ds&lt;br /&gt;
	%endif&lt;br /&gt;
		inc bp&lt;br /&gt;
	%ifdef music	&lt;br /&gt;
		test bp, time_mask&lt;br /&gt;
		jnz nomuse&lt;br /&gt;
		mov dx,0x330&lt;br /&gt;
		mov si,sounds&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		imul ax,bp,scale_mod&lt;br /&gt;
		shr ax,10&lt;br /&gt;
		add al,22&lt;br /&gt;
		out dx,al&lt;br /&gt;
		outsb&lt;br /&gt;
		%ifdef switch_uart&lt;br /&gt;
			inc dx&lt;br /&gt;
			outsb&lt;br /&gt;
		%endif&lt;br /&gt;
	%endif&lt;br /&gt;
nomuse:&lt;br /&gt;
	iret&lt;br /&gt;
%endif	&lt;br /&gt;
fx0: ; tilted plane, scrolling&lt;br /&gt;
	mov ax,0x1329&lt;br /&gt;
	add dh,al&lt;br /&gt;
	div dh&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul dl&lt;br /&gt;
	sub dx,bp&lt;br /&gt;
	xor ah,dl&lt;br /&gt;
	mov al,ah&lt;br /&gt;
	and al,tilt_plate_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx2: ; board of chessboards&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	sub ax,bp&lt;br /&gt;
	xor al,ah&lt;br /&gt;
	or al,0xDB&lt;br /&gt;
	add al,13h&lt;br /&gt;
ret&lt;br /&gt;
fx1: ; circles, zooming&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	sub al,100&lt;br /&gt;
	imul al&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul al&lt;br /&gt;
	add dh,ah&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,circles_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx3: ; parallax checkerboards&lt;br /&gt;
	mov cx,bp&lt;br /&gt;
	mov bx,-16&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di&lt;br /&gt;
	mov ax,819&lt;br /&gt;
	imul cx	 &lt;br /&gt;
	ror dx,1	 &lt;br /&gt;
	inc bx	 &lt;br /&gt;
	ja fx3L&lt;br /&gt;
	lea ax,[bx+31]	 &lt;br /&gt;
ret&lt;br /&gt;
fx4: ; sierpinski rotozoomer	&lt;br /&gt;
	lea cx,[bp-2048]&lt;br /&gt;
	sal cx,3&lt;br /&gt;
	movzx ax,dh&lt;br /&gt;
	movsx dx,dl&lt;br /&gt;
	mov bx,ax&lt;br /&gt;
	imul bx,cx&lt;br /&gt;
	add bh,dl&lt;br /&gt;
	imul dx,cx&lt;br /&gt;
	sub al,dh&lt;br /&gt;
	and al,bh&lt;br /&gt;
	and al,0b11111100&lt;br /&gt;
	salc				; VERY slow on dosbox, but ok&lt;br /&gt;
	jnz fx4q&lt;br /&gt;
	mov al,sierp_color&lt;br /&gt;
	fx4q:&lt;br /&gt;
ret&lt;br /&gt;
fx5: ; raycast bent tunnel&lt;br /&gt;
	mov cl,-9&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		sub al,100&lt;br /&gt;
		imul cl&lt;br /&gt;
		xchg ax,dx	&lt;br /&gt;
		add al,cl&lt;br /&gt;
		imul cl&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		xor al,ah&lt;br /&gt;
		add al,4&lt;br /&gt;
		test al,-8&lt;br /&gt;
	pop dx&lt;br /&gt;
	loopz fx5L&lt;br /&gt;
	sub cx,bp&lt;br /&gt;
	xor al,cl&lt;br /&gt;
	aam tunnel_pattern; VERY slow on dosbox, but ok&lt;br /&gt;
	add al,tunnel_base_color&lt;br /&gt;
ret&lt;br /&gt;
fx6: ; ocean night / to day sky&lt;br /&gt;
	sub dh,120&lt;br /&gt;
	js fx6q&lt;br /&gt;
	mov [bx+si],dx&lt;br /&gt;
	fild word [bx+si]&lt;br /&gt;
	fidivr dword [bx+si]&lt;br /&gt;
	fstp dword [bx+si-1]&lt;br /&gt;
	mov ax,[bx+si]&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,128&lt;br /&gt;
	dec ax&lt;br /&gt;
fx6q:&lt;br /&gt;
ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Further tweaks==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==Differences between dos versions==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==Bonus - NFO/ASCII==&lt;br /&gt;
&lt;br /&gt;
  &amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
                                                             art : hammerfist&lt;br /&gt;
         ∂#MW%e                              _d$Ng,&lt;br /&gt;
         'B,  ∂b                   _jM@$QZb,cQ&amp;quot;  )@&lt;br /&gt;
  ,edRB$b,l@   Wk,yGR$KM&amp;amp;$b,     ,dP&amp;quot;     Wl ]bsd%UR8BG6&amp;amp;$@DSyG#ZKM&amp;amp;$b,&lt;br /&gt;
,dP      &amp;quot;T%L  'MGF      &amp;quot;*∂R_   Tg    &amp;quot;*4Zk,#I  YP   W&amp;quot;    7P      &amp;quot;*∂R&lt;br /&gt;
4M   gd@    ^   ∂@   d@b   dQ$#@Z@R3L_    &amp;quot;*GMj  'W      ,gd$   d@b   9Q$#%b&lt;br /&gt;
W#,  `M          Wb  `*  _4P   `Qk  *#N8L   `H5   @b   'QR7YK   `*  _4F&amp;quot;   Qk&lt;br /&gt;
`6@L             dML            '@          ,BK   'M    ∂B  *b,            '#L&lt;br /&gt;
  ^QBb,_     _,4&amp;amp;M∞∂@=,_       _dGL       _gQKM    GL    @k  'Mg,_         _dG,&lt;br /&gt;
    &amp;quot;*BN5W$2#MNP&amp;quot;   &amp;quot;*G3WRM8&amp;amp;B5P&amp;quot;`Y@QNW3Z5P&amp;quot; ∂#$W8BRM3XZN87    &amp;quot;*GW38M%EBDW5P&amp;quot;`&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
                              p r e s e n t s&lt;br /&gt;
&lt;br /&gt;
            4&lt;br /&gt;
           d@,&lt;br /&gt;
         _&amp;amp; `Wl&lt;br /&gt;
      _,aP   &amp;quot;#baedM$#@@K JP*&amp;quot;?ML&lt;br /&gt;
 ,ad@$#P&amp;quot;         ,d@NEWVB&amp;quot;     X,aQPYb,_&lt;br /&gt;
V@Mm,_          ,d@MW#BW'      EMP&amp;quot;   '¶R ,ngBP^fML&lt;br /&gt;
 ¶M@N@y        Y#BNW#M&amp;quot;       J9&amp;quot;      `MQ9&amp;quot;      &amp;quot;MgRBq  ,QBMg,&lt;br /&gt;
  VN#P` ,d@@    `WM@^                   7f         ¶F` 7kY&amp;quot;   ^G  _.eQNE1.&lt;br /&gt;
   ]B _G@MWN$,   `P                     '     4b       QP      ¶w@F*^  ^Qb&lt;br /&gt;
   ]O@NRM#W@MNB,         ;                    ^`      j        JP^       Yl&lt;br /&gt;
  J#NRNWM@#BcT&amp;quot;^        ,A  _J                     _q@                   `X&lt;br /&gt;
 '¶WM#B@WdY`,7        _G#YN#PM                 _,gG&amp;quot;                      M,&lt;br /&gt;
  *BN#WP&amp;quot;  dK       ,Q@NRMB&amp;quot;]9       ,      _,M@Q*                        #A&lt;br /&gt;
   &amp;quot;U^      V@h,   iNBW#NT  J'      J9     s@QN&amp;quot;         _;               'D,&lt;br /&gt;
             ¶RMBv&amp;amp;NMQR@9  .W      .K'     &amp;quot;9`         ,6BA   _JL          ]l&lt;br /&gt;
              Y#NE@W#NRP   #[      `¶8               _d@MW#B_jW#W          BN&lt;br /&gt;
               &amp;quot;GQ@MR#W    QL_      *B            _,p#NBW#NQMG@WY          3Q&lt;br /&gt;
                  &amp;quot;Y@F     ,XW@M%im,_Yb_     _,g5@#MW@QMNE@E@NRMB         ,WM&lt;br /&gt;
                    `  _,gP*&amp;quot;#REM#GB@N#MQbnd@N#M@MW#R8QSB^'WQERM@        ;4NB,&lt;br /&gt;
                     ,GYKL    ¶E#B8R8QSB@M@#BM#W@MNB&amp;quot;`_  ,  &amp;quot;^` N       ,dW@Ql&lt;br /&gt;
                   _Q`'W`*t    '¶@GS#MBQ#E@W#NQBW[     'LvQ_   ,K    _dNABGM#N&lt;br /&gt;
                  ,F   '          `^WAB@QGE9*&amp;quot;9^*@L    jP7FY,  ¶h,_.jWM#BR#GBM,&lt;br /&gt;
                 J;    ,   _                  '       '   &amp;quot;LL  YxE#B8R8QSBNW@W;&lt;br /&gt;
                AP   _,Ag6^          _   J                  ¶A  `&amp;quot;Q#M@MW#R8E#P&lt;br /&gt;
               j@   `&amp;quot;XQW[            'LvK,_      'L_,/      @t    Y#NE@WNR&amp;quot;&lt;br /&gt;
              :M/     9^*@L           jP7F&amp;quot;       _PYKL     _,A;     ¶RSNQ&amp;quot;&lt;br /&gt;
              dKL     '     `        '   &amp;quot;L      &amp;quot;`'W`*t   `&amp;quot;XQb      `W^&lt;br /&gt;
              Q`8t            'L_,/         ,   _   '        9^Q&lt;br /&gt;
             ,W               _PYKL       _,Ag6^             ' W,     _ ,#N&amp;amp;&lt;br /&gt;
             !N  _   J       &amp;quot;`'W`*t     `&amp;quot;XQW[       _  J     N!_JG9^RwQ' *t&lt;br /&gt;
             `W,  'LvK,_        '        _gGB8@L   _   'LvK,_ ,WgB'    V    7L&lt;br /&gt;
         _.,gm&amp;amp;@B&amp;amp;wBZF&amp;quot;                j@'`  &amp;quot;WL _gML  jZd7Yb lN&amp;quot;          dBWl&lt;br /&gt;
      ,g&amp;amp;QB*&amp;quot;^`    `&amp;quot;*G@g, .gR&amp;amp;k,_   ,N&amp;quot;      '@QF  ¶k;gMF  *QvQ     jQ, ,@N@B#,&lt;br /&gt;
   .eQF*`              `Yb@&amp;quot;  &amp;quot;*6Qg,gF     ,   7     XMN&amp;quot;    'MNB,    ^¶QWSER@N;&lt;br /&gt;
 ,gP&amp;quot;           qy,      W'       ^Q'     &amp;amp;L      ,g@W'       `QMEL     `&amp;quot;WBNWP&lt;br /&gt;
g7              ¶9      ,X         M?     9&amp;quot;   _q8MSK           ¶EMt       *@K&lt;br /&gt;
Vh   _,m#L             _AH        le         ,GBDNE9^A,          *@F        NMg&lt;br /&gt;
 ¶L,qQ@ND           _.m@Bl        We      ,gM@B8#Q'   ¶h_                   lWE,&lt;br /&gt;
  W9NHW@`          JWM#B@]        @e     4WR@NGF^      'QL                  dRWl&lt;br /&gt;
   VMd*            &amp;quot;@BE@PM        'N      *UP&amp;quot;           VW,               JRSB;&lt;br /&gt;
  ,@F       j       `¶WK W,        ¶t                     XNt            _A@E#N&lt;br /&gt;
_JP       ,6&amp;amp;         &amp;quot;GLdM         XD,               _.g8NMA@k,_    _,gG#NMGR;&lt;br /&gt;
&amp;quot;Z      .JRER           'VMi     _jNB#W&amp;amp;_         _,j@E@W#Nl ¶MBGMNQGNQMG@QBW9&lt;br /&gt;
 ¶h   ,G@NRMBl            `&amp;quot;   ,d#R@M$F ¶Mg,_.gp&amp;amp;@@NEWVBWBMG  *QMN8R8SBN$E@WF&lt;br /&gt;
  Vb dW#R8QSRb,                 *YM@EQ,_ 'MENBW#NQMG#B@R@MW#l   &amp;quot;BM@QNENRQG'&lt;br /&gt;
   *WGS#MBMNEYL                    `^&amp;quot;*8M@Q@NRM#W@BWSNW@QBF&amp;quot;`     `^*@QBF^ [HFT]&lt;br /&gt;
    ^M@MW#Q9 ^Wt                           `^¶RQ@W8NQGP*`&lt;br /&gt;
     ¶Q#@P     Vk                            lA `&amp;quot;^`&lt;br /&gt;
      Y&amp;quot;       `MA                           J#,&lt;br /&gt;
                *R@,                        ,MQl&lt;br /&gt;
                 Y#Wk,                      GWM8L&lt;br /&gt;
                  W8RQSt,_                 AQ@MR#,&lt;br /&gt;
                  `@M@#SB@Mbm.,_          QNBW#NW&lt;br /&gt;
                    ¶QB8R8SBN$WNRM@#GNtwg@NMQR@B'&lt;br /&gt;
                     *MBQ#8R8QS@NE@WNBW#NQMG@NR;&lt;br /&gt;
                      `WGS#MBQ#R8QSB@NE@W#NQBW9&lt;br /&gt;
                        *OMW@QMNE@E@NRMW@QMB@*&lt;br /&gt;
                          `^&amp;quot;YQW@Q#SB#NE@EGP&lt;br /&gt;
                               `^&amp;quot;*8R@GBQF`&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Memories&amp;diff=503</id>
		<title>Memories</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Memories&amp;diff=503"/>
				<updated>2020-04-20T02:17:56Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
Hello! My name is &amp;quot;HellMood&amp;quot; and this is not a usual wiki style article. It will be written from my point of view, in first person. This article is all about a tiny MS DOS program called &amp;quot;Memories&amp;quot;. This program has a size of 256 bytes and won the &amp;quot;PC 256 byte&amp;quot; competition of the [https://en.wikipedia.org/wiki/Demoscene demoscene] event [https://2020.revision-party.net/ &amp;quot;Revision&amp;quot;] in 2020, as well as the public choice award. You can watch a video of the output of this program here (https://www.youtube.com/watch?v=Imquk_3oFf4) and a video including live reactions of online audience and moderators here (https://www.twitch.tv/videos/589179638?t=7h02m40s). You can download the release and comment here (https://www.pouet.net/prod.php?which=85227). This wiki article will give you an indepth analysis of the program and point out historical references as well as development steps. In case you did not notice, you are now on the &amp;quot;sizecoding wiki&amp;quot;. This will help you not only to understand the interior of &amp;quot;Memories&amp;quot;, but also enable you to produce something like this yourself. Feel free to look around! It is recommended that you start with the [http://www.sizecoding.org/wiki/Getting_Started basics of this wiki] first if you're new to sizecoding and/or x86 assembler. While the principles might be easy to grasp, details of the implementation might not.&lt;br /&gt;
&lt;br /&gt;
==Disclaimer==&lt;br /&gt;
This article discusses the submitted DosBox version (256 bytes). The available archive contains versions for FreeDos and Windows XP DOS which, at the time of this writing, don't work on all computers. These alternative versions were included as proof of concept, to show that this program does not only run on the emulator &amp;quot;DosBox&amp;quot;. For the &amp;quot;PC 256 bytes&amp;quot; competition of the event &amp;quot;Revision&amp;quot; in 2020, it was allowed to specify either &amp;quot;FreeDos&amp;quot; or &amp;quot;DosBox&amp;quot; as platform, the latter with a specific configuration. It is in fact possible to modify the DosBox version to work on FreeDos, MS DOS, WinXP and Win98 as the released alternative versions showcase, but this is not the focus of this article. Anyway, safe versions for all platform and all computers are in the works. Regarding possible further optimizations, I will discuss/explain only the submitted version although i already found several spots in the code that can be optimized. As the history of tiny effects show, almost no code is perfect, and can be brought down further in size. To avoid confusion between different versions, only the submitted version will be explained.&lt;br /&gt;
&lt;br /&gt;
==History of tiny effects==&lt;br /&gt;
[[File:Pouet categories.png|thumb|Size categories on http://www.pouet.net]]&lt;br /&gt;
Sizecoders think in size categories. For MS DOS, these categories are 256b, 128b, 64b and 32b. These are the standards of one of the biggest demoscene archives, http://www.pouet.net. There is no 16b category, although many tiny effects can be implemented in 16 bytes. Almost all of the effects of &amp;quot;Memories&amp;quot; have been coded and optimized by me before, and mostly the implementations were attempts to reduce an already existing effect in size, or do something similar in less size. Reducing the size in this case means, reducing it to the (one of the) next lower 2^N category. For example i might have seen an effect done in 33 to 64 bytes, and then reduced the size to 32 bytes or less. Almost everytime i moved an effect to a lower category, i submitted the resulting tiny program to a demoscene party which allowed remote entries in the 256b category and/or published the result to http://www.pouet.net . In this section i will introduce the effects and mention the origins and authors.&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Kasparov.gif|frame|kasparov, 16 bytes]]&lt;br /&gt;
The origin of this effect is my own &amp;quot;Kasparov 16b&amp;quot; from 2018 (https://www.pouet.net/prod.php?which=75912) apparently i &amp;quot;had this piece lying around, stuck at 17 bytes and not that beautiful, until i found a trick ;)&amp;quot; (quote from the release notes) There was a similar effect in 32 bytes before : &amp;quot;ew&amp;quot; by &amp;quot;headcrash&amp;quot; from 2003. (https://www.pouet.net/prod.php?which=17567) In this case, my design goal was to show &amp;quot;real&amp;quot; chessboards of 8x8 fields and recognizable dark and light squares, as well as the correct orientation of the single chessboards, meaning that the bottom right corner (h1) has to be a [https://en.wikibooks.org/wiki/Chess/Arranging_The_Board white square]. For &amp;quot;Memories&amp;quot;, the effect had to be reimplemented with another approach of writing to the screen to work with the framework, also the scrolling direction was changed to be different from the &amp;quot;scrolling tilted plane&amp;quot; effect.&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
The zooming circles were supposed to be entered to a demoparty as a 32 bytes intro, but i never actually did it. There is no real predecessor in 64 bytes, because in the 64b category much more complex effects are possible. The zooming circles were the result of desperately trying to reach the 32b category for a circular &amp;quot;tunnel&amp;quot; effect, for that my own record is still 52 bytes (&amp;quot;Neontube&amp;quot; - 2016)(https://www.pouet.net/prod.php?which=66808) which in turn was the optimization of the all-time-classic 64 byte effect &amp;quot;constant evolution&amp;quot; from ryg/Farbrausch (2003) (https://www.pouet.net/prod.php?which=8697). In the zooming circles routine, the distance and the angle are removed/ignored, so that the 32b category could be reached.&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Floorcast variation.png|thumb|Floorcast, 32 bytes version, variation]]&lt;br /&gt;
The scrolling tilted plane is one of my own releases, &amp;quot;floorcast 32b&amp;quot; from 2018. Floorcasts have its own history in sizecoding, the effect was brought down stepwise from 256 bytes to 32 bytes. The versions differ in the number of planes, some show two planes, others only one. For the &amp;quot;floorcast 32b&amp;quot; release i specifically decided against the &amp;quot;XOR&amp;quot; texture, whereas in &amp;quot;Memories&amp;quot; i used it again, but disguised with another final touch of &amp;quot;AND&amp;quot;.&lt;br /&gt;
* &amp;quot;rain_storm version 2008 - 256 bytes - https://www.pouet.net/prod.php?which=50479&lt;br /&gt;
* &amp;quot;org_100h&amp;quot; version 2008 - 128 bytes - https://www.pouet.net/prod.php?which=63271&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2013 - 86 bytes - https://www.pouet.net/prod.php?which=61070&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2014 - 64 bytes - https://www.pouet.net/prod.php?which=63305&lt;br /&gt;
* &amp;quot;HellMood&amp;quot; version 2018 - 32 bytes - https://www.pouet.net/prod.php?which=77774&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Projektbeschreibung.gif|frame|Projektbeschreibung, 32 bytes]]&lt;br /&gt;
The parallax checkerboards have been released by me as 32 bytes effect &amp;quot;Projektbeschreibung&amp;quot; in 2018. It was the direct attempt to bring &amp;quot;Follow the light&amp;quot; (https://www.pouet.net/prod.php?which=28543) from &amp;quot;Digimind&amp;quot; (2006) or my own &amp;quot;Lucy&amp;quot; (2014)(https://www.pouet.net/prod.php?which=63293) down to 32 bytes. Very helpful inspirations came from &amp;quot;Rrrolas&amp;quot; &amp;quot;Paralaxa&amp;quot; (32 bytes, 2007, https://www.pouet.net/prod.php?which=32281) and &amp;quot;Sensenstahl&amp;quot;s &amp;quot;Byteropolis&amp;quot; (2013) (https://www.pouet.net/prod.php?which=61075). In fact Rrrolas rendering approach was already close to the final solution, the code was modified to correct the planes arrangement, change the shape from triangles to checkerboards and improve on the colors. In &amp;quot;Memories&amp;quot;, the coloring of &amp;quot;Digimind&amp;quot;s version was used. Also, the effect has been modified to reduce shearing as much as possible.&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Colpinski.png|thumb|colpinski, 16 bytes]]&lt;br /&gt;
[[File:Rotastic.png|thumb|left|rotastic, 32 bytes]]&lt;br /&gt;
This consists of two effects, the rotozoomer and the sierpinski effect as a texture. The sierpinski effect is based on my own &amp;quot;Colpinski 16b&amp;quot; from 2013 (https://www.pouet.net/prod.php?which=62079) where &amp;quot;frag&amp;quot; of &amp;quot;fsqrt&amp;quot; helped pushing the limits. There is no real predecessor to that, because it is known that this effect can be done directly by combining X and Y, instead of an Iterated Function System or Cellular Automaton approach. The rotozoomer was released as 32b intro &amp;quot;rotastic&amp;quot; by me in 2017 (https://www.pouet.net/prod.php?which=70112), it is based on the ideas of &amp;quot;ryg&amp;quot; of &amp;quot;Farbrausch&amp;quot; (51 bytes, 2002, https://www.pouet.net/prod.php?which=5015) and &amp;quot;Gargaj&amp;quot; of &amp;quot;Conspiracy&amp;quot; (49 bytes, 2002, https://www.pouet.net/prod.php?which=6088).&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Intoanew.gif|frame|Into a new era, 64 bytes version]]&lt;br /&gt;
The raycast bent tunnel is a tweaked version of my own 64 bytes &amp;quot;Into a new era&amp;quot; release (2018, https://www.pouet.net/prod.php?which=78044). The custom colors were exchanged for colors of the standard palette, the geometry was tweaked a bit and changed accordingly, so that no depth effect was used for the calculation of texture values. The standalone version of this effect has a size of 50 bytes. The two main inspirations for the 64 bytes version were two 128 byte intros, &amp;quot;Spongy&amp;quot; by &amp;quot;TBC&amp;quot; (2009, https://www.pouet.net/prod.php?which=53871) and &amp;quot;Wolf128&amp;quot; from &amp;quot;Baudsurfer&amp;quot; (2014, https://www.pouet.net/prod.php?which=63518), while the particular algorithm was developed independently.&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Ocean.png|thumb|Ocean, 64b version]]&lt;br /&gt;
The ocean effect is based on my own 64 bytes release &amp;quot;Ocean&amp;quot; from 2016 (https://www.pouet.net/prod.php?which=66857). The custom color generation and music generation have been stripped from the original, both were not compatible with the main framework of &amp;quot;Memories&amp;quot; without consuming a LOT of extra space. The special effect of &amp;quot;dawn&amp;quot; happens as a byproduct of the general framework approach. This will be explained in the next chapter.&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
The transition between two effects is an effect itself, which has no real predecessor. It is rather an idea that evolved over the years, and might have been done by many others in similar ways. In a nutshell, while calculating a frame, each pixel position is randomized and the time - which determines the effect to use - is offset by this randomization value which is downscaled before. This allows the usage of the standard VGA palette (picture, source) rather than creating custom colors for real crossfading and thus saves space.&lt;br /&gt;
&lt;br /&gt;
==Tiny megademo framework==&lt;br /&gt;
For putting many tiny effects together into a &amp;quot;megademo&amp;quot;, they have all to use the same approach and they have to be mostly free of assumptions (contents of memory and registers). Also they have to use the same timing values and be tuned to a shared timing in general. Preparing the single effects to be includeable in a framework like this took quite some time and required a lot of extra space initially. It has to be noted, that - sadly - some of my most impressive effects (judged by audience reactions and views on social media) could not be included at all, because the approaches were not combineable without a HUGE overhead. Once all effects were streamlined, i could think about &amp;quot;factoring&amp;quot; out common calculations which repeatedly occur, resulting in a few saved bytes. The framework does the following:&lt;br /&gt;
* Setting up screen mode to 320 x 200 pixels in 256 colors&lt;br /&gt;
* Initializing a pointer to the screen&lt;br /&gt;
* Installing a callback for timing and music&lt;br /&gt;
* Mainloop&lt;br /&gt;
** Calculation of X and Y from the current screen location&lt;br /&gt;
*** with the [http://www.sizecoding.org/wiki/General_Coding_Tricks#Obtaining_X_and_Y_without_DIV_.28The_Rrrola_Trick.29 &amp;quot;Rrrola Trick&amp;quot;]&lt;br /&gt;
** P = Pseudorandom value from screen location&lt;br /&gt;
** Offset current time T by a scaled value P'&lt;br /&gt;
** Selection of effect number N according to T'&lt;br /&gt;
** Execution of effect N (X,Y) for current pixel&lt;br /&gt;
*** On entrance, XY is in register DX (DL,DH)&lt;br /&gt;
*** On entrance, the current time is in register BP&lt;br /&gt;
*** On exit, the pixel color is expected in AL&lt;br /&gt;
** Advance to next pixel&lt;br /&gt;
** Triple diagonal interlacing for smoothing&lt;br /&gt;
** Repeat until frame is complete&lt;br /&gt;
* Set timer to a tempo of ~35 FPS&lt;br /&gt;
* Check keyboard for ESC&lt;br /&gt;
** Quit on ESC, otherwise continue&lt;br /&gt;
=== Code of framework===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
	mov al,0x13				; set AL to mode 320*200 in 256 colors&lt;br /&gt;
	int 0x10	 			; call BIOS to set mode&lt;br /&gt;
	xchg bp,ax				; set timing value to 0x13 &lt;br /&gt;
	push 0xa000-10			; write the screen adress to register ES&lt;br /&gt;
	pop es					; works in conjunction with Rrrola trick&lt;br /&gt;
	mov ax,0x251c			; parameter for changing timer interrupt&lt;br /&gt;
	mov dl,timer			; adress of timer routine, assume DH=1&lt;br /&gt;
	int 0x21				; install timer routine&lt;br /&gt;
top:&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
	call bx					; call the effect&lt;br /&gt;
	stosb					; write the return value and advance&lt;br /&gt;
	inc di					; triple interlace trick for after&lt;br /&gt;
	inc di					; effect and smoothing the animation&lt;br /&gt;
	jnz top					; repeat until the frame is complete&lt;br /&gt;
	mov al,tempo			; set AL to divider for timer&lt;br /&gt;
	out 40h,al				; set timing (dual pass)&lt;br /&gt;
	in al,0x60				; read keyboard&lt;br /&gt;
	dec al					; quit on ESC&lt;br /&gt;
	jnz top					; otherwise repeat loop&lt;br /&gt;
sounds: db 0xc3, 11, 0x93; 0xc3 is MIDI/RET; fx2-s is used as volume&lt;br /&gt;
table: 	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Explanation of effects==&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Array of chessboards.png|thumb|array of chessboards]]&lt;br /&gt;
The most simple effect and a good one to start with. After offsetting the row by the time, the classic XOR pattern is applied. To achieve the impression of a grid of chessboards, all bits but two are set in the color. The real trick lies in shifting to a &amp;quot;good&amp;quot; palette spot. The portion of the frame which is perceived as black, isn't actually black, it's the dark part of the [https://en.wikipedia.org/wiki/Video_Graphics_Array#Color_palette standard VGA palette]. Shifting that way also makes the dark and light square look like they are from an ancient old chessboard.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	xchg dx,ax		; get XY into AX&lt;br /&gt;
	sub ax,bp		; subtract time from row&lt;br /&gt;
	xor al,ah		; XOR pattern (x xor y)&lt;br /&gt;
	or al,0xDB		; pattern for array of boards&lt;br /&gt;
	add al,13h		; shift to good palette spot&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
[[File:Zooming circles.png|thumb|zooming circles]]&lt;br /&gt;
The distance D of a point (X,Y) to the center (0,0) is sqrt(X²+Y²). The framework already makes sure the DL contains a centered X coordinate, DH=Y has to be centered though. Performing a square root on x86 requires quite a bit of code, but this actually can be omitted. With a clever selection of colors, the impression of zooming circles without square root is quite decent.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov al,dh		; get Y in AL&lt;br /&gt;
	sub al,100		; align Y vertically&lt;br /&gt;
	imul al			; AL = Y²&lt;br /&gt;
	xchg dx,ax		; Y²/256 in DH, X in AL&lt;br /&gt;
	imul al			; AL = X²&lt;br /&gt;
	add dh,ah		; DH = (X² + Y²)/256&lt;br /&gt;
	mov al,dh		; AL = (X² + Y²)/256&lt;br /&gt;
	add ax,bp		; offset color by time&lt;br /&gt;
	and al,8+16		; select special rings&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Tiltplane.png|thumb|scrolling tilted plane]]&lt;br /&gt;
This effect is achieved as follow. First, a large constant is divided by the row number Y, to simulate distance. Then the resulting value is used twice, a) scaled by a centered X value and b) offset by current time. These results are then combined with the XOR pattern of which a special pattern is selected.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov ax,0x1329	; initialize with constant&lt;br /&gt;
	add dh,al		; preventing divide overflow&lt;br /&gt;
	div dh			; reverse divide AL = C/Y'&lt;br /&gt;
	xchg dx,ax		; DL = C/Y', AL = X&lt;br /&gt;
	imul dl			; AH = CX/Y'&lt;br /&gt;
	sub dx,bp		; DL = C/Y'-T 	&lt;br /&gt;
	xor ah,dl		; AH = (CX/Y') ^ (C/Y'-T)&lt;br /&gt;
	mov al,ah		; move to AL&lt;br /&gt;
	and al,4+8+16	; select special pattern&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Paralax.png|thumb|parallax checker boards]]&lt;br /&gt;
This is kind of raycasting with dynamic geometry. Each object plane is split horizontally because of the signed 16 bit multiplication and vertically because of an implicit logic column number operation. Additionally, implicit shearing is applied, so that the edges of the resulting grid (4 &amp;quot;solid&amp;quot; patches, 4 &amp;quot;transparent&amp;quot; patches, alternating) connect. If the ray hits one of the solid patches, the color becomes the iteration number (+palette offset into grayscale), if not, the plane is shifted by the screenpointer and the process is repeated until the maximum iteration number is reached.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cx,bp		; set inital point to time&lt;br /&gt;
	mov bx,-16		; limit to 16 iterations&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di		; offset point by screenpointer&lt;br /&gt;
	mov ax,819		; magic, related to Rrrola constant&lt;br /&gt;
	imul cx			; get X',Y' in DX&lt;br /&gt;
	ror dx,1		; set carry flag on &amp;quot;hit&amp;quot;&lt;br /&gt;
	inc bx			; increment iteration count&lt;br /&gt;
	ja fx3L			; loop until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	lea ax,[bx+31]	; map value to standard gray scale&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Sierpinski rotozoomer.png|thumb|sierpinski rotozoomer]]&lt;br /&gt;
[[File:1 div cos(atan(x)).png|thumb|1/cos(atan(x)), made with www.google.com]]&lt;br /&gt;
Rotozooming normally requires trigonometric functions, or good estimations of them. Let's look at the [https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions usual 2D rotation equation] and interpret it in a special way to get rid of the trigonometry in the calculation:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = x*cos(a) - y*sin(a)&lt;br /&gt;
y' = x*sin(a) + y*cos(a)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
If we add zooming to it, it looks like this:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = z * (x*cos(a) - y*sin(a))&lt;br /&gt;
y' = z * (x*sin(a) + y*cos(a))&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Now let's assume, we don't specify z ourself, and factor out 1/cos(a):&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = 1/cos(a) * (x - y*tan(a))&lt;br /&gt;
y' = 1/cos(a) * (x*tan(a) + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Finally, let's substitute tan(a) with our time variable, since the tangens function rises to infinity the closer we get to 180°:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = 1/cos(atan(T)) * (x - y*T)&lt;br /&gt;
y' = 1/cos(atan(T)) * (x*T + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
As long as we don't care about specifying the zoomfactor and don't directly control the angle, we can now rotate between -180° and +180° without using trigonometric functions. As a byproduct, the zoomfactor is now tied to the time T. The function for the zoomfactor is shown in the image, it results in zooming from infinitely small to one (original) and back to infinitely small. Quite a few bytes have been spent to beautify the effect, correct the timing offset, speeding up the animation, enlarging the sierpinski pixels and have decent colors, but i thought it's worth the space.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	lea cx,[bp-2048]; center time to pass zero&lt;br /&gt;
	sal cx,3		; speed up by factor 8!&lt;br /&gt;
	movzx ax,dh		; get X into AL&lt;br /&gt;
	movsx dx,dl		; get Y int DL&lt;br /&gt;
	mov bx,ax		; save X in BX&lt;br /&gt;
	imul bx,cx		; BX = X*T&lt;br /&gt;
	add bh,dl		; BH = X*T/256+Y&lt;br /&gt;
	imul dx,cx		; DX = Y*T&lt;br /&gt;
	sub al,dh		; AL = X-Y*T/256&lt;br /&gt;
	and al,bh		; AL = (X-Y*T/256)&amp;amp;(X*T/256+Y)&lt;br /&gt;
	and al,252		; thicker sierpinski&lt;br /&gt;
	salc			; set pixel value to black&lt;br /&gt;
	jnz fx4q		; leave black if not sierpinski&lt;br /&gt;
	mov al,0x2A		; otherwise: a nice orange&lt;br /&gt;
	fx4q:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Raycastbenttunnel.png|thumb|raycast bent tunnel]]&lt;br /&gt;
This is a variation of a the [https://www.pouet.net/prod.php?which=78044 raycast tunnel &amp;quot;into a new era&amp;quot;] (see also above). A [https://www.reddit.com/r/tinycode/comments/de7dj9/moving_cubes_with_pathtracing_lighting_and/ detailed explanation] has been posted to reddit for the [https://www.pouet.net/prod.php?which=83204 similar production &amp;quot;Essence&amp;quot;]. The custom colors have been removed, the bend direction is different, and the geometry is a bit more closed, to increase the performance on slower computers and DosBox.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cl,-9		; start with depth 9 (moves backwards)&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx			; save DX, destroyed inside the loop&lt;br /&gt;
		mov al,dh	; Get Y into AL&lt;br /&gt;
		sub al,100	; Centering Y has to be done &amp;quot;manually&amp;quot;.&lt;br /&gt;
		imul cl		; Multiply AL=Y by the current distance, to get a projection(1)&lt;br /&gt;
		xchg ax,dx	; Get X into AL, while saving the result in DX (DH)&lt;br /&gt;
		add al,cl	; add distance to projection, (bend to the right)&lt;br /&gt;
		imul cl		; Multiply AL=X by the current distance, to get a projection(2)&lt;br /&gt;
		mov al,dh	; Get projection(1) in AL&lt;br /&gt;
		xor al,ah	; combine with projection(2)&lt;br /&gt;
		add al,4	; center the walls around 0&lt;br /&gt;
		test al,-8	; check if the wall is hit&lt;br /&gt;
	pop dx			; restore DX&lt;br /&gt;
	loopz fx5L		; repeat until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	sub cx,bp		; offset depth by time&lt;br /&gt;
	xor al,cl		; XOR pattern for texture &lt;br /&gt;
	aam 6			; irregular pattern with MOD 6&lt;br /&gt;
	add al,20		; offset into grayscale palette&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Oceannight.png|thumb|left|oceannight]]&lt;br /&gt;
[[File:Oceanday.png|thumb|right|oceanday]]&lt;br /&gt;
The ocean effect is a wonderful example for a [https://en.wikipedia.org/wiki/Bob_Ross#Style &amp;quot;happy accident&amp;quot;]. If you load a value into the FPU as integer, and store it as floating point value, and interpret it as integer again, you get a cool pattern. Combine that with a reverse divide and you get a neat wavy effect. This works in tight conjunction with the register DX, which has a sign flip at the position we want, so we can easily separate sky and sea. The final touch is the coloring though. By design of the global framework, the value of AL is determined on function entrance, it contains the effect adress. With a bit of code shuffling, the color of the sky came &amp;quot;for free&amp;quot;, without using instructions, as well as the &amp;quot;fading to day&amp;quot;-color, which is also an effect adress. And that is no happy accident. In other versions than the original submitted DosBox version, the sky color can vary for that reason.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	sub dh,120			; check if pixel is in the sky&lt;br /&gt;
	js fx6q				; quit if that's the case&lt;br /&gt;
	mov [bx+si],dx		; move XY to a memory location&lt;br /&gt;
	fild word [bx+si]	; read memory location as integer&lt;br /&gt;
	fidivr dword [bx+si]; reverse divide by constant&lt;br /&gt;
	fstp dword [bx+si-1]; store result as floating point&lt;br /&gt;
	mov ax,[bx+si]		; get the result into AX&lt;br /&gt;
	add ax,bp			; modify color by time&lt;br /&gt;
	and al,128			; threshold into two bands&lt;br /&gt;
	dec ax				; beautify colors to blue/black&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
For convenience, the fading effect from the global framework, isolated. Basically it is generating a pseudorandom value from the screenpointer, then offsetting the time by a scaled value of it, then calling the destined effect.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MIDI music part==&lt;br /&gt;
This code section increments the timing value and produces sound. By chosing channel 3 we can reuse the &amp;quot;change channel instrument&amp;quot; instruction again as &amp;quot;RET&amp;quot;. By shifting around the code of the effects, a reasonable value for the volume could be created and thus save another byte. Be aware that this code sequence only works if the MIDI device is already in UART mode, otherwise three more bytes have to be spent. Quite a few people from the audience and sceners told my afterwards that the melody sounds a bit like [https://en.wikipedia.org/wiki/Incantations_(album) Incantations from Mike Oldfield], but it is rather created by a simple approach. Starting with [https://www.pouet.net/prod.php?which=65604 &amp;quot;Hypnoteye&amp;quot; in 2015], i experimented with procedural MIDI and eventually released a [https://www.pouet.net/prod.php?which=66313 small MIDI framework (64 bytes)]. The basic principle is to jump through tonal space with fixed steps, and map high values back to lower ones (by modulo). With simple combinations of stepwidth and modvalue, interesting effects can be produced. For example stepwidth=3 for reduced minor chords which are never resolved, likewise stepwidth=4, or stepwidth=6 for tritonal effects. With a well chosen modvalue, these patterns can create progressions. I did not do an appropriate theoretical analysis yet, rather explored the tonal space and noted when something sounded interesting.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
sounds: db 0xc3, 11, 0x93, fx2-s&lt;br /&gt;
...&lt;br /&gt;
		inc bp				; increment timing value&lt;br /&gt;
		test bp, 7			; play a note every 8th step&lt;br /&gt;
		jnz nomuse			; quit if in between&lt;br /&gt;
		mov dx,0x330		; port number for MIDI&lt;br /&gt;
		mov si,sounds		; adress for sound data&lt;br /&gt;
		outsb				; change instrument of channel 3&lt;br /&gt;
		outsb				; to vibraphone&lt;br /&gt;
		outsb				; play a note on channel 3&lt;br /&gt;
		imul ax,bp,-19*32*4	; the magic melody constant&lt;br /&gt;
		shr ax,10			; scale down and implicit &amp;quot;and 63&amp;quot;&lt;br /&gt;
		add al,22			; pitch base is 22&lt;br /&gt;
		out dx,al			; play THIS note on channel 3&lt;br /&gt;
		outsb				; play it with THIS volume&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Original release code (complete)==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
; &amp;quot;memories&amp;quot; by HellMood/DESiRE&lt;br /&gt;
; the tiny megademo, 256 byte msdos intro&lt;br /&gt;
; shown in April 2020 @ REVISION&lt;br /&gt;
;&lt;br /&gt;
;   (= WILL BE COMMENTED IN DETAIL LATER =)&lt;br /&gt;
;&lt;br /&gt;
; create : nasm.exe memories.asm -fbin -o memories.com&lt;br /&gt;
; CHOOSE YOUR TARGET PLATFORM (compo version is dosbox)&lt;br /&gt;
; be sure to use the dosbox.conf from this archive!&lt;br /&gt;
; only ONE of the defines should be active!&lt;br /&gt;
%define dosbox			; size : 256 bytes&lt;br /&gt;
;%define freedos		; size : 230 bytes&lt;br /&gt;
;%define winxpdos		; size : 263 bytes&lt;br /&gt;
&lt;br /&gt;
; DON'T TOUCH THESE UNLESS YOU KNOW WHAT YOU'RE DOING&lt;br /&gt;
%ifdef winxpdos&lt;br /&gt;
	%define music&lt;br /&gt;
	%define switch_uart&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
	%define safe_segment&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef dosbox&lt;br /&gt;
	%define music&lt;br /&gt;
	;%define safe_dx ; sometimes needed&lt;br /&gt;
%endif&lt;br /&gt;
&lt;br /&gt;
; GLOBAL PARAMETERS, TUNE WITH CARE!&lt;br /&gt;
%define volume 127	; not used on dosbox (optimization)&lt;br /&gt;
%define instrument 11&lt;br /&gt;
%define scale_mod -19*32*4; &lt;br /&gt;
%define time_mask 7&lt;br /&gt;
%define targetFPS 35&lt;br /&gt;
%define tempo 1193182/256/targetFPS		&lt;br /&gt;
%define sierp_color 0x2A&lt;br /&gt;
%define tunnel_base_color 20&lt;br /&gt;
%define tunnel_pattern 6&lt;br /&gt;
%define tilt_plate_pattern 4+8+16&lt;br /&gt;
%define circles_pattern 8+16&lt;br /&gt;
&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov fs,ax&lt;br /&gt;
	mov [fs:0x46c],ax&lt;br /&gt;
%endif&lt;br /&gt;
	mov al,0x13&lt;br /&gt;
	int 0x10	 &lt;br /&gt;
	xchg bp,ax&lt;br /&gt;
	push 0xa000-10&lt;br /&gt;
	pop es&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	mov ax,0x251c&lt;br /&gt;
	%ifdef safe_dx	&lt;br /&gt;
		mov dx,timer	&lt;br /&gt;
	%else ; assume DH=1, mostly true on DosBox&lt;br /&gt;
		mov dl,timer&lt;br /&gt;
	%endif&lt;br /&gt;
	int 0x21&lt;br /&gt;
%endif&lt;br /&gt;
top:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov bp,[fs:0x46c]&lt;br /&gt;
%endif	&lt;br /&gt;
	mov ax,0xcccd&lt;br /&gt;
	mul di&lt;br /&gt;
	add al,ah&lt;br /&gt;
	xor ah,ah&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	shr ax,9&lt;br /&gt;
	and al,15&lt;br /&gt;
	xchg bx,ax&lt;br /&gt;
	mov bh,1&lt;br /&gt;
	mov bl,[byte bx+table]&lt;br /&gt;
	call bx&lt;br /&gt;
	stosb&lt;br /&gt;
	inc di&lt;br /&gt;
	inc di&lt;br /&gt;
	jnz top&lt;br /&gt;
	mov al,tempo&lt;br /&gt;
	out 40h,al&lt;br /&gt;
	in al,0x60&lt;br /&gt;
	dec al&lt;br /&gt;
	jnz top&lt;br /&gt;
sounds:&lt;br /&gt;
	db 0xc3	; is MIDI/RET&lt;br /&gt;
%ifdef music&lt;br /&gt;
	db instrument,0x93&lt;br /&gt;
	%ifdef switch_uart&lt;br /&gt;
		db volume		; without switch, volume is in table&lt;br /&gt;
		db 0x3f &lt;br /&gt;
	%endif&lt;br /&gt;
%endif&lt;br /&gt;
table: ; first index is volume, change order with care!		    					&lt;br /&gt;
	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
stop:&lt;br /&gt;
	pop ax&lt;br /&gt;
	ret&lt;br /&gt;
timer:&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	%ifdef safe_segment&lt;br /&gt;
		push cs&lt;br /&gt;
		pop ds&lt;br /&gt;
	%endif&lt;br /&gt;
		inc bp&lt;br /&gt;
	%ifdef music	&lt;br /&gt;
		test bp, time_mask&lt;br /&gt;
		jnz nomuse&lt;br /&gt;
		mov dx,0x330&lt;br /&gt;
		mov si,sounds&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		imul ax,bp,scale_mod&lt;br /&gt;
		shr ax,10&lt;br /&gt;
		add al,22&lt;br /&gt;
		out dx,al&lt;br /&gt;
		outsb&lt;br /&gt;
		%ifdef switch_uart&lt;br /&gt;
			inc dx&lt;br /&gt;
			outsb&lt;br /&gt;
		%endif&lt;br /&gt;
	%endif&lt;br /&gt;
nomuse:&lt;br /&gt;
	iret&lt;br /&gt;
%endif	&lt;br /&gt;
fx0: ; tilted plane, scrolling&lt;br /&gt;
	mov ax,0x1329&lt;br /&gt;
	add dh,al&lt;br /&gt;
	div dh&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul dl&lt;br /&gt;
	sub dx,bp&lt;br /&gt;
	xor ah,dl&lt;br /&gt;
	mov al,ah&lt;br /&gt;
	and al,tilt_plate_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx2: ; board of chessboards&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	sub ax,bp&lt;br /&gt;
	xor al,ah&lt;br /&gt;
	or al,0xDB&lt;br /&gt;
	add al,13h&lt;br /&gt;
ret&lt;br /&gt;
fx1: ; circles, zooming&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	sub al,100&lt;br /&gt;
	imul al&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul al&lt;br /&gt;
	add dh,ah&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,circles_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx3: ; parallax checkerboards&lt;br /&gt;
	mov cx,bp&lt;br /&gt;
	mov bx,-16&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di&lt;br /&gt;
	mov ax,819&lt;br /&gt;
	imul cx	 &lt;br /&gt;
	ror dx,1	 &lt;br /&gt;
	inc bx	 &lt;br /&gt;
	ja fx3L&lt;br /&gt;
	lea ax,[bx+31]	 &lt;br /&gt;
ret&lt;br /&gt;
fx4: ; sierpinski rotozoomer	&lt;br /&gt;
	lea cx,[bp-2048]&lt;br /&gt;
	sal cx,3&lt;br /&gt;
	movzx ax,dh&lt;br /&gt;
	movsx dx,dl&lt;br /&gt;
	mov bx,ax&lt;br /&gt;
	imul bx,cx&lt;br /&gt;
	add bh,dl&lt;br /&gt;
	imul dx,cx&lt;br /&gt;
	sub al,dh&lt;br /&gt;
	and al,bh&lt;br /&gt;
	and al,0b11111100&lt;br /&gt;
	salc				; VERY slow on dosbox, but ok&lt;br /&gt;
	jnz fx4q&lt;br /&gt;
	mov al,sierp_color&lt;br /&gt;
	fx4q:&lt;br /&gt;
ret&lt;br /&gt;
fx5: ; raycast bent tunnel&lt;br /&gt;
	mov cl,-9&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		sub al,100&lt;br /&gt;
		imul cl&lt;br /&gt;
		xchg ax,dx	&lt;br /&gt;
		add al,cl&lt;br /&gt;
		imul cl&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		xor al,ah&lt;br /&gt;
		add al,4&lt;br /&gt;
		test al,-8&lt;br /&gt;
	pop dx&lt;br /&gt;
	loopz fx5L&lt;br /&gt;
	sub cx,bp&lt;br /&gt;
	xor al,cl&lt;br /&gt;
	aam tunnel_pattern; VERY slow on dosbox, but ok&lt;br /&gt;
	add al,tunnel_base_color&lt;br /&gt;
ret&lt;br /&gt;
fx6: ; ocean night / to day sky&lt;br /&gt;
	sub dh,120&lt;br /&gt;
	js fx6q&lt;br /&gt;
	mov [bx+si],dx&lt;br /&gt;
	fild word [bx+si]&lt;br /&gt;
	fidivr dword [bx+si]&lt;br /&gt;
	fstp dword [bx+si-1]&lt;br /&gt;
	mov ax,[bx+si]&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,128&lt;br /&gt;
	dec ax&lt;br /&gt;
fx6q:&lt;br /&gt;
ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Further tweaks==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==Differences between dos versions==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==Bonus - NFO/ASCII==&lt;br /&gt;
&lt;br /&gt;
  &amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
                                                             art : hammerfist&lt;br /&gt;
         ∂#MW%e                              _d$Ng,&lt;br /&gt;
         'B,  ∂b                   _jM@$QZb,cQ&amp;quot;  )@&lt;br /&gt;
  ,edRB$b,l@   Wk,yGR$KM&amp;amp;$b,     ,dP&amp;quot;     Wl ]bsd%UR8BG6&amp;amp;$@DSyG#ZKM&amp;amp;$b,&lt;br /&gt;
,dP      &amp;quot;T%L  'MGF      &amp;quot;*∂R_   Tg    &amp;quot;*4Zk,#I  YP   W&amp;quot;    7P      &amp;quot;*∂R&lt;br /&gt;
4M   gd@    ^   ∂@   d@b   dQ$#@Z@R3L_    &amp;quot;*GMj  'W      ,gd$   d@b   9Q$#%b&lt;br /&gt;
W#,  `M          Wb  `*  _4P   `Qk  *#N8L   `H5   @b   'QR7YK   `*  _4F&amp;quot;   Qk&lt;br /&gt;
`6@L             dML            '@          ,BK   'M    ∂B  *b,            '#L&lt;br /&gt;
  ^QBb,_     _,4&amp;amp;M∞∂@=,_       _dGL       _gQKM    GL    @k  'Mg,_         _dG,&lt;br /&gt;
    &amp;quot;*BN5W$2#MNP&amp;quot;   &amp;quot;*G3WRM8&amp;amp;B5P&amp;quot;`Y@QNW3Z5P&amp;quot; ∂#$W8BRM3XZN87    &amp;quot;*GW38M%EBDW5P&amp;quot;`&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
                              p r e s e n t s&lt;br /&gt;
&lt;br /&gt;
            4&lt;br /&gt;
           d@,&lt;br /&gt;
         _&amp;amp; `Wl&lt;br /&gt;
      _,aP   &amp;quot;#baedM$#@@K JP*&amp;quot;?ML&lt;br /&gt;
 ,ad@$#P&amp;quot;         ,d@NEWVB&amp;quot;     X,aQPYb,_&lt;br /&gt;
V@Mm,_          ,d@MW#BW'      EMP&amp;quot;   '¶R ,ngBP^fML&lt;br /&gt;
 ¶M@N@y        Y#BNW#M&amp;quot;       J9&amp;quot;      `MQ9&amp;quot;      &amp;quot;MgRBq  ,QBMg,&lt;br /&gt;
  VN#P` ,d@@    `WM@^                   7f         ¶F` 7kY&amp;quot;   ^G  _.eQNE1.&lt;br /&gt;
   ]B _G@MWN$,   `P                     '     4b       QP      ¶w@F*^  ^Qb&lt;br /&gt;
   ]O@NRM#W@MNB,         ;                    ^`      j        JP^       Yl&lt;br /&gt;
  J#NRNWM@#BcT&amp;quot;^        ,A  _J                     _q@                   `X&lt;br /&gt;
 '¶WM#B@WdY`,7        _G#YN#PM                 _,gG&amp;quot;                      M,&lt;br /&gt;
  *BN#WP&amp;quot;  dK       ,Q@NRMB&amp;quot;]9       ,      _,M@Q*                        #A&lt;br /&gt;
   &amp;quot;U^      V@h,   iNBW#NT  J'      J9     s@QN&amp;quot;         _;               'D,&lt;br /&gt;
             ¶RMBv&amp;amp;NMQR@9  .W      .K'     &amp;quot;9`         ,6BA   _JL          ]l&lt;br /&gt;
              Y#NE@W#NRP   #[      `¶8               _d@MW#B_jW#W          BN&lt;br /&gt;
               &amp;quot;GQ@MR#W    QL_      *B            _,p#NBW#NQMG@WY          3Q&lt;br /&gt;
                  &amp;quot;Y@F     ,XW@M%im,_Yb_     _,g5@#MW@QMNE@E@NRMB         ,WM&lt;br /&gt;
                    `  _,gP*&amp;quot;#REM#GB@N#MQbnd@N#M@MW#R8QSB^'WQERM@        ;4NB,&lt;br /&gt;
                     ,GYKL    ¶E#B8R8QSB@M@#BM#W@MNB&amp;quot;`_  ,  &amp;quot;^` N       ,dW@Ql&lt;br /&gt;
                   _Q`'W`*t    '¶@GS#MBQ#E@W#NQBW[     'LvQ_   ,K    _dNABGM#N&lt;br /&gt;
                  ,F   '          `^WAB@QGE9*&amp;quot;9^*@L    jP7FY,  ¶h,_.jWM#BR#GBM,&lt;br /&gt;
                 J;    ,   _                  '       '   &amp;quot;LL  YxE#B8R8QSBNW@W;&lt;br /&gt;
                AP   _,Ag6^          _   J                  ¶A  `&amp;quot;Q#M@MW#R8E#P&lt;br /&gt;
               j@   `&amp;quot;XQW[            'LvK,_      'L_,/      @t    Y#NE@WNR&amp;quot;&lt;br /&gt;
              :M/     9^*@L           jP7F&amp;quot;       _PYKL     _,A;     ¶RSNQ&amp;quot;&lt;br /&gt;
              dKL     '     `        '   &amp;quot;L      &amp;quot;`'W`*t   `&amp;quot;XQb      `W^&lt;br /&gt;
              Q`8t            'L_,/         ,   _   '        9^Q&lt;br /&gt;
             ,W               _PYKL       _,Ag6^             ' W,     _ ,#N&amp;amp;&lt;br /&gt;
             !N  _   J       &amp;quot;`'W`*t     `&amp;quot;XQW[       _  J     N!_JG9^RwQ' *t&lt;br /&gt;
             `W,  'LvK,_        '        _gGB8@L   _   'LvK,_ ,WgB'    V    7L&lt;br /&gt;
         _.,gm&amp;amp;@B&amp;amp;wBZF&amp;quot;                j@'`  &amp;quot;WL _gML  jZd7Yb lN&amp;quot;          dBWl&lt;br /&gt;
      ,g&amp;amp;QB*&amp;quot;^`    `&amp;quot;*G@g, .gR&amp;amp;k,_   ,N&amp;quot;      '@QF  ¶k;gMF  *QvQ     jQ, ,@N@B#,&lt;br /&gt;
   .eQF*`              `Yb@&amp;quot;  &amp;quot;*6Qg,gF     ,   7     XMN&amp;quot;    'MNB,    ^¶QWSER@N;&lt;br /&gt;
 ,gP&amp;quot;           qy,      W'       ^Q'     &amp;amp;L      ,g@W'       `QMEL     `&amp;quot;WBNWP&lt;br /&gt;
g7              ¶9      ,X         M?     9&amp;quot;   _q8MSK           ¶EMt       *@K&lt;br /&gt;
Vh   _,m#L             _AH        le         ,GBDNE9^A,          *@F        NMg&lt;br /&gt;
 ¶L,qQ@ND           _.m@Bl        We      ,gM@B8#Q'   ¶h_                   lWE,&lt;br /&gt;
  W9NHW@`          JWM#B@]        @e     4WR@NGF^      'QL                  dRWl&lt;br /&gt;
   VMd*            &amp;quot;@BE@PM        'N      *UP&amp;quot;           VW,               JRSB;&lt;br /&gt;
  ,@F       j       `¶WK W,        ¶t                     XNt            _A@E#N&lt;br /&gt;
_JP       ,6&amp;amp;         &amp;quot;GLdM         XD,               _.g8NMA@k,_    _,gG#NMGR;&lt;br /&gt;
&amp;quot;Z      .JRER           'VMi     _jNB#W&amp;amp;_         _,j@E@W#Nl ¶MBGMNQGNQMG@QBW9&lt;br /&gt;
 ¶h   ,G@NRMBl            `&amp;quot;   ,d#R@M$F ¶Mg,_.gp&amp;amp;@@NEWVBWBMG  *QMN8R8SBN$E@WF&lt;br /&gt;
  Vb dW#R8QSRb,                 *YM@EQ,_ 'MENBW#NQMG#B@R@MW#l   &amp;quot;BM@QNENRQG'&lt;br /&gt;
   *WGS#MBMNEYL                    `^&amp;quot;*8M@Q@NRM#W@BWSNW@QBF&amp;quot;`     `^*@QBF^ [HFT]&lt;br /&gt;
    ^M@MW#Q9 ^Wt                           `^¶RQ@W8NQGP*`&lt;br /&gt;
     ¶Q#@P     Vk                            lA `&amp;quot;^`&lt;br /&gt;
      Y&amp;quot;       `MA                           J#,&lt;br /&gt;
                *R@,                        ,MQl&lt;br /&gt;
                 Y#Wk,                      GWM8L&lt;br /&gt;
                  W8RQSt,_                 AQ@MR#,&lt;br /&gt;
                  `@M@#SB@Mbm.,_          QNBW#NW&lt;br /&gt;
                    ¶QB8R8SBN$WNRM@#GNtwg@NMQR@B'&lt;br /&gt;
                     *MBQ#8R8QS@NE@WNBW#NQMG@NR;&lt;br /&gt;
                      `WGS#MBQ#R8QSB@NE@W#NQBW9&lt;br /&gt;
                        *OMW@QMNE@E@NRMW@QMB@*&lt;br /&gt;
                          `^&amp;quot;YQW@Q#SB#NE@EGP&lt;br /&gt;
                               `^&amp;quot;*8R@GBQF`&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=502</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=502"/>
				<updated>2020-04-20T02:16:17Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=CS&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
BP=09xx&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&lt;br /&gt;
flags=xx02&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
Different DOS versions may differ slightly in the above.  A full chart exists at http://www.fysnet.net/yourhelp.htm to help you check if your specific target has different defaults than the standard settings above.&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but the BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== How to think like a sizecoder ===&lt;br /&gt;
&lt;br /&gt;
This example framework can be shrunk!  Think a little bit about how you might do it, then check out the example below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        mov     es,bp           ;BP=09xx at program start; close enough to A000 ;-)&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to ES:xxxx to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
                                ;Don't care if we set text mode, user can just MODE CO80&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By casually disregarding the user experience :-) and fudging the start of video memory, our basic framework is now 13 bytes.&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=501</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=501"/>
				<updated>2020-04-20T02:15:23Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: Added &amp;quot;how to think like a sizecoder&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=CS&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
BP=09xx&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&lt;br /&gt;
flags=xx02&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
Different DOS versions may differ slightly in the above.  A full chart exists at http://www.fysnet.net/yourhelp.htm to help you check if your specific target has different defaults than the standard settings above.&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but the BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== How to think like a sizecoder ===&lt;br /&gt;
&lt;br /&gt;
This example framework can be shrunk!  Think a little bit about how you might do it, then check out the example below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        mov     es,bp           ;BP=09xx at program start; close enough to A000 ;-)&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
                                ;Don't care if we set text mode, user can just MODE CO80&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By casually disregarding the user experience :-) and fudging the start of video memory, our basic framework is now 13 bytes.&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=500</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=500"/>
				<updated>2020-04-20T02:08:26Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=CS&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
BP=09xx&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&lt;br /&gt;
flags=xx02&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
Different DOS versions may differ slightly in the above.  A full chart exists at http://www.fysnet.net/yourhelp.htm to help you check if your specific target has different defaults than the standard settings above.&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but the BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Essence&amp;diff=369</id>
		<title>Essence</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Essence&amp;diff=369"/>
				<updated>2019-11-03T05:18:36Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: Created page with &amp;quot;Category:Case Study  Essence by Hellmood  {{#ev:youtube|https://youtu.be/JqQbv12Dp9g}}  As you might have guessed, real path tracing and lighting is (yet) impossible in 64...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Case Study]]&lt;br /&gt;
&lt;br /&gt;
Essence by Hellmood&lt;br /&gt;
&lt;br /&gt;
{{#ev:youtube|https://youtu.be/JqQbv12Dp9g}}&lt;br /&gt;
&lt;br /&gt;
As you might have guessed, real path tracing and lighting is (yet) impossible in 64 bytes of assembler ;) but still, it's at least possible to generate the impression of both.&lt;br /&gt;
&lt;br /&gt;
Code:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
; &amp;quot;Essence&amp;quot; - by HellMood/DESiRE - 5th October 2019&lt;br /&gt;
; 64 bytes msdos intro, showing animated raycasted objects&lt;br /&gt;
; with fake pathtracing and fake lighting while playing&lt;br /&gt;
; ambient MIDI sound, which is coded for DosBox 0.74.&lt;br /&gt;
; On a real MsDos or FreeDos, the demo will work&lt;br /&gt;
; but no sound will be played unless a MIDI capable&lt;br /&gt;
; soundcard is present. On Dosbox, a custom configuration&lt;br /&gt;
; is needed, to provide sufficient emulation power and&lt;br /&gt;
; enable the MIDI UART mode, which saves a few bytes.&lt;br /&gt;
; --------------------------------------------------------------&lt;br /&gt;
; released at https://deadline.untergrund.net/2019/&lt;br /&gt;
; published at https://www.pouet.net/prod.php?which=83204&lt;br /&gt;
; see also : http://www.sizecoding.org/wiki/Main_Page&lt;br /&gt;
; assemble with &amp;quot;nasm.exe&amp;quot; &amp;lt;this&amp;gt; -fbin -o &amp;lt;this&amp;gt;.com&lt;br /&gt;
; --------------------------------------------------------------&lt;br /&gt;
; Set ES to the screen, to perform the &amp;quot;Rrrola Trick&amp;quot;, see&lt;br /&gt;
; http://www.sizecoding.org/wiki/General_Coding_Tricks&lt;br /&gt;
push 0x9FF6&lt;br /&gt;
pop es&lt;br /&gt;
; Set mode to 0x13, +0x80 means not deleting the screen content&lt;br /&gt;
; that is 320x200 pixels in 256 colors&lt;br /&gt;
mov al,93h&lt;br /&gt;
int 10h&lt;br /&gt;
; Setting port to MIDI data port, assuming it is in UART mode&lt;br /&gt;
; 0x3F has to be sent to 0x331 first, if UART mode is not on.&lt;br /&gt;
mov dx,0x330&lt;br /&gt;
; Effectively outputting all the code to the MIDI data port.&lt;br /&gt;
; CX=255 at start, DS=CS, SI=0x100, see MIDI section below.&lt;br /&gt;
rep outsb&lt;br /&gt;
; Setting DS to zero, top stack normally contains the return&lt;br /&gt;
; adress. DS is needed to be 0 to access a timer later on.&lt;br /&gt;
pop ds&lt;br /&gt;
; CL is the iteration count for a ray, CH is 0 all the time.&lt;br /&gt;
; The value is chosen to generate a blue background texture.&lt;br /&gt;
; Chosing 64 instead would lead to a totally black background &lt;br /&gt;
X: mov cl,63&lt;br /&gt;
; BL is the current depth of a ray, we start with minus(!) 64.&lt;br /&gt;
; We cast rays in negative direction to calculate the point&lt;br /&gt;
; in 3D and the texture color at the same time. if a ray hits&lt;br /&gt;
; an object from this side, the object function has a reasonable&lt;br /&gt;
; texture value, from the other side it would be always black.&lt;br /&gt;
; CL, BL are decoupled because decrementing -128 leads to 127&lt;br /&gt;
; and since we are using signed multiplication for keeping things&lt;br /&gt;
; centered, that would result in very buggy and ugly behaviour.&lt;br /&gt;
; They are also decoupled because of visual beauty: because of&lt;br /&gt;
; the usage of signed 8 bit coordinates, objects close to the&lt;br /&gt;
; projection center are way too coarse and move way too fast.&lt;br /&gt;
mov bl,-64&lt;br /&gt;
; At this point, AL contains the color of the previous pixel.&lt;br /&gt;
; By design of the object formula, the last 4 bits contain the&lt;br /&gt;
; texture value while the 5th bit is always set, which maps it&lt;br /&gt;
; to the 16 color gray scale subtable of the VGA default&lt;br /&gt;
; colors. Other bits may be set, too, so they are masked.&lt;br /&gt;
; https://www.fountainware.com/EXPL/vga_color_palettes.htm&lt;br /&gt;
; Simultaneously, the object function, in combination with the&lt;br /&gt;
; palette subset, creates the impression of lighting from the&lt;br /&gt;
; front top left. The right, bottom and back side appears to&lt;br /&gt;
; be black. Changing the object formula will result in&lt;br /&gt;
; changing the texture, visibility and lighting as well.&lt;br /&gt;
and al,31&lt;br /&gt;
; Outputting the pixel on the screen and increment pointer&lt;br /&gt;
stosb&lt;br /&gt;
; Instead of going pixel by pixel, the following jumps&lt;br /&gt;
; Pseudorandomly across the screen, this smoothes the&lt;br /&gt;
; animation a lot and looks a bit like pathtracing.&lt;br /&gt;
imul di,byte 117&lt;br /&gt;
; The inner loop for each ray, decrementing BX means&lt;br /&gt;
; advancing the ray in negative direction by 1&lt;br /&gt;
L: dec bx&lt;br /&gt;
; Assign the Rrrola constant to register AX&lt;br /&gt;
mov ax,0xcccd&lt;br /&gt;
; Place the signed coordinates X and Y into DL and DH&lt;br /&gt;
mul di&lt;br /&gt;
; Centering for X is implicitly done by offsetting the segment&lt;br /&gt;
; Centering Y has to be done &amp;quot;manually&amp;quot;. any value can be used&lt;br /&gt;
; as long as it doesn't show the signed overflow on screen.&lt;br /&gt;
mov al,dh&lt;br /&gt;
sbb al,73&lt;br /&gt;
; Multiply AL=Y by the current distance, to get a projection(1)&lt;br /&gt;
imul bl&lt;br /&gt;
; Get X into AL, while saving the result in DX (DH)&lt;br /&gt;
xchg ax,dx&lt;br /&gt;
; Multiply AL=X by the current distance, to get a projection(2)&lt;br /&gt;
imul bl&lt;br /&gt;
; Considering an implicit division by 256, the projected &lt;br /&gt;
; coordinates now reside in DH and AH, while the depth is in BL.&lt;br /&gt;
; the following sequence calculates whether the current 3D&lt;br /&gt;
; position belongs to an object. Objects are normal cubes&lt;br /&gt;
; defined by f(X,Y,Z) = (X &amp;amp; Y &amp;amp; Z &amp;amp; 16 != 0)&lt;br /&gt;
mov al,dh&lt;br /&gt;
; offset X by timer, effectively producing 18.2 FPS&lt;br /&gt;
; http://vitaly_filatov.tripod.com/ng/asm/asm_002.29.html&lt;br /&gt;
add ah,[0x46c]&lt;br /&gt;
and al,ah&lt;br /&gt;
and al,bl&lt;br /&gt;
test al,16&lt;br /&gt;
; the inner loop is done when either the iteration count has&lt;br /&gt;
; reached zero or the function f(X,Y,Z) is true (object hit)&lt;br /&gt;
loopz L&lt;br /&gt;
; the outer loop repeats endlessly&lt;br /&gt;
jmp short X&lt;br /&gt;
; MIDI Data Section, actually code above and memory below is&lt;br /&gt;
; sent to the MIDI data port as well, but it does not matter&lt;br /&gt;
db 0xc0 ; set instrument on channel 0 to the next value&lt;br /&gt;
db 89   ; instrument 89 = Pad2 of general MIDI&lt;br /&gt;
db 0x90 ; play notes on channel 0, minor chord over 4 octaves&lt;br /&gt;
db 28   ; note 1, very deep&lt;br /&gt;
db 127  ; volume 1, maximum value to let the subwoofers shake&lt;br /&gt;
db 59   ; note 2 fitting to note 1  &lt;br /&gt;
db 80   ; volume 2, a bit reduced to not overshadow the bass&lt;br /&gt;
db 67   ; note 3 fitting to notes 1 &amp;amp; 2&lt;br /&gt;
db 65   ; volume 3, even more reduced to fit the other notes&lt;br /&gt;
;      #                   ´                           #&lt;br /&gt;
;      #                   greetings                   #&lt;br /&gt;
;      #     sensenstahl,homecoded,rrrola,frag,T$      #&lt;br /&gt;
;      #     Optimus,Trixter,igor,gentleman,VileR      #&lt;br /&gt;
;      #     Whizart,g0blinish,Rudi,ryg,TomCat    .    #&lt;br /&gt;
;      #     orbitaldecay,wysiwtf,Kuemmel,p01,Lara     #&lt;br /&gt;
;      #     Oscar Toledo,Drift,maugli,Harekiet,etc    #&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Case_Studies&amp;diff=368</id>
		<title>Case Studies</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Case_Studies&amp;diff=368"/>
				<updated>2019-11-03T05:12:06Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Sometimes it can be helpful to examine a tinyprog in detail, seeing what choices were made and why:&lt;br /&gt;
&lt;br /&gt;
:[[MELT.COM]]: We take a cute program from the 1980s and see how far we can crunch it down&lt;br /&gt;
&lt;br /&gt;
:[[m8trix 8b]]: An &amp;quot;Enter The Matrix&amp;quot; screen display in only 8 bytes&lt;br /&gt;
&lt;br /&gt;
:[[4is256]]: &amp;amp;#344;r&amp;amp;#345;ola's Tetris is a full tetris clone with scoring&lt;br /&gt;
&lt;br /&gt;
:[[paint16b]]: Hellmood's paint program in 16 bytes (really)&lt;br /&gt;
&lt;br /&gt;
:[[Indian Spirit]]: An American Indian tapestry with music in 32 bytes&lt;br /&gt;
&lt;br /&gt;
:[[Essence]]: Fakery of path tracing and lighting, with audio, in 64 bytes&lt;br /&gt;
&lt;br /&gt;
== External Case Studies ==&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Brainfuck Brainfuck] is a very tiny language, having only 8 commands and an instruction pointer.  Several compilers and interpreters have been made; Gynvael archived [http://gynvael.coldwind.pl/bf125/ many different versions along with their source code], the smallest of which was 125 bytes.  qkumba took that as inspiration and created his own [http://pferrie.host22.com/misc/tiny/brainfck.htm brainfuck compiler in 100 bytes] (104 for one that is fully compliant).  (Note:  The term &amp;quot;compiler&amp;quot; is used mistakenly a lot in these descriptions; the majority of brainfuck programs are actually interpreters that load brainfuck programs and execute native code token by token.  This one is an exception in that it really does compile the code entirely into native instructions first.)&lt;br /&gt;
&lt;br /&gt;
[http://canonical.org/~kragen/demo/fr-016.html Disassembly of Farbrausch's &amp;quot;fr-016: bytes&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[http://canonical.org/~kragen/demo/klappquadrat.html A disassembly of the 64-byte version of Klappquadrat]&lt;br /&gt;
&lt;br /&gt;
[https://trixter.oldskool.org/2012/12/17/maze-generation-in-thirteen-bytes/ Maze generation in 10 bytes]&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Indian_Spirit&amp;diff=362</id>
		<title>Indian Spirit</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Indian_Spirit&amp;diff=362"/>
				<updated>2019-04-08T17:35:22Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Tinyprog|title=Indian Spirit|author=Hellmood|bsize=32|link=http://www.pouet.net/prod.php?which=80341}}&lt;br /&gt;
[[Category:Case Study]]&lt;br /&gt;
Indian Spirit implements a 640x400 American Indian cloth tapestry while playing music inspired by that culture.  Original Readme:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&amp;quot;Indian Spirit&amp;quot; - HellMood / DSR&lt;br /&gt;
released just for fun 02/2019&lt;br /&gt;
&lt;br /&gt;
DOSBOX - 32 bytes - PC Speaker&lt;br /&gt;
&lt;br /&gt;
Somewhat inspired by Adoks &amp;quot;Indian&amp;quot; Series&lt;br /&gt;
i coded for one hour to see what i can come up&lt;br /&gt;
with, but in 32 bytes, and with sound. At the&lt;br /&gt;
same time i exploited the &amp;quot;secret mode&amp;quot; 0x69&lt;br /&gt;
which switches to 640x400 in 256 colors, but&lt;br /&gt;
without using the size-unfriendly VESA functions.&lt;br /&gt;
For that reason, this demo only works in DOSBOX&lt;br /&gt;
or on a computer which as the EXACT graphic card&lt;br /&gt;
dosbox emulates. &lt;br /&gt;
&lt;br /&gt;
Youtube Capture :&lt;br /&gt;
https://www.youtube.com/watch?v=TeIkW31Blf0&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
        pop     ds&lt;br /&gt;
        cwd&lt;br /&gt;
Y:&lt;br /&gt;
        xor     al,0x69&lt;br /&gt;
        int     0x10&lt;br /&gt;
        mov     dl,cl&lt;br /&gt;
        mov     al,[0x46C]	&lt;br /&gt;
        neg     al&lt;br /&gt;
        xor     dl,al&lt;br /&gt;
        and     al,32+16+4+2&lt;br /&gt;
        or      al,0x41&lt;br /&gt;
        out     0x42,al			&lt;br /&gt;
        out     0x61,al			&lt;br /&gt;
        mov     ah,12&lt;br /&gt;
        loop    Y		&lt;br /&gt;
        inc     bl	&lt;br /&gt;
        jmp short Y&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Indian_Spirit&amp;diff=361</id>
		<title>Indian Spirit</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Indian_Spirit&amp;diff=361"/>
				<updated>2019-04-08T17:33:40Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: Created page with &amp;quot;{{Tinyprog|title=Indian Spirit|author=Hellmood|bsize=32|link=http://www.pouet.net/prod.php?which=80341}} Category:Case Study Indian Spirit implements a 640x400 American In...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Tinyprog|title=Indian Spirit|author=Hellmood|bsize=32|link=http://www.pouet.net/prod.php?which=80341}}&lt;br /&gt;
[[Category:Case Study]]&lt;br /&gt;
Indian Spirit implements a 640x400 American Indian cloth tapestry while playing music inspired by that culture.  Original Readme:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&amp;quot;Indian Spirit&amp;quot; - HellMood / DSR&lt;br /&gt;
released just for fun 02/2019&lt;br /&gt;
&lt;br /&gt;
DOSBOX - 32 bytes - PC Speaker&lt;br /&gt;
&lt;br /&gt;
Somewhat inspired by Adoks &amp;quot;Indian&amp;quot; Series&lt;br /&gt;
i coded for one hour to see what i can come up&lt;br /&gt;
with, but in 32 bytes, and with sound. At the&lt;br /&gt;
same time i exploited the &amp;quot;secret mode&amp;quot; 0x69&lt;br /&gt;
which switches to 640x400 in 256 colors, but&lt;br /&gt;
without using the size-unfriendly VESA functions.&lt;br /&gt;
For that reason, this demo only works in DOSBOX&lt;br /&gt;
or on a computer which as the EXACT graphic card&lt;br /&gt;
dosbox emulates. &lt;br /&gt;
&lt;br /&gt;
Youtube Capture :&lt;br /&gt;
https://www.youtube.com/watch?v=TeIkW31Blf0&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
pop ds&lt;br /&gt;
cwd&lt;br /&gt;
Y:&lt;br /&gt;
xor al,0x69&lt;br /&gt;
int 0x10&lt;br /&gt;
mov dl,cl&lt;br /&gt;
mov al,[0x46C]	&lt;br /&gt;
neg al&lt;br /&gt;
xor dl,al&lt;br /&gt;
and al,32+16+4+2&lt;br /&gt;
or al,0x41&lt;br /&gt;
out 0x42,al			&lt;br /&gt;
out 0x61,al			&lt;br /&gt;
mov ah,12&lt;br /&gt;
loop Y		&lt;br /&gt;
inc bl	&lt;br /&gt;
jmp short Y&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Case_Studies&amp;diff=360</id>
		<title>Case Studies</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Case_Studies&amp;diff=360"/>
				<updated>2019-04-08T17:29:25Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Sometimes it can be helpful to examine a tinyprog in detail, seeing what choices were made and why:&lt;br /&gt;
&lt;br /&gt;
:[[MELT.COM]]: We take a cute program from the 1980s and see how far we can crunch it down&lt;br /&gt;
&lt;br /&gt;
:[[m8trix 8b]]: An &amp;quot;Enter The Matrix&amp;quot; screen display in only 8 bytes&lt;br /&gt;
&lt;br /&gt;
:[[4is256]]: &amp;amp;#344;r&amp;amp;#345;ola's Tetris is a full tetris clone with scoring&lt;br /&gt;
&lt;br /&gt;
:[[paint16b]]: Hellmood's paint program in 16 bytes (really)&lt;br /&gt;
&lt;br /&gt;
:[[Indian Spirit]]: An American Indian tapestry with music in 32 bytes&lt;br /&gt;
&lt;br /&gt;
== External Case Studies ==&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Brainfuck Brainfuck] is a very tiny language, having only 8 commands and an instruction pointer.  Several compilers and interpreters have been made; Gynvael archived [http://gynvael.coldwind.pl/bf125/ many different versions along with their source code], the smallest of which was 125 bytes.  qkumba took that as inspiration and created his own [http://pferrie.host22.com/misc/tiny/brainfck.htm brainfuck compiler in 100 bytes] (104 for one that is fully compliant).  (Note:  The term &amp;quot;compiler&amp;quot; is used mistakenly a lot in these descriptions; the majority of brainfuck programs are actually interpreters that load brainfuck programs and execute native code token by token.  This one is an exception in that it really does compile the code entirely into native instructions first.)&lt;br /&gt;
&lt;br /&gt;
[http://canonical.org/~kragen/demo/fr-016.html Disassembly of Farbrausch's &amp;quot;fr-016: bytes&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[http://canonical.org/~kragen/demo/klappquadrat.html A disassembly of the 64-byte version of Klappquadrat]&lt;br /&gt;
&lt;br /&gt;
[https://trixter.oldskool.org/2012/12/17/maze-generation-in-thirteen-bytes/ Maze generation in 10 bytes]&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=347</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=347"/>
				<updated>2018-03-27T15:13:51Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* .COM file defaults */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=CS&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
BP=09xx&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&lt;br /&gt;
flags=xx02&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
Different DOS versions may differ slightly in the above.  A full chart exists at http://www.fysnet.net/yourhelp.htm to help you check if your specific target has different defaults than the standard settings above.&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=346</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=346"/>
				<updated>2018-03-27T15:07:30Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* .COM file defaults */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
Different DOS versions may differ slightly in the above.  A full chart exists at http://www.fysnet.net/yourhelp.htm to help you check if your specific target has different defaults than the standard settings above.&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=340</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=340"/>
				<updated>2016-08-28T19:35:16Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* A sample framework */ My bad, had wrong assembler settings&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=339</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=339"/>
				<updated>2016-08-28T05:46:27Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* A sample framework */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 20 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=338</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=338"/>
				<updated>2016-08-28T05:44:40Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* 1-byte opcodes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=337</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=337"/>
				<updated>2016-08-28T05:44:22Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* 1-byte opcodes */ Rewrote my own terrible wording&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, a design philosophy that attempts to create instructions that perform multiple steps.  As such, there are some single instructions that perform complex actions, such as the string opcodes (&amp;lt;code&amp;gt;LODS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;MOVS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;CMPS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;SCAS&amp;lt;/code&amp;gt;).  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=336</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=336"/>
				<updated>2016-08-28T05:40:15Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Know your environment */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
=== A simple graphics mode ===&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, which is a design philosophy that intentionally attempts to create many instructions that perform multiple steps.  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=332</id>
		<title>General Coding Tricks</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=332"/>
				<updated>2016-08-26T01:13:42Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* The Rrrola Trick (Obtaining X and Y without DIV) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Data is code, code is data ==&lt;br /&gt;
&lt;br /&gt;
Code is nothing more than data that the CPU interprets.  For example, consider this multi-byte instruction:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This assembles to &amp;lt;code&amp;gt;B4 37&amp;lt;/code&amp;gt;.  &amp;lt;code&amp;gt;B4&amp;lt;/code&amp;gt; by itself isn't interesting, but &amp;lt;code&amp;gt;37&amp;lt;/code&amp;gt; happens to be the opcode for &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;.  Let's say you had this code before a loop, and you needed to perform &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; at the top of a loop.  Rather than put AAS at the top of the loop, you can reuse the opcode that will already be there as part of the &amp;lt;code&amp;gt;mov ah,37&amp;lt;/code&amp;gt; that comes before it.  Just jump directly into the middle of the &amp;lt;code&amp;gt;mov ah,37h&amp;lt;/code&amp;gt;, which will get interpreted and executed as &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
&lt;br /&gt;
label:&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
        ;misc. stuff&lt;br /&gt;
        loop label+1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;+1&amp;lt;/code&amp;gt; specifies the jump should go to 1 byte past the actual location.&lt;br /&gt;
&lt;br /&gt;
=== Reuse ===&lt;br /&gt;
&lt;br /&gt;
You can use opcodes hidden in your existing data.  For example, .COM files can end with &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;, which is opcode &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt;.  If you already have a &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; somewhere else in your code, even as part of data, just &amp;lt;code&amp;gt;JMP&amp;lt;/code&amp;gt; to that pre-existing &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; instead of adding a &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== If your environment holds you back, change it ==&lt;br /&gt;
&lt;br /&gt;
The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default MCGA palette] is fairly horrible, but can be size advantages to changing it:  While setting a new palette costs bytes, the new palette arrangement could save you headaches down the road.  For example, if your code is calculating pixel colors that fall into goofy ranges, rather than constantly adjust the colors to sane ranges (ie. aligned to powers of 2), just set the palette so that values falling into those ranges look the way you want.  (This assumes you have very small ways of redefining the palette, of course.)&lt;br /&gt;
&lt;br /&gt;
The above is maybe not the best example.  Rewrites to this section are welcome.&lt;br /&gt;
&lt;br /&gt;
== Need a constant? ==&lt;br /&gt;
&lt;br /&gt;
If you need a constant value but you're out of space, search your assembled code for a byte value you can use.&lt;br /&gt;
&lt;br /&gt;
== A smaller way to point to Mode 13's screen segment ==&lt;br /&gt;
&lt;br /&gt;
Rather than &amp;lt;code&amp;gt;mov ah,0a0h; mov es,ax&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;push word 0a000h; pop es&amp;lt;/code&amp;gt;, try this 2-byte wonder:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
les bp,[bx]&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This sets ES=9FFF, only one away from A000.  You can write to the screen with ES: this way as long as you are aware the segment is one paragraph (16 bytes) behind, so just increase your offset by 16 if you need exact placement.&lt;br /&gt;
&lt;br /&gt;
How does this work?   At start of execution of a .COM file, BX=0, and DS=CS. The contents of the COM file get loaded to offset 0x100 in that segment, but loaded before that is the PSP (program segment prefix), which DOS populates with information about the loaded program and other info. The PSP starts with &amp;lt;code&amp;gt;CD 20&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;INT 20&amp;lt;/code&amp;gt;, which exits the program), so that's what gets loaded into BP.  The next word is the number of the last free conventional memory segment, typically 0x9fff (but can be something different if parts of the upper memory range are either not installed or allocated).&lt;br /&gt;
&lt;br /&gt;
'''Warning:''' This trick doesn't always work.  On FreeDos, this can set ES=9FE0 and there is something resident at that location that can screw up the system after normal program exit if you overwrite it.&lt;br /&gt;
&lt;br /&gt;
== Accessing the timer tick for free ==&lt;br /&gt;
&lt;br /&gt;
If using a 386+, FS=0 at .COM start.  So, &amp;lt;code&amp;gt;FS:[046C]&amp;lt;/code&amp;gt; gets you the DOS timer tick variable, which you could use for timing/pacing, or a random seed. Some environments, especially EMS/XMS programs, can modify the FS register, so it can't always assumed to be 0000h. &amp;lt;code&amp;gt;POP DS&amp;lt;/code&amp;gt; right after the start and accessing &amp;lt;code&amp;gt;[046C]&amp;lt;/code&amp;gt; does the trick then with equal size.&lt;br /&gt;
&lt;br /&gt;
== Looping twice ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code that doesn't modify the carry flag, and you know the carry flag is clear, you can loop once in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        cmc&lt;br /&gt;
        jc      looping&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Looping three times ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code and you have a register whose value is zero and can be incremented freely, or whose value is -1 and can be decremented freely, you can loop twice in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        inc     bx ;if decrementing instead, parity check must be reversed&lt;br /&gt;
        jpo     looping ;1 (01b) and 2 (10b) have odd parity, 3 (11b) has even parity&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Obtaining X and Y without DIV (The Rrrola Trick) ==&lt;br /&gt;
&lt;br /&gt;
In [[Output#Outputting_in_mode_13h_.28320x200.29|320x200 mode]], instead of constructing X and Y from the screen pointer DI with DIV, you can get a decent estimation with multiplying the screen pointer with 0xCCCD and read X and Y from the 8bit registers DH (+DL as 16bit value) and DL (+AH as 16bit value). The idea is to interpret DI as a kind of 16 bit float in the range [0,1], from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation 0xCCCD is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
&lt;br /&gt;
This is adapted from [http://www.pouet.net/prod.php?which=53816 &amp;quot;Puls&amp;quot; by Rrrola] where X and Y are directly modified on the stack by performing &amp;lt;code&amp;gt;add dword[di],0000CCCDh&amp;lt;/code&amp;gt; on each pixel iteration, which requires 7 bytes of code. The vertical alignment correction is solved with a good starting value on said DWORD on the stack before each frame, which requires 2 additional bytes. Both approaches are too different to directly compare, but share the core idea of [http://www.pouet.net/topic.php?which=8791&amp;amp;page=8#c411796 multiplying with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt;], so &amp;quot;Rrrolas trick&amp;quot; is an appropriate term to use.&lt;br /&gt;
&lt;br /&gt;
== Use the entire register for a smaller opcode form ==&lt;br /&gt;
&lt;br /&gt;
1. As you know e.g. &amp;lt;code&amp;gt;add cl,1&amp;lt;/code&amp;gt; produces 3 Bytes of code while &amp;lt;code&amp;gt;inc cl&amp;lt;/code&amp;gt; compiles to 2 Bytes. If ch does not matter (or you know that it won't be affected) use &amp;lt;code&amp;gt;inc cx&amp;lt;/code&amp;gt; instead and get the most out of that 1 Byte. This is no real trick but sometimes such things can be overlooked - while the 2 saved Bytes could be invested wisely.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=331</id>
		<title>General Coding Tricks</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=331"/>
				<updated>2016-08-26T01:13:16Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Various (small) Hints */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Data is code, code is data ==&lt;br /&gt;
&lt;br /&gt;
Code is nothing more than data that the CPU interprets.  For example, consider this multi-byte instruction:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This assembles to &amp;lt;code&amp;gt;B4 37&amp;lt;/code&amp;gt;.  &amp;lt;code&amp;gt;B4&amp;lt;/code&amp;gt; by itself isn't interesting, but &amp;lt;code&amp;gt;37&amp;lt;/code&amp;gt; happens to be the opcode for &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;.  Let's say you had this code before a loop, and you needed to perform &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; at the top of a loop.  Rather than put AAS at the top of the loop, you can reuse the opcode that will already be there as part of the &amp;lt;code&amp;gt;mov ah,37&amp;lt;/code&amp;gt; that comes before it.  Just jump directly into the middle of the &amp;lt;code&amp;gt;mov ah,37h&amp;lt;/code&amp;gt;, which will get interpreted and executed as &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
&lt;br /&gt;
label:&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
        ;misc. stuff&lt;br /&gt;
        loop label+1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;+1&amp;lt;/code&amp;gt; specifies the jump should go to 1 byte past the actual location.&lt;br /&gt;
&lt;br /&gt;
=== Reuse ===&lt;br /&gt;
&lt;br /&gt;
You can use opcodes hidden in your existing data.  For example, .COM files can end with &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;, which is opcode &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt;.  If you already have a &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; somewhere else in your code, even as part of data, just &amp;lt;code&amp;gt;JMP&amp;lt;/code&amp;gt; to that pre-existing &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; instead of adding a &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== If your environment holds you back, change it ==&lt;br /&gt;
&lt;br /&gt;
The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default MCGA palette] is fairly horrible, but can be size advantages to changing it:  While setting a new palette costs bytes, the new palette arrangement could save you headaches down the road.  For example, if your code is calculating pixel colors that fall into goofy ranges, rather than constantly adjust the colors to sane ranges (ie. aligned to powers of 2), just set the palette so that values falling into those ranges look the way you want.  (This assumes you have very small ways of redefining the palette, of course.)&lt;br /&gt;
&lt;br /&gt;
The above is maybe not the best example.  Rewrites to this section are welcome.&lt;br /&gt;
&lt;br /&gt;
== Need a constant? ==&lt;br /&gt;
&lt;br /&gt;
If you need a constant value but you're out of space, search your assembled code for a byte value you can use.&lt;br /&gt;
&lt;br /&gt;
== A smaller way to point to Mode 13's screen segment ==&lt;br /&gt;
&lt;br /&gt;
Rather than &amp;lt;code&amp;gt;mov ah,0a0h; mov es,ax&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;push word 0a000h; pop es&amp;lt;/code&amp;gt;, try this 2-byte wonder:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
les bp,[bx]&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This sets ES=9FFF, only one away from A000.  You can write to the screen with ES: this way as long as you are aware the segment is one paragraph (16 bytes) behind, so just increase your offset by 16 if you need exact placement.&lt;br /&gt;
&lt;br /&gt;
How does this work?   At start of execution of a .COM file, BX=0, and DS=CS. The contents of the COM file get loaded to offset 0x100 in that segment, but loaded before that is the PSP (program segment prefix), which DOS populates with information about the loaded program and other info. The PSP starts with &amp;lt;code&amp;gt;CD 20&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;INT 20&amp;lt;/code&amp;gt;, which exits the program), so that's what gets loaded into BP.  The next word is the number of the last free conventional memory segment, typically 0x9fff (but can be something different if parts of the upper memory range are either not installed or allocated).&lt;br /&gt;
&lt;br /&gt;
'''Warning:''' This trick doesn't always work.  On FreeDos, this can set ES=9FE0 and there is something resident at that location that can screw up the system after normal program exit if you overwrite it.&lt;br /&gt;
&lt;br /&gt;
== Accessing the timer tick for free ==&lt;br /&gt;
&lt;br /&gt;
If using a 386+, FS=0 at .COM start.  So, &amp;lt;code&amp;gt;FS:[046C]&amp;lt;/code&amp;gt; gets you the DOS timer tick variable, which you could use for timing/pacing, or a random seed. Some environments, especially EMS/XMS programs, can modify the FS register, so it can't always assumed to be 0000h. &amp;lt;code&amp;gt;POP DS&amp;lt;/code&amp;gt; right after the start and accessing &amp;lt;code&amp;gt;[046C]&amp;lt;/code&amp;gt; does the trick then with equal size.&lt;br /&gt;
&lt;br /&gt;
== Looping twice ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code that doesn't modify the carry flag, and you know the carry flag is clear, you can loop once in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        cmc&lt;br /&gt;
        jc      looping&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Looping three times ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code and you have a register whose value is zero and can be incremented freely, or whose value is -1 and can be decremented freely, you can loop twice in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        inc     bx ;if decrementing instead, parity check must be reversed&lt;br /&gt;
        jpo     looping ;1 (01b) and 2 (10b) have odd parity, 3 (11b) has even parity&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== The Rrrola Trick (Obtaining X and Y without DIV) ==&lt;br /&gt;
&lt;br /&gt;
In [[Output#Outputting_in_mode_13h_.28320x200.29|320x200 mode]], instead of constructing X and Y from the screen pointer DI with DIV, you can get a decent estimation with multiplying the screen pointer with 0xCCCD and read X and Y from the 8bit registers DH (+DL as 16bit value) and DL (+AH as 16bit value). The idea is to interpret DI as a kind of 16 bit float in the range [0,1], from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation 0xCCCD is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
&lt;br /&gt;
This is adapted from [http://www.pouet.net/prod.php?which=53816 &amp;quot;Puls&amp;quot; by Rrrola] where X and Y are directly modified on the stack by performing &amp;lt;code&amp;gt;add dword[di],0000CCCDh&amp;lt;/code&amp;gt; on each pixel iteration, which requires 7 bytes of code. The vertical alignment correction is solved with a good starting value on said DWORD on the stack before each frame, which requires 2 additional bytes. Both approaches are too different to directly compare, but share the core idea of [http://www.pouet.net/topic.php?which=8791&amp;amp;page=8#c411796 multiplying with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt;], so &amp;quot;Rrrolas trick&amp;quot; is an appropriate term to use.&lt;br /&gt;
&lt;br /&gt;
== Use the entire register for a smaller opcode form ==&lt;br /&gt;
&lt;br /&gt;
1. As you know e.g. &amp;lt;code&amp;gt;add cl,1&amp;lt;/code&amp;gt; produces 3 Bytes of code while &amp;lt;code&amp;gt;inc cl&amp;lt;/code&amp;gt; compiles to 2 Bytes. If ch does not matter (or you know that it won't be affected) use &amp;lt;code&amp;gt;inc cx&amp;lt;/code&amp;gt; instead and get the most out of that 1 Byte. This is no real trick but sometimes such things can be overlooked - while the 2 saved Bytes could be invested wisely.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=329</id>
		<title>Floating-point Opcodes</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=329"/>
				<updated>2016-08-24T19:18:10Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The [https://en.wikipedia.org/wiki/Floating-point_unit FPU] offers a lot of complex and powerful floating-point operations, like &amp;lt;code&amp;gt;SIN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;COS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ATAN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;SQRT&amp;lt;/code&amp;gt;, etc. ([http://www.website.masmforum.com/tutorials/fptute/appen1.htm SIMPLY FPU] by Raymond Filiatreault has a compact overview of all FPU commands for the curious.)  Use of the FPU can drastically increase what you can do in a tiny production while simultaneously keeping code size down.  All x86 CPUs can use the FPU, although older CPUs (80486sx and earlier) need the FPU physically installed in the FPU socket.&lt;br /&gt;
&lt;br /&gt;
This guide assumes intermediate to advanced-level proficiency with sizecoding.&lt;br /&gt;
&lt;br /&gt;
== FPU Basics ==&lt;br /&gt;
&lt;br /&gt;
Usage and communication with the FPU is quite uncommon and takes a bit to get used to, so we'll start with a simple example.  This is what [[Output#Basic_Animation_and_user_interaction|our code example]] looks like originally:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter	          &amp;lt;-- REPLACE THIS WITH FPU CODE&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and this is how the code looks if we replace the instruction with FPU code :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get &amp;quot;column&amp;quot; in AX&lt;br /&gt;
&lt;br /&gt;
fninit				; init FPU first&lt;br /&gt;
mov		[si],ax		; write first addend to a memory location&lt;br /&gt;
fild	word [si]	; F(pu) I(nteger) L(oad)D a WORD from memory location to the FPU stack&lt;br /&gt;
mov		[si],di		; write second addend to a memory location&lt;br /&gt;
fiadd	word [si]	; Directly add the word in the memory location to the top FPU stack&lt;br /&gt;
fist	word [si]	; F(pu) I(nteger) ST(ore) the result into a memory location&lt;br /&gt;
mov		ax,[si]		; Get the word from the memory location into AX&lt;br /&gt;
&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Obviously that made the program bigger instead of smaller, but the point of this exercise is to illustrate how to work with the FPU.)  Looking at our changes, you get a sense of what usual interaction with the FPU is:&lt;br /&gt;
# &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt;: Initialize the FPU&lt;br /&gt;
# transfer values from CPU registers to memory location(s)&lt;br /&gt;
# transfer values from memory location(s) onto FPU stack&lt;br /&gt;
# do the actual calculations using the FPU (more on this soon)&lt;br /&gt;
# transfer result from the FPU stack into memory location(s)&lt;br /&gt;
# transfer result from memory location(s) back into registers&lt;br /&gt;
&lt;br /&gt;
That is a lot of extra code for a single integer addition, but that's a simple example; ''once more complex floating point operations are involved, it starts to pay off.''&lt;br /&gt;
&lt;br /&gt;
== Optimizing with the FPU ==&lt;br /&gt;
&lt;br /&gt;
=== Distance function example ===&lt;br /&gt;
&lt;br /&gt;
For more advanced FPU operation, let's start from scratch with an unoptimized program which plots the distance of each pixel to the screen center as color, in 49 bytes.&lt;br /&gt;
&lt;br /&gt;
[[File:Distance to center example.png|thumb|Distance to center]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h			&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
fninit					; -	&lt;br /&gt;
						; it's useful to comment what's on the&lt;br /&gt;
						; stack after each FPU operation&lt;br /&gt;
						; to not get lost ;) start is : empty (-)&lt;br /&gt;
X:&lt;br /&gt;
xor 	dx,dx			; reset the high word before division&lt;br /&gt;
mov 	bx,320			; 320 columns&lt;br /&gt;
mov 	ax,di			; get screen pointer in AX&lt;br /&gt;
div 	bx				; construct X,Y from screen pointer into AX,DX&lt;br /&gt;
sub 	ax,100			; subtract the origin&lt;br /&gt;
sub 	dx,160			; = (160,100) ... center of 320x200 screen	&lt;br /&gt;
mov 	[si],ax			; move X into a memory location&lt;br /&gt;
fild 	word [si]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
mov 	[si],dx			; move Y into a memory location&lt;br /&gt;
fild 	word [si]		; Y X²&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	word [si]		; -&lt;br /&gt;
mov 	ax,[si]			; get the result from memory&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A few words on this : &lt;br /&gt;
* The FPU registers (st0, st1, ...) are organized as a stack. When you load something to the FPU, everything else will be moved one location further away from the top (implicitly!) Some FPU instructions work only on the top, other allow the explicit parametrization with arbitrary FPU registers.&lt;br /&gt;
* Depending on what you do, sometimes &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt; can be omitted. Real hardware will refuse to work more often than emulators, but it's always worth the try.&lt;br /&gt;
* Accessing memory (size) efficiently can be a real pain. The safest way is to reference absolute memory locations (f.e &amp;lt;code&amp;gt;[1234]&amp;lt;/code&amp;gt;) but that's two bytes more per instruction than referencing memory with &amp;lt;code&amp;gt;[BX]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+DI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[DI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+DI]&amp;lt;/code&amp;gt;. When working with FPU and this ''classic'' approach of FPU communication, you have to design your codeflow to have one or some of these locations available.&lt;br /&gt;
* Accessing the memory is always with regard to the segment register &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; unless you perform segment overrides. When accessing memory with &amp;lt;code&amp;gt;[BP+??]&amp;lt;/code&amp;gt; be aware that the memory is accessed with regard to the segment register &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (see [https://courses.engr.illinois.edu/ece390/books/artofasm/CH04/CH04-2.html#HEADING2-35 Register Indirect Addressing Modes])&lt;br /&gt;
* There are a few conventions which help you identify FPU commands. &amp;quot;i&amp;quot; stands for integer (WORD or DWORD), &amp;quot;p&amp;quot; means &amp;quot;pop stack afterwards&amp;quot;, so &amp;lt;code&amp;gt;FST&amp;lt;/code&amp;gt; means just &amp;quot;store&amp;quot; while &amp;lt;code&amp;gt;FISTP&amp;lt;/code&amp;gt; means &amp;quot;store as integer, then pop the stack&amp;quot;&lt;br /&gt;
&lt;br /&gt;
=== Stack addressing, &amp;quot;Rrrolas Trick&amp;quot;, alignment optimization, Aspect Ratio ===&lt;br /&gt;
&lt;br /&gt;
Now let's unleash the state of the art sizecoding arsenal onto this, to bring it down to 37 bytes (40 bytes with aspect correction)&lt;br /&gt;
&lt;br /&gt;
[[File:Distant to center optimized.png|thumb|Distance to center : optimized]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h - 70		; modified to center to 160,100&lt;br /&gt;
aas						; aspect ratio constant part&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
X:&lt;br /&gt;
mov 	ax,0xCCCD		; perform the famous...&lt;br /&gt;
mul		di				; ... Rrrola trick =)&lt;br /&gt;
sub 	dh,[si]			; align vertically&lt;br /&gt;
pusha 					; push all registers on stack&lt;br /&gt;
fild 	word [bx-8]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
fild 	word [bx-9]		; Y X²&lt;br /&gt;
fmul	dword [bx+si]	; aspect ratio correction&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	dword [bx-5]	; -&lt;br /&gt;
popa					; pop all registers from stack&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The resulting image is almost identical to to the former. Let's go through this step by step:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;push 	0a000h - 70&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of aligning horizontally with &amp;lt;code&amp;gt;sub	dx,160&amp;lt;/code&amp;gt; we can code this implicitly by moving our segment register ten units - that is 10 * 16 = 160 pixels - to the left (see [https://en.wikipedia.org/wiki/X86_memory_segmentation#Real_mode Real Mode Addressing]). With further multiple subtraction of 20 units - that is 320 pixels, we can shift the visible screen towards the top, to finetune vertical alignment. As long as this shift is no more than 4 lines ''( 65536 / 320 - 200 = 4,8 )'' there is no further visual impact.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the high byte of a constant, placed in a way that &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt; resolves to ~1.24 when read as 32bit float. The last byte of segment &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; is also of importance. Check yourself with the [http://www.h-schmidt.net/FloatConverter/IEEE754.html IEEE 754 Converter]&lt;br /&gt;
* &amp;lt;code&amp;gt;mov ax,0xCCCD&amp;lt;/code&amp;gt; &amp;amp; &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; (The &amp;quot;Rrrola trick&amp;quot;)&lt;br /&gt;
Instead of constructing X and Y from the screen pointer &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;DIV&amp;lt;/code&amp;gt; you can get a decent estimation with multiplying the screen pointer with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; and read X and Y from the 8bit registers &amp;lt;code&amp;gt;DH&amp;lt;/code&amp;gt; (+DL as 16bit value) and &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; (+AH as 16bit value). The idea is to interpret &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; as a kind of 16 bit float in the range &amp;lt;code&amp;gt;[0,1]&amp;lt;/code&amp;gt;, from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
* &amp;lt;code&amp;gt;sub dh,[si]&amp;lt;/code&amp;gt;&lt;br /&gt;
The instruction at &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;push &amp;lt;word&amp;gt;&amp;lt;/code&amp;gt; and has the opcode &amp;lt;code&amp;gt;0x68&amp;lt;/code&amp;gt; which is 104 in decimal. Combined with the fine tuned vertical alignment above ( ~4 lines) this results in (virtually) subtracting 100 for perfect vertical alignment. This is one byte shorter than &amp;lt;code&amp;gt;sub dh,100&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;pusha / popa&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of going the ''classical'' way of communicating with the FPU, we push all the registers, read/write values with memory addressing to/from the FPU, then pop all registers again. This works when &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; = &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP&amp;lt;/code&amp;gt; is &amp;quot;close enough&amp;quot; to &amp;lt;code&amp;gt;BX&amp;lt;/code&amp;gt; (initially zero and kept that way) to allow &amp;lt;code&amp;gt;[BX+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; addressing. It comes with the special benefit of implicit 8bit shifts. One serious drawback is loss of precision, since the registers &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;AH&amp;lt;/code&amp;gt; &amp;quot;lose connection&amp;quot; when using &amp;lt;code&amp;gt;PUSHA&amp;lt;/code&amp;gt; (see the order of registers : [http://x86.renejeschke.de/html/file_module_x86_id_270.html PUSHA/PUSHAD documentation]&lt;br /&gt;
*&amp;lt;code&amp;gt;fild word [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; &amp;amp; *&amp;lt;code&amp;gt;fistp dword [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the so called &amp;quot;stack addressing&amp;quot;. We assume that &amp;lt;code&amp;gt;BX=0&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP=0xFFFE&amp;lt;/code&amp;gt; at start, so we know where the registers are in memory after &amp;lt;code&amp;gt;pusha&amp;lt;/code&amp;gt; (AX at [BX-4], CX at [BX-6] etc.). It's important to realize that we work with signed 16 bit values now, in the full range of [-32768,32767]. That is also why we need &amp;lt;code&amp;gt;DWORD&amp;lt;/code&amp;gt; when storing the result : &amp;lt;code&amp;gt;sqrt(x²+y²)&amp;lt;/code&amp;gt; exceeds the signed 16bit range for quite some value pairs. Note that there are already implicit 8bit shifts (bx-9,bx-5)&lt;br /&gt;
*&amp;lt;code&amp;gt;fmul dword [bx+si]&amp;lt;/code&amp;gt;&lt;br /&gt;
With the &amp;quot;Rrrola&amp;quot; trick above, we have the row number to be 204 at maximum, but also the column can't be greater than 256. This results in a wrong aspect ratio, but it can almost completely be fixed with this two byte instruction (+ one byte for the &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; instruction) : 256 * 1,24 = 317,44 which is quite close to 320. If aspect ratio is of no meaning to the desired effect, this three bytes can be shaved off. &lt;br /&gt;
&lt;br /&gt;
=== The &amp;quot;Tunnel&amp;quot; ===&lt;br /&gt;
&lt;br /&gt;
Now let's add some features:&lt;br /&gt;
* extract angle as opposed to the distance and combine both&lt;br /&gt;
* reverse divide the distance to create the &amp;quot;tunnel&amp;quot; effect&lt;br /&gt;
* animate with smooth steps along the distance&lt;br /&gt;
* improve on the colors with subselecting from the standard palette&lt;br /&gt;
* quit the program on ESC&lt;br /&gt;
&lt;br /&gt;
This results in the following program with a size of 63 bytes :&lt;br /&gt;
&lt;br /&gt;
[[File:Tunnel effect neontube.png|thumb|tunnel effect neontube]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0xa000 - 10 - 3 * 20	; video base - 3.5 lines&lt;br /&gt;
or 		al, 0x13				; mode 13h = 320 x 200 in 256 colors&lt;br /&gt;
pop 	es						; get aligned video memory base&lt;br /&gt;
int 	0x10					; switch videomode&lt;br /&gt;
X: &lt;br /&gt;
sub		dh, [si]				; vertical alignment&lt;br /&gt;
pusha							; push all registers on stack&lt;br /&gt;
fild 	word	[bx-9]			; fpustack :  x&lt;br /&gt;
fild 	word	[bx-8]			; fpustack :  y  x&lt;br /&gt;
fpatan							; fpustack :  arc&lt;br /&gt;
fst 	st1						; fpustack :  arc  arc&lt;br /&gt;
fcos							; fpustack :  cos(arc)  arc&lt;br /&gt;
fimul	dword	[si]			; fpustack :  l*cos(arc)  arc&lt;br /&gt;
fidiv	word	[bx-8]			; fpustack :  l*cos(arc)/x  arc&lt;br /&gt;
fiadd	word	[bp+si]			; fpustack :  l*cos(arc)/x+offset  arc&lt;br /&gt;
fistp	dword	[bx-7]			; fpustack :  arc&lt;br /&gt;
fimul	word	[byte si+val]	; fpustack :  scaled_arc&lt;br /&gt;
fistp	word	[bx-5]			; fpustack :  -&lt;br /&gt;
popa							; pop all registers from stack&lt;br /&gt;
xor 	al, cl					; XOR scaled_arc with distance&lt;br /&gt;
and 	al, 16 + 8 + 2			; sub selecting palette part&lt;br /&gt;
stosb							; writing to screen&lt;br /&gt;
mov 	ax, 0xCCCD				; Performing the famous&lt;br /&gt;
mul 	di						; Rrrola trick&lt;br /&gt;
jo 		X						; next frame check&lt;br /&gt;
add 	word [bp+si], byte 23	; change offset smoothly&lt;br /&gt;
in 		al, 0x60				; check for ...&lt;br /&gt;
dec 	ax						; ...ESC key&lt;br /&gt;
jnz 	X						; otherwise continue&lt;br /&gt;
ret								; quit program&lt;br /&gt;
val:	dw 6519 				; n = 160 * 256 / pi / 2 ; 0x1977&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Many other tiny tunnel effects have been coded, so it is highly recommended to check out the documented source code of [http://web.archive.org/web/20050216133244/http://www.farb-rausch.de/ryg/tunnel.asm &amp;quot;Constant Evolution&amp;quot; by ryg/Farbrausch] and the [http://www.pouet.net/prod.php?which=29412 &amp;quot;Heart shaped tunnel&amp;quot; from Lord Kelvin], both with a size of 64 bytes. While &amp;quot;Constant Evolution&amp;quot; takes a slightly different route than the example here (''classic'' FPU communication, ''classic'' X Y construction, sqrt(x²+y²) instead of using &amp;lt;code&amp;gt;fcos&amp;lt;/code&amp;gt;+&amp;lt;code&amp;gt;fimul&amp;lt;/code&amp;gt;), &amp;quot;Heart shaped tunnel&amp;quot; uses no FPU at all. &lt;br /&gt;
&lt;br /&gt;
The takeaways from this example are:&lt;br /&gt;
&lt;br /&gt;
* Loading a constant from the code with some degrees of freedom&lt;br /&gt;
&amp;lt;code&amp;gt;fimul dword [si]&amp;lt;/code&amp;gt; multiplies with a 32 bit integer dividend for the tunnel effect. The highest byte of this constant points to our code, to the opcode from &amp;lt;code&amp;gt;or al, 0x13&amp;lt;/code&amp;gt;. This instruction puts &amp;lt;code&amp;gt;0x13&amp;lt;/code&amp;gt; into &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt;, and since there a lot of possibilities to achieve this, there is a direct way of changing the appearance of the tunnel with changing this instruction to one of the following : &amp;lt;code&amp;gt;mov al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;add al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;xor al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;adc al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;sbb al,0xED&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sub al,0xED&amp;lt;/code&amp;gt;. In this special case, the instruction can also be swapped with &amp;lt;code&amp;gt;pop es&amp;lt;/code&amp;gt; to gain a further degree of freedom.&lt;br /&gt;
&lt;br /&gt;
* normal loading of a constant which can't be reused as opcode&lt;br /&gt;
Although it's the ultimate goal to not even use a single extra byte for constants, sometimes the required sequence simply does not appear in the code. In this case, a constant is needed to convert the angle from the range [-pi,pi] to the color space in a way that no gaps appear while stepping from 359° to 0°. In the last line at &amp;lt;code&amp;gt;val:&amp;lt;/code&amp;gt; the value &amp;lt;code&amp;gt;160&amp;lt;/code&amp;gt; in the comment is 32 * 5 where 5 is number of &amp;quot;spiral arms&amp;quot; the tunnel effect has. The 8bit shift (*256) is to increase precision. It turns out that 16bit precision is enough to get a decent &amp;quot;gap closer&amp;quot; for values obtained by &amp;lt;code&amp;gt;fpatan&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* operating directly on an indirect memory location without offset&lt;br /&gt;
The location &amp;lt;code&amp;gt;[bp+si]&amp;lt;/code&amp;gt; is used as animation variable, while both participatory registers are kept fixed (the value is &amp;lt;code&amp;gt;0x0A??&amp;lt;/code&amp;gt; and therefor way above our code). Since we work with 16bit values and the top 8bit are the measurement in pixels, the instruction &amp;lt;code&amp;gt;add word [bp+si], byte 23&amp;lt;/code&amp;gt; allows for sub pixel precision in animation, while occupying 3 bytes of space. Depending on the target hardware, this value 23 can be increased/decreased to achieve faster/smoother animation.&lt;br /&gt;
&lt;br /&gt;
* Optimizing the check for the next frame&lt;br /&gt;
Normally, there is a check like &amp;lt;code&amp;gt;test di,di&amp;lt;/code&amp;gt; with direct consecutive branch necessary. The used approach allows for direct branching after &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;jo&amp;lt;/code&amp;gt;, since the overflow flag is always but twice triggered for a frame. This saves two bytes, but also requires adjustment of the animation constant, because the animation constant is also added twice. A further benefit is that in one of these two cases, &amp;lt;code&amp;gt;AX&amp;lt;/code&amp;gt; is zero which save a further byte on the following ESC check (&amp;lt;code&amp;gt;dec ax&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;dec al&amp;lt;/code&amp;gt; )&lt;br /&gt;
&lt;br /&gt;
=== Size optimizing the &amp;quot;Tunnel&amp;quot; ===&lt;br /&gt;
&lt;br /&gt;
Now if we abandon all the comfort, alignment, smoothness and convenience, and optimize this straight for size, we end up with a 52 byte version. This does not include the possible exclusion of color tuning (2 bytes), after all the effect is supposed to look at least somewhat appealing ;)&lt;br /&gt;
&lt;br /&gt;
[[File:Neontube 52b optimized.png|thumb|Neontube 52b optimized]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov		al,0x13				; mode 13h = 320 x 200 in 256 colors&lt;br /&gt;
int		0x10				; switch videomode&lt;br /&gt;
X: &lt;br /&gt;
or		al, [bp+si]			; *illusion* - executed ONCE&lt;br /&gt;
xor		al, 0x68			; *illusion* - executed ONCE&lt;br /&gt;
mov		dx, 0x79F			; *illusion* - executed ONCE&lt;br /&gt;
pusha						; push all registers on stack&lt;br /&gt;
fild 	word	[bx-9]		; x&lt;br /&gt;
fild 	word	[bx-8]		; y x&lt;br /&gt;
fpatan						; arc&lt;br /&gt;
fst 	st1					; arc arc&lt;br /&gt;
fcos						; cos(arc) arc&lt;br /&gt;
fimul	dword	[si]		; l*cos(arc) arc&lt;br /&gt;
fidiv	word	[bx-8]		; l*cos(arc)/x arc&lt;br /&gt;
fistp	dword	[bx-4]		; arc&lt;br /&gt;
fimul	word	[bx]		; scaled_arc&lt;br /&gt;
fistp	word	[bx-5]		; -&lt;br /&gt;
popa						; pop all registers from stack&lt;br /&gt;
sub		ah, [bp+si]			; animation along distance&lt;br /&gt;
xor		al, ah				; XOR scaled_arc with distance&lt;br /&gt;
and		al, 16 + 8 + 4		; sub palette selection&lt;br /&gt;
stosb						; write to screen, advance DI&lt;br /&gt;
mov		ax, 0xCCCD			; the famous&lt;br /&gt;
mul		di					; Rrrola trick&lt;br /&gt;
jmp 	short X-1			; *ODD* jump into &amp;quot;int 0x10&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Since this code contains an &amp;quot;odd jump&amp;quot; into the middle of the instruction &amp;lt;code&amp;gt;int 0x10&amp;lt;/code&amp;gt; it helps to disassemble the code from address &amp;lt;code&amp;gt;0x103&amp;lt;/code&amp;gt; on, until both code pathes realign at &amp;lt;code&amp;gt;pusha&amp;lt;/code&amp;gt; :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;adc 	[bp+si],cl		; decrement framecounter&lt;br /&gt;
add 	dh,[si]			; vertical alignment&lt;br /&gt;
push 	word 0x9FBA		; video base - 3.5 lines&lt;br /&gt;
pop 	es				; get aligned video memory&lt;br /&gt;
						; continues with &amp;quot;pusha&amp;quot;&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Like before the &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; instruction triggers the overflow flag - and the carry flag - always but twice per frame. &amp;lt;code&amp;gt;CL&amp;lt;/code&amp;gt; remains &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; unchanged from start, so &amp;lt;code&amp;gt;adc [bp+si],cl&amp;lt;/code&amp;gt; effectively decrements the framecounter twice per frame. &lt;br /&gt;
&lt;br /&gt;
The dividend for scaling the arc is now taken from the location &amp;lt;code&amp;gt;[bx]&amp;lt;/code&amp;gt;, which is &amp;lt;code&amp;gt;0x20CD&amp;lt;/code&amp;gt;. Calculating the number of &amp;quot;spirals&amp;quot; backwards from this number (*pi*2/256/32) gives about 6.44 which is close enough to 6.5, so that the visual gap in the top is almost not recognizable - besides from the miscoloring which can be fixed by removing the 16 from &amp;lt;code&amp;gt;and al, 16 + 8 + 4&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
It's noteworthy that the location &amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; does not contain a good offset anymore, since &amp;lt;code&amp;gt;push &amp;lt;word&amp;gt;&amp;lt;/code&amp;gt; moved away from the top, to help create *illusion* code that not only realigns soon in the second code path but also does not modify registers and memory environment in a hindering way (only &amp;lt;code&amp;gt;AL&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;DX&amp;lt;/code&amp;gt; are modified)&lt;br /&gt;
&lt;br /&gt;
The takeaway here is to study the modbytes of instructions that are very likely to appear in tiny intros, as well as using their constants as code. &amp;lt;code&amp;gt;int 0x10&amp;lt;/code&amp;gt; will almost always be in your code, so there is always &amp;lt;code&amp;gt;adc byte[??], 8bit_reg&amp;lt;/code&amp;gt; too in between.&lt;br /&gt;
&lt;br /&gt;
Finally, you might have noticed that these tunnels don't &amp;quot;spin&amp;quot;. For the cost of two bytes this can be easily added. &amp;lt;code&amp;gt;add		al, [bp+si]&amp;lt;/code&amp;gt; before applying XOR in the last example would offset the &amp;lt;code&amp;gt;arc&amp;lt;/code&amp;gt; with the framecounter and produce a spinning effect. For a coherent visual experience the sub palette selection must be changed to &amp;lt;code&amp;gt;and al, 8 + 4&amp;lt;/code&amp;gt;, too.&lt;br /&gt;
&lt;br /&gt;
== FPU Tricks ==&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=328</id>
		<title>SizeCoding:About</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=328"/>
				<updated>2016-08-24T19:08:59Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* History */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== History ==&lt;br /&gt;
&lt;br /&gt;
sizecoding.org grew from a conversation between [[User:Trixter|Trixter]], [[User:HellMood|HellMood]], and [[User:qkumba|qkumba]] when it was determined there was no central repository for sizecoding information.  Trixter had inquired because VileR was dipping his toes into the lake that is sizecoding, and wanted to point him to a starting point for resources suitable for beginners.  None existed, so Trixter stood up sizecoding.org, and all three of them contributed information and examples to the launch, with Hellmood contributing the bulk of the initial load and guides.&lt;br /&gt;
&lt;br /&gt;
Many thanks to the entire demoscene for knowledge and inspiration.&lt;br /&gt;
&lt;br /&gt;
== Q&amp;amp;A ==&lt;br /&gt;
&lt;br /&gt;
'''Why a Wiki?'''  Wikis have the lowest barrier to entry for contributing to information stores.  We encourage the entire demoscene and assembler programming community to contribute.&lt;br /&gt;
&lt;br /&gt;
'''Is this being backed up somewhere?'''  Yes.  A few key members have access to the weekly php and database dumps.&lt;br /&gt;
&lt;br /&gt;
'''Doesn't in4k cover this material?'''  [https://in4k.github.io/ in4k] mostly concentrates on 1k/4k productions, across all environments (Windows, javascript, webgl, etc.).  SizeCoding is much more targeted, and concentrates mainly on 256b, x86-only coding.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=327</id>
		<title>Floating-point Opcodes</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=327"/>
				<updated>2016-08-24T19:02:33Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Distance function example */ Chose better-formatted link&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The FPU offers a lot of complex floating-point operations not available to classic x86 CPUs, like &amp;lt;code&amp;gt;SIN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;COS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ATAN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;SQRT&amp;lt;/code&amp;gt;, etc. ([http://www.website.masmforum.com/tutorials/fptute/appen1.htm SIMPLY FPU] by Raymond Filiatreault has a compact overview of all FPU commands for the curious.)  &lt;br /&gt;
&lt;br /&gt;
== FPU Basics ==&lt;br /&gt;
&lt;br /&gt;
Usage and communication with the FPU is quite uncommon and takes a bit to get used to, so we'll start with a simple example.  This is what [[Output#Basic_Animation_and_user_interaction|our code example]] looks like originally:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter	          &amp;lt;-- REPLACE THIS WITH FPU CODE&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and this is how the code looks if we replace the instruction with FPU code :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get &amp;quot;column&amp;quot; in AX&lt;br /&gt;
&lt;br /&gt;
fninit				; init FPU first&lt;br /&gt;
mov		[si],ax		; write first addend to a memory location&lt;br /&gt;
fild	word [si]	; F(pu) I(nteger) L(oad)D a WORD from memory location to the FPU stack&lt;br /&gt;
mov		[si],di		; write second addend to a memory location&lt;br /&gt;
fiadd	word [si]	; Directly add the word in the memory location to the top FPU stack&lt;br /&gt;
fist	word [si]	; F(pu) I(nteger) ST(ore) the result into a memory location&lt;br /&gt;
mov		ax,[si]		; Get the word from the memory location into AX&lt;br /&gt;
&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Obviously that made the program bigger instead of smaller, but the point of this exercise is to illustrate how to work with the FPU.)  Looking at our changes, you get a sense of what usual interaction with the FPU is:&lt;br /&gt;
# &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt;: Initialize the FPU&lt;br /&gt;
# transfer values from CPU registers to memory location(s)&lt;br /&gt;
# transfer values from memory location(s) onto FPU stack&lt;br /&gt;
# do the actual calculations using the FPU (more on this soon)&lt;br /&gt;
# transfer result from the FPU stack into memory location(s)&lt;br /&gt;
# transfer result from memory location(s) back into registers&lt;br /&gt;
&lt;br /&gt;
That is a lot of extra code for a single integer addition, but that's a simple example; ''once more complex floating point operations are involved, it starts to pay off.''&lt;br /&gt;
&lt;br /&gt;
== Optimizing with the FPU ==&lt;br /&gt;
&lt;br /&gt;
=== Distance function example ===&lt;br /&gt;
&lt;br /&gt;
For more advanced FPU operation, let's start from scratch with an unoptimized program which plots the distance of each pixel to the screen center as color, in 49 bytes.&lt;br /&gt;
&lt;br /&gt;
[[File:Distance to center example.png|thumb|Distance to center]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h			&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
fninit					; -	&lt;br /&gt;
						; it's useful to comment what's on the&lt;br /&gt;
						; stack after each FPU operation&lt;br /&gt;
						; to not get lost ;) start is : empty (-)&lt;br /&gt;
X:&lt;br /&gt;
xor 	dx,dx			; reset the high word before division&lt;br /&gt;
mov 	bx,320			; 320 columns&lt;br /&gt;
mov 	ax,di			; get screen pointer in AX&lt;br /&gt;
div 	bx				; construct X,Y from screen pointer into AX,DX&lt;br /&gt;
sub 	ax,100			; subtract the origin&lt;br /&gt;
sub 	dx,160			; = (160,100) ... center of 320x200 screen	&lt;br /&gt;
mov 	[si],ax			; move X into a memory location&lt;br /&gt;
fild 	word [si]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
mov 	[si],dx			; move Y into a memory location&lt;br /&gt;
fild 	word [si]		; Y X²&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	word [si]		; -&lt;br /&gt;
mov 	ax,[si]			; get the result from memory&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A few words on this : &lt;br /&gt;
* The FPU registers (st0, st1, ...) are organized as a stack. When you load something to the FPU, everything else will be moved one location further away from the top (implicitly!) Some FPU instructions work only on the top, other allow the explicit parametrization with arbitrary FPU registers.&lt;br /&gt;
* Depending on what you do, sometimes &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt; can be omitted. Real hardware will refuse to work more often than emulators, but it's always worth the try.&lt;br /&gt;
* Accessing memory (size) efficiently can be a real pain. The safest way is to reference absolute memory locations (f.e &amp;lt;code&amp;gt;[1234]&amp;lt;/code&amp;gt;) but that's two bytes more per instruction than referencing memory with &amp;lt;code&amp;gt;[BX]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+DI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[DI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+DI]&amp;lt;/code&amp;gt;. When working with FPU and this ''classic'' approach of FPU communication, you have to design your codeflow to have one or some of these locations available.&lt;br /&gt;
* Accessing the memory is always with regard to the segment register &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; unless you perform segment overrides. When accessing memory with &amp;lt;code&amp;gt;[BP+??]&amp;lt;/code&amp;gt; be aware that the memory is accessed with regard to the segment register &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (see [https://courses.engr.illinois.edu/ece390/books/artofasm/CH04/CH04-2.html#HEADING2-35 Register Indirect Addressing Modes])&lt;br /&gt;
* There are a few conventions which help you identify FPU commands. &amp;quot;i&amp;quot; stands for integer (WORD or DWORD), &amp;quot;p&amp;quot; means &amp;quot;pop stack afterwards&amp;quot;, so &amp;lt;code&amp;gt;FST&amp;lt;/code&amp;gt; means just &amp;quot;store&amp;quot; while &amp;lt;code&amp;gt;FISTP&amp;lt;/code&amp;gt; means &amp;quot;store as integer, then pop the stack&amp;quot;&lt;br /&gt;
&lt;br /&gt;
=== Stack addressing, &amp;quot;Rrrolas Trick&amp;quot;, alignment optimization, Aspect Ratio ===&lt;br /&gt;
&lt;br /&gt;
Now let's unleash the state of the art sizecoding arsenal onto this, to bring it down to 37 bytes (40 bytes with aspect correction)&lt;br /&gt;
&lt;br /&gt;
[[File:Distant to center optimized.png|thumb|Distance to center : optimized]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h - 70		; modified to center to 160,100&lt;br /&gt;
aas						; aspect ratio constant part&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
X:&lt;br /&gt;
mov 	ax,0xCCCD		; perform the famous...&lt;br /&gt;
mul		di				; ... Rrrola trick =)&lt;br /&gt;
sub 	dh,[si]			; align vertically&lt;br /&gt;
pusha 					; push all registers on stack&lt;br /&gt;
fild 	word [bx-8]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
fild 	word [bx-9]		; Y X²&lt;br /&gt;
fmul	dword [bx+si]	; aspect ratio correction&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	dword [bx-5]	; -&lt;br /&gt;
popa					; pop all registers from stack&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The resulting image is almost identical to to the former. Let's go through this step by step:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;push 	0a000h - 70&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of aligning horizontally with &amp;lt;code&amp;gt;sub	dx,160&amp;lt;/code&amp;gt; we can code this implicitly by moving our segment register ten units - that is 10 * 16 = 160 pixels - to the left (see [https://en.wikipedia.org/wiki/X86_memory_segmentation#Real_mode Real Mode Addressing]). With further multiple subtraction of 20 units - that is 320 pixels, we can shift the visible screen towards the top, to finetune vertical alignment. As long as this shift is no more than 4 lines ''( 65536 / 320 - 200 = 4,8 )'' there is no further visual impact.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the high byte of a constant, placed in a way that &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt; resolves to ~1.24 when read as 32bit float. The last byte of segment &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; is also of importance. Check yourself with the [http://www.h-schmidt.net/FloatConverter/IEEE754.html IEEE 754 Converter]&lt;br /&gt;
* &amp;lt;code&amp;gt;mov ax,0xCCCD&amp;lt;/code&amp;gt; &amp;amp; &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; (The &amp;quot;Rrrola trick&amp;quot;)&lt;br /&gt;
Instead of constructing X and Y from the screen pointer &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;DIV&amp;lt;/code&amp;gt; you can get a decent estimation with multiplying the screen pointer with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; and read X and Y from the 8bit registers &amp;lt;code&amp;gt;DH&amp;lt;/code&amp;gt; (+DL as 16bit value) and &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; (+AH as 16bit value). The idea is to interpret &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; as a kind of 16 bit float in the range &amp;lt;code&amp;gt;[0,1]&amp;lt;/code&amp;gt;, from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
* &amp;lt;code&amp;gt;sub dh,[si]&amp;lt;/code&amp;gt;&lt;br /&gt;
The instruction at &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;push &amp;lt;word&amp;gt;&amp;lt;/code&amp;gt; and has the opcode &amp;lt;code&amp;gt;0x68&amp;lt;/code&amp;gt; which is 104 in decimal. Combined with the fine tuned vertical alignment above ( ~4 lines) this results in (virtually) subtracting 100 for perfect vertical alignment. This is one byte shorter than &amp;lt;code&amp;gt;sub dh,100&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;pusha / popa&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of going the ''classical'' way of communicating with the FPU, we push all the registers, read/write values with memory addressing to/from the FPU, then pop all registers again. This works when &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; = &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP&amp;lt;/code&amp;gt; is &amp;quot;close enough&amp;quot; to &amp;lt;code&amp;gt;BX&amp;lt;/code&amp;gt; (initially zero and kept that way) to allow &amp;lt;code&amp;gt;[BX+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; addressing. It comes with the special benefit of implicit 8bit shifts. One serious drawback is loss of precision, since the registers &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;AH&amp;lt;/code&amp;gt; &amp;quot;lose connection&amp;quot; when using &amp;lt;code&amp;gt;PUSHA&amp;lt;/code&amp;gt; (see the order of registers : [http://x86.renejeschke.de/html/file_module_x86_id_270.html PUSHA/PUSHAD documentation]&lt;br /&gt;
*&amp;lt;code&amp;gt;fild word [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; &amp;amp; *&amp;lt;code&amp;gt;fistp dword [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the so called &amp;quot;stack addressing&amp;quot;. We assume that &amp;lt;code&amp;gt;BX=0&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP=0xFFFE&amp;lt;/code&amp;gt; at start, so we know where the registers are in memory after &amp;lt;code&amp;gt;pusha&amp;lt;/code&amp;gt; (AX at [BX-4], CX at [BX-6] etc.). It's important to realize that we work with signed 16 bit values now, in the full range of [-32768,32767]. That is also why we need &amp;lt;code&amp;gt;DWORD&amp;lt;/code&amp;gt; when storing the result : &amp;lt;code&amp;gt;sqrt(x²+y²)&amp;lt;/code&amp;gt; exceeds the signed 16bit range for quite some value pairs. Note that there are already implicit 8bit shifts (bx-9,bx-5)&lt;br /&gt;
*&amp;lt;code&amp;gt;fmul dword [bx+si]&amp;lt;/code&amp;gt;&lt;br /&gt;
With the &amp;quot;Rrrola&amp;quot; trick above, we have the row number to be 204 at maximum, but also the column can't be greater than 256. This results in a wrong aspect ratio, but it can almost completely be fixed with this two byte instruction (+ one byte for the &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; instruction) : 256 * 1,24 = 317,44 which is quite close to 320. If aspect ratio is of no meaning to the desired effect, this three bytes can be shaved off. &lt;br /&gt;
&lt;br /&gt;
=== The &amp;quot;Tunnel&amp;quot; ===&lt;br /&gt;
&lt;br /&gt;
Now let's add some features:&lt;br /&gt;
* extract angle as opposed to the distance and combine both&lt;br /&gt;
* reverse divide the distance to create the &amp;quot;tunnel&amp;quot; effect&lt;br /&gt;
* animate with smooth steps along the distance&lt;br /&gt;
* improve on the colors with subselecting from the standard palette&lt;br /&gt;
* quit the program on ESC&lt;br /&gt;
&lt;br /&gt;
This results in the following program with a size of 63 bytes :&lt;br /&gt;
&lt;br /&gt;
[[File:Tunnel effect neontube.png|thumb|tunnel effect neontube]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0xa000 - 10 - 3 * 20	; video base - 3.5 lines&lt;br /&gt;
or 		al, 0x13				; mode 13h = 320 x 200 in 256 colors&lt;br /&gt;
pop 	es						; get aligned video memory base&lt;br /&gt;
int 	0x10					; switch videomode&lt;br /&gt;
X: &lt;br /&gt;
sub		dh, [si]				; vertical alignment&lt;br /&gt;
pusha							; push all registers on stack&lt;br /&gt;
fild 	word	[bx-9]			; fpustack :  x&lt;br /&gt;
fild 	word	[bx-8]			; fpustack :  y  x&lt;br /&gt;
fpatan							; fpustack :  arc&lt;br /&gt;
fst 	st1						; fpustack :  arc  arc&lt;br /&gt;
fcos							; fpustack :  cos(arc)  arc&lt;br /&gt;
fimul	dword	[si]			; fpustack :  l*cos(arc)  arc&lt;br /&gt;
fidiv	word	[bx-8]			; fpustack :  l*cos(arc)/x  arc&lt;br /&gt;
fiadd	word	[bp+si]			; fpustack :  l*cos(arc)/x+offset  arc&lt;br /&gt;
fistp	dword	[bx-7]			; fpustack :  arc&lt;br /&gt;
fimul	word	[byte si+val]	; fpustack :  scaled_arc&lt;br /&gt;
fistp	word	[bx-5]			; fpustack :  -&lt;br /&gt;
popa							; pop all registers from stack&lt;br /&gt;
xor 	al, cl					; XOR scaled_arc with distance&lt;br /&gt;
and 	al, 16 + 8 + 2			; sub selecting palette part&lt;br /&gt;
stosb							; writing to screen&lt;br /&gt;
mov 	ax, 0xCCCD				; Performing the famous&lt;br /&gt;
mul 	di						; Rrrola trick&lt;br /&gt;
jo 		X						; next frame check&lt;br /&gt;
add 	word [bp+si], byte 23	; change offset smoothly&lt;br /&gt;
in 		al, 0x60				; check for ...&lt;br /&gt;
dec 	ax						; ...ESC key&lt;br /&gt;
jnz 	X						; otherwise continue&lt;br /&gt;
ret								; quit program&lt;br /&gt;
val:	dw 6519 				; n = 160 * 256 / pi / 2 ; 0x1977&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Many other tiny tunnel effects have been coded, so it is highly recommended to check out the documented source code of [http://web.archive.org/web/20050216133244/http://www.farb-rausch.de/ryg/tunnel.asm &amp;quot;Constant Evolution&amp;quot; by ryg/Farbrausch] and the [http://www.pouet.net/prod.php?which=29412 &amp;quot;Heart shaped tunnel&amp;quot; from Lord Kelvin], both with a size of 64 bytes. While &amp;quot;Constant Evolution&amp;quot; takes a slightly different route than the example here (''classic'' FPU communication, ''classic'' X Y construction, sqrt(x²+y²) instead of using &amp;lt;code&amp;gt;fcos&amp;lt;/code&amp;gt;+&amp;lt;code&amp;gt;fimul&amp;lt;/code&amp;gt;), &amp;quot;Heart shaped tunnel&amp;quot; uses no FPU at all. &lt;br /&gt;
&lt;br /&gt;
The takeaways from this example are:&lt;br /&gt;
&lt;br /&gt;
* Loading a constant from the code with some degrees of freedom&lt;br /&gt;
&amp;lt;code&amp;gt;fimul dword [si]&amp;lt;/code&amp;gt; multiplies with a 32 bit integer dividend for the tunnel effect. The highest byte of this constant points to our code, to the opcode from &amp;lt;code&amp;gt;or al, 0x13&amp;lt;/code&amp;gt;. This instruction puts &amp;lt;code&amp;gt;0x13&amp;lt;/code&amp;gt; into &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt;, and since there a lot of possibilities to achieve this, there is a direct way of changing the appearance of the tunnel with changing this instruction to one of the following : &amp;lt;code&amp;gt;mov al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;add al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;xor al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;adc al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;sbb al,0xED&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sub al,0xED&amp;lt;/code&amp;gt;. In this special case, the instruction can also be swapped with &amp;lt;code&amp;gt;pop es&amp;lt;/code&amp;gt; to gain a further degree of freedom.&lt;br /&gt;
&lt;br /&gt;
* normal loading of a constant which can't be reused as opcode&lt;br /&gt;
Although it's the ultimate goal to not even use a single extra byte for constants, sometimes the required sequence simply does not appear in the code. In this case, a constant is needed to convert the angle from the range [-pi,pi] to the color space in a way that no gaps appear while stepping from 359° to 0°. In the last line at &amp;lt;code&amp;gt;val:&amp;lt;/code&amp;gt; the value &amp;lt;code&amp;gt;160&amp;lt;/code&amp;gt; in the comment is 32 * 5 where 5 is number of &amp;quot;spiral arms&amp;quot; the tunnel effect has. The 8bit shift (*256) is to increase precision. It turns out that 16bit precision is enough to get a decent &amp;quot;gap closer&amp;quot; for values obtained by &amp;lt;code&amp;gt;fpatan&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* operating directly on an indirect memory location without offset&lt;br /&gt;
The location &amp;lt;code&amp;gt;[bp+si]&amp;lt;/code&amp;gt; is used as animation variable, while both participatory registers are kept fixed (the value is &amp;lt;code&amp;gt;0x0A??&amp;lt;/code&amp;gt; and therefor way above our code). Since we work with 16bit values and the top 8bit are the measurement in pixels, the instruction &amp;lt;code&amp;gt;add word [bp+si], byte 23&amp;lt;/code&amp;gt; allows for sub pixel precision in animation, while occupying 3 bytes of space. Depending on the target hardware, this value 23 can be increased/decreased to achieve faster/smoother animation.&lt;br /&gt;
&lt;br /&gt;
* Optimizing the check for the next frame&lt;br /&gt;
Normally, there is a check like &amp;lt;code&amp;gt;test di,di&amp;lt;/code&amp;gt; with direct consecutive branch necessary. The used approach allows for direct branching after &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;jo&amp;lt;/code&amp;gt;, since the overflow flag is always but twice triggered for a frame. This saves two bytes, but also requires adjustment of the animation constant, because the animation constant is also added twice. A further benefit is that in one of these two cases, &amp;lt;code&amp;gt;AX&amp;lt;/code&amp;gt; is zero which save a further byte on the following ESC check (&amp;lt;code&amp;gt;dec ax&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;dec al&amp;lt;/code&amp;gt; )&lt;br /&gt;
&lt;br /&gt;
=== Size optimizing the &amp;quot;Tunnel&amp;quot; ===&lt;br /&gt;
&lt;br /&gt;
Now if we abandon all the comfort, alignment, smoothness and convenience, and optimize this straight for size, we end up with a 52 byte version. This does not include the possible exclusion of color tuning (2 bytes), after all the effect is supposed to look at least somewhat appealing ;)&lt;br /&gt;
&lt;br /&gt;
[[File:Neontube 52b optimized.png|thumb|Neontube 52b optimized]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov		al,0x13				; mode 13h = 320 x 200 in 256 colors&lt;br /&gt;
int		0x10				; switch videomode&lt;br /&gt;
X: &lt;br /&gt;
or		al, [bp+si]			; *illusion* - executed ONCE&lt;br /&gt;
xor		al, 0x68			; *illusion* - executed ONCE&lt;br /&gt;
mov		dx, 0x79F			; *illusion* - executed ONCE&lt;br /&gt;
pusha						; push all registers on stack&lt;br /&gt;
fild 	word	[bx-9]		; x&lt;br /&gt;
fild 	word	[bx-8]		; y x&lt;br /&gt;
fpatan						; arc&lt;br /&gt;
fst 	st1					; arc arc&lt;br /&gt;
fcos						; cos(arc) arc&lt;br /&gt;
fimul	dword	[si]		; l*cos(arc) arc&lt;br /&gt;
fidiv	word	[bx-8]		; l*cos(arc)/x arc&lt;br /&gt;
fistp	dword	[bx-4]		; arc&lt;br /&gt;
fimul	word	[bx]		; scaled_arc&lt;br /&gt;
fistp	word	[bx-5]		; -&lt;br /&gt;
popa						; pop all registers from stack&lt;br /&gt;
sub		ah, [bp+si]			; animation along distance&lt;br /&gt;
xor		al, ah				; XOR scaled_arc with distance&lt;br /&gt;
and		al, 16 + 8 + 4		; sub palette selection&lt;br /&gt;
stosb						; write to screen, advance DI&lt;br /&gt;
mov		ax, 0xCCCD			; the famous&lt;br /&gt;
mul		di					; Rrrola trick&lt;br /&gt;
jmp 	short X-1			; *ODD* jump into &amp;quot;int 0x10&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Since this code contains an &amp;quot;odd jump&amp;quot; into the middle of the instruction &amp;lt;code&amp;gt;int 0x10&amp;lt;/code&amp;gt; it helps to disassemble the code from address &amp;lt;code&amp;gt;0x103&amp;lt;/code&amp;gt; on, until both code pathes realign at &amp;lt;code&amp;gt;pusha&amp;lt;/code&amp;gt; :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;adc 	[bp+si],cl		; decrement framecounter&lt;br /&gt;
add 	dh,[si]			; vertical alignment&lt;br /&gt;
push 	word 0x9FBA		; video base - 3.5 lines&lt;br /&gt;
pop 	es				; get aligned video memory&lt;br /&gt;
						; continues with &amp;quot;pusha&amp;quot;&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Like before the &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; instruction triggers the overflow flag - and the carry flag - always but twice per frame. &amp;lt;code&amp;gt;CL&amp;lt;/code&amp;gt; remains &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; unchanged from start, so &amp;lt;code&amp;gt;adc [bp+si],cl&amp;lt;/code&amp;gt; effectively decrements the framecounter twice per frame. &lt;br /&gt;
&lt;br /&gt;
The dividend for scaling the arc is now taken from the location &amp;lt;code&amp;gt;[bx]&amp;lt;/code&amp;gt;, which is &amp;lt;code&amp;gt;0x20CD&amp;lt;/code&amp;gt;. Calculating the number of &amp;quot;spirals&amp;quot; backwards from this number (*pi*2/256/32) gives about 6.44 which is close enough to 6.5, so that the visual gap in the top is almost not recognizable - besides from the miscoloring which can be fixed by removing the 16 from &amp;lt;code&amp;gt;and al, 16 + 8 + 4&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
It's noteworthy that the location &amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; does not contain a good offset anymore, since &amp;lt;code&amp;gt;push &amp;lt;word&amp;gt;&amp;lt;/code&amp;gt; moved away from the top, to help create *illusion* code that not only realigns soon in the second code path but also does not modify registers and memory environment in a hindering way (only &amp;lt;code&amp;gt;AL&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;DX&amp;lt;/code&amp;gt; are modified)&lt;br /&gt;
&lt;br /&gt;
The takeaway here is to study the modbytes of instructions that are very likely to appear in tiny intros, as well as using their constants as code. &amp;lt;code&amp;gt;int 0x10&amp;lt;/code&amp;gt; will almost always be in your code, so there is always &amp;lt;code&amp;gt;adc byte[??], 8bit_reg&amp;lt;/code&amp;gt; too in between.&lt;br /&gt;
&lt;br /&gt;
Finally, you might have noticed that these tunnels don't &amp;quot;spin&amp;quot;. For the cost of two bytes this can be easily added. &amp;lt;code&amp;gt;add		al, [bp+si]&amp;lt;/code&amp;gt; before applying XOR in the last example would offset the &amp;lt;code&amp;gt;arc&amp;lt;/code&amp;gt; with the framecounter and produce a spinning effect. For a coherent visual experience the sub palette selection must be changed to &amp;lt;code&amp;gt;and al, 8 + 4&amp;lt;/code&amp;gt;, too.&lt;br /&gt;
&lt;br /&gt;
== FPU Tricks ==&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Output&amp;diff=325</id>
		<title>Output</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Output&amp;diff=325"/>
				<updated>2016-08-21T04:02:39Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Basic animation and user interaction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Outputting to the screen ==&lt;br /&gt;
&lt;br /&gt;
First, be aware of the [http://img.tfd.com/cde/MEMMAP.GIF MSDOS memory layout]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in Textmode (80x25) ===&lt;br /&gt;
&lt;br /&gt;
==== Hello World / High Level function ====&lt;br /&gt;
&lt;br /&gt;
Here's an obligatory &amp;quot;Hello World&amp;quot; program in text mode, using a [http://www.ctyme.com/intr/rb-2562.htm &amp;quot;high level&amp;quot; MS-DOS function]. With a small optimization already included (using &amp;lt;code&amp;gt;XCHG BP,AX&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;MOV AH,09h&amp;lt;/code&amp;gt;), this snippet is 20 bytes in size.&lt;br /&gt;
&lt;br /&gt;
[[File:Hello world.png|thumb|Hello World!]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h			; we start at CS:100h&lt;br /&gt;
xchg 	bp,ax		; already a trick, puts 09h into AH&lt;br /&gt;
mov		dx,text		; DX expects the adress of a $ terminated string&lt;br /&gt;
int 	21h			; call the DOS function (AH = 09h)&lt;br /&gt;
ret					; quit&lt;br /&gt;
text:&lt;br /&gt;
db 'Hello World!$'&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Of course, this gets shorter with each byte you remove from the text itself. Now let's look into arbitrary screen access. Right after the start of your program you are in mode 3, that is 80x25 in 16 colors. See the [http://www.columbia.edu/~em36/wpdos/videomodes.txt Video Modes List] [[File:Drawchar example.png|thumb|draw char example]] So, to show something on the screen, you would need to set a segment register to 0xB800, then write values into this segment. &lt;br /&gt;
&lt;br /&gt;
==== Low level access ====&lt;br /&gt;
&lt;br /&gt;
The following three snippets showcase how to draw a red smiley in three different ways. All example snippets are meant to be standalone programs, starting with the first instruction and nothing before it. The target coordinate (40,12) is about the middle of the screen. We need a multiplier 2 since one char needs two bytes in memory (char and color is a byte each). The high byte 0x04 means red (4) on black (0) while the 0x01 is the first ASCII char - a smiley.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop ds&lt;br /&gt;
mov bx,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
mov [bx],ax&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop es&lt;br /&gt;
mov di,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
stosw&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push ss&lt;br /&gt;
push 0xb800&lt;br /&gt;
pop ss&lt;br /&gt;
mov sp,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
push ax&lt;br /&gt;
pop ss&lt;br /&gt;
int 0x20&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You might notice that the ''push &amp;lt;word&amp;gt;'' + ''pop seg_reg'' combination is always the same and occupies four bytes alltogether. If correct alignment is not important to you and you really just want ''any'' pointer to the screen, there is another way to get a valid one:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
les bx,[si]&lt;br /&gt;
nop&lt;br /&gt;
stosb&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That's also four bytes, but it already has the &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; opcode (for putting something onto the screen) integrated and even one slot free for another one-byte-instruction. It works because &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially points to the start of our code, and &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; has the hexadecimal representation of &amp;lt;code&amp;gt;0AAh&amp;lt;/code&amp;gt;. After the first command, the segment register &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; contains the value &amp;lt;code&amp;gt;0AA90h&amp;lt;/code&amp;gt;. If you repeatedly write something to the screen with &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; you will eventually reach the &amp;lt;code&amp;gt;0B800h&amp;lt;/code&amp;gt; segment and chars will appear on the screen. With a careful selection of the free one-byte-opcode you can also reintroduce some alignment. This works also with the &amp;lt;code&amp;gt;stosw&amp;lt;/code&amp;gt; opcode &amp;lt;code&amp;gt;0ABh&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Alternative high level functions  ====&lt;br /&gt;
&lt;br /&gt;
Besides the direct way of accessing memory there are also other ways of bringing char to the screen (f.e)&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-4124.htm INT 29h]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2558.htm INT 21h AH=6]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2562.htm INT 21h AH=9]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in mode 13h (320x200) ===&lt;br /&gt;
&lt;br /&gt;
==== Basic pixel output ====&lt;br /&gt;
&lt;br /&gt;
The videomemory for mode 13h is located at segment 0xA000, so you need to assign this value to a segment register. Also, after the start of your program you are normally still in textmode, so you need to switch to the videomode. The following snippet does both:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13 &lt;br /&gt;
int 0x10     ; AH = 0 means : set video mode to AL = 0x13 (320 x 200 pixels in 256 colors)&lt;br /&gt;
push 0xA000  ; put value on the stack&lt;br /&gt;
pop es       ; pop the top stack value into segment register ES&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You're free to use any of the segment register / opcode combinations to write to the screen&lt;br /&gt;
* &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;mov&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Let's add some code that actually draws something on the screen, the following program occupies 23 bytes and draws a fullscreen XOR texture&lt;br /&gt;
[[File:Mode13h-example-xor.png|left|bottom|thumb|mode13h-example-xor]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13&lt;br /&gt;
int 0x10&lt;br /&gt;
push 0xa000&lt;br /&gt;
pop es&lt;br /&gt;
X: cwd			; &amp;quot;clear&amp;quot; DX (if AH &amp;lt; 0x7F)&lt;br /&gt;
mov ax,di		; get screen position into AX&lt;br /&gt;
mov bx,320		; get screen width into BX&lt;br /&gt;
div bx			; divide, to get row and column&lt;br /&gt;
xor ax,dx		; the famous XOR pattern&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
stosb			; finally, draw to the screen&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that there is a different way of preparing the segment register, instead of :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
you can also do :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov ah,0xA0&lt;br /&gt;
mov es,ax&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
both variations occupy 4 bytes, but the latter is executable on processor architectures where ''push &amp;lt;word&amp;gt;'' is not available.&lt;br /&gt;
&lt;br /&gt;
==== Alternative way of pixel plotting and optimization ====&lt;br /&gt;
&lt;br /&gt;
Now let's optimize on the snippet. First, we can adapt the &amp;quot;LES&amp;quot; trick from the textmode section. We just exchange &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
with:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;les bx,[bx]&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
to save two bytes. This works because BX is 0x0000 at start and thus, accesses the region ''before'' our code, which is called [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix]. The two bytes that are put into the segment register ES are bytes 2 and 3  = ''&amp;quot;Segment of the first byte beyond the memory allocated to the program&amp;quot;'' which is usually 0x9FFF. That is just off by one to our desired 0xA000. Unfortunately that means a 16 pixel offset, so if screen alignment means something to you, you can't use this optimization. Also, said two bytes are not always 0x9FFF; for example, if resident programs are above the ''&amp;quot;memory allocated to the program&amp;quot;'' (FreeDos), their content is overwritten if we take their base as our video memory base.&lt;br /&gt;
&lt;br /&gt;
Second, we can use an alternative way of putting pixels to the screen, subfunction AH = 0x0C of int 0x10. Also, instead of constructing row and column from the screen pointer, we can use some interesting properties of the screenwidth regarding logical operations. This results in the following 16 byte program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             ; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov al,0x13&lt;br /&gt;
X: int 0x10		; set video mode AND draw pixel&lt;br /&gt;
inc cx			; increment column&lt;br /&gt;
mov ax,cx		; get column in AH&lt;br /&gt;
xor al,ah		; the famous XOR pattern&lt;br /&gt;
mov ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first optimization is the double usage of the same &amp;quot;int 0x10&amp;quot; as setting the videomode and drawing the pixel. The subfunction AH = 0x0C expects row and column in DX and CX. Since the screenwidth is 320, which is 5 * 64, we can ignore the row and just works with the column, if we use logical operations and just use bit 0-6 of the result. The subfunction AH = 0x0C allows for unbounded column values in CX (up to 65535) and correctly &amp;quot;wraps&amp;quot; it internally without an error.&lt;br /&gt;
&lt;br /&gt;
The major drawback of the &amp;quot;subfunction AH = 0x0C&amp;quot; approach is performance loss. While DosBox and many emulators perform just fine, real hardware will draw much much slower based on the Video BIOS.&lt;br /&gt;
&lt;br /&gt;
==== Basic animation and user interaction ====&lt;br /&gt;
&lt;br /&gt;
Now let's add the convenient check for the ESC key and also add a simple animation. The &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; register is used as frame counter and incremented after the pixel counter &amp;lt;code&amp;gt;CX&amp;lt;/code&amp;gt; ran through all 65536 values via &amp;lt;code&amp;gt;LOOP&amp;lt;/code&amp;gt;. This frame counter is then added to the column. The resulting program is now 25 bytes in size :&lt;br /&gt;
&lt;br /&gt;
[[File:Xor anim example.gif|thumb]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
( ↑ This example is the blueprint in the [[Floating-point_Opcodes#FPU_Basics| FPU Basics Section]].)&lt;br /&gt;
&lt;br /&gt;
== Producing sound ==&lt;br /&gt;
&lt;br /&gt;
=== MIDI notes ===&lt;br /&gt;
&lt;br /&gt;
Creating sounds with MIDI requires a bit more preparation, but once you're familiar with it, it's even simpler than PC Speaker sound, because you basically don't have to ''create'' the sound, you just have to ''trigger'' it. For the start, you have to know, that there is a lot of [https://en.wikipedia.org/wiki/General_MIDI different instruments] and a [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined way of communication]. Imagine the MIDI interface like a keyboard, you tell it which button/key you want to press, which knob to twist, and sometimes, how hard. Per default, the active instrument is the [https://en.wikipedia.org/wiki/Piano Acoustic Grand Piano].&lt;br /&gt;
&lt;br /&gt;
==== Single piano note ====&lt;br /&gt;
&lt;br /&gt;
Let's start of with a simple example, playing a single note on the piano :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh		;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h	;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 		;	send !&lt;br /&gt;
dec 	dx			;	MIDI Data Port ( = 330h )&lt;br /&gt;
mov 	al,	90h		;	send note on channel ZERO - command&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
mov 	al,	56h		;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al		; 	send !&lt;br /&gt;
mov 	al,	67h		;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
ret					; 	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In short: you turn your keyboard on (switching to UART mode), then press a KEY with a certain VOLUME on channel ZERO, then exit. Besides switching to UART mode, all this communication uses the port &amp;lt;code&amp;gt;330h&amp;lt;/code&amp;gt;. This example will work on DosBox but not on Windows XP NTVDM: for [http://www.pouet.net/topic.php?which=10720&amp;amp;page=1 still unclear reasons], the NTVDM emulation delays the note until it receives a second one. The simplest way of at least hearing something is to repeatedly play notes, like in the following example :&lt;br /&gt;
&lt;br /&gt;
==== Repeated piano notes ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh			;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h		;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 			;	send !&lt;br /&gt;
dec 	dx				;	MIDI Data Port ( = 330h )&lt;br /&gt;
main:&lt;br /&gt;
mov 	al,	90h			;	send note on channel 0 - command&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	al,	56h			;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al			; 	send !&lt;br /&gt;
mov 	al,	67h			;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,	[fs:0x46c]	;	read timer&lt;br /&gt;
test 	al,	3			;	skip 3 values&lt;br /&gt;
jnz 	_wait			;&lt;br /&gt;
inc 	byte [fs:0x46c]	; 	inc manually to prevent retrigger&lt;br /&gt;
in 		al,	0x60		;	check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			;	no? repeat&lt;br /&gt;
ret						;	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, enriched with synchronizing against the timer and checking for the ESC key. It works on both DosBox and Windows XP NTVDM and plays a note on the Piano repeatedly.&lt;br /&gt;
&lt;br /&gt;
==== Repeated notes of other instruments ====&lt;br /&gt;
&lt;br /&gt;
While hitting one key repeatedly is not really interesting in general, it can produce decent results when doing it with the right instrument activated, like it was done with the &amp;quot;French Horn&amp;quot; in [https://www.youtube.com/watch?v=tsbxdjP9l50 Timelord (by Baudsurfer)]. Apart from just changing the instrument, let's also optimize a little bit on the size: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
start:	&lt;br /&gt;
mov 	si,data			; init pointer for outsb&lt;br /&gt;
mov 	dx,330h			; change to data port&lt;br /&gt;
mov 	cl,5			; play our music data&lt;br /&gt;
rep 	outsb			; (see below at &amp;quot;data&amp;quot; label)&lt;br /&gt;
inc 	dx				; switch to control port&lt;br /&gt;
outsb					; change to mode &amp;quot;UART&amp;quot;&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,[fs:0x46c]	; read timer value&lt;br /&gt;
cmp	 	al,bl			; wait until...&lt;br /&gt;
jz 		_wait			; ...timer value changed&lt;br /&gt;
xchg 	bx,ax			; save old timer value&lt;br /&gt;
in 		al,0x60			; check for ...&lt;br /&gt;
dec 	al				; ... ESC key&lt;br /&gt;
jnz 	start			; otherwise : repeat &lt;br /&gt;
dec 	dx				; switch to data port again&lt;br /&gt;
outsb					; stop all ... &lt;br /&gt;
outsb					; ... notes played ...&lt;br /&gt;
outsb					; ... on channel 3&lt;br /&gt;
data: 	&lt;br /&gt;
db		0c3h			; change instrument on channel 3&lt;br /&gt;
						; (is also &amp;quot;RET&amp;quot; for program quit)&lt;br /&gt;
db		60				; to &amp;quot;French Horn&amp;quot;&lt;br /&gt;
db		93h				; play note on channel 3&lt;br /&gt;
db		35				; deep &amp;quot;b&amp;quot; = note number 35&lt;br /&gt;
db		127				; play with volume = 127&lt;br /&gt;
db		3fh				; change mode to &amp;quot;UART&amp;quot;&lt;br /&gt;
db		0b3h			; control change on channel 3&lt;br /&gt;
db		123				; Channel Mode Message &amp;quot;All Notes Off&amp;quot;&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, with changed instrument, structuring the MIDI data into a data section, optimizing the output with the usage of &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;out dx,al&amp;lt;/code&amp;gt;, and finalizing the program with a special command to turn ''All Notes Off''. This is necessary for all instruments which don't stop by themself. In all the previous examples, we sent the &amp;quot;NOTE ON&amp;quot; command (&amp;lt;code&amp;gt;9Xh&amp;lt;/code&amp;gt;), but not the according &amp;quot;NOTE OFF&amp;quot; command (&amp;lt;code&amp;gt;8Xh&amp;lt;/code&amp;gt;). Also, the note is now played on channel &amp;lt;code&amp;gt;03h&amp;lt;/code&amp;gt;, since the commandbyte for changing an instrument on channel 3 is &amp;lt;code&amp;gt;0C3h&amp;lt;/code&amp;gt; which is also &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; and can be reused. If this looks complicated at first, always remember, it's just sending [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined commands] to a single port.&lt;br /&gt;
&lt;br /&gt;
==== The drum channel ====&lt;br /&gt;
&lt;br /&gt;
Now, that you're aware that there are different channels (overall: 16) to play notes on, how would you like a channel &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt; specifically for 'Drums' ? [http://www.voidaudio.net/percussion.html Ten different drumsets] with dozens of samples are available out of the box. Per default, the &amp;quot;Standard Kit&amp;quot; is active. The following example plays a track of drum notes repeatedly, while further optimizing for size :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
aas						; 3fh = &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
cwd						; 99h = &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
db 		42,38,42,35		; the drum notes (kick, snare, hihat)&lt;br /&gt;
mov 	dx,0x331		; MIDI Control Port&lt;br /&gt;
outsb					; send &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
dec 	dx				; switch to MIDI data port&lt;br /&gt;
outsb					; send &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
mov		al,[fs:0x46c]	; read timer&lt;br /&gt;
test	al,3&lt;br /&gt;
jnz 	main			; skip 3 values&lt;br /&gt;
inc 	byte [fs:0x46c] ; inc manually to prevent retrigger&lt;br /&gt;
&lt;br /&gt;
inc 	bx				; increment note counter&lt;br /&gt;
and 	bl,3			; truncate to 4 notes&lt;br /&gt;
mov 	al,[bx+si]		; read the drumnote (see above)&lt;br /&gt;
&lt;br /&gt;
out 	dx,al			; send the drum&lt;br /&gt;
mov 	al,127			; set volume to maximum&lt;br /&gt;
out 	dx,al			; send volume&lt;br /&gt;
&lt;br /&gt;
in 		al,0x60			; check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			; no? repeat&lt;br /&gt;
ret						; otherwise quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In contrast to the previous example, the data section is now at the start. That means, it's executed as code! This is dangerous of course, but also saves bytes on assigning the &amp;lt;code&amp;gt;DATA&amp;lt;/code&amp;gt; offset to &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt;. Once &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; incremented &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially two times, it is fixed and further reading from the drumdata is done with &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;. Unless you know exactly what you are doing, don't use that kind of &amp;quot;executing data&amp;quot; optimization!&amp;quot;. In this special case &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;CWD&amp;lt;/code&amp;gt; do no harm and the drum notes &amp;lt;code&amp;gt;42,38,42,35&amp;lt;/code&amp;gt; are carefully crafted and arranged to resemble the instruction &amp;lt;code&amp;gt;SUB AH,[232Ah]&amp;lt;/code&amp;gt; which does no harm either.&lt;br /&gt;
&lt;br /&gt;
==== Creating basic sound effects in 16 bytes ====&lt;br /&gt;
&lt;br /&gt;
In the [https://en.wikipedia.org/wiki/General_MIDI#Sound_Effects MIDI repertoire], there are already some sound effects available. With the &amp;quot;data execution&amp;quot; optimization above, let's fire a gunshot in 16 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;aas&lt;br /&gt;
les 	di,[bx-0x6C]&lt;br /&gt;
xor 	al,127&lt;br /&gt;
mov 	dx,0x331&lt;br /&gt;
outsb&lt;br /&gt;
dec 	dx&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
rep 	outsb&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first three instructions don't do anything (they do, but we don't care), it's just MIDI data.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
the command for switching to &amp;quot;UART&amp;quot; mode, for sending to port &amp;lt;code&amp;gt;0x331&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;les di,[bx-0x6C]&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0xc4&amp;lt;/code&amp;gt; (change instrument on channel 4), &amp;lt;code&amp;gt;0x7F&amp;lt;/code&amp;gt; (change it to &amp;quot;Gunshot&amp;quot;), &amp;lt;code&amp;gt;0x94&amp;lt;/code&amp;gt; (play note on channel 4)&lt;br /&gt;
* &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0x34&amp;lt;/code&amp;gt; (play THIS note), &amp;lt;code&amp;gt;0x7f&amp;lt;/code&amp;gt; (play it THAT loud, 127 is also the allowed maximum)&lt;br /&gt;
&lt;br /&gt;
The rest of the code basically just sends the MIDI data to the interface and exits. You can change the kind of sound effect with modifying the modbyte of the second instruction (change BX to BP or SI etc.). Changing the volume is more simple, change the byte value of &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt; to any value between 0 and 127.&lt;br /&gt;
&lt;br /&gt;
==== Procedural MIDI music generation in 64 bytes ====&lt;br /&gt;
&lt;br /&gt;
With all the above you should now be able to follow the next snippet [http://www.pouet.net/prod.php?which=66313 Descent OST], a small framework for procedural MIDI sound generation in 64 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;; &amp;quot;Descent OST&amp;quot;, a 62 byte MIDI music player for MSDOS&lt;br /&gt;
; created by HellMood/DESiRE (C)2015&lt;br /&gt;
; this is the extracted music routine used in &amp;quot;Descent&amp;quot; &lt;br /&gt;
; it is a procedural MIDI algorithm which sticks a&lt;br /&gt;
; subroutine to the DOS timer (interrupt 0x1C)&lt;br /&gt;
; the registered routine is called ~18.2 times per second&lt;br /&gt;
&lt;br /&gt;
; developed for use with &amp;quot;NASM&amp;quot;, &lt;br /&gt;
; see http://sourceforge.net/projects/nasm/files/&lt;br /&gt;
&lt;br /&gt;
%define		 	rhythmPattern			0b11&lt;br /&gt;
; with &amp;quot;rhythmPattern&amp;quot;, you define how often a note is played&lt;br /&gt;
; generally, higher values and values containing many &amp;quot;ones&amp;quot;&lt;br /&gt;
; in binary representation, will result in faster play&lt;br /&gt;
; for example &amp;quot;0b11&amp;quot; will play every 4th note&lt;br /&gt;
%define 		baseInstrument			9&lt;br /&gt;
; defines the number of the first instrument used.&lt;br /&gt;
; see http://www.midi.org/techspecs/gm1sound.php for a full list&lt;br /&gt;
; keep in mind, that there are only a few instrument blocks&lt;br /&gt;
; whose sounds stop after a while. You won't get good results&lt;br /&gt;
; from strings etc. just a mess of overlayed sounds&lt;br /&gt;
%define 		numInstruments			7&lt;br /&gt;
; defines how many instrument are used. keep in mind, that &amp;quot;rhythm-&lt;br /&gt;
; Pattern&amp;quot; has influence on the picked instrument. the instruments&lt;br /&gt;
; from 9 to 9+7 are called &amp;quot;chromatic percussion&amp;quot;&lt;br /&gt;
%define 		noteStep 				5&lt;br /&gt;
; defines the basic difference from on note to the next. recommended&lt;br /&gt;
; values here are (mainly) 3,4 and 5 for music theoretic reasons&lt;br /&gt;
; but feel free to play around =)&lt;br /&gt;
%define 		noteRange				12&lt;br /&gt;
; after adding the noteStep, the note value is &amp;quot;mod&amp;quot;ded with&lt;br /&gt;
; the &amp;quot;noteRange&amp;quot;. 12 means octave, which results in very harmonic&lt;br /&gt;
; scales&lt;br /&gt;
%define 		noteSpread				3&lt;br /&gt;
; the third step spreads the notes over the tonal spectrum, you may&lt;br /&gt;
; want to keep &amp;quot;noteSpread&amp;quot; * &amp;quot;noteRange&amp;quot; round about 30-60.&lt;br /&gt;
%define 		baseNote				40&lt;br /&gt;
; the general tone height of everything. some instruments don't play&lt;br /&gt;
; arbitrary deep notes correctly, and too high notes cause ear bleeding&lt;br /&gt;
; adjust with care ;)&lt;br /&gt;
&lt;br /&gt;
; WARNING : after exiting the program, the timer interrupt is still active&lt;br /&gt;
; i strongly recommend to reboot or restart DOSBOX!&lt;br /&gt;
&lt;br /&gt;
; ADVISE : Yes, there are music- and math-related things going on here&lt;br /&gt;
; if you're not into music theory, cycle of fifth, and the like, it maybe&lt;br /&gt;
; better to just play around with the parameters, rather then understanding them&lt;br /&gt;
; just change stuff slowly, and eventually you will get &amp;quot;there&amp;quot;&lt;br /&gt;
; wherever that is ;)&lt;br /&gt;
&lt;br /&gt;
org 0x100&lt;br /&gt;
xchg cx,ax					; set our second counter to zero&lt;br /&gt;
mov dx,music			&lt;br /&gt;
mov ax,0x251C				; mode &amp;quot;0x25&amp;quot; , &amp;quot;0x1C&amp;quot; = change address of timer interrupt&lt;br /&gt;
int 0x21					; see http://mprolab.teipir.gr/vivlio80X86/dosints.pdf&lt;br /&gt;
S:&lt;br /&gt;
in ax,0x60					; wait for &amp;quot;ESC&amp;quot; press, then exit&lt;br /&gt;
dec al						; music plays on anyway, this is just for&lt;br /&gt;
jnz S						; keeping the music exactly as in &amp;quot;Descent&amp;quot;&lt;br /&gt;
ret							; return to prompt&lt;br /&gt;
music:&lt;br /&gt;
inc bx						; increment our first counter (starts at zero)&lt;br /&gt;
test bl,byte rhythmPattern	; play a note every 4th time tick&lt;br /&gt;
jnz nomusic					; otherwise do nothing&lt;br /&gt;
mov dx,0x331&lt;br /&gt;
mov al,0x3F&lt;br /&gt;
out dx,al&lt;br /&gt;
dec dx&lt;br /&gt;
mov al,0xC0					; change instrument on channel 0...&lt;br /&gt;
out dx,al&lt;br /&gt;
mov ax,bx					&lt;br /&gt;
aam byte numInstruments&lt;br /&gt;
add al,byte baseInstrument	; ...to this instrument&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0x90					; play note on channel 0 ...&lt;br /&gt;
out dx,al&lt;br /&gt;
add cl,byte noteStep			 &lt;br /&gt;
mov al,cl&lt;br /&gt;
aam byte noteRange&lt;br /&gt;
imul ax,noteSpread&lt;br /&gt;
add al,baseNote				; ... play THIS note&lt;br /&gt;
out dx,al&lt;br /&gt;
neg al						; (play deeper notes louder = add bass)&lt;br /&gt;
add al,127+39				; ... play it THAT loud&lt;br /&gt;
out dx,al&lt;br /&gt;
nomusic:&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PC Speaker ===&lt;br /&gt;
&lt;br /&gt;
Producing sound with PC speakers is incredibly easy. Basically, you set a system timer to a desired frequency, then connect this timer to the speaker. [http://wiki.osdev.org/PC_Speaker The PC Speaker Article] from OSDEV Wiki has the details about it. An example for a tiny intro that uses PC speaker music is [http://www.pouet.net/prod.php?which=67833 SpeaCore]&lt;br /&gt;
&lt;br /&gt;
==== Basic example with melody pattern ====&lt;br /&gt;
&lt;br /&gt;
A very optimized and dirty variant of producing sound with the speaker is this 12 byte snippet (sound routine from [http://www.pouet.net/prod.php?which=67829 the tiny intro &amp;quot;darkweb&amp;quot;]):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;hlt				; sync to timer1&lt;br /&gt;
inc bx			; increment our counter&lt;br /&gt;
mov ax,bx		; work with a copy&lt;br /&gt;
or al,0x4B      ; melody pattern + 2 LSB for speaker link&lt;br /&gt;
out 0x42,al		; set new countdown for timer2 (two passes)&lt;br /&gt;
out 0x61,al		; link timer2 to PC speaker (2 LSBs are 1)&lt;br /&gt;
jmp si			; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Instead of sending low and high byte of our divisor directly in succession, we do it the &amp;quot;two path&amp;quot; way. That reduces the amount of possible frequencies to 255, which is still good enough for some rough sounds. Linking the timer to the PC speaker might not be obvious : Normally you would read the value of port 0x61, set the two least significant bits to TRUE and write the value again. You can save on all of this, if you just send the &amp;quot;two path&amp;quot; value which you just used for the timer if that value has the two least significant bits already set (''or al,0x4B'' does this). Be aware that port 0x61 does many things apart from just connecting the timer to the speaker. A useful resource for ports in general is the [http://bochs.sourceforge.net/techspec/PORTS.LST Bochs Ports List], for port 0x61 it displays:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
''0061	w	KB controller port B (ISA, EISA)   (PS/2 port A is at 0092)&lt;br /&gt;
&lt;br /&gt;
		system control port for compatibility with 8255&lt;br /&gt;
&lt;br /&gt;
		bit 7	(1= IRQ 0 reset )&lt;br /&gt;
&lt;br /&gt;
		bit 6-4    reserved&lt;br /&gt;
&lt;br /&gt;
		bit 3 = 1  channel check enable&lt;br /&gt;
&lt;br /&gt;
		bit 2 = 1  parity check enable&lt;br /&gt;
&lt;br /&gt;
		'''bit 1 = 1  speaker data enable'''&lt;br /&gt;
&lt;br /&gt;
		'''bit 0 = 1  timer 2 gate to speaker enable''' ''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So if you experience strange things with highly optimized pc speaker output, revert to the safe way. The described way works with real hardware and DosBox. Unfortunately, both Orcacle Virtual Box with MsDos 6.22 and Windows XP NTVDM seem not to properly emulate PC speakers (Investigation and citation needed here!)&lt;br /&gt;
&lt;br /&gt;
==== Simple deep sound in 8 bytes ====&lt;br /&gt;
&lt;br /&gt;
One of the smallest possible PC speaker sound generation might be this 8 byte snippet :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;dec ax			; AX initially 0000h -&amp;gt; AL = 0xFF&lt;br /&gt;
out 42h,al		; change divisor of timer2 to 0xFFFF&lt;br /&gt;
out 42h,al		; resulting in a very low frequency&lt;br /&gt;
out 61h,al		; 2 LSBs are set, connect timer to speaker&lt;br /&gt;
ret				; quit&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Note: This may fail on actual hardware, as there might not be time for the bus to settle between the consecutive &amp;lt;code&amp;gt;out 42h,al&amp;lt;/code&amp;gt; statements.)&lt;br /&gt;
&lt;br /&gt;
=== COVOX output (aka LPT DAC) ===&lt;br /&gt;
&lt;br /&gt;
It is possible to output to an LPT-connected DAC (&amp;quot;[https://en.wikipedia.org/wiki/Covox_Speech_Thing COVOX]&amp;quot;) in a tinyprog.  A proof-of-concept example is [http://www.pouet.net/prod.php?which=57991 Express Train 125] which uses COVOX for sound generation.  &lt;br /&gt;
&lt;br /&gt;
This method follows the [http://countercomplex.blogspot.com/2011/10/algorithmic-symphonies-from-one-line-of.html &amp;quot;audio from one line of C code&amp;quot;] style of sound generation.  [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 A pouet discussion] exists for more background information.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=316</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=316"/>
				<updated>2016-08-20T17:55:15Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Boot sector defaults */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file -- it has to spend bytes setting things up like a .COM file for the rest of the code to work.  &lt;br /&gt;
&lt;br /&gt;
If you still want to target boot sectors, your reward is 510 bytes available for program code (the last two bytes must be 0x55, 0xAA to be [http://thestarman.pcministry.com/asm/mbr/PartTables.htm recognized as a boot sector]).&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, which is a design philosophy that intentionally attempts to create many instructions that perform multiple steps.  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Main_Page&amp;diff=315</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Main_Page&amp;diff=315"/>
				<updated>2016-08-20T17:47:52Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Welcome to SizeCoding.org! ==&lt;br /&gt;
&lt;br /&gt;
'''SizeCoding.org is a wiki dedicated to the art of creating very tiny programs for the 80x86 family of CPUs.'''  By &amp;quot;very tiny programs&amp;quot;, we mean programs that are '''256 bytes or less in size''', typically created by members of the [https://en.wikipedia.org/wiki/Demoscene demoscene] as a show of programming skill.  The size of these tiny programs is measured by their total size in opcode bytes, and are usually presented as executable .COM files to be run in pure DOS, a DOS VM running inside another operating system, or an emulator that can run DOS such as DOSBox.  &lt;br /&gt;
&lt;br /&gt;
'''Despite their tiny size, these programs are able to produce amazing graphical displays, playable games, and sometimes music.'''  There are even some surprisingly effective programs in under '''16 bytes''', such as [https://trixter.oldskool.org/2012/12/17/maze-generation-in-thirteen-bytes/ a maze generator in 10 bytes], a [[Paint16b|mouse-driven paint program in 16 bytes]], and an [[M8trix_8b|&amp;quot;Enter the Matrix&amp;quot; screensaver]] in only 8 bytes. &lt;br /&gt;
&lt;br /&gt;
'''The intent of this wiki is to teach x86 assembler programmers the various techniques used to create tiny demoscene intros.'''  While these techniques can be used for other applications (boot sectors, BIOS and firmware code, etc.), the information presented here is firmly oriented towards the demoscene.  Practicality and common sense are sometimes thrown out the window just to shave a single byte -- consider yourself warned.&lt;br /&gt;
&lt;br /&gt;
This wiki is divided into the following major sections:&lt;br /&gt;
&lt;br /&gt;
*[[Getting Started]] - What do you need to know before starting your first tinyprog?&lt;br /&gt;
&lt;br /&gt;
*[[Techniques|Tips, Tricks, and Techniques]] - The meat of the wiki.  Lists of small opcodes, default environment settings, size optimization basics, and a ton of random tips.&lt;br /&gt;
&lt;br /&gt;
*[[Case Studies]] - Analysis of existing award-winning tiny programs, with comments on what choices were made and why.&lt;br /&gt;
&lt;br /&gt;
*[[Resources|Additional Resources]] - Discussion threads, competitions, repositories, and other external references that can aid you in your quest.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=314</id>
		<title>SizeCoding:About</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=314"/>
				<updated>2016-08-20T17:43:24Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== History ==&lt;br /&gt;
&lt;br /&gt;
sizecoding.org grew from a conversation between [[User:Trixter|Trixter]], [[User:HellMood|HellMood]], and [[User:qkumba|qkumba]] when it was determined there was no central repository for sizecoding information.  Trixter had inquired because VileR was dipping his toes into the lake that is sizecoding, and wanted to point him to a starting point for resources suitable for beginners.  None existed, so Trixter stood up sizecoding.org, and all three of them contributed information and examples to the initial launch.&lt;br /&gt;
&lt;br /&gt;
Many thanks to the entire demoscene for knowledge and inspiration.&lt;br /&gt;
&lt;br /&gt;
== Q&amp;amp;A ==&lt;br /&gt;
&lt;br /&gt;
'''Why a Wiki?'''  Wikis have the lowest barrier to entry for contributing to information stores.  We encourage the entire demoscene and assembler programming community to contribute.&lt;br /&gt;
&lt;br /&gt;
'''Is this being backed up somewhere?'''  Yes.  A few key members have access to the weekly php and database dumps.&lt;br /&gt;
&lt;br /&gt;
'''Doesn't in4k cover this material?'''  [https://in4k.github.io/ in4k] mostly concentrates on 1k/4k productions, across all environments (Windows, javascript, webgl, etc.).  SizeCoding is much more targeted, and concentrates mainly on 256b, x86-only coding.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=313</id>
		<title>SizeCoding:About</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=313"/>
				<updated>2016-08-20T17:42:46Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== History ==&lt;br /&gt;
&lt;br /&gt;
sizecoding.org grew from a conversation between [[User:Trixter|Trixter]], [[User:HellMood|HellMood]], and [[User:qkumba|qkumba]] when it was determined there was no central repository for sizecoding information.  Trixter had inquired because VileR was dipping his toes into the lake that is sizecoding, and wanted to point him to a starting point for resources suitable for beginners.  None existed, so Trixter stood up sizecoding.org, and all three of them contributed information and examples to the initial launch.&lt;br /&gt;
&lt;br /&gt;
Many thanks to the entire demoscene for knowledge and inspiration.&lt;br /&gt;
&lt;br /&gt;
== Q&amp;amp;A ==&lt;br /&gt;
&lt;br /&gt;
'''Why a Wiki?'''  Wikis have the lowest barrier to entry for contributing to information stores.  We encourage the entire demoscene and assembler programming community to contribute.&lt;br /&gt;
&lt;br /&gt;
'''Doesn't in4k cover this material?'''  [https://in4k.github.io/ in4k] mostly concentrates on 1k/4k productions, across all environments (Windows, javascript, webgl, etc.).  SizeCoding is much more targeted, and concentrates mainly on 256b, x86-only coding.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=312</id>
		<title>SizeCoding:About</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=312"/>
				<updated>2016-08-20T17:35:22Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Doesn't in4k cover this material? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== History ==&lt;br /&gt;
&lt;br /&gt;
sizecoding.org grew from a conversation between [[User:Trixter|Trixter]], [[User:HellMood|HellMood]], and [[User:qkumba|qkumba]] when it was determined there was no central repository for sizecoding information.  Trixter had inquired because VileR was dipping his toes into the lake that is sizecoding, and wanted to point him to a starting point for resources suitable for beginners.  None existed, so Trixter stood up sizecoding.org, and all three of them contributed information and examples to the initial launch.&lt;br /&gt;
&lt;br /&gt;
Many thanks to the entire demoscene for knowledge and inspiration.&lt;br /&gt;
&lt;br /&gt;
== Why a Wiki? ==&lt;br /&gt;
&lt;br /&gt;
Wikis have the lowest barrier to entry for contributing to information stores.  We encourage the entire demoscene and assembler programming community to contribute.&lt;br /&gt;
&lt;br /&gt;
== Doesn't in4k cover this material? ==&lt;br /&gt;
&lt;br /&gt;
[https://in4k.github.io/ in4k] mostly concentrates on 1k and larger productions, across all environments (Windows, javascript, webgl, etc.).  SizeCoding is much more targeted, and concentrates on 512b and lower, x86 only.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=311</id>
		<title>SizeCoding:About</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=311"/>
				<updated>2016-08-20T17:27:36Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== History ==&lt;br /&gt;
&lt;br /&gt;
sizecoding.org grew from a conversation between [[User:Trixter|Trixter]], [[User:HellMood|HellMood]], and [[User:qkumba|qkumba]] when it was determined there was no central repository for sizecoding information.  Trixter had inquired because VileR was dipping his toes into the lake that is sizecoding, and wanted to point him to a starting point for resources suitable for beginners.  None existed, so Trixter stood up sizecoding.org, and all three of them contributed information and examples to the initial launch.&lt;br /&gt;
&lt;br /&gt;
Many thanks to the entire demoscene for knowledge and inspiration.&lt;br /&gt;
&lt;br /&gt;
== Why a Wiki? ==&lt;br /&gt;
&lt;br /&gt;
Wikis have the lowest barrier to entry for contributing to information stores.  We encourage the entire demoscene and assembler programming community to contribute.&lt;br /&gt;
&lt;br /&gt;
== Doesn't in4k cover this material? ==&lt;br /&gt;
&lt;br /&gt;
in4k mostly concentrates on 1k and larger productions, across all environments (Windows, javascript, webgl, etc.).  SizeCoding is much more targeted, and concentrates on 512b and lower, x86 only.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=309</id>
		<title>SizeCoding:About</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=309"/>
				<updated>2016-08-20T17:20:19Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== History ==&lt;br /&gt;
&lt;br /&gt;
sizecoding.org grew from a conversation between [[User:Trixter|Trixter]], [[User:HellMood|HellMood]], and [[User:qkumba|qkumba]] when it was determined there was no central repository for sizecoding information.  Trixter had inquired because VileR was dipping his toes into the lake that is sizecoding, and wanted to point him to a starting point for resources suitable for beginners.  None existed, so Trixter stood up sizecoding.org, and all three of them contributed information and examples to the initial launch.&lt;br /&gt;
&lt;br /&gt;
Many thanks to the entire demoscene for knowledge and inspiration.&lt;br /&gt;
&lt;br /&gt;
== Why a Wiki? ==&lt;br /&gt;
&lt;br /&gt;
Wikis have the lowest barrier to entry for contributing to information stores.  We encourage the entire demoscene and assembler programming community to contribute.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=M8trix_8b&amp;diff=307</id>
		<title>M8trix 8b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=M8trix_8b&amp;diff=307"/>
				<updated>2016-08-20T16:42:00Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Case Study]]&lt;br /&gt;
{{Tinyprog|title=m8trix 8b|author=Hellmood|bsize=8|link=http://www.pouet.net/prod.php?which=63126}}&lt;br /&gt;
Actually, it has been optimized to 7 bytes (!), but let's start with the original.  m8trix 8b started out as these 8 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h&lt;br /&gt;
&lt;br /&gt;
S: &lt;br /&gt;
les bx,[si]             ; sets ES to the screen, assume si = 0x100&lt;br /&gt;
                        ; 0x101 is SBB AL,9F and changes the char&lt;br /&gt;
                        ; without CR flag, there would be&lt;br /&gt;
                        ; no animation ;)&lt;br /&gt;
lahf                    ; gets 0x02 (green) in the first run&lt;br /&gt;
                        ; afterwards, it is not called again&lt;br /&gt;
                        ; because of alignment ;)&lt;br /&gt;
stosw                   ; print the green char ...&lt;br /&gt;
                        ; (is also 0xAB9F and works as segment)&lt;br /&gt;
inc di                  ; and skip one row&lt;br /&gt;
inc di                  ;&lt;br /&gt;
jmp short S+1           ; repeat on 0x101 &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(This case study is unfinished)&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Main_Page&amp;diff=306</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Main_Page&amp;diff=306"/>
				<updated>2016-08-20T16:41:43Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Welcome to SizeCoding.org! */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Welcome to SizeCoding.org! ==&lt;br /&gt;
&lt;br /&gt;
'''SizeCoding.org is a wiki dedicated to the art of creating very tiny programs for the 80x86 family of CPUs.'''  By &amp;quot;very tiny programs&amp;quot;, we mean programs that are '''256 bytes or less in size''', typically created by members of the [https://en.wikipedia.org/wiki/Demoscene demoscene] as a show of programming skill.  The size of these tiny programs is measured by their total size in opcode bytes, and are usually presented as executable .COM files to be run in pure DOS, a DOS VM running inside another operating system, or an emulator that can run DOS such as DOSBox.  &lt;br /&gt;
&lt;br /&gt;
'''Despite their tiny size, these programs are able to produce amazing graphical displays, playable games, and sometimes music.'''  There are even some surprisingly effective programs in under '''16 bytes''', such as [https://trixter.oldskool.org/2012/12/17/maze-generation-in-thirteen-bytes/ a maze generator in 10 bytes], a [[Paint16b|mouse-driven paint program in 16 bytes]], and an [[M8trix_8b|&amp;quot;Enter the Matrix&amp;quot; screensaver]] in only 8 bytes. &lt;br /&gt;
&lt;br /&gt;
'''The intent of this wiki is to teach x86 assembler programmers the various techniques used to create tiny demoscene intros.'''  While these techniques can be used for other applications (boot sectors, BIOS and firmware code, etc.), the information presented here is firmly oriented towards the demoscene.  Practicality and common sense are sometimes thrown out the window just to shave a single byte -- consider yourself warned.&lt;br /&gt;
&lt;br /&gt;
This wiki is divided into the following major sections:&lt;br /&gt;
&lt;br /&gt;
:[[Getting Started]] - What do you need to know before starting your first tinyprog?&lt;br /&gt;
:[[Techniques|Tips, Tricks, and Techniques]] - The meat of the wiki.  Lists of small opcodes, default environment settings, size optimization basics, and a ton of random tips.&lt;br /&gt;
:[[Case Studies]] - Analysis of existing award-winning tiny programs, with comments on what choices were made and why.&lt;br /&gt;
:[[Resources|Additional Resources]] - Discussion threads, competitions, repositories, and other external references that can aid you in your quest.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=305</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=305"/>
				<updated>2016-08-20T16:30:04Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* A sample framework */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file.&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, which is a design philosophy that intentionally attempts to create many instructions that perform multiple steps.  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 20 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=304</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=304"/>
				<updated>2016-08-20T16:29:42Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: Reverted edits by Trixter (talk) to last revision by HellMood&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file.&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, which is a design philosophy that intentionally attempts to create many instructions that perform multiple steps.  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 19 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=303</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=303"/>
				<updated>2016-08-20T16:29:19Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* A sample framework */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file.&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, which is a design philosophy that intentionally attempts to create many instructions that perform multiple steps.  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 20 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=302</id>
		<title>Getting Started</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Getting_Started&amp;diff=302"/>
				<updated>2016-08-20T16:27:47Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* A sample framework */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Words of warning ==&lt;br /&gt;
&lt;br /&gt;
Sizecoding assumes a basic level of assembler knowledge.  You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding.  Also, don't assume sizecoding is ''normal'' -- shaving bytes is a black art that should be kept far, far away from normal programming targets.  People sizecode for fun, not profit!&lt;br /&gt;
&lt;br /&gt;
== Tools and Workflows ==&lt;br /&gt;
&lt;br /&gt;
Most people use NASM (it's free) and DOSBox for testing.  A simple workflow:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
nasm main.asm -fbin -o main.com&lt;br /&gt;
ndisasm -o100h main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some crazy sizecoders like writing tinyprogs on very old hardware.  For that, a86 is a good choice for an assembler, as it assembles source directly to a .COM file in one pass.  Even on the first 8086 systems, you can use debug.com to check the assembly and size:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
a86 main.asm&lt;br /&gt;
debug main.com&lt;br /&gt;
dir main.com&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Know your environment ==&lt;br /&gt;
&lt;br /&gt;
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000.  Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for.  Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything.  The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default palette] leaves a little to be desired, so if you have room, you may want to change it.&lt;br /&gt;
&lt;br /&gt;
=== .COM file defaults ===&lt;br /&gt;
&lt;br /&gt;
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code.  On most (but not all) DOS environments, the following registers have these default values:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
AX=0000&lt;br /&gt;
BX=0000&lt;br /&gt;
CX=00FF&lt;br /&gt;
DX=Same as CS register&lt;br /&gt;
SI=0100&lt;br /&gt;
DI=FFFE&lt;br /&gt;
SP=FFFC (DOS child process) or FFFE (the default)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Because .COM files only support 64K executables, &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; are all set to the same value as &amp;lt;code&amp;gt;CS&amp;lt;/code&amp;gt;.  The rest can't be counted on for any specific value, except that &amp;lt;code&amp;gt;BP&amp;lt;/code&amp;gt; is mostly 09??h so you can usually count on the high byte being &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Usually, the top stack contains &amp;lt;code&amp;gt;0000h&amp;lt;/code&amp;gt;. That allows for using the one byte instruction &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; to exit your program, since at location &amp;lt;code&amp;gt;CS:0000h&amp;lt;/code&amp;gt; there is the value &amp;lt;code&amp;gt;20CDh&amp;lt;/code&amp;gt; (= &amp;lt;code&amp;gt;int 20h&amp;lt;/code&amp;gt;). See [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix].&lt;br /&gt;
&lt;br /&gt;
=== Boot sector defaults ===&lt;br /&gt;
&lt;br /&gt;
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:&lt;br /&gt;
* The boot sector is loaded at 0000:7C00&lt;br /&gt;
* DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00&lt;br /&gt;
* The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h&lt;br /&gt;
&lt;br /&gt;
This is why most sizecoders target .COM files, and is also why [https://github.com/nanochess/Toledo-Atomchess Toledo Atomchess] is 9 bytes larger if loaded from boot sector than from a .COM file.&lt;br /&gt;
&lt;br /&gt;
=== 1-byte opcodes ===&lt;br /&gt;
The 80x86 family was originally a [https://en.wikipedia.org/wiki/Complex_instruction_set_computing CISC] design, which is a design philosophy that intentionally attempts to create many instructions that perform multiple steps.  In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family.  Here's a handy chart (segments and prefixes omitted):&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Opcode !! Mnemonic !! Arch !! Description !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| 37      || AAA     ||         || ASCII adjust AL (carry into AH) after addition&lt;br /&gt;
|-&lt;br /&gt;
|3F||AAS         || ||ASCII adjust AL (borrow from AH) after subtraction    &lt;br /&gt;
|-&lt;br /&gt;
|98||CBW         || ||Convert byte into word (AH = top bit of AL)           &lt;br /&gt;
|-&lt;br /&gt;
|99||CDQ         ||80386+||Convert dword to qword (EDX = top bit of EAX)         &lt;br /&gt;
|-&lt;br /&gt;
|F8||CLC         || ||Clear carry flag                                      &lt;br /&gt;
|-&lt;br /&gt;
|FC||CLD         || ||Clear direction flag so SI and DI will increment      &lt;br /&gt;
|-&lt;br /&gt;
|FA||CLI         || ||Clear interrupt enable flag; interrupts disabled      &lt;br /&gt;
|-&lt;br /&gt;
|F5||CMC         || ||Complement carry flag                                 &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPS mb,mb  || ||Compare bytes [SI] - ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPS mv,mv  || ||Compare words [SI] - ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A6||CMPSB       || ||Compare bytes DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSD       ||80386+||Compare dwords DS:[SI] - ES:[DI], advance SI,DI       &lt;br /&gt;
|-&lt;br /&gt;
|A7||CMPSW       || ||Compare words DS:[SI] - ES:[DI], advance SI,DI        &lt;br /&gt;
|-&lt;br /&gt;
|99||CWD         || ||Convert word to doubleword (DX = top bit of AX)       &lt;br /&gt;
|-&lt;br /&gt;
|98||CWDE        ||80386+||Sign-extend word AX to doubleword EAX                 &lt;br /&gt;
|-&lt;br /&gt;
|27||DAA         || ||Decimal adjust AL after addition                      &lt;br /&gt;
|-&lt;br /&gt;
|2F||DAS         || ||Decimal adjust AL after subtraction                   &lt;br /&gt;
|-&lt;br /&gt;
|F4||HLT         || ||Halt||Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast&lt;br /&gt;
|-&lt;br /&gt;
|EC||IN AL,DX    || ||Input byte from port DX into AL                       &lt;br /&gt;
|-&lt;br /&gt;
|ED||IN eAX,DX   || ||Input word from port DX into eAX                     &lt;br /&gt;
|-&lt;br /&gt;
|6C||INS rmb,DX  ||80186+||Input byte from port DX into [DI], advance DI         &lt;br /&gt;
|-&lt;br /&gt;
|6D||INS rmv,DX  ||80186+||Input word from port DX into [DI], advance DI        &lt;br /&gt;
|-&lt;br /&gt;
|6C||INSB        ||80186+||Input byte from port DX into ES:[DI], advance DI      &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSD        ||80386+||Input dword from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|6D||INSW        ||80186+||Input word from port DX into ES:[DI], advance DI     &lt;br /&gt;
|-&lt;br /&gt;
|CC||INT 3       || ||Interrupt 3 (trap to debugger)||If performing very many CALLs to a single procedure, could make it INT 3&lt;br /&gt;
|-&lt;br /&gt;
|CE||INTO        || ||Interrupt 4 if overflow flag is 1                     &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRET        || ||Interrupt return (far return and pop flags)           &lt;br /&gt;
|-&lt;br /&gt;
|CF||IRETD       ||80386+||Interrupt return (pop EIP, ECS, Eflags)               &lt;br /&gt;
|-&lt;br /&gt;
|9F||LAHF        || ||Load: AH = flags  SF ZF xx AF xx PF xx CF             &lt;br /&gt;
|-&lt;br /&gt;
|C9||LEAVE       ||80186+||Set SP to BP, then POP BP (reverses previous ENTER)   &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODS mb     || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODS mv     || ||Load word [SI] into eAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AC||LODSB       || ||Load byte [SI] into AL, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSD       ||80386+||Load dword [SI] into EAX, advance SI                  &lt;br /&gt;
|-&lt;br /&gt;
|AD||LODSW       || ||Load word [SI] into AX, advance SI                    &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVS mb,mb  || ||Move byte [SI] to ES:[DI], advance SI,DI              &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVS mv,mv  || ||Move word [SI] to ES:[DI], advance SI,DI             &lt;br /&gt;
|-&lt;br /&gt;
|A4||MOVSB       || ||Move byte DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSD       ||80386+||Move dword DS:[SI] to ES:[DI], advance SI,DI          &lt;br /&gt;
|-&lt;br /&gt;
|A5||MOVSW       || ||Move word DS:[SI] to ES:[DI], advance SI,DI           &lt;br /&gt;
|-&lt;br /&gt;
|90||NOP         || ||No Operation                                          &lt;br /&gt;
|-&lt;br /&gt;
|EE||OUT DX,AL   || ||Output byte AL to port number DX                      &lt;br /&gt;
|-&lt;br /&gt;
|EF||OUT DX,eAX  || ||Output word eAX to port number DX                     &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTS DX,rmb ||80186+||Output byte [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTS DX,rmv ||80186+||Output word [SI] to port number DX, advance SI        &lt;br /&gt;
|-&lt;br /&gt;
|6E||OUTSB       ||80186+||Output byte DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSD       ||80386+||Output dword DS:[SI] to port number DX, advance SI    &lt;br /&gt;
|-&lt;br /&gt;
|6F||OUTSW       ||80186+||Output word DS:[SI] to port number DX, advance SI     &lt;br /&gt;
|-&lt;br /&gt;
|1F||POP DS      || ||Set DS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|07||POP ES      || ||Set ES to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|17||POP SS      || ||Set SS to top of stack, increment SP by 2             &lt;br /&gt;
|-&lt;br /&gt;
|61||POPA        ||80186+||Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored)     &lt;br /&gt;
|-&lt;br /&gt;
|61||POPAD       ||80386+||Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.)          &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPF        || ||Set flags register to top of stack, increment SP by 2 &lt;br /&gt;
|-&lt;br /&gt;
|9D||POPFD       ||80386+||Set eflags reg to top of stack, incr SP by 2          &lt;br /&gt;
|-&lt;br /&gt;
|0E||PUSH CS     || ||Set [SP-2] to CS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|1E||PUSH DS     || ||Set [SP-2] to DS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|06||PUSH ES     || ||Set [SP-2] to ES, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|16||PUSH SS     || ||Set [SP-2] to SS, then decrement SP by 2              &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHA       ||80186+||Push AX,CX,DX,BX,original SP,BP,SI,DI                 &lt;br /&gt;
|-&lt;br /&gt;
|60||PUSHAD      ||80386+||Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI         &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHF       || ||Set [SP-2] to flags register, then decrement SP by 2  &lt;br /&gt;
|-&lt;br /&gt;
|9C||PUSHFD      ||80386+||Set [SP-4] to eflags reg, then decr SP by 4           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RET         || ||Return to caller (near or far, depending on PROC)     &lt;br /&gt;
|-&lt;br /&gt;
|CB||RETF        || ||Return to far caller (pop offset, then seg)           &lt;br /&gt;
|-&lt;br /&gt;
|C3||RETN        || ||Return to near caller (pop offset only)               &lt;br /&gt;
|-&lt;br /&gt;
|9E||SAHF        || ||Store AH into flags  SF ZF xx AF xx PF xx CF          &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCAS mb     || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCAS mv     || ||Compare words eAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AE||SCASB       || ||Compare bytes AL - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASD       ||80386+||Compare dwords EAX - ES:[DI], advance DI              &lt;br /&gt;
|-&lt;br /&gt;
|AF||SCASW       || ||Compare words AX - ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|36||SS          || ||Use SS segment for the following memory reference     &lt;br /&gt;
|-&lt;br /&gt;
|F9||STC         || ||Set carry flag                                        &lt;br /&gt;
|-&lt;br /&gt;
|FD||STD         || ||Set direction flag so SI and DI will decrement        &lt;br /&gt;
|-&lt;br /&gt;
|FB||STI         || ||Set interrupt enable flag, interrupts enabled         &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOS mb     || ||Store AL to byte [DI], advance DI                     &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOS mv     || ||Store eAX to word [DI], advance DI                    &lt;br /&gt;
|-&lt;br /&gt;
|AA||STOSB       || ||Store AL to byte ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSD       ||80386+||Store EAX to dword ES:[DI], advance DI                &lt;br /&gt;
|-&lt;br /&gt;
|AB||STOSW       || ||Store AX to word ES:[DI], advance DI                  &lt;br /&gt;
|-&lt;br /&gt;
|9B||WAIT        || ||Wait until floating-point operation is completed      &lt;br /&gt;
|-&lt;br /&gt;
|D7||XLAT        || ||Set AL to memory byte DS:[BX + unsigned AL]           &lt;br /&gt;
|-&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Additionally, there are accumulator-optimized (AX) forms of opcodes that are one byte, such as &amp;lt;code&amp;gt;DEC AX&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;INC AX&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;XCHG reg,AX&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A sample framework ==&lt;br /&gt;
&lt;br /&gt;
Want to just dive in and see what happens?  Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.  Although it assembles to only 20 bytes, do not consider this fully and finally optimized; your program's structure may allow some of this to be reduced further.  But it's a good place to start:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h                        ;specify .COM file&lt;br /&gt;
&lt;br /&gt;
start:&lt;br /&gt;
        mov     al,13h          ;AX=0000 at program start&lt;br /&gt;
        int     10h             ;init mode 13h&lt;br /&gt;
        push    word 0A000h     ;Requires 80186 or higher to PUSH IMMED&lt;br /&gt;
        pop     es              ;ES now points to mode 13h screen segment&lt;br /&gt;
        &lt;br /&gt;
mainloop:&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
;This is where you do your mega-amazing tiny program.&lt;br /&gt;
;Write 8-bit values to A000:0000 to draw some pixels.&lt;br /&gt;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&lt;br /&gt;
&lt;br /&gt;
        ;Check for user wanting to leave by pressing ESC&lt;br /&gt;
        in      al,60h          ;read whatever is at keyboard port; looking for ESC which is #1&lt;br /&gt;
        dec     al              ;if ESC, AL now 0&lt;br /&gt;
        jnz     mainloop        ;fall through if 0, jump otherwise&lt;br /&gt;
        mov     al,03           ;AX=0000 due to mainloop exit condition&lt;br /&gt;
        int     10h             ;Switch back to text mode as a convenience&lt;br /&gt;
        ret                     ;.COM files can exit with RET&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Where to go from here? ==&lt;br /&gt;
&lt;br /&gt;
[[Output#Outputting_in_Textmode_.2880x25.29| Hello World! console output]] shows how to output text.&lt;br /&gt;
&lt;br /&gt;
[[Techniques|Tips, Tricks, and Techniques]] can help you with ideas on optimizing your next production, or help you design while you're writing it.&lt;br /&gt;
&lt;br /&gt;
Some [[Case Studies]] are provided that illustrate and explain some of the choices made when sizecoding.&lt;br /&gt;
&lt;br /&gt;
Can't find what you need?  Check our list of [[Resources|external resources]].&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=299</id>
		<title>General Coding Tricks</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=299"/>
				<updated>2016-08-20T01:31:39Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* The Rrrola Trick (Obtaining X and Y without DIV) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Data is code, code is data ==&lt;br /&gt;
&lt;br /&gt;
Code is nothing more than data that the CPU interprets.  For example, consider this multi-byte instruction:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This assembles to &amp;lt;code&amp;gt;B4 37&amp;lt;/code&amp;gt;.  &amp;lt;code&amp;gt;B4&amp;lt;/code&amp;gt; by itself isn't interesting, but &amp;lt;code&amp;gt;37&amp;lt;/code&amp;gt; happens to be the opcode for &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;.  Let's say you had this code before a loop, and you needed to perform &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; at the top of a loop.  Rather than put AAS at the top of the loop, you can reuse the opcode that will already be there as part of the &amp;lt;code&amp;gt;mov ah,37&amp;lt;/code&amp;gt; that comes before it.  Just jump directly into the middle of the &amp;lt;code&amp;gt;mov ah,37h&amp;lt;/code&amp;gt;, which will get interpreted and executed as &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
&lt;br /&gt;
label:&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
        ;misc. stuff&lt;br /&gt;
        loop label+1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;+1&amp;lt;/code&amp;gt; specifies the jump should go to 1 byte past the actual location.&lt;br /&gt;
&lt;br /&gt;
=== Reuse ===&lt;br /&gt;
&lt;br /&gt;
You can use opcodes hidden in your existing data.  For example, .COM files can end with &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;, which is opcode &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt;.  If you already have a &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; somewhere else in your code, even as part of data, just &amp;lt;code&amp;gt;JMP&amp;lt;/code&amp;gt; to that pre-existing &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; instead of adding a &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== If your environment holds you back, change it ==&lt;br /&gt;
&lt;br /&gt;
The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default MCGA palette] is fairly horrible, but can be size advantages to changing it:  While setting a new palette costs bytes, the new palette arrangement could save you headaches down the road.  For example, if your code is calculating pixel colors that fall into goofy ranges, rather than constantly adjust the colors to sane ranges (ie. aligned to powers of 2), just set the palette so that values falling into those ranges look the way you want.  (This assumes you have very small ways of redefining the palette, of course.)&lt;br /&gt;
&lt;br /&gt;
The above is maybe not the best example.  Rewrites to this section are welcome.&lt;br /&gt;
&lt;br /&gt;
== Need a constant? ==&lt;br /&gt;
&lt;br /&gt;
If you need a constant value but you're out of space, search your assembled code for a byte value you can use.&lt;br /&gt;
&lt;br /&gt;
== A smaller way to point to Mode 13's screen segment ==&lt;br /&gt;
&lt;br /&gt;
Rather than &amp;lt;code&amp;gt;mov ah,0a0h; mov es,ax&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;push word 0a000h; pop es&amp;lt;/code&amp;gt;, try this 2-byte wonder:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
les bp,[bx]&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This sets ES=9FFF, only one away from A000.  You can write to the screen with ES: this way as long as you are aware the segment is one paragraph (16 bytes) behind, so just increase your offset by 16 if you need exact placement.&lt;br /&gt;
&lt;br /&gt;
How does this work?   At start of execution of a .COM file, BX=0, and DS=CS. The contents of the COM file get loaded to offset 0x100 in that segment, but loaded before that is the PSP (program segment prefix), which DOS populates with information about the loaded program and other info. The PSP starts with &amp;lt;code&amp;gt;CD 20&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;INT 20&amp;lt;/code&amp;gt;, which exits the program), so that's what gets loaded into BP.  The next word is the number of the last free conventional memory segment, typically 0x9fff (but can be something different if parts of the upper memory range are either not installed or allocated).&lt;br /&gt;
&lt;br /&gt;
'''Warning:''' This trick doesn't always work.  On FreeDos, this can set ES=9FE0 and there is something resident at that location that can screw up the system after normal program exit if you overwrite it.&lt;br /&gt;
&lt;br /&gt;
== Accessing the timer tick for free ==&lt;br /&gt;
&lt;br /&gt;
If using a 386+, FS=0 at .COM start.  So, &amp;lt;code&amp;gt;FS:[046C]&amp;lt;/code&amp;gt; gets you the DOS timer tick variable, which you could use for timing/pacing, or a random seed. Some environments, especially EMS/XMS programs, can modify the FS register, so it can't always assumed to be 0000h. &amp;lt;code&amp;gt;POP DS&amp;lt;/code&amp;gt; right after the start and accessing &amp;lt;code&amp;gt;[046C]&amp;lt;/code&amp;gt; does the trick then with equal size.&lt;br /&gt;
&lt;br /&gt;
== Looping twice ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code that doesn't modify the carry flag, and you know the carry flag is clear, you can loop once in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        cmc&lt;br /&gt;
        jc      looping&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Looping three times ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code and you have a register whose value is zero and can be incremented freely, or whose value is -1 and can be decremented freely, you can loop twice in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        inc     bx ;if decrementing instead, parity check must be reversed&lt;br /&gt;
        jpo     looping ;1 (01b) and 2 (10b) have odd parity, 3 (11b) has even parity&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== The Rrrola Trick (Obtaining X and Y without DIV) ==&lt;br /&gt;
&lt;br /&gt;
In [[Output#Outputting_in_mode_13h_.28320x200.29|320x200 mode]], instead of constructing X and Y from the screen pointer DI with DIV, you can get a decent estimation with multiplying the screen pointer with 0xCCCD and read X and Y from the 8bit registers DH (+DL as 16bit value) and DL (+AH as 16bit value). The idea is to interpret DI as a kind of 16 bit float in the range [0,1], from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation 0xCCCD is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
&lt;br /&gt;
This is adapted from [http://www.pouet.net/prod.php?which=53816 &amp;quot;Puls&amp;quot; by Rrrola] where X and Y are directly modified on the stack by performing &amp;lt;code&amp;gt;add dword[di],0000CCCDh&amp;lt;/code&amp;gt; on each pixel iteration, which requires 7 bytes of code. The vertical alignment correction is solved with a good starting value on said DWORD on the stack before each frame, which requires 2 additional bytes. Both approaches are too different to directly compare, but share the core idea of [http://www.pouet.net/topic.php?which=8791&amp;amp;page=8#c411796 multiplying with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt;], so &amp;quot;Rrrolas trick&amp;quot; is an appropriate term to use.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=298</id>
		<title>SizeCoding:About</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=SizeCoding:About&amp;diff=298"/>
				<updated>2016-08-20T01:28:41Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;sizecoding.org grew from a conversation between [[User:Trixter|Trixter]], [[User:HellMood|HellMood]], and [[User:qkumba|qkumba]] when it was determined there was no central repository for sizecoding information.  Trixter had inquired because VileR was dipping his toes into the lake that is sizecoding, and wanted to point him to a starting point for resources suitable for beginners.  None existed, so Trixter stood up sizecoding.org, and all three of them contributed information and examples to the initial launch.&lt;br /&gt;
&lt;br /&gt;
Many thanks to the entire demoscene for knowledge and inspiration.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Output&amp;diff=297</id>
		<title>Output</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Output&amp;diff=297"/>
				<updated>2016-08-20T01:27:31Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Simple deep sound in 8 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Outputting to the screen ==&lt;br /&gt;
&lt;br /&gt;
First, be aware of the [http://img.tfd.com/cde/MEMMAP.GIF MSDOS memory layout]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in Textmode (80x25) ===&lt;br /&gt;
&lt;br /&gt;
==== Hello World / High Level function ====&lt;br /&gt;
&lt;br /&gt;
Here's an obligatory &amp;quot;Hello World&amp;quot; program in text mode, using a [http://www.ctyme.com/intr/rb-2562.htm &amp;quot;high level&amp;quot; MS-DOS function]. With a small optimization already included (using &amp;lt;code&amp;gt;XCHG BP,AX&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;MOV AH,09h&amp;lt;/code&amp;gt;), this snippet is 20 bytes in size.&lt;br /&gt;
&lt;br /&gt;
[[File:Hello world.png|thumb|Hello World!]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h			; we start at CS:100h&lt;br /&gt;
xchg 	bp,ax		; already a trick, puts 09h into AH&lt;br /&gt;
mov		dx,text		; DX expects the adress of a $ terminated string&lt;br /&gt;
int 	21h			; call the DOS function (AH = 09h)&lt;br /&gt;
ret					; quit&lt;br /&gt;
text:&lt;br /&gt;
db 'Hello World!$'&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Of course, this gets shorter with each byte you remove from the text itself. Now let's look into arbitrary screen access. Right after the start of your program you are in mode 3, that is 80x25 in 16 colors. See the [http://www.columbia.edu/~em36/wpdos/videomodes.txt Video Modes List] [[File:Drawchar example.png|thumb|draw char example]] So, to show something on the screen, you would need to set a segment register to 0xB800, then write values into this segment. &lt;br /&gt;
&lt;br /&gt;
==== Low level access ====&lt;br /&gt;
&lt;br /&gt;
The following three snippets showcase how to draw a red smiley in three different ways. All example snippets are meant to be standalone programs, starting with the first instruction and nothing before it. The target coordinate (40,12) is about the middle of the screen. We need a multiplier 2 since one char needs two bytes in memory (char and color is a byte each). The high byte 0x04 means red (4) on black (0) while the 0x01 is the first ASCII char - a smiley.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop ds&lt;br /&gt;
mov bx,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
mov [bx],ax&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop es&lt;br /&gt;
mov di,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
stosw&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push ss&lt;br /&gt;
push 0xb800&lt;br /&gt;
pop ss&lt;br /&gt;
mov sp,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
push ax&lt;br /&gt;
pop ss&lt;br /&gt;
int 0x20&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You might notice that the ''push &amp;lt;word&amp;gt;'' + ''pop seg_reg'' combination is always the same and occupies four bytes alltogether. If correct alignment is not important to you and you really just want ''any'' pointer to the screen, there is another way to get a valid one:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
les bx,[si]&lt;br /&gt;
nop&lt;br /&gt;
stosb&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That's also four bytes, but it already has the &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; opcode (for putting something onto the screen) integrated and even one slot free for another one-byte-instruction. It works because &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially points to the start of our code, and &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; has the hexadecimal representation of &amp;lt;code&amp;gt;0AAh&amp;lt;/code&amp;gt;. After the first command, the segment register &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; contains the value &amp;lt;code&amp;gt;0AA90h&amp;lt;/code&amp;gt;. If you repeatedly write something to the screen with &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; you will eventually reach the &amp;lt;code&amp;gt;0B800h&amp;lt;/code&amp;gt; segment and chars will appear on the screen. With a careful selection of the free one-byte-opcode you can also reintroduce some alignment. This works also with the &amp;lt;code&amp;gt;stosw&amp;lt;/code&amp;gt; opcode &amp;lt;code&amp;gt;0ABh&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Alternative high level functions  ====&lt;br /&gt;
&lt;br /&gt;
Besides the direct way of accessing memory there are also other ways of bringing char to the screen (f.e)&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-4124.htm INT 29h]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2558.htm INT 21h AH=6]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2562.htm INT 21h AH=9]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in mode 13h (320x200) ===&lt;br /&gt;
&lt;br /&gt;
==== Basic pixel output ====&lt;br /&gt;
&lt;br /&gt;
The videomemory for mode 13h is located at segment 0xA000, so you need to assign this value to a segment register. Also, after the start of your program you are normally still in textmode, so you need to switch to the videomode. The following snippet does both:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13 &lt;br /&gt;
int 0x10     ; AH = 0 means : set video mode to AL = 0x13 (320 x 200 pixels in 256 colors)&lt;br /&gt;
push 0xA000  ; put value on the stack&lt;br /&gt;
pop es       ; pop the top stack value into segment register ES&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You're free to use any of the segment register / opcode combinations to write to the screen&lt;br /&gt;
* &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;mov&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Let's add some code that actually draws something on the screen, the following program occupies 23 bytes and draws a fullscreen XOR texture&lt;br /&gt;
[[File:Mode13h-example-xor.png|left|bottom|thumb|mode13h-example-xor]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13&lt;br /&gt;
int 0x10&lt;br /&gt;
push 0xa000&lt;br /&gt;
pop es&lt;br /&gt;
X: cwd			; &amp;quot;clear&amp;quot; DX (if AH &amp;lt; 0x7F)&lt;br /&gt;
mov ax,di		; get screen position into AX&lt;br /&gt;
mov bx,320		; get screen width into BX&lt;br /&gt;
div bx			; divide, to get row and column&lt;br /&gt;
xor ax,dx		; the famous XOR pattern&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
stosb			; finally, draw to the screen&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that there is a different way of preparing the segment register, instead of :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
you can also do :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov ah,0xA0&lt;br /&gt;
mov es,ax&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
both variations occupy 4 bytes, but the latter is executable on processor architectures where ''push &amp;lt;word&amp;gt;'' is not available.&lt;br /&gt;
&lt;br /&gt;
==== Alternative way of pixel plotting and optimization ====&lt;br /&gt;
&lt;br /&gt;
Now let's optimize on the snippet. First, we can adapt the &amp;quot;LES&amp;quot; trick from the textmode section. We just exchange &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
with:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;les bx,[bx]&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
to save two bytes. This works because BX is 0x0000 at start and thus, accesses the region ''before'' our code, which is called [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix]. The two bytes that are put into the segment register ES are bytes 2 and 3  = ''&amp;quot;Segment of the first byte beyond the memory allocated to the program&amp;quot;'' which is usually 0x9FFF. That is just off by one to our desired 0xA000. Unfortunately that means a 16 pixel offset, so if screen alignment means something to you, you can't use this optimization. Also, said two bytes are not always 0x9FFF; for example, if resident programs are above the ''&amp;quot;memory allocated to the program&amp;quot;'' (FreeDos), their content is overwritten if we take their base as our video memory base.&lt;br /&gt;
&lt;br /&gt;
Second, we can use an alternative way of putting pixels to the screen, subfunction AH = 0x0C of int 0x10. Also, instead of constructing row and column from the screen pointer, we can use some interesting properties of the screenwidth regarding logical operations. This results in the following 16 byte program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             ; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov al,0x13&lt;br /&gt;
X: int 0x10		; set video mode AND draw pixel&lt;br /&gt;
inc cx			; increment column&lt;br /&gt;
mov ax,cx		; get column in AH&lt;br /&gt;
xor al,ah		; the famous XOR pattern&lt;br /&gt;
mov ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first optimization is the double usage of the same &amp;quot;int 0x10&amp;quot; as setting the videomode and drawing the pixel. The subfunction AH = 0x0C expects row and column in DX and CX. Since the screenwidth is 320, which is 5 * 64, we can ignore the row and just works with the column, if we use logical operations and just use bit 0-6 of the result. The subfunction AH = 0x0C allows for unbounded column values in CX (up to 65535) and correctly &amp;quot;wraps&amp;quot; it internally without an error.&lt;br /&gt;
&lt;br /&gt;
The major drawback of the &amp;quot;subfunction AH = 0x0C&amp;quot; approach is performance loss. While DosBox and many emulators perform just fine, real hardware will draw much much slower based on the Video BIOS.&lt;br /&gt;
&lt;br /&gt;
==== Basic Animation and user interaction ====&lt;br /&gt;
&lt;br /&gt;
Now let's add the convenient check for the ESC key and also add a simple animation. The &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; register is used as frame counter and incremented after the pixel counter &amp;lt;code&amp;gt;CX&amp;lt;/code&amp;gt; ran through all 65536 values via &amp;lt;code&amp;gt;LOOP&amp;lt;/code&amp;gt;. This frame counter is then added to the column. The resulting program is now 25 bytes in size :&lt;br /&gt;
&lt;br /&gt;
[[File:Xor anim example.gif|thumb]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Producing sound ==&lt;br /&gt;
&lt;br /&gt;
=== MIDI notes ===&lt;br /&gt;
&lt;br /&gt;
Creating sounds with MIDI requires a bit more preparation, but once you're familiar with it, it's even simpler than PC Speaker sound, because you basically don't have to ''create'' the sound, you just have to ''trigger'' it. For the start, you have to know, that there is a lot of [https://en.wikipedia.org/wiki/General_MIDI different instruments] and a [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined way of communication]. Imagine the MIDI interface like a keyboard, you tell it which button/key you want to press, which knob to twist, and sometimes, how hard.&lt;br /&gt;
&lt;br /&gt;
==== Single piano note ====&lt;br /&gt;
&lt;br /&gt;
Let's start of with a simple example, playing a single note on the piano :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh		;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h	;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 		;	send !&lt;br /&gt;
dec 	dx			;	MIDI Data Port ( = 330h )&lt;br /&gt;
mov 	al,	90h		;	send note on channel ZERO - command&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
mov 	al,	56h		;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al		; 	send !&lt;br /&gt;
mov 	al,	67h		;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
ret					; 	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In short: you turn your keyboard on (switching to UART mode), then press a KEY with a certain VOLUME on channel ZERO, then exit. Besides switching to UART mode, all this communication uses the port &amp;lt;code&amp;gt;330h&amp;lt;/code&amp;gt;. This example will work on DosBox but not on Windows XP NTVDM: for [http://www.pouet.net/topic.php?which=10720&amp;amp;page=1 still unclear reasons], the NTVDM emulation delays the note until it receives a second one. The simplest way of at least hearing something is to repeatedly play notes, like in the following example :&lt;br /&gt;
&lt;br /&gt;
==== Repeated piano notes ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh			;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h		;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 			;	send !&lt;br /&gt;
dec 	dx				;	MIDI Data Port ( = 330h )&lt;br /&gt;
main:&lt;br /&gt;
mov 	al,	90h			;	send note on channel 0 - command&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	al,	56h			;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al			; 	send !&lt;br /&gt;
mov 	al,	67h			;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,	[fs:0x46c]	;	read timer&lt;br /&gt;
test 	al,	3			;	skip 3 values&lt;br /&gt;
jnz 	_wait			;&lt;br /&gt;
inc 	byte [fs:0x46c]	; 	inc manually to prevent retrigger&lt;br /&gt;
in 		al,	0x60		;	check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			;	no? repeat&lt;br /&gt;
ret						;	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, enriched with synchronizing against the timer and checking for the ESC key. It works on both DosBox and Windows XP NTVDM and plays a note on the Piano repeatedly.&lt;br /&gt;
&lt;br /&gt;
==== Repeated notes of other instruments ====&lt;br /&gt;
&lt;br /&gt;
While hitting one key repeatedly is not really interesting in general, it can produce decent results when doing it with the right instrument activated, like it was done with the &amp;quot;French Horn&amp;quot; in [https://www.youtube.com/watch?v=tsbxdjP9l50 Timelord (by Baudsurfer)]. Apart from just changing the instrument, let's also optimize a little bit on the size: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
start:	&lt;br /&gt;
mov 	si,data			; init pointer for outsb&lt;br /&gt;
mov 	dx,330h			; change to data port&lt;br /&gt;
mov 	cl,5			; play our music data&lt;br /&gt;
rep 	outsb			; (see below at &amp;quot;data&amp;quot; label)&lt;br /&gt;
inc 	dx				; switch to control port&lt;br /&gt;
outsb					; change to mode &amp;quot;UART&amp;quot;&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,[fs:0x46c]	; read timer value&lt;br /&gt;
cmp	 	al,bl			; wait until...&lt;br /&gt;
jz 		_wait			; ...timer value changed&lt;br /&gt;
xchg 	bx,ax			; save old timer value&lt;br /&gt;
in 		al,0x60			; check for ...&lt;br /&gt;
dec 	al				; ... ESC key&lt;br /&gt;
jnz 	start			; otherwise : repeat &lt;br /&gt;
dec 	dx				; switch to data port again&lt;br /&gt;
outsb					; stop all ... &lt;br /&gt;
outsb					; ... notes played ...&lt;br /&gt;
outsb					; ... on channel 3&lt;br /&gt;
data: 	&lt;br /&gt;
db		0c3h			; change instrument on channel 3&lt;br /&gt;
						; (is also &amp;quot;RET&amp;quot; for program quit)&lt;br /&gt;
db		60				; to &amp;quot;French Horn&amp;quot;&lt;br /&gt;
db		93h				; play note on channel 3&lt;br /&gt;
db		35				; deep &amp;quot;b&amp;quot; = note number 35&lt;br /&gt;
db		127				; play with volume = 127&lt;br /&gt;
db		3fh				; change mode to &amp;quot;UART&amp;quot;&lt;br /&gt;
db		0b3h			; control change on channel 3&lt;br /&gt;
db		123				; Channel Mode Message &amp;quot;All Notes Off&amp;quot;&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, with changed instrument, structuring the MIDI data into a data section, optimizing the output with the usage of &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;out dx,al&amp;lt;/code&amp;gt;, and finalizing the program with a special command to turn ''All Notes Off''. This is necessary for all instruments which don't stop by themself. In all the previous examples, we sent the &amp;quot;NOTE ON&amp;quot; command (&amp;lt;code&amp;gt;9Xh&amp;lt;/code&amp;gt;), but not the according &amp;quot;NOTE OFF&amp;quot; command (&amp;lt;code&amp;gt;8Xh&amp;lt;/code&amp;gt;). Also, the note is now played on channel &amp;lt;code&amp;gt;03h&amp;lt;/code&amp;gt;, since the commandbyte for changing an instrument on channel 3 is &amp;lt;code&amp;gt;0C3h&amp;lt;/code&amp;gt; which is also &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; and can be reused. If this looks complicated at first, always remember, it's just sending [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined commands] to a single port.&lt;br /&gt;
&lt;br /&gt;
==== The drum channel ====&lt;br /&gt;
&lt;br /&gt;
Now, that you're aware that there are different channels (overall: 16) to play notes on, how would you like a channel &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt; specifically for 'Drums' ? The following example plays a track of drum notes repeatedly, while further optimizing for size :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
aas						; 3fh = &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
cwd						; 99h = &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
db 		42,38,42,35		; the drum notes (kick, snare, hihat)&lt;br /&gt;
mov 	dx,0x331		; MIDI Control Port&lt;br /&gt;
outsb					; send &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
dec 	dx				; switch to MIDI data port&lt;br /&gt;
outsb					; send &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
mov		al,[fs:0x46c]	; read timer&lt;br /&gt;
test	al,3&lt;br /&gt;
jnz 	main			; skip 3 values&lt;br /&gt;
inc 	byte [fs:0x46c] ; inc manually to prevent retrigger&lt;br /&gt;
&lt;br /&gt;
inc 	bx				; increment note counter&lt;br /&gt;
and 	bl,3			; truncate to 4 notes&lt;br /&gt;
mov 	al,[bx+si]		; read the drumnote (see above)&lt;br /&gt;
&lt;br /&gt;
out 	dx,al			; send the drum&lt;br /&gt;
mov 	al,127			; set volume to maximum&lt;br /&gt;
out 	dx,al			; send volume&lt;br /&gt;
&lt;br /&gt;
in 		al,0x60			; check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			; no? repeat&lt;br /&gt;
ret						; otherwise quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In contrast to the previous example, the data section is now at the start. That means, it's executed as code! This is dangerous of course, but also saves bytes on assigning the &amp;lt;code&amp;gt;DATA&amp;lt;/code&amp;gt; offset to &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt;. Once &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; incremented &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially two times, it is fixed and further reading from the drumdata is done with &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;. Unless you know exactly what you are doing, don't use that kind of &amp;quot;executing data&amp;quot; optimization!&amp;quot;. In this special case &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;CWD&amp;lt;/code&amp;gt; do no harm and the drum notes &amp;lt;code&amp;gt;42,38,42,35&amp;lt;/code&amp;gt; are carefully crafted and arranged to resemble the instruction &amp;lt;code&amp;gt;SUB AH,[232Ah]&amp;lt;/code&amp;gt; which does no harm either.&lt;br /&gt;
&lt;br /&gt;
==== Creating basic sound effects in 16 bytes ====&lt;br /&gt;
&lt;br /&gt;
In the [https://en.wikipedia.org/wiki/General_MIDI#Sound_Effects MIDI repertoire], there are already some sound effects available. With the &amp;quot;data execution&amp;quot; optimization above, let's fire a gunshot in 16 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;aas&lt;br /&gt;
les 	di,[bx-0x6C]&lt;br /&gt;
xor 	al,127&lt;br /&gt;
mov 	dx,0x331&lt;br /&gt;
outsb&lt;br /&gt;
dec 	dx&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
rep 	outsb&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first three instructions don't do anything (they do, but we don't care), it's just MIDI data.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
the command for switching to &amp;quot;UART&amp;quot; mode, for sending to port &amp;lt;code&amp;gt;0x331&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;les di,[bx-0x6C]&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0xc4&amp;lt;/code&amp;gt; (change instrument on channel 4), &amp;lt;code&amp;gt;0x7F&amp;lt;/code&amp;gt; (change it to &amp;quot;Gunshot&amp;quot;), &amp;lt;code&amp;gt;0x94&amp;lt;/code&amp;gt; (play note on channel 4)&lt;br /&gt;
* &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0x34&amp;lt;/code&amp;gt; (play THIS note), &amp;lt;code&amp;gt;0x7f&amp;lt;/code&amp;gt; (play it THAT loud, 127 is also the allowed maximum)&lt;br /&gt;
&lt;br /&gt;
The rest of the code basically just sends the MIDI data to the interface and exits. You can change the kind of sound effect with modifying the modbyte of the second instruction (change BX to BP or SI etc.). Changing the volume is more simple, change the byte value of &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt; to any value between 0 and 127.&lt;br /&gt;
&lt;br /&gt;
==== Procedural MIDI music generation in 64 bytes ====&lt;br /&gt;
&lt;br /&gt;
With all the above you should now be able to follow the next snippet [http://www.pouet.net/prod.php?which=66313 Descent OST], a small framework for procedural MIDI sound generation in 64 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;; &amp;quot;Descent OST&amp;quot;, a 62 byte MIDI music player for MSDOS&lt;br /&gt;
; created by HellMood/DESiRE (C)2015&lt;br /&gt;
; this is the extracted music routine used in &amp;quot;Descent&amp;quot; &lt;br /&gt;
; it is a procedural MIDI algorithm which sticks a&lt;br /&gt;
; subroutine to the DOS timer (interrupt 0x1C)&lt;br /&gt;
; the registered routine is called ~18.2 times per second&lt;br /&gt;
&lt;br /&gt;
; developed for use with &amp;quot;NASM&amp;quot;, &lt;br /&gt;
; see http://sourceforge.net/projects/nasm/files/&lt;br /&gt;
&lt;br /&gt;
%define		 	rhythmPattern			0b11&lt;br /&gt;
; with &amp;quot;rhythmPattern&amp;quot;, you define how often a note is played&lt;br /&gt;
; generally, higher values and values containing many &amp;quot;ones&amp;quot;&lt;br /&gt;
; in binary representation, will result in faster play&lt;br /&gt;
; for example &amp;quot;0b11&amp;quot; will play every 4th note&lt;br /&gt;
%define 		baseInstrument			9&lt;br /&gt;
; defines the number of the first instrument used.&lt;br /&gt;
; see http://www.midi.org/techspecs/gm1sound.php for a full list&lt;br /&gt;
; keep in mind, that there are only a few instrument blocks&lt;br /&gt;
; whose sounds stop after a while. You won't get good results&lt;br /&gt;
; from strings etc. just a mess of overlayed sounds&lt;br /&gt;
%define 		numInstruments			7&lt;br /&gt;
; defines how many instrument are used. keep in mind, that &amp;quot;rhythm-&lt;br /&gt;
; Pattern&amp;quot; has influence on the picked instrument. the instruments&lt;br /&gt;
; from 9 to 9+7 are called &amp;quot;chromatic percussion&amp;quot;&lt;br /&gt;
%define 		noteStep 				5&lt;br /&gt;
; defines the basic difference from on note to the next. recommended&lt;br /&gt;
; values here are (mainly) 3,4 and 5 for music theoretic reasons&lt;br /&gt;
; but feel free to play around =)&lt;br /&gt;
%define 		noteRange				12&lt;br /&gt;
; after adding the noteStep, the note value is &amp;quot;mod&amp;quot;ded with&lt;br /&gt;
; the &amp;quot;noteRange&amp;quot;. 12 means octave, which results in very harmonic&lt;br /&gt;
; scales&lt;br /&gt;
%define 		noteSpread				3&lt;br /&gt;
; the third step spreads the notes over the tonal spectrum, you may&lt;br /&gt;
; want to keep &amp;quot;noteSpread&amp;quot; * &amp;quot;noteRange&amp;quot; round about 30-60.&lt;br /&gt;
%define 		baseNote				40&lt;br /&gt;
; the general tone height of everything. some instruments don't play&lt;br /&gt;
; arbitrary deep notes correctly, and too high notes cause ear bleeding&lt;br /&gt;
; adjust with care ;)&lt;br /&gt;
&lt;br /&gt;
; WARNING : after exiting the program, the timer interrupt is still active&lt;br /&gt;
; i strongly recommend to reboot or restart DOSBOX!&lt;br /&gt;
&lt;br /&gt;
; ADVISE : Yes, there are music- and math-related things going on here&lt;br /&gt;
; if you're not into music theory, cycle of fifth, and the like, it maybe&lt;br /&gt;
; better to just play around with the parameters, rather then understanding them&lt;br /&gt;
; just change stuff slowly, and eventually you will get &amp;quot;there&amp;quot;&lt;br /&gt;
; wherever that is ;)&lt;br /&gt;
&lt;br /&gt;
org 0x100&lt;br /&gt;
xchg cx,ax					; set our second counter to zero&lt;br /&gt;
mov dx,music			&lt;br /&gt;
mov ax,0x251C				; mode &amp;quot;0x25&amp;quot; , &amp;quot;0x1C&amp;quot; = change address of timer interrupt&lt;br /&gt;
int 0x21					; see http://mprolab.teipir.gr/vivlio80X86/dosints.pdf&lt;br /&gt;
S:&lt;br /&gt;
in ax,0x60					; wait for &amp;quot;ESC&amp;quot; press, then exit&lt;br /&gt;
dec al						; music plays on anyway, this is just for&lt;br /&gt;
jnz S						; keeping the music exactly as in &amp;quot;Descent&amp;quot;&lt;br /&gt;
ret							; return to prompt&lt;br /&gt;
music:&lt;br /&gt;
inc bx						; increment our first counter (starts at zero)&lt;br /&gt;
test bl,byte rhythmPattern	; play a note every 4th time tick&lt;br /&gt;
jnz nomusic					; otherwise do nothing&lt;br /&gt;
mov dx,0x331&lt;br /&gt;
mov al,0x3F&lt;br /&gt;
out dx,al&lt;br /&gt;
dec dx&lt;br /&gt;
mov al,0xC0					; change instrument on channel 0...&lt;br /&gt;
out dx,al&lt;br /&gt;
mov ax,bx					&lt;br /&gt;
aam byte numInstruments&lt;br /&gt;
add al,byte baseInstrument	; ...to this instrument&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0x90					; play note on channel 0 ...&lt;br /&gt;
out dx,al&lt;br /&gt;
add cl,byte noteStep			 &lt;br /&gt;
mov al,cl&lt;br /&gt;
aam byte noteRange&lt;br /&gt;
imul ax,noteSpread&lt;br /&gt;
add al,baseNote				; ... play THIS note&lt;br /&gt;
out dx,al&lt;br /&gt;
neg al						; (play deeper notes louder = add bass)&lt;br /&gt;
add al,127+39				; ... play it THAT loud&lt;br /&gt;
out dx,al&lt;br /&gt;
nomusic:&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PC Speaker ===&lt;br /&gt;
&lt;br /&gt;
Producing sound with PC speakers is incredibly easy. Basically, you set a system timer to a desired frequency, then connect this timer to the speaker. [http://wiki.osdev.org/PC_Speaker The PC Speaker Article] from OSDEV Wiki has the details about it. An example for a tiny intro that uses PC speaker music is [http://www.pouet.net/prod.php?which=67833 SpeaCore]&lt;br /&gt;
&lt;br /&gt;
==== Basic example with melody pattern ====&lt;br /&gt;
&lt;br /&gt;
A very optimized and dirty variant of producing sound with the speaker is this 12 byte snippet (sound routine from [http://www.pouet.net/prod.php?which=67829 the tiny intro &amp;quot;darkweb&amp;quot;]):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;hlt				; sync to timer1&lt;br /&gt;
inc bx			; increment our counter&lt;br /&gt;
mov ax,bx		; work with a copy&lt;br /&gt;
or al,0x4B      ; melody pattern + 2 LSB for speaker link&lt;br /&gt;
out 0x42,al		; set new countdown for timer2 (two passes)&lt;br /&gt;
out 0x61,al		; link timer2 to PC speaker (2 LSBs are 1)&lt;br /&gt;
jmp si			; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Instead of sending low and high byte of our divisor directly in succession, we do it the &amp;quot;two path&amp;quot; way. That reduces the amount of possible frequencies to 255, which is still good enough for some rough sounds. Linking the timer to the PC speaker might not be obvious : Normally you would read the value of port 0x61, set the two least significant bits to TRUE and write the value again. You can save on all of this, if you just send the &amp;quot;two path&amp;quot; value which you just used for the timer if that value has the two least significant bits already set (''or al,0x4B'' does this). Be aware that port 0x61 does many things apart from just connecting the timer to the speaker. A useful resource for ports in general is the [http://bochs.sourceforge.net/techspec/PORTS.LST Bochs Ports List], for port 0x61 it displays:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
''0061	w	KB controller port B (ISA, EISA)   (PS/2 port A is at 0092)&lt;br /&gt;
&lt;br /&gt;
		system control port for compatibility with 8255&lt;br /&gt;
&lt;br /&gt;
		bit 7	(1= IRQ 0 reset )&lt;br /&gt;
&lt;br /&gt;
		bit 6-4    reserved&lt;br /&gt;
&lt;br /&gt;
		bit 3 = 1  channel check enable&lt;br /&gt;
&lt;br /&gt;
		bit 2 = 1  parity check enable&lt;br /&gt;
&lt;br /&gt;
		'''bit 1 = 1  speaker data enable'''&lt;br /&gt;
&lt;br /&gt;
		'''bit 0 = 1  timer 2 gate to speaker enable''' ''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So if you experience strange things with highly optimized pc speaker output, revert to the safe way. The described way works with real hardware and DosBox. Unfortunately, both Orcacle Virtual Box with MsDos 6.22 and Windows XP NTVDM seem not to properly emulate PC speakers (Investigation and citation needed here!)&lt;br /&gt;
&lt;br /&gt;
==== Simple deep sound in 8 bytes ====&lt;br /&gt;
&lt;br /&gt;
One of the smallest possible PC speaker sound generation might be this 8 byte snippet :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;dec ax			; AX initially 0000h -&amp;gt; AL = 0xFF&lt;br /&gt;
out 42h,al		; change divisor of timer2 to 0xFFFF&lt;br /&gt;
out 42h,al		; resulting in a very low frequency&lt;br /&gt;
out 61h,al		; 2 LSBs are set, connect timer to speaker&lt;br /&gt;
ret				; quit&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Note: This may fail on actual hardware, as there might not be time for the bus to settle between the consecutive &amp;lt;code&amp;gt;out 42h,al&amp;lt;/code&amp;gt; statements.)&lt;br /&gt;
&lt;br /&gt;
=== COVOX output (aka LPT DAC) ===&lt;br /&gt;
&lt;br /&gt;
It is possible to output to an LPT-connected DAC (&amp;quot;[https://en.wikipedia.org/wiki/Covox_Speech_Thing COVOX]&amp;quot;) in a tinyprog.  A proof-of-concept example is [http://www.pouet.net/prod.php?which=57991 Express Train 125] which uses COVOX for sound generation.  &lt;br /&gt;
&lt;br /&gt;
This method follows the [http://countercomplex.blogspot.com/2011/10/algorithmic-symphonies-from-one-line-of.html &amp;quot;audio from one line of C code&amp;quot;] style of sound generation.  [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 A pouet discussion] exists for more background information.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Output&amp;diff=296</id>
		<title>Output</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Output&amp;diff=296"/>
				<updated>2016-08-20T01:27:10Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Simple deep sound in 8 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Outputting to the screen ==&lt;br /&gt;
&lt;br /&gt;
First, be aware of the [http://img.tfd.com/cde/MEMMAP.GIF MSDOS memory layout]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in Textmode (80x25) ===&lt;br /&gt;
&lt;br /&gt;
==== Hello World / High Level function ====&lt;br /&gt;
&lt;br /&gt;
Here's an obligatory &amp;quot;Hello World&amp;quot; program in text mode, using a [http://www.ctyme.com/intr/rb-2562.htm &amp;quot;high level&amp;quot; MS-DOS function]. With a small optimization already included (using &amp;lt;code&amp;gt;XCHG BP,AX&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;MOV AH,09h&amp;lt;/code&amp;gt;), this snippet is 20 bytes in size.&lt;br /&gt;
&lt;br /&gt;
[[File:Hello world.png|thumb|Hello World!]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h			; we start at CS:100h&lt;br /&gt;
xchg 	bp,ax		; already a trick, puts 09h into AH&lt;br /&gt;
mov		dx,text		; DX expects the adress of a $ terminated string&lt;br /&gt;
int 	21h			; call the DOS function (AH = 09h)&lt;br /&gt;
ret					; quit&lt;br /&gt;
text:&lt;br /&gt;
db 'Hello World!$'&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Of course, this gets shorter with each byte you remove from the text itself. Now let's look into arbitrary screen access. Right after the start of your program you are in mode 3, that is 80x25 in 16 colors. See the [http://www.columbia.edu/~em36/wpdos/videomodes.txt Video Modes List] [[File:Drawchar example.png|thumb|draw char example]] So, to show something on the screen, you would need to set a segment register to 0xB800, then write values into this segment. &lt;br /&gt;
&lt;br /&gt;
==== Low level access ====&lt;br /&gt;
&lt;br /&gt;
The following three snippets showcase how to draw a red smiley in three different ways. All example snippets are meant to be standalone programs, starting with the first instruction and nothing before it. The target coordinate (40,12) is about the middle of the screen. We need a multiplier 2 since one char needs two bytes in memory (char and color is a byte each). The high byte 0x04 means red (4) on black (0) while the 0x01 is the first ASCII char - a smiley.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop ds&lt;br /&gt;
mov bx,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
mov [bx],ax&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop es&lt;br /&gt;
mov di,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
stosw&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push ss&lt;br /&gt;
push 0xb800&lt;br /&gt;
pop ss&lt;br /&gt;
mov sp,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
push ax&lt;br /&gt;
pop ss&lt;br /&gt;
int 0x20&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You might notice that the ''push &amp;lt;word&amp;gt;'' + ''pop seg_reg'' combination is always the same and occupies four bytes alltogether. If correct alignment is not important to you and you really just want ''any'' pointer to the screen, there is another way to get a valid one:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
les bx,[si]&lt;br /&gt;
nop&lt;br /&gt;
stosb&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That's also four bytes, but it already has the &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; opcode (for putting something onto the screen) integrated and even one slot free for another one-byte-instruction. It works because &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially points to the start of our code, and &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; has the hexadecimal representation of &amp;lt;code&amp;gt;0AAh&amp;lt;/code&amp;gt;. After the first command, the segment register &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; contains the value &amp;lt;code&amp;gt;0AA90h&amp;lt;/code&amp;gt;. If you repeatedly write something to the screen with &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; you will eventually reach the &amp;lt;code&amp;gt;0B800h&amp;lt;/code&amp;gt; segment and chars will appear on the screen. With a careful selection of the free one-byte-opcode you can also reintroduce some alignment. This works also with the &amp;lt;code&amp;gt;stosw&amp;lt;/code&amp;gt; opcode &amp;lt;code&amp;gt;0ABh&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Alternative high level functions  ====&lt;br /&gt;
&lt;br /&gt;
Besides the direct way of accessing memory there are also other ways of bringing char to the screen (f.e)&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-4124.htm INT 29h]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2558.htm INT 21h AH=6]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2562.htm INT 21h AH=9]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in mode 13h (320x200) ===&lt;br /&gt;
&lt;br /&gt;
==== Basic pixel output ====&lt;br /&gt;
&lt;br /&gt;
The videomemory for mode 13h is located at segment 0xA000, so you need to assign this value to a segment register. Also, after the start of your program you are normally still in textmode, so you need to switch to the videomode. The following snippet does both:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13 &lt;br /&gt;
int 0x10     ; AH = 0 means : set video mode to AL = 0x13 (320 x 200 pixels in 256 colors)&lt;br /&gt;
push 0xA000  ; put value on the stack&lt;br /&gt;
pop es       ; pop the top stack value into segment register ES&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You're free to use any of the segment register / opcode combinations to write to the screen&lt;br /&gt;
* &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;mov&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Let's add some code that actually draws something on the screen, the following program occupies 23 bytes and draws a fullscreen XOR texture&lt;br /&gt;
[[File:Mode13h-example-xor.png|left|bottom|thumb|mode13h-example-xor]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13&lt;br /&gt;
int 0x10&lt;br /&gt;
push 0xa000&lt;br /&gt;
pop es&lt;br /&gt;
X: cwd			; &amp;quot;clear&amp;quot; DX (if AH &amp;lt; 0x7F)&lt;br /&gt;
mov ax,di		; get screen position into AX&lt;br /&gt;
mov bx,320		; get screen width into BX&lt;br /&gt;
div bx			; divide, to get row and column&lt;br /&gt;
xor ax,dx		; the famous XOR pattern&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
stosb			; finally, draw to the screen&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that there is a different way of preparing the segment register, instead of :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
you can also do :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov ah,0xA0&lt;br /&gt;
mov es,ax&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
both variations occupy 4 bytes, but the latter is executable on processor architectures where ''push &amp;lt;word&amp;gt;'' is not available.&lt;br /&gt;
&lt;br /&gt;
==== Alternative way of pixel plotting and optimization ====&lt;br /&gt;
&lt;br /&gt;
Now let's optimize on the snippet. First, we can adapt the &amp;quot;LES&amp;quot; trick from the textmode section. We just exchange &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
with:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;les bx,[bx]&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
to save two bytes. This works because BX is 0x0000 at start and thus, accesses the region ''before'' our code, which is called [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix]. The two bytes that are put into the segment register ES are bytes 2 and 3  = ''&amp;quot;Segment of the first byte beyond the memory allocated to the program&amp;quot;'' which is usually 0x9FFF. That is just off by one to our desired 0xA000. Unfortunately that means a 16 pixel offset, so if screen alignment means something to you, you can't use this optimization. Also, said two bytes are not always 0x9FFF; for example, if resident programs are above the ''&amp;quot;memory allocated to the program&amp;quot;'' (FreeDos), their content is overwritten if we take their base as our video memory base.&lt;br /&gt;
&lt;br /&gt;
Second, we can use an alternative way of putting pixels to the screen, subfunction AH = 0x0C of int 0x10. Also, instead of constructing row and column from the screen pointer, we can use some interesting properties of the screenwidth regarding logical operations. This results in the following 16 byte program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             ; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov al,0x13&lt;br /&gt;
X: int 0x10		; set video mode AND draw pixel&lt;br /&gt;
inc cx			; increment column&lt;br /&gt;
mov ax,cx		; get column in AH&lt;br /&gt;
xor al,ah		; the famous XOR pattern&lt;br /&gt;
mov ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first optimization is the double usage of the same &amp;quot;int 0x10&amp;quot; as setting the videomode and drawing the pixel. The subfunction AH = 0x0C expects row and column in DX and CX. Since the screenwidth is 320, which is 5 * 64, we can ignore the row and just works with the column, if we use logical operations and just use bit 0-6 of the result. The subfunction AH = 0x0C allows for unbounded column values in CX (up to 65535) and correctly &amp;quot;wraps&amp;quot; it internally without an error.&lt;br /&gt;
&lt;br /&gt;
The major drawback of the &amp;quot;subfunction AH = 0x0C&amp;quot; approach is performance loss. While DosBox and many emulators perform just fine, real hardware will draw much much slower based on the Video BIOS.&lt;br /&gt;
&lt;br /&gt;
==== Basic Animation and user interaction ====&lt;br /&gt;
&lt;br /&gt;
Now let's add the convenient check for the ESC key and also add a simple animation. The &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; register is used as frame counter and incremented after the pixel counter &amp;lt;code&amp;gt;CX&amp;lt;/code&amp;gt; ran through all 65536 values via &amp;lt;code&amp;gt;LOOP&amp;lt;/code&amp;gt;. This frame counter is then added to the column. The resulting program is now 25 bytes in size :&lt;br /&gt;
&lt;br /&gt;
[[File:Xor anim example.gif|thumb]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Producing sound ==&lt;br /&gt;
&lt;br /&gt;
=== MIDI notes ===&lt;br /&gt;
&lt;br /&gt;
Creating sounds with MIDI requires a bit more preparation, but once you're familiar with it, it's even simpler than PC Speaker sound, because you basically don't have to ''create'' the sound, you just have to ''trigger'' it. For the start, you have to know, that there is a lot of [https://en.wikipedia.org/wiki/General_MIDI different instruments] and a [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined way of communication]. Imagine the MIDI interface like a keyboard, you tell it which button/key you want to press, which knob to twist, and sometimes, how hard.&lt;br /&gt;
&lt;br /&gt;
==== Single piano note ====&lt;br /&gt;
&lt;br /&gt;
Let's start of with a simple example, playing a single note on the piano :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh		;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h	;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 		;	send !&lt;br /&gt;
dec 	dx			;	MIDI Data Port ( = 330h )&lt;br /&gt;
mov 	al,	90h		;	send note on channel ZERO - command&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
mov 	al,	56h		;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al		; 	send !&lt;br /&gt;
mov 	al,	67h		;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
ret					; 	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In short: you turn your keyboard on (switching to UART mode), then press a KEY with a certain VOLUME on channel ZERO, then exit. Besides switching to UART mode, all this communication uses the port &amp;lt;code&amp;gt;330h&amp;lt;/code&amp;gt;. This example will work on DosBox but not on Windows XP NTVDM: for [http://www.pouet.net/topic.php?which=10720&amp;amp;page=1 still unclear reasons], the NTVDM emulation delays the note until it receives a second one. The simplest way of at least hearing something is to repeatedly play notes, like in the following example :&lt;br /&gt;
&lt;br /&gt;
==== Repeated piano notes ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh			;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h		;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 			;	send !&lt;br /&gt;
dec 	dx				;	MIDI Data Port ( = 330h )&lt;br /&gt;
main:&lt;br /&gt;
mov 	al,	90h			;	send note on channel 0 - command&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	al,	56h			;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al			; 	send !&lt;br /&gt;
mov 	al,	67h			;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,	[fs:0x46c]	;	read timer&lt;br /&gt;
test 	al,	3			;	skip 3 values&lt;br /&gt;
jnz 	_wait			;&lt;br /&gt;
inc 	byte [fs:0x46c]	; 	inc manually to prevent retrigger&lt;br /&gt;
in 		al,	0x60		;	check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			;	no? repeat&lt;br /&gt;
ret						;	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, enriched with synchronizing against the timer and checking for the ESC key. It works on both DosBox and Windows XP NTVDM and plays a note on the Piano repeatedly.&lt;br /&gt;
&lt;br /&gt;
==== Repeated notes of other instruments ====&lt;br /&gt;
&lt;br /&gt;
While hitting one key repeatedly is not really interesting in general, it can produce decent results when doing it with the right instrument activated, like it was done with the &amp;quot;French Horn&amp;quot; in [https://www.youtube.com/watch?v=tsbxdjP9l50 Timelord (by Baudsurfer)]. Apart from just changing the instrument, let's also optimize a little bit on the size: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
start:	&lt;br /&gt;
mov 	si,data			; init pointer for outsb&lt;br /&gt;
mov 	dx,330h			; change to data port&lt;br /&gt;
mov 	cl,5			; play our music data&lt;br /&gt;
rep 	outsb			; (see below at &amp;quot;data&amp;quot; label)&lt;br /&gt;
inc 	dx				; switch to control port&lt;br /&gt;
outsb					; change to mode &amp;quot;UART&amp;quot;&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,[fs:0x46c]	; read timer value&lt;br /&gt;
cmp	 	al,bl			; wait until...&lt;br /&gt;
jz 		_wait			; ...timer value changed&lt;br /&gt;
xchg 	bx,ax			; save old timer value&lt;br /&gt;
in 		al,0x60			; check for ...&lt;br /&gt;
dec 	al				; ... ESC key&lt;br /&gt;
jnz 	start			; otherwise : repeat &lt;br /&gt;
dec 	dx				; switch to data port again&lt;br /&gt;
outsb					; stop all ... &lt;br /&gt;
outsb					; ... notes played ...&lt;br /&gt;
outsb					; ... on channel 3&lt;br /&gt;
data: 	&lt;br /&gt;
db		0c3h			; change instrument on channel 3&lt;br /&gt;
						; (is also &amp;quot;RET&amp;quot; for program quit)&lt;br /&gt;
db		60				; to &amp;quot;French Horn&amp;quot;&lt;br /&gt;
db		93h				; play note on channel 3&lt;br /&gt;
db		35				; deep &amp;quot;b&amp;quot; = note number 35&lt;br /&gt;
db		127				; play with volume = 127&lt;br /&gt;
db		3fh				; change mode to &amp;quot;UART&amp;quot;&lt;br /&gt;
db		0b3h			; control change on channel 3&lt;br /&gt;
db		123				; Channel Mode Message &amp;quot;All Notes Off&amp;quot;&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, with changed instrument, structuring the MIDI data into a data section, optimizing the output with the usage of &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;out dx,al&amp;lt;/code&amp;gt;, and finalizing the program with a special command to turn ''All Notes Off''. This is necessary for all instruments which don't stop by themself. In all the previous examples, we sent the &amp;quot;NOTE ON&amp;quot; command (&amp;lt;code&amp;gt;9Xh&amp;lt;/code&amp;gt;), but not the according &amp;quot;NOTE OFF&amp;quot; command (&amp;lt;code&amp;gt;8Xh&amp;lt;/code&amp;gt;). Also, the note is now played on channel &amp;lt;code&amp;gt;03h&amp;lt;/code&amp;gt;, since the commandbyte for changing an instrument on channel 3 is &amp;lt;code&amp;gt;0C3h&amp;lt;/code&amp;gt; which is also &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; and can be reused. If this looks complicated at first, always remember, it's just sending [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined commands] to a single port.&lt;br /&gt;
&lt;br /&gt;
==== The drum channel ====&lt;br /&gt;
&lt;br /&gt;
Now, that you're aware that there are different channels (overall: 16) to play notes on, how would you like a channel &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt; specifically for 'Drums' ? The following example plays a track of drum notes repeatedly, while further optimizing for size :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
aas						; 3fh = &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
cwd						; 99h = &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
db 		42,38,42,35		; the drum notes (kick, snare, hihat)&lt;br /&gt;
mov 	dx,0x331		; MIDI Control Port&lt;br /&gt;
outsb					; send &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
dec 	dx				; switch to MIDI data port&lt;br /&gt;
outsb					; send &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
mov		al,[fs:0x46c]	; read timer&lt;br /&gt;
test	al,3&lt;br /&gt;
jnz 	main			; skip 3 values&lt;br /&gt;
inc 	byte [fs:0x46c] ; inc manually to prevent retrigger&lt;br /&gt;
&lt;br /&gt;
inc 	bx				; increment note counter&lt;br /&gt;
and 	bl,3			; truncate to 4 notes&lt;br /&gt;
mov 	al,[bx+si]		; read the drumnote (see above)&lt;br /&gt;
&lt;br /&gt;
out 	dx,al			; send the drum&lt;br /&gt;
mov 	al,127			; set volume to maximum&lt;br /&gt;
out 	dx,al			; send volume&lt;br /&gt;
&lt;br /&gt;
in 		al,0x60			; check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			; no? repeat&lt;br /&gt;
ret						; otherwise quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In contrast to the previous example, the data section is now at the start. That means, it's executed as code! This is dangerous of course, but also saves bytes on assigning the &amp;lt;code&amp;gt;DATA&amp;lt;/code&amp;gt; offset to &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt;. Once &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; incremented &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially two times, it is fixed and further reading from the drumdata is done with &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;. Unless you know exactly what you are doing, don't use that kind of &amp;quot;executing data&amp;quot; optimization!&amp;quot;. In this special case &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;CWD&amp;lt;/code&amp;gt; do no harm and the drum notes &amp;lt;code&amp;gt;42,38,42,35&amp;lt;/code&amp;gt; are carefully crafted and arranged to resemble the instruction &amp;lt;code&amp;gt;SUB AH,[232Ah]&amp;lt;/code&amp;gt; which does no harm either.&lt;br /&gt;
&lt;br /&gt;
==== Creating basic sound effects in 16 bytes ====&lt;br /&gt;
&lt;br /&gt;
In the [https://en.wikipedia.org/wiki/General_MIDI#Sound_Effects MIDI repertoire], there are already some sound effects available. With the &amp;quot;data execution&amp;quot; optimization above, let's fire a gunshot in 16 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;aas&lt;br /&gt;
les 	di,[bx-0x6C]&lt;br /&gt;
xor 	al,127&lt;br /&gt;
mov 	dx,0x331&lt;br /&gt;
outsb&lt;br /&gt;
dec 	dx&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
rep 	outsb&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first three instructions don't do anything (they do, but we don't care), it's just MIDI data.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
the command for switching to &amp;quot;UART&amp;quot; mode, for sending to port &amp;lt;code&amp;gt;0x331&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;les di,[bx-0x6C]&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0xc4&amp;lt;/code&amp;gt; (change instrument on channel 4), &amp;lt;code&amp;gt;0x7F&amp;lt;/code&amp;gt; (change it to &amp;quot;Gunshot&amp;quot;), &amp;lt;code&amp;gt;0x94&amp;lt;/code&amp;gt; (play note on channel 4)&lt;br /&gt;
* &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0x34&amp;lt;/code&amp;gt; (play THIS note), &amp;lt;code&amp;gt;0x7f&amp;lt;/code&amp;gt; (play it THAT loud, 127 is also the allowed maximum)&lt;br /&gt;
&lt;br /&gt;
The rest of the code basically just sends the MIDI data to the interface and exits. You can change the kind of sound effect with modifying the modbyte of the second instruction (change BX to BP or SI etc.). Changing the volume is more simple, change the byte value of &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt; to any value between 0 and 127.&lt;br /&gt;
&lt;br /&gt;
==== Procedural MIDI music generation in 64 bytes ====&lt;br /&gt;
&lt;br /&gt;
With all the above you should now be able to follow the next snippet [http://www.pouet.net/prod.php?which=66313 Descent OST], a small framework for procedural MIDI sound generation in 64 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;; &amp;quot;Descent OST&amp;quot;, a 62 byte MIDI music player for MSDOS&lt;br /&gt;
; created by HellMood/DESiRE (C)2015&lt;br /&gt;
; this is the extracted music routine used in &amp;quot;Descent&amp;quot; &lt;br /&gt;
; it is a procedural MIDI algorithm which sticks a&lt;br /&gt;
; subroutine to the DOS timer (interrupt 0x1C)&lt;br /&gt;
; the registered routine is called ~18.2 times per second&lt;br /&gt;
&lt;br /&gt;
; developed for use with &amp;quot;NASM&amp;quot;, &lt;br /&gt;
; see http://sourceforge.net/projects/nasm/files/&lt;br /&gt;
&lt;br /&gt;
%define		 	rhythmPattern			0b11&lt;br /&gt;
; with &amp;quot;rhythmPattern&amp;quot;, you define how often a note is played&lt;br /&gt;
; generally, higher values and values containing many &amp;quot;ones&amp;quot;&lt;br /&gt;
; in binary representation, will result in faster play&lt;br /&gt;
; for example &amp;quot;0b11&amp;quot; will play every 4th note&lt;br /&gt;
%define 		baseInstrument			9&lt;br /&gt;
; defines the number of the first instrument used.&lt;br /&gt;
; see http://www.midi.org/techspecs/gm1sound.php for a full list&lt;br /&gt;
; keep in mind, that there are only a few instrument blocks&lt;br /&gt;
; whose sounds stop after a while. You won't get good results&lt;br /&gt;
; from strings etc. just a mess of overlayed sounds&lt;br /&gt;
%define 		numInstruments			7&lt;br /&gt;
; defines how many instrument are used. keep in mind, that &amp;quot;rhythm-&lt;br /&gt;
; Pattern&amp;quot; has influence on the picked instrument. the instruments&lt;br /&gt;
; from 9 to 9+7 are called &amp;quot;chromatic percussion&amp;quot;&lt;br /&gt;
%define 		noteStep 				5&lt;br /&gt;
; defines the basic difference from on note to the next. recommended&lt;br /&gt;
; values here are (mainly) 3,4 and 5 for music theoretic reasons&lt;br /&gt;
; but feel free to play around =)&lt;br /&gt;
%define 		noteRange				12&lt;br /&gt;
; after adding the noteStep, the note value is &amp;quot;mod&amp;quot;ded with&lt;br /&gt;
; the &amp;quot;noteRange&amp;quot;. 12 means octave, which results in very harmonic&lt;br /&gt;
; scales&lt;br /&gt;
%define 		noteSpread				3&lt;br /&gt;
; the third step spreads the notes over the tonal spectrum, you may&lt;br /&gt;
; want to keep &amp;quot;noteSpread&amp;quot; * &amp;quot;noteRange&amp;quot; round about 30-60.&lt;br /&gt;
%define 		baseNote				40&lt;br /&gt;
; the general tone height of everything. some instruments don't play&lt;br /&gt;
; arbitrary deep notes correctly, and too high notes cause ear bleeding&lt;br /&gt;
; adjust with care ;)&lt;br /&gt;
&lt;br /&gt;
; WARNING : after exiting the program, the timer interrupt is still active&lt;br /&gt;
; i strongly recommend to reboot or restart DOSBOX!&lt;br /&gt;
&lt;br /&gt;
; ADVISE : Yes, there are music- and math-related things going on here&lt;br /&gt;
; if you're not into music theory, cycle of fifth, and the like, it maybe&lt;br /&gt;
; better to just play around with the parameters, rather then understanding them&lt;br /&gt;
; just change stuff slowly, and eventually you will get &amp;quot;there&amp;quot;&lt;br /&gt;
; wherever that is ;)&lt;br /&gt;
&lt;br /&gt;
org 0x100&lt;br /&gt;
xchg cx,ax					; set our second counter to zero&lt;br /&gt;
mov dx,music			&lt;br /&gt;
mov ax,0x251C				; mode &amp;quot;0x25&amp;quot; , &amp;quot;0x1C&amp;quot; = change address of timer interrupt&lt;br /&gt;
int 0x21					; see http://mprolab.teipir.gr/vivlio80X86/dosints.pdf&lt;br /&gt;
S:&lt;br /&gt;
in ax,0x60					; wait for &amp;quot;ESC&amp;quot; press, then exit&lt;br /&gt;
dec al						; music plays on anyway, this is just for&lt;br /&gt;
jnz S						; keeping the music exactly as in &amp;quot;Descent&amp;quot;&lt;br /&gt;
ret							; return to prompt&lt;br /&gt;
music:&lt;br /&gt;
inc bx						; increment our first counter (starts at zero)&lt;br /&gt;
test bl,byte rhythmPattern	; play a note every 4th time tick&lt;br /&gt;
jnz nomusic					; otherwise do nothing&lt;br /&gt;
mov dx,0x331&lt;br /&gt;
mov al,0x3F&lt;br /&gt;
out dx,al&lt;br /&gt;
dec dx&lt;br /&gt;
mov al,0xC0					; change instrument on channel 0...&lt;br /&gt;
out dx,al&lt;br /&gt;
mov ax,bx					&lt;br /&gt;
aam byte numInstruments&lt;br /&gt;
add al,byte baseInstrument	; ...to this instrument&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0x90					; play note on channel 0 ...&lt;br /&gt;
out dx,al&lt;br /&gt;
add cl,byte noteStep			 &lt;br /&gt;
mov al,cl&lt;br /&gt;
aam byte noteRange&lt;br /&gt;
imul ax,noteSpread&lt;br /&gt;
add al,baseNote				; ... play THIS note&lt;br /&gt;
out dx,al&lt;br /&gt;
neg al						; (play deeper notes louder = add bass)&lt;br /&gt;
add al,127+39				; ... play it THAT loud&lt;br /&gt;
out dx,al&lt;br /&gt;
nomusic:&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PC Speaker ===&lt;br /&gt;
&lt;br /&gt;
Producing sound with PC speakers is incredibly easy. Basically, you set a system timer to a desired frequency, then connect this timer to the speaker. [http://wiki.osdev.org/PC_Speaker The PC Speaker Article] from OSDEV Wiki has the details about it. An example for a tiny intro that uses PC speaker music is [http://www.pouet.net/prod.php?which=67833 SpeaCore]&lt;br /&gt;
&lt;br /&gt;
==== Basic example with melody pattern ====&lt;br /&gt;
&lt;br /&gt;
A very optimized and dirty variant of producing sound with the speaker is this 12 byte snippet (sound routine from [http://www.pouet.net/prod.php?which=67829 the tiny intro &amp;quot;darkweb&amp;quot;]):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;hlt				; sync to timer1&lt;br /&gt;
inc bx			; increment our counter&lt;br /&gt;
mov ax,bx		; work with a copy&lt;br /&gt;
or al,0x4B      ; melody pattern + 2 LSB for speaker link&lt;br /&gt;
out 0x42,al		; set new countdown for timer2 (two passes)&lt;br /&gt;
out 0x61,al		; link timer2 to PC speaker (2 LSBs are 1)&lt;br /&gt;
jmp si			; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Instead of sending low and high byte of our divisor directly in succession, we do it the &amp;quot;two path&amp;quot; way. That reduces the amount of possible frequencies to 255, which is still good enough for some rough sounds. Linking the timer to the PC speaker might not be obvious : Normally you would read the value of port 0x61, set the two least significant bits to TRUE and write the value again. You can save on all of this, if you just send the &amp;quot;two path&amp;quot; value which you just used for the timer if that value has the two least significant bits already set (''or al,0x4B'' does this). Be aware that port 0x61 does many things apart from just connecting the timer to the speaker. A useful resource for ports in general is the [http://bochs.sourceforge.net/techspec/PORTS.LST Bochs Ports List], for port 0x61 it displays:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
''0061	w	KB controller port B (ISA, EISA)   (PS/2 port A is at 0092)&lt;br /&gt;
&lt;br /&gt;
		system control port for compatibility with 8255&lt;br /&gt;
&lt;br /&gt;
		bit 7	(1= IRQ 0 reset )&lt;br /&gt;
&lt;br /&gt;
		bit 6-4    reserved&lt;br /&gt;
&lt;br /&gt;
		bit 3 = 1  channel check enable&lt;br /&gt;
&lt;br /&gt;
		bit 2 = 1  parity check enable&lt;br /&gt;
&lt;br /&gt;
		'''bit 1 = 1  speaker data enable'''&lt;br /&gt;
&lt;br /&gt;
		'''bit 0 = 1  timer 2 gate to speaker enable''' ''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So if you experience strange things with highly optimized pc speaker output, revert to the safe way. The described way works with real hardware and DosBox. Unfortunately, both Orcacle Virtual Box with MsDos 6.22 and Windows XP NTVDM seem not to properly emulate PC speakers (Investigation and citation needed here!)&lt;br /&gt;
&lt;br /&gt;
==== Simple deep sound in 8 bytes ====&lt;br /&gt;
&lt;br /&gt;
One of the smallest possible PC speaker sound generation might be this 8 byte snippet :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;dec ax			; AX initially 0000h -&amp;gt; AL = 0xFF&lt;br /&gt;
out 42h,al		; change divisor of timer2 to 0xFFFF&lt;br /&gt;
out 42h,al		; resulting in a very low frequency&lt;br /&gt;
out 61h,al		; 2 LSBs are set, connect timer to speaker&lt;br /&gt;
ret				; quit&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Note: This may fail on actual hardware, as there might not be time for the bus to settle between the consecutive code&amp;gt;out 42h,al&amp;lt;/code&amp;gt; statements.)&lt;br /&gt;
&lt;br /&gt;
=== COVOX output (aka LPT DAC) ===&lt;br /&gt;
&lt;br /&gt;
It is possible to output to an LPT-connected DAC (&amp;quot;[https://en.wikipedia.org/wiki/Covox_Speech_Thing COVOX]&amp;quot;) in a tinyprog.  A proof-of-concept example is [http://www.pouet.net/prod.php?which=57991 Express Train 125] which uses COVOX for sound generation.  &lt;br /&gt;
&lt;br /&gt;
This method follows the [http://countercomplex.blogspot.com/2011/10/algorithmic-symphonies-from-one-line-of.html &amp;quot;audio from one line of C code&amp;quot;] style of sound generation.  [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 A pouet discussion] exists for more background information.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Output&amp;diff=295</id>
		<title>Output</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Output&amp;diff=295"/>
				<updated>2016-08-20T01:26:53Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Simple deep sound in 8 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Outputting to the screen ==&lt;br /&gt;
&lt;br /&gt;
First, be aware of the [http://img.tfd.com/cde/MEMMAP.GIF MSDOS memory layout]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in Textmode (80x25) ===&lt;br /&gt;
&lt;br /&gt;
==== Hello World / High Level function ====&lt;br /&gt;
&lt;br /&gt;
Here's an obligatory &amp;quot;Hello World&amp;quot; program in text mode, using a [http://www.ctyme.com/intr/rb-2562.htm &amp;quot;high level&amp;quot; MS-DOS function]. With a small optimization already included (using &amp;lt;code&amp;gt;XCHG BP,AX&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;MOV AH,09h&amp;lt;/code&amp;gt;), this snippet is 20 bytes in size.&lt;br /&gt;
&lt;br /&gt;
[[File:Hello world.png|thumb|Hello World!]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h			; we start at CS:100h&lt;br /&gt;
xchg 	bp,ax		; already a trick, puts 09h into AH&lt;br /&gt;
mov		dx,text		; DX expects the adress of a $ terminated string&lt;br /&gt;
int 	21h			; call the DOS function (AH = 09h)&lt;br /&gt;
ret					; quit&lt;br /&gt;
text:&lt;br /&gt;
db 'Hello World!$'&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Of course, this gets shorter with each byte you remove from the text itself. Now let's look into arbitrary screen access. Right after the start of your program you are in mode 3, that is 80x25 in 16 colors. See the [http://www.columbia.edu/~em36/wpdos/videomodes.txt Video Modes List] [[File:Drawchar example.png|thumb|draw char example]] So, to show something on the screen, you would need to set a segment register to 0xB800, then write values into this segment. &lt;br /&gt;
&lt;br /&gt;
==== Low level access ====&lt;br /&gt;
&lt;br /&gt;
The following three snippets showcase how to draw a red smiley in three different ways. All example snippets are meant to be standalone programs, starting with the first instruction and nothing before it. The target coordinate (40,12) is about the middle of the screen. We need a multiplier 2 since one char needs two bytes in memory (char and color is a byte each). The high byte 0x04 means red (4) on black (0) while the 0x01 is the first ASCII char - a smiley.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop ds&lt;br /&gt;
mov bx,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
mov [bx],ax&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop es&lt;br /&gt;
mov di,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
stosw&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push ss&lt;br /&gt;
push 0xb800&lt;br /&gt;
pop ss&lt;br /&gt;
mov sp,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
push ax&lt;br /&gt;
pop ss&lt;br /&gt;
int 0x20&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You might notice that the ''push &amp;lt;word&amp;gt;'' + ''pop seg_reg'' combination is always the same and occupies four bytes alltogether. If correct alignment is not important to you and you really just want ''any'' pointer to the screen, there is another way to get a valid one:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
les bx,[si]&lt;br /&gt;
nop&lt;br /&gt;
stosb&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That's also four bytes, but it already has the &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; opcode (for putting something onto the screen) integrated and even one slot free for another one-byte-instruction. It works because &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially points to the start of our code, and &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; has the hexadecimal representation of &amp;lt;code&amp;gt;0AAh&amp;lt;/code&amp;gt;. After the first command, the segment register &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; contains the value &amp;lt;code&amp;gt;0AA90h&amp;lt;/code&amp;gt;. If you repeatedly write something to the screen with &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; you will eventually reach the &amp;lt;code&amp;gt;0B800h&amp;lt;/code&amp;gt; segment and chars will appear on the screen. With a careful selection of the free one-byte-opcode you can also reintroduce some alignment. This works also with the &amp;lt;code&amp;gt;stosw&amp;lt;/code&amp;gt; opcode &amp;lt;code&amp;gt;0ABh&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Alternative high level functions  ====&lt;br /&gt;
&lt;br /&gt;
Besides the direct way of accessing memory there are also other ways of bringing char to the screen (f.e)&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-4124.htm INT 29h]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2558.htm INT 21h AH=6]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2562.htm INT 21h AH=9]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in mode 13h (320x200) ===&lt;br /&gt;
&lt;br /&gt;
==== Basic pixel output ====&lt;br /&gt;
&lt;br /&gt;
The videomemory for mode 13h is located at segment 0xA000, so you need to assign this value to a segment register. Also, after the start of your program you are normally still in textmode, so you need to switch to the videomode. The following snippet does both:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13 &lt;br /&gt;
int 0x10     ; AH = 0 means : set video mode to AL = 0x13 (320 x 200 pixels in 256 colors)&lt;br /&gt;
push 0xA000  ; put value on the stack&lt;br /&gt;
pop es       ; pop the top stack value into segment register ES&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You're free to use any of the segment register / opcode combinations to write to the screen&lt;br /&gt;
* &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;mov&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Let's add some code that actually draws something on the screen, the following program occupies 23 bytes and draws a fullscreen XOR texture&lt;br /&gt;
[[File:Mode13h-example-xor.png|left|bottom|thumb|mode13h-example-xor]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13&lt;br /&gt;
int 0x10&lt;br /&gt;
push 0xa000&lt;br /&gt;
pop es&lt;br /&gt;
X: cwd			; &amp;quot;clear&amp;quot; DX (if AH &amp;lt; 0x7F)&lt;br /&gt;
mov ax,di		; get screen position into AX&lt;br /&gt;
mov bx,320		; get screen width into BX&lt;br /&gt;
div bx			; divide, to get row and column&lt;br /&gt;
xor ax,dx		; the famous XOR pattern&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
stosb			; finally, draw to the screen&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that there is a different way of preparing the segment register, instead of :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
you can also do :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov ah,0xA0&lt;br /&gt;
mov es,ax&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
both variations occupy 4 bytes, but the latter is executable on processor architectures where ''push &amp;lt;word&amp;gt;'' is not available.&lt;br /&gt;
&lt;br /&gt;
==== Alternative way of pixel plotting and optimization ====&lt;br /&gt;
&lt;br /&gt;
Now let's optimize on the snippet. First, we can adapt the &amp;quot;LES&amp;quot; trick from the textmode section. We just exchange &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
with:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;les bx,[bx]&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
to save two bytes. This works because BX is 0x0000 at start and thus, accesses the region ''before'' our code, which is called [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix]. The two bytes that are put into the segment register ES are bytes 2 and 3  = ''&amp;quot;Segment of the first byte beyond the memory allocated to the program&amp;quot;'' which is usually 0x9FFF. That is just off by one to our desired 0xA000. Unfortunately that means a 16 pixel offset, so if screen alignment means something to you, you can't use this optimization. Also, said two bytes are not always 0x9FFF; for example, if resident programs are above the ''&amp;quot;memory allocated to the program&amp;quot;'' (FreeDos), their content is overwritten if we take their base as our video memory base.&lt;br /&gt;
&lt;br /&gt;
Second, we can use an alternative way of putting pixels to the screen, subfunction AH = 0x0C of int 0x10. Also, instead of constructing row and column from the screen pointer, we can use some interesting properties of the screenwidth regarding logical operations. This results in the following 16 byte program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             ; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov al,0x13&lt;br /&gt;
X: int 0x10		; set video mode AND draw pixel&lt;br /&gt;
inc cx			; increment column&lt;br /&gt;
mov ax,cx		; get column in AH&lt;br /&gt;
xor al,ah		; the famous XOR pattern&lt;br /&gt;
mov ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first optimization is the double usage of the same &amp;quot;int 0x10&amp;quot; as setting the videomode and drawing the pixel. The subfunction AH = 0x0C expects row and column in DX and CX. Since the screenwidth is 320, which is 5 * 64, we can ignore the row and just works with the column, if we use logical operations and just use bit 0-6 of the result. The subfunction AH = 0x0C allows for unbounded column values in CX (up to 65535) and correctly &amp;quot;wraps&amp;quot; it internally without an error.&lt;br /&gt;
&lt;br /&gt;
The major drawback of the &amp;quot;subfunction AH = 0x0C&amp;quot; approach is performance loss. While DosBox and many emulators perform just fine, real hardware will draw much much slower based on the Video BIOS.&lt;br /&gt;
&lt;br /&gt;
==== Basic Animation and user interaction ====&lt;br /&gt;
&lt;br /&gt;
Now let's add the convenient check for the ESC key and also add a simple animation. The &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; register is used as frame counter and incremented after the pixel counter &amp;lt;code&amp;gt;CX&amp;lt;/code&amp;gt; ran through all 65536 values via &amp;lt;code&amp;gt;LOOP&amp;lt;/code&amp;gt;. This frame counter is then added to the column. The resulting program is now 25 bytes in size :&lt;br /&gt;
&lt;br /&gt;
[[File:Xor anim example.gif|thumb]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Producing sound ==&lt;br /&gt;
&lt;br /&gt;
=== MIDI notes ===&lt;br /&gt;
&lt;br /&gt;
Creating sounds with MIDI requires a bit more preparation, but once you're familiar with it, it's even simpler than PC Speaker sound, because you basically don't have to ''create'' the sound, you just have to ''trigger'' it. For the start, you have to know, that there is a lot of [https://en.wikipedia.org/wiki/General_MIDI different instruments] and a [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined way of communication]. Imagine the MIDI interface like a keyboard, you tell it which button/key you want to press, which knob to twist, and sometimes, how hard.&lt;br /&gt;
&lt;br /&gt;
==== Single piano note ====&lt;br /&gt;
&lt;br /&gt;
Let's start of with a simple example, playing a single note on the piano :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh		;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h	;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 		;	send !&lt;br /&gt;
dec 	dx			;	MIDI Data Port ( = 330h )&lt;br /&gt;
mov 	al,	90h		;	send note on channel ZERO - command&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
mov 	al,	56h		;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al		; 	send !&lt;br /&gt;
mov 	al,	67h		;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
ret					; 	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In short: you turn your keyboard on (switching to UART mode), then press a KEY with a certain VOLUME on channel ZERO, then exit. Besides switching to UART mode, all this communication uses the port &amp;lt;code&amp;gt;330h&amp;lt;/code&amp;gt;. This example will work on DosBox but not on Windows XP NTVDM: for [http://www.pouet.net/topic.php?which=10720&amp;amp;page=1 still unclear reasons], the NTVDM emulation delays the note until it receives a second one. The simplest way of at least hearing something is to repeatedly play notes, like in the following example :&lt;br /&gt;
&lt;br /&gt;
==== Repeated piano notes ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh			;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h		;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 			;	send !&lt;br /&gt;
dec 	dx				;	MIDI Data Port ( = 330h )&lt;br /&gt;
main:&lt;br /&gt;
mov 	al,	90h			;	send note on channel 0 - command&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	al,	56h			;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al			; 	send !&lt;br /&gt;
mov 	al,	67h			;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,	[fs:0x46c]	;	read timer&lt;br /&gt;
test 	al,	3			;	skip 3 values&lt;br /&gt;
jnz 	_wait			;&lt;br /&gt;
inc 	byte [fs:0x46c]	; 	inc manually to prevent retrigger&lt;br /&gt;
in 		al,	0x60		;	check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			;	no? repeat&lt;br /&gt;
ret						;	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, enriched with synchronizing against the timer and checking for the ESC key. It works on both DosBox and Windows XP NTVDM and plays a note on the Piano repeatedly.&lt;br /&gt;
&lt;br /&gt;
==== Repeated notes of other instruments ====&lt;br /&gt;
&lt;br /&gt;
While hitting one key repeatedly is not really interesting in general, it can produce decent results when doing it with the right instrument activated, like it was done with the &amp;quot;French Horn&amp;quot; in [https://www.youtube.com/watch?v=tsbxdjP9l50 Timelord (by Baudsurfer)]. Apart from just changing the instrument, let's also optimize a little bit on the size: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
start:	&lt;br /&gt;
mov 	si,data			; init pointer for outsb&lt;br /&gt;
mov 	dx,330h			; change to data port&lt;br /&gt;
mov 	cl,5			; play our music data&lt;br /&gt;
rep 	outsb			; (see below at &amp;quot;data&amp;quot; label)&lt;br /&gt;
inc 	dx				; switch to control port&lt;br /&gt;
outsb					; change to mode &amp;quot;UART&amp;quot;&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,[fs:0x46c]	; read timer value&lt;br /&gt;
cmp	 	al,bl			; wait until...&lt;br /&gt;
jz 		_wait			; ...timer value changed&lt;br /&gt;
xchg 	bx,ax			; save old timer value&lt;br /&gt;
in 		al,0x60			; check for ...&lt;br /&gt;
dec 	al				; ... ESC key&lt;br /&gt;
jnz 	start			; otherwise : repeat &lt;br /&gt;
dec 	dx				; switch to data port again&lt;br /&gt;
outsb					; stop all ... &lt;br /&gt;
outsb					; ... notes played ...&lt;br /&gt;
outsb					; ... on channel 3&lt;br /&gt;
data: 	&lt;br /&gt;
db		0c3h			; change instrument on channel 3&lt;br /&gt;
						; (is also &amp;quot;RET&amp;quot; for program quit)&lt;br /&gt;
db		60				; to &amp;quot;French Horn&amp;quot;&lt;br /&gt;
db		93h				; play note on channel 3&lt;br /&gt;
db		35				; deep &amp;quot;b&amp;quot; = note number 35&lt;br /&gt;
db		127				; play with volume = 127&lt;br /&gt;
db		3fh				; change mode to &amp;quot;UART&amp;quot;&lt;br /&gt;
db		0b3h			; control change on channel 3&lt;br /&gt;
db		123				; Channel Mode Message &amp;quot;All Notes Off&amp;quot;&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, with changed instrument, structuring the MIDI data into a data section, optimizing the output with the usage of &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;out dx,al&amp;lt;/code&amp;gt;, and finalizing the program with a special command to turn ''All Notes Off''. This is necessary for all instruments which don't stop by themself. In all the previous examples, we sent the &amp;quot;NOTE ON&amp;quot; command (&amp;lt;code&amp;gt;9Xh&amp;lt;/code&amp;gt;), but not the according &amp;quot;NOTE OFF&amp;quot; command (&amp;lt;code&amp;gt;8Xh&amp;lt;/code&amp;gt;). Also, the note is now played on channel &amp;lt;code&amp;gt;03h&amp;lt;/code&amp;gt;, since the commandbyte for changing an instrument on channel 3 is &amp;lt;code&amp;gt;0C3h&amp;lt;/code&amp;gt; which is also &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; and can be reused. If this looks complicated at first, always remember, it's just sending [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined commands] to a single port.&lt;br /&gt;
&lt;br /&gt;
==== The drum channel ====&lt;br /&gt;
&lt;br /&gt;
Now, that you're aware that there are different channels (overall: 16) to play notes on, how would you like a channel &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt; specifically for 'Drums' ? The following example plays a track of drum notes repeatedly, while further optimizing for size :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
aas						; 3fh = &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
cwd						; 99h = &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
db 		42,38,42,35		; the drum notes (kick, snare, hihat)&lt;br /&gt;
mov 	dx,0x331		; MIDI Control Port&lt;br /&gt;
outsb					; send &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
dec 	dx				; switch to MIDI data port&lt;br /&gt;
outsb					; send &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
mov		al,[fs:0x46c]	; read timer&lt;br /&gt;
test	al,3&lt;br /&gt;
jnz 	main			; skip 3 values&lt;br /&gt;
inc 	byte [fs:0x46c] ; inc manually to prevent retrigger&lt;br /&gt;
&lt;br /&gt;
inc 	bx				; increment note counter&lt;br /&gt;
and 	bl,3			; truncate to 4 notes&lt;br /&gt;
mov 	al,[bx+si]		; read the drumnote (see above)&lt;br /&gt;
&lt;br /&gt;
out 	dx,al			; send the drum&lt;br /&gt;
mov 	al,127			; set volume to maximum&lt;br /&gt;
out 	dx,al			; send volume&lt;br /&gt;
&lt;br /&gt;
in 		al,0x60			; check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			; no? repeat&lt;br /&gt;
ret						; otherwise quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In contrast to the previous example, the data section is now at the start. That means, it's executed as code! This is dangerous of course, but also saves bytes on assigning the &amp;lt;code&amp;gt;DATA&amp;lt;/code&amp;gt; offset to &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt;. Once &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; incremented &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially two times, it is fixed and further reading from the drumdata is done with &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;. Unless you know exactly what you are doing, don't use that kind of &amp;quot;executing data&amp;quot; optimization!&amp;quot;. In this special case &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;CWD&amp;lt;/code&amp;gt; do no harm and the drum notes &amp;lt;code&amp;gt;42,38,42,35&amp;lt;/code&amp;gt; are carefully crafted and arranged to resemble the instruction &amp;lt;code&amp;gt;SUB AH,[232Ah]&amp;lt;/code&amp;gt; which does no harm either.&lt;br /&gt;
&lt;br /&gt;
==== Creating basic sound effects in 16 bytes ====&lt;br /&gt;
&lt;br /&gt;
In the [https://en.wikipedia.org/wiki/General_MIDI#Sound_Effects MIDI repertoire], there are already some sound effects available. With the &amp;quot;data execution&amp;quot; optimization above, let's fire a gunshot in 16 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;aas&lt;br /&gt;
les 	di,[bx-0x6C]&lt;br /&gt;
xor 	al,127&lt;br /&gt;
mov 	dx,0x331&lt;br /&gt;
outsb&lt;br /&gt;
dec 	dx&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
rep 	outsb&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first three instructions don't do anything (they do, but we don't care), it's just MIDI data.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
the command for switching to &amp;quot;UART&amp;quot; mode, for sending to port &amp;lt;code&amp;gt;0x331&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;les di,[bx-0x6C]&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0xc4&amp;lt;/code&amp;gt; (change instrument on channel 4), &amp;lt;code&amp;gt;0x7F&amp;lt;/code&amp;gt; (change it to &amp;quot;Gunshot&amp;quot;), &amp;lt;code&amp;gt;0x94&amp;lt;/code&amp;gt; (play note on channel 4)&lt;br /&gt;
* &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0x34&amp;lt;/code&amp;gt; (play THIS note), &amp;lt;code&amp;gt;0x7f&amp;lt;/code&amp;gt; (play it THAT loud, 127 is also the allowed maximum)&lt;br /&gt;
&lt;br /&gt;
The rest of the code basically just sends the MIDI data to the interface and exits. You can change the kind of sound effect with modifying the modbyte of the second instruction (change BX to BP or SI etc.). Changing the volume is more simple, change the byte value of &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt; to any value between 0 and 127.&lt;br /&gt;
&lt;br /&gt;
==== Procedural MIDI music generation in 64 bytes ====&lt;br /&gt;
&lt;br /&gt;
With all the above you should now be able to follow the next snippet [http://www.pouet.net/prod.php?which=66313 Descent OST], a small framework for procedural MIDI sound generation in 64 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;; &amp;quot;Descent OST&amp;quot;, a 62 byte MIDI music player for MSDOS&lt;br /&gt;
; created by HellMood/DESiRE (C)2015&lt;br /&gt;
; this is the extracted music routine used in &amp;quot;Descent&amp;quot; &lt;br /&gt;
; it is a procedural MIDI algorithm which sticks a&lt;br /&gt;
; subroutine to the DOS timer (interrupt 0x1C)&lt;br /&gt;
; the registered routine is called ~18.2 times per second&lt;br /&gt;
&lt;br /&gt;
; developed for use with &amp;quot;NASM&amp;quot;, &lt;br /&gt;
; see http://sourceforge.net/projects/nasm/files/&lt;br /&gt;
&lt;br /&gt;
%define		 	rhythmPattern			0b11&lt;br /&gt;
; with &amp;quot;rhythmPattern&amp;quot;, you define how often a note is played&lt;br /&gt;
; generally, higher values and values containing many &amp;quot;ones&amp;quot;&lt;br /&gt;
; in binary representation, will result in faster play&lt;br /&gt;
; for example &amp;quot;0b11&amp;quot; will play every 4th note&lt;br /&gt;
%define 		baseInstrument			9&lt;br /&gt;
; defines the number of the first instrument used.&lt;br /&gt;
; see http://www.midi.org/techspecs/gm1sound.php for a full list&lt;br /&gt;
; keep in mind, that there are only a few instrument blocks&lt;br /&gt;
; whose sounds stop after a while. You won't get good results&lt;br /&gt;
; from strings etc. just a mess of overlayed sounds&lt;br /&gt;
%define 		numInstruments			7&lt;br /&gt;
; defines how many instrument are used. keep in mind, that &amp;quot;rhythm-&lt;br /&gt;
; Pattern&amp;quot; has influence on the picked instrument. the instruments&lt;br /&gt;
; from 9 to 9+7 are called &amp;quot;chromatic percussion&amp;quot;&lt;br /&gt;
%define 		noteStep 				5&lt;br /&gt;
; defines the basic difference from on note to the next. recommended&lt;br /&gt;
; values here are (mainly) 3,4 and 5 for music theoretic reasons&lt;br /&gt;
; but feel free to play around =)&lt;br /&gt;
%define 		noteRange				12&lt;br /&gt;
; after adding the noteStep, the note value is &amp;quot;mod&amp;quot;ded with&lt;br /&gt;
; the &amp;quot;noteRange&amp;quot;. 12 means octave, which results in very harmonic&lt;br /&gt;
; scales&lt;br /&gt;
%define 		noteSpread				3&lt;br /&gt;
; the third step spreads the notes over the tonal spectrum, you may&lt;br /&gt;
; want to keep &amp;quot;noteSpread&amp;quot; * &amp;quot;noteRange&amp;quot; round about 30-60.&lt;br /&gt;
%define 		baseNote				40&lt;br /&gt;
; the general tone height of everything. some instruments don't play&lt;br /&gt;
; arbitrary deep notes correctly, and too high notes cause ear bleeding&lt;br /&gt;
; adjust with care ;)&lt;br /&gt;
&lt;br /&gt;
; WARNING : after exiting the program, the timer interrupt is still active&lt;br /&gt;
; i strongly recommend to reboot or restart DOSBOX!&lt;br /&gt;
&lt;br /&gt;
; ADVISE : Yes, there are music- and math-related things going on here&lt;br /&gt;
; if you're not into music theory, cycle of fifth, and the like, it maybe&lt;br /&gt;
; better to just play around with the parameters, rather then understanding them&lt;br /&gt;
; just change stuff slowly, and eventually you will get &amp;quot;there&amp;quot;&lt;br /&gt;
; wherever that is ;)&lt;br /&gt;
&lt;br /&gt;
org 0x100&lt;br /&gt;
xchg cx,ax					; set our second counter to zero&lt;br /&gt;
mov dx,music			&lt;br /&gt;
mov ax,0x251C				; mode &amp;quot;0x25&amp;quot; , &amp;quot;0x1C&amp;quot; = change address of timer interrupt&lt;br /&gt;
int 0x21					; see http://mprolab.teipir.gr/vivlio80X86/dosints.pdf&lt;br /&gt;
S:&lt;br /&gt;
in ax,0x60					; wait for &amp;quot;ESC&amp;quot; press, then exit&lt;br /&gt;
dec al						; music plays on anyway, this is just for&lt;br /&gt;
jnz S						; keeping the music exactly as in &amp;quot;Descent&amp;quot;&lt;br /&gt;
ret							; return to prompt&lt;br /&gt;
music:&lt;br /&gt;
inc bx						; increment our first counter (starts at zero)&lt;br /&gt;
test bl,byte rhythmPattern	; play a note every 4th time tick&lt;br /&gt;
jnz nomusic					; otherwise do nothing&lt;br /&gt;
mov dx,0x331&lt;br /&gt;
mov al,0x3F&lt;br /&gt;
out dx,al&lt;br /&gt;
dec dx&lt;br /&gt;
mov al,0xC0					; change instrument on channel 0...&lt;br /&gt;
out dx,al&lt;br /&gt;
mov ax,bx					&lt;br /&gt;
aam byte numInstruments&lt;br /&gt;
add al,byte baseInstrument	; ...to this instrument&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0x90					; play note on channel 0 ...&lt;br /&gt;
out dx,al&lt;br /&gt;
add cl,byte noteStep			 &lt;br /&gt;
mov al,cl&lt;br /&gt;
aam byte noteRange&lt;br /&gt;
imul ax,noteSpread&lt;br /&gt;
add al,baseNote				; ... play THIS note&lt;br /&gt;
out dx,al&lt;br /&gt;
neg al						; (play deeper notes louder = add bass)&lt;br /&gt;
add al,127+39				; ... play it THAT loud&lt;br /&gt;
out dx,al&lt;br /&gt;
nomusic:&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PC Speaker ===&lt;br /&gt;
&lt;br /&gt;
Producing sound with PC speakers is incredibly easy. Basically, you set a system timer to a desired frequency, then connect this timer to the speaker. [http://wiki.osdev.org/PC_Speaker The PC Speaker Article] from OSDEV Wiki has the details about it. An example for a tiny intro that uses PC speaker music is [http://www.pouet.net/prod.php?which=67833 SpeaCore]&lt;br /&gt;
&lt;br /&gt;
==== Basic example with melody pattern ====&lt;br /&gt;
&lt;br /&gt;
A very optimized and dirty variant of producing sound with the speaker is this 12 byte snippet (sound routine from [http://www.pouet.net/prod.php?which=67829 the tiny intro &amp;quot;darkweb&amp;quot;]):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;hlt				; sync to timer1&lt;br /&gt;
inc bx			; increment our counter&lt;br /&gt;
mov ax,bx		; work with a copy&lt;br /&gt;
or al,0x4B      ; melody pattern + 2 LSB for speaker link&lt;br /&gt;
out 0x42,al		; set new countdown for timer2 (two passes)&lt;br /&gt;
out 0x61,al		; link timer2 to PC speaker (2 LSBs are 1)&lt;br /&gt;
jmp si			; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Instead of sending low and high byte of our divisor directly in succession, we do it the &amp;quot;two path&amp;quot; way. That reduces the amount of possible frequencies to 255, which is still good enough for some rough sounds. Linking the timer to the PC speaker might not be obvious : Normally you would read the value of port 0x61, set the two least significant bits to TRUE and write the value again. You can save on all of this, if you just send the &amp;quot;two path&amp;quot; value which you just used for the timer if that value has the two least significant bits already set (''or al,0x4B'' does this). Be aware that port 0x61 does many things apart from just connecting the timer to the speaker. A useful resource for ports in general is the [http://bochs.sourceforge.net/techspec/PORTS.LST Bochs Ports List], for port 0x61 it displays:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
''0061	w	KB controller port B (ISA, EISA)   (PS/2 port A is at 0092)&lt;br /&gt;
&lt;br /&gt;
		system control port for compatibility with 8255&lt;br /&gt;
&lt;br /&gt;
		bit 7	(1= IRQ 0 reset )&lt;br /&gt;
&lt;br /&gt;
		bit 6-4    reserved&lt;br /&gt;
&lt;br /&gt;
		bit 3 = 1  channel check enable&lt;br /&gt;
&lt;br /&gt;
		bit 2 = 1  parity check enable&lt;br /&gt;
&lt;br /&gt;
		'''bit 1 = 1  speaker data enable'''&lt;br /&gt;
&lt;br /&gt;
		'''bit 0 = 1  timer 2 gate to speaker enable''' ''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So if you experience strange things with highly optimized pc speaker output, revert to the safe way. The described way works with real hardware and DosBox. Unfortunately, both Orcacle Virtual Box with MsDos 6.22 and Windows XP NTVDM seem not to properly emulate PC speakers (Investigation and citation needed here!)&lt;br /&gt;
&lt;br /&gt;
==== Simple deep sound in 8 bytes ====&lt;br /&gt;
&lt;br /&gt;
One of the smallest possible PC speaker sound generation might be this 8 byte snippet :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;dec ax			; AX initially 0000h -&amp;gt; AL = 0xFF&lt;br /&gt;
out 42h,al		; change divisor of timer2 to 0xFFFF&lt;br /&gt;
out 42h,al		; resulting in a very low frequency&lt;br /&gt;
out 61h,al		; 2 LSBs are set, connect timer to speaker&lt;br /&gt;
ret				; quit&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Note: This may fail on actual older hardware, as there might not be time for the bus to settle between the consecutive code&amp;gt;out 42h,al&amp;lt;/code&amp;gt; statements.)&lt;br /&gt;
&lt;br /&gt;
=== COVOX output (aka LPT DAC) ===&lt;br /&gt;
&lt;br /&gt;
It is possible to output to an LPT-connected DAC (&amp;quot;[https://en.wikipedia.org/wiki/Covox_Speech_Thing COVOX]&amp;quot;) in a tinyprog.  A proof-of-concept example is [http://www.pouet.net/prod.php?which=57991 Express Train 125] which uses COVOX for sound generation.  &lt;br /&gt;
&lt;br /&gt;
This method follows the [http://countercomplex.blogspot.com/2011/10/algorithmic-symphonies-from-one-line-of.html &amp;quot;audio from one line of C code&amp;quot;] style of sound generation.  [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 A pouet discussion] exists for more background information.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Output&amp;diff=294</id>
		<title>Output</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Output&amp;diff=294"/>
				<updated>2016-08-20T01:26:21Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Simple deep sound in 8 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Outputting to the screen ==&lt;br /&gt;
&lt;br /&gt;
First, be aware of the [http://img.tfd.com/cde/MEMMAP.GIF MSDOS memory layout]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in Textmode (80x25) ===&lt;br /&gt;
&lt;br /&gt;
==== Hello World / High Level function ====&lt;br /&gt;
&lt;br /&gt;
Here's an obligatory &amp;quot;Hello World&amp;quot; program in text mode, using a [http://www.ctyme.com/intr/rb-2562.htm &amp;quot;high level&amp;quot; MS-DOS function]. With a small optimization already included (using &amp;lt;code&amp;gt;XCHG BP,AX&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;MOV AH,09h&amp;lt;/code&amp;gt;), this snippet is 20 bytes in size.&lt;br /&gt;
&lt;br /&gt;
[[File:Hello world.png|thumb|Hello World!]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h			; we start at CS:100h&lt;br /&gt;
xchg 	bp,ax		; already a trick, puts 09h into AH&lt;br /&gt;
mov		dx,text		; DX expects the adress of a $ terminated string&lt;br /&gt;
int 	21h			; call the DOS function (AH = 09h)&lt;br /&gt;
ret					; quit&lt;br /&gt;
text:&lt;br /&gt;
db 'Hello World!$'&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Of course, this gets shorter with each byte you remove from the text itself. Now let's look into arbitrary screen access. Right after the start of your program you are in mode 3, that is 80x25 in 16 colors. See the [http://www.columbia.edu/~em36/wpdos/videomodes.txt Video Modes List] [[File:Drawchar example.png|thumb|draw char example]] So, to show something on the screen, you would need to set a segment register to 0xB800, then write values into this segment. &lt;br /&gt;
&lt;br /&gt;
==== Low level access ====&lt;br /&gt;
&lt;br /&gt;
The following three snippets showcase how to draw a red smiley in three different ways. All example snippets are meant to be standalone programs, starting with the first instruction and nothing before it. The target coordinate (40,12) is about the middle of the screen. We need a multiplier 2 since one char needs two bytes in memory (char and color is a byte each). The high byte 0x04 means red (4) on black (0) while the 0x01 is the first ASCII char - a smiley.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop ds&lt;br /&gt;
mov bx,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
mov [bx],ax&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop es&lt;br /&gt;
mov di,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
stosw&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push ss&lt;br /&gt;
push 0xb800&lt;br /&gt;
pop ss&lt;br /&gt;
mov sp,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
push ax&lt;br /&gt;
pop ss&lt;br /&gt;
int 0x20&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You might notice that the ''push &amp;lt;word&amp;gt;'' + ''pop seg_reg'' combination is always the same and occupies four bytes alltogether. If correct alignment is not important to you and you really just want ''any'' pointer to the screen, there is another way to get a valid one:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
les bx,[si]&lt;br /&gt;
nop&lt;br /&gt;
stosb&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That's also four bytes, but it already has the &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; opcode (for putting something onto the screen) integrated and even one slot free for another one-byte-instruction. It works because &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially points to the start of our code, and &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; has the hexadecimal representation of &amp;lt;code&amp;gt;0AAh&amp;lt;/code&amp;gt;. After the first command, the segment register &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; contains the value &amp;lt;code&amp;gt;0AA90h&amp;lt;/code&amp;gt;. If you repeatedly write something to the screen with &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; you will eventually reach the &amp;lt;code&amp;gt;0B800h&amp;lt;/code&amp;gt; segment and chars will appear on the screen. With a careful selection of the free one-byte-opcode you can also reintroduce some alignment. This works also with the &amp;lt;code&amp;gt;stosw&amp;lt;/code&amp;gt; opcode &amp;lt;code&amp;gt;0ABh&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Alternative high level functions  ====&lt;br /&gt;
&lt;br /&gt;
Besides the direct way of accessing memory there are also other ways of bringing char to the screen (f.e)&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-4124.htm INT 29h]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2558.htm INT 21h AH=6]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2562.htm INT 21h AH=9]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in mode 13h (320x200) ===&lt;br /&gt;
&lt;br /&gt;
==== Basic pixel output ====&lt;br /&gt;
&lt;br /&gt;
The videomemory for mode 13h is located at segment 0xA000, so you need to assign this value to a segment register. Also, after the start of your program you are normally still in textmode, so you need to switch to the videomode. The following snippet does both:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13 &lt;br /&gt;
int 0x10     ; AH = 0 means : set video mode to AL = 0x13 (320 x 200 pixels in 256 colors)&lt;br /&gt;
push 0xA000  ; put value on the stack&lt;br /&gt;
pop es       ; pop the top stack value into segment register ES&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You're free to use any of the segment register / opcode combinations to write to the screen&lt;br /&gt;
* &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;mov&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Let's add some code that actually draws something on the screen, the following program occupies 23 bytes and draws a fullscreen XOR texture&lt;br /&gt;
[[File:Mode13h-example-xor.png|left|bottom|thumb|mode13h-example-xor]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13&lt;br /&gt;
int 0x10&lt;br /&gt;
push 0xa000&lt;br /&gt;
pop es&lt;br /&gt;
X: cwd			; &amp;quot;clear&amp;quot; DX (if AH &amp;lt; 0x7F)&lt;br /&gt;
mov ax,di		; get screen position into AX&lt;br /&gt;
mov bx,320		; get screen width into BX&lt;br /&gt;
div bx			; divide, to get row and column&lt;br /&gt;
xor ax,dx		; the famous XOR pattern&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
stosb			; finally, draw to the screen&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that there is a different way of preparing the segment register, instead of :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
you can also do :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov ah,0xA0&lt;br /&gt;
mov es,ax&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
both variations occupy 4 bytes, but the latter is executable on processor architectures where ''push &amp;lt;word&amp;gt;'' is not available.&lt;br /&gt;
&lt;br /&gt;
==== Alternative way of pixel plotting and optimization ====&lt;br /&gt;
&lt;br /&gt;
Now let's optimize on the snippet. First, we can adapt the &amp;quot;LES&amp;quot; trick from the textmode section. We just exchange &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
with:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;les bx,[bx]&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
to save two bytes. This works because BX is 0x0000 at start and thus, accesses the region ''before'' our code, which is called [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix]. The two bytes that are put into the segment register ES are bytes 2 and 3  = ''&amp;quot;Segment of the first byte beyond the memory allocated to the program&amp;quot;'' which is usually 0x9FFF. That is just off by one to our desired 0xA000. Unfortunately that means a 16 pixel offset, so if screen alignment means something to you, you can't use this optimization. Also, said two bytes are not always 0x9FFF; for example, if resident programs are above the ''&amp;quot;memory allocated to the program&amp;quot;'' (FreeDos), their content is overwritten if we take their base as our video memory base.&lt;br /&gt;
&lt;br /&gt;
Second, we can use an alternative way of putting pixels to the screen, subfunction AH = 0x0C of int 0x10. Also, instead of constructing row and column from the screen pointer, we can use some interesting properties of the screenwidth regarding logical operations. This results in the following 16 byte program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             ; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov al,0x13&lt;br /&gt;
X: int 0x10		; set video mode AND draw pixel&lt;br /&gt;
inc cx			; increment column&lt;br /&gt;
mov ax,cx		; get column in AH&lt;br /&gt;
xor al,ah		; the famous XOR pattern&lt;br /&gt;
mov ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first optimization is the double usage of the same &amp;quot;int 0x10&amp;quot; as setting the videomode and drawing the pixel. The subfunction AH = 0x0C expects row and column in DX and CX. Since the screenwidth is 320, which is 5 * 64, we can ignore the row and just works with the column, if we use logical operations and just use bit 0-6 of the result. The subfunction AH = 0x0C allows for unbounded column values in CX (up to 65535) and correctly &amp;quot;wraps&amp;quot; it internally without an error.&lt;br /&gt;
&lt;br /&gt;
The major drawback of the &amp;quot;subfunction AH = 0x0C&amp;quot; approach is performance loss. While DosBox and many emulators perform just fine, real hardware will draw much much slower based on the Video BIOS.&lt;br /&gt;
&lt;br /&gt;
==== Basic Animation and user interaction ====&lt;br /&gt;
&lt;br /&gt;
Now let's add the convenient check for the ESC key and also add a simple animation. The &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; register is used as frame counter and incremented after the pixel counter &amp;lt;code&amp;gt;CX&amp;lt;/code&amp;gt; ran through all 65536 values via &amp;lt;code&amp;gt;LOOP&amp;lt;/code&amp;gt;. This frame counter is then added to the column. The resulting program is now 25 bytes in size :&lt;br /&gt;
&lt;br /&gt;
[[File:Xor anim example.gif|thumb]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Producing sound ==&lt;br /&gt;
&lt;br /&gt;
=== MIDI notes ===&lt;br /&gt;
&lt;br /&gt;
Creating sounds with MIDI requires a bit more preparation, but once you're familiar with it, it's even simpler than PC Speaker sound, because you basically don't have to ''create'' the sound, you just have to ''trigger'' it. For the start, you have to know, that there is a lot of [https://en.wikipedia.org/wiki/General_MIDI different instruments] and a [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined way of communication]. Imagine the MIDI interface like a keyboard, you tell it which button/key you want to press, which knob to twist, and sometimes, how hard.&lt;br /&gt;
&lt;br /&gt;
==== Single piano note ====&lt;br /&gt;
&lt;br /&gt;
Let's start of with a simple example, playing a single note on the piano :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh		;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h	;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 		;	send !&lt;br /&gt;
dec 	dx			;	MIDI Data Port ( = 330h )&lt;br /&gt;
mov 	al,	90h		;	send note on channel ZERO - command&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
mov 	al,	56h		;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al		; 	send !&lt;br /&gt;
mov 	al,	67h		;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
ret					; 	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In short: you turn your keyboard on (switching to UART mode), then press a KEY with a certain VOLUME on channel ZERO, then exit. Besides switching to UART mode, all this communication uses the port &amp;lt;code&amp;gt;330h&amp;lt;/code&amp;gt;. This example will work on DosBox but not on Windows XP NTVDM: for [http://www.pouet.net/topic.php?which=10720&amp;amp;page=1 still unclear reasons], the NTVDM emulation delays the note until it receives a second one. The simplest way of at least hearing something is to repeatedly play notes, like in the following example :&lt;br /&gt;
&lt;br /&gt;
==== Repeated piano notes ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh			;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h		;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 			;	send !&lt;br /&gt;
dec 	dx				;	MIDI Data Port ( = 330h )&lt;br /&gt;
main:&lt;br /&gt;
mov 	al,	90h			;	send note on channel 0 - command&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	al,	56h			;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al			; 	send !&lt;br /&gt;
mov 	al,	67h			;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,	[fs:0x46c]	;	read timer&lt;br /&gt;
test 	al,	3			;	skip 3 values&lt;br /&gt;
jnz 	_wait			;&lt;br /&gt;
inc 	byte [fs:0x46c]	; 	inc manually to prevent retrigger&lt;br /&gt;
in 		al,	0x60		;	check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			;	no? repeat&lt;br /&gt;
ret						;	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, enriched with synchronizing against the timer and checking for the ESC key. It works on both DosBox and Windows XP NTVDM and plays a note on the Piano repeatedly.&lt;br /&gt;
&lt;br /&gt;
==== Repeated notes of other instruments ====&lt;br /&gt;
&lt;br /&gt;
While hitting one key repeatedly is not really interesting in general, it can produce decent results when doing it with the right instrument activated, like it was done with the &amp;quot;French Horn&amp;quot; in [https://www.youtube.com/watch?v=tsbxdjP9l50 Timelord (by Baudsurfer)]. Apart from just changing the instrument, let's also optimize a little bit on the size: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
start:	&lt;br /&gt;
mov 	si,data			; init pointer for outsb&lt;br /&gt;
mov 	dx,330h			; change to data port&lt;br /&gt;
mov 	cl,5			; play our music data&lt;br /&gt;
rep 	outsb			; (see below at &amp;quot;data&amp;quot; label)&lt;br /&gt;
inc 	dx				; switch to control port&lt;br /&gt;
outsb					; change to mode &amp;quot;UART&amp;quot;&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,[fs:0x46c]	; read timer value&lt;br /&gt;
cmp	 	al,bl			; wait until...&lt;br /&gt;
jz 		_wait			; ...timer value changed&lt;br /&gt;
xchg 	bx,ax			; save old timer value&lt;br /&gt;
in 		al,0x60			; check for ...&lt;br /&gt;
dec 	al				; ... ESC key&lt;br /&gt;
jnz 	start			; otherwise : repeat &lt;br /&gt;
dec 	dx				; switch to data port again&lt;br /&gt;
outsb					; stop all ... &lt;br /&gt;
outsb					; ... notes played ...&lt;br /&gt;
outsb					; ... on channel 3&lt;br /&gt;
data: 	&lt;br /&gt;
db		0c3h			; change instrument on channel 3&lt;br /&gt;
						; (is also &amp;quot;RET&amp;quot; for program quit)&lt;br /&gt;
db		60				; to &amp;quot;French Horn&amp;quot;&lt;br /&gt;
db		93h				; play note on channel 3&lt;br /&gt;
db		35				; deep &amp;quot;b&amp;quot; = note number 35&lt;br /&gt;
db		127				; play with volume = 127&lt;br /&gt;
db		3fh				; change mode to &amp;quot;UART&amp;quot;&lt;br /&gt;
db		0b3h			; control change on channel 3&lt;br /&gt;
db		123				; Channel Mode Message &amp;quot;All Notes Off&amp;quot;&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, with changed instrument, structuring the MIDI data into a data section, optimizing the output with the usage of &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;out dx,al&amp;lt;/code&amp;gt;, and finalizing the program with a special command to turn ''All Notes Off''. This is necessary for all instruments which don't stop by themself. In all the previous examples, we sent the &amp;quot;NOTE ON&amp;quot; command (&amp;lt;code&amp;gt;9Xh&amp;lt;/code&amp;gt;), but not the according &amp;quot;NOTE OFF&amp;quot; command (&amp;lt;code&amp;gt;8Xh&amp;lt;/code&amp;gt;). Also, the note is now played on channel &amp;lt;code&amp;gt;03h&amp;lt;/code&amp;gt;, since the commandbyte for changing an instrument on channel 3 is &amp;lt;code&amp;gt;0C3h&amp;lt;/code&amp;gt; which is also &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; and can be reused. If this looks complicated at first, always remember, it's just sending [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined commands] to a single port.&lt;br /&gt;
&lt;br /&gt;
==== The drum channel ====&lt;br /&gt;
&lt;br /&gt;
Now, that you're aware that there are different channels (overall: 16) to play notes on, how would you like a channel &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt; specifically for 'Drums' ? The following example plays a track of drum notes repeatedly, while further optimizing for size :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
aas						; 3fh = &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
cwd						; 99h = &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
db 		42,38,42,35		; the drum notes (kick, snare, hihat)&lt;br /&gt;
mov 	dx,0x331		; MIDI Control Port&lt;br /&gt;
outsb					; send &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
dec 	dx				; switch to MIDI data port&lt;br /&gt;
outsb					; send &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
mov		al,[fs:0x46c]	; read timer&lt;br /&gt;
test	al,3&lt;br /&gt;
jnz 	main			; skip 3 values&lt;br /&gt;
inc 	byte [fs:0x46c] ; inc manually to prevent retrigger&lt;br /&gt;
&lt;br /&gt;
inc 	bx				; increment note counter&lt;br /&gt;
and 	bl,3			; truncate to 4 notes&lt;br /&gt;
mov 	al,[bx+si]		; read the drumnote (see above)&lt;br /&gt;
&lt;br /&gt;
out 	dx,al			; send the drum&lt;br /&gt;
mov 	al,127			; set volume to maximum&lt;br /&gt;
out 	dx,al			; send volume&lt;br /&gt;
&lt;br /&gt;
in 		al,0x60			; check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			; no? repeat&lt;br /&gt;
ret						; otherwise quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In contrast to the previous example, the data section is now at the start. That means, it's executed as code! This is dangerous of course, but also saves bytes on assigning the &amp;lt;code&amp;gt;DATA&amp;lt;/code&amp;gt; offset to &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt;. Once &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; incremented &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially two times, it is fixed and further reading from the drumdata is done with &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;. Unless you know exactly what you are doing, don't use that kind of &amp;quot;executing data&amp;quot; optimization!&amp;quot;. In this special case &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;CWD&amp;lt;/code&amp;gt; do no harm and the drum notes &amp;lt;code&amp;gt;42,38,42,35&amp;lt;/code&amp;gt; are carefully crafted and arranged to resemble the instruction &amp;lt;code&amp;gt;SUB AH,[232Ah]&amp;lt;/code&amp;gt; which does no harm either.&lt;br /&gt;
&lt;br /&gt;
==== Creating basic sound effects in 16 bytes ====&lt;br /&gt;
&lt;br /&gt;
In the [https://en.wikipedia.org/wiki/General_MIDI#Sound_Effects MIDI repertoire], there are already some sound effects available. With the &amp;quot;data execution&amp;quot; optimization above, let's fire a gunshot in 16 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;aas&lt;br /&gt;
les 	di,[bx-0x6C]&lt;br /&gt;
xor 	al,127&lt;br /&gt;
mov 	dx,0x331&lt;br /&gt;
outsb&lt;br /&gt;
dec 	dx&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
rep 	outsb&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first three instructions don't do anything (they do, but we don't care), it's just MIDI data.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
the command for switching to &amp;quot;UART&amp;quot; mode, for sending to port &amp;lt;code&amp;gt;0x331&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;les di,[bx-0x6C]&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0xc4&amp;lt;/code&amp;gt; (change instrument on channel 4), &amp;lt;code&amp;gt;0x7F&amp;lt;/code&amp;gt; (change it to &amp;quot;Gunshot&amp;quot;), &amp;lt;code&amp;gt;0x94&amp;lt;/code&amp;gt; (play note on channel 4)&lt;br /&gt;
* &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0x34&amp;lt;/code&amp;gt; (play THIS note), &amp;lt;code&amp;gt;0x7f&amp;lt;/code&amp;gt; (play it THAT loud, 127 is also the allowed maximum)&lt;br /&gt;
&lt;br /&gt;
The rest of the code basically just sends the MIDI data to the interface and exits. You can change the kind of sound effect with modifying the modbyte of the second instruction (change BX to BP or SI etc.). Changing the volume is more simple, change the byte value of &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt; to any value between 0 and 127.&lt;br /&gt;
&lt;br /&gt;
==== Procedural MIDI music generation in 64 bytes ====&lt;br /&gt;
&lt;br /&gt;
With all the above you should now be able to follow the next snippet [http://www.pouet.net/prod.php?which=66313 Descent OST], a small framework for procedural MIDI sound generation in 64 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;; &amp;quot;Descent OST&amp;quot;, a 62 byte MIDI music player for MSDOS&lt;br /&gt;
; created by HellMood/DESiRE (C)2015&lt;br /&gt;
; this is the extracted music routine used in &amp;quot;Descent&amp;quot; &lt;br /&gt;
; it is a procedural MIDI algorithm which sticks a&lt;br /&gt;
; subroutine to the DOS timer (interrupt 0x1C)&lt;br /&gt;
; the registered routine is called ~18.2 times per second&lt;br /&gt;
&lt;br /&gt;
; developed for use with &amp;quot;NASM&amp;quot;, &lt;br /&gt;
; see http://sourceforge.net/projects/nasm/files/&lt;br /&gt;
&lt;br /&gt;
%define		 	rhythmPattern			0b11&lt;br /&gt;
; with &amp;quot;rhythmPattern&amp;quot;, you define how often a note is played&lt;br /&gt;
; generally, higher values and values containing many &amp;quot;ones&amp;quot;&lt;br /&gt;
; in binary representation, will result in faster play&lt;br /&gt;
; for example &amp;quot;0b11&amp;quot; will play every 4th note&lt;br /&gt;
%define 		baseInstrument			9&lt;br /&gt;
; defines the number of the first instrument used.&lt;br /&gt;
; see http://www.midi.org/techspecs/gm1sound.php for a full list&lt;br /&gt;
; keep in mind, that there are only a few instrument blocks&lt;br /&gt;
; whose sounds stop after a while. You won't get good results&lt;br /&gt;
; from strings etc. just a mess of overlayed sounds&lt;br /&gt;
%define 		numInstruments			7&lt;br /&gt;
; defines how many instrument are used. keep in mind, that &amp;quot;rhythm-&lt;br /&gt;
; Pattern&amp;quot; has influence on the picked instrument. the instruments&lt;br /&gt;
; from 9 to 9+7 are called &amp;quot;chromatic percussion&amp;quot;&lt;br /&gt;
%define 		noteStep 				5&lt;br /&gt;
; defines the basic difference from on note to the next. recommended&lt;br /&gt;
; values here are (mainly) 3,4 and 5 for music theoretic reasons&lt;br /&gt;
; but feel free to play around =)&lt;br /&gt;
%define 		noteRange				12&lt;br /&gt;
; after adding the noteStep, the note value is &amp;quot;mod&amp;quot;ded with&lt;br /&gt;
; the &amp;quot;noteRange&amp;quot;. 12 means octave, which results in very harmonic&lt;br /&gt;
; scales&lt;br /&gt;
%define 		noteSpread				3&lt;br /&gt;
; the third step spreads the notes over the tonal spectrum, you may&lt;br /&gt;
; want to keep &amp;quot;noteSpread&amp;quot; * &amp;quot;noteRange&amp;quot; round about 30-60.&lt;br /&gt;
%define 		baseNote				40&lt;br /&gt;
; the general tone height of everything. some instruments don't play&lt;br /&gt;
; arbitrary deep notes correctly, and too high notes cause ear bleeding&lt;br /&gt;
; adjust with care ;)&lt;br /&gt;
&lt;br /&gt;
; WARNING : after exiting the program, the timer interrupt is still active&lt;br /&gt;
; i strongly recommend to reboot or restart DOSBOX!&lt;br /&gt;
&lt;br /&gt;
; ADVISE : Yes, there are music- and math-related things going on here&lt;br /&gt;
; if you're not into music theory, cycle of fifth, and the like, it maybe&lt;br /&gt;
; better to just play around with the parameters, rather then understanding them&lt;br /&gt;
; just change stuff slowly, and eventually you will get &amp;quot;there&amp;quot;&lt;br /&gt;
; wherever that is ;)&lt;br /&gt;
&lt;br /&gt;
org 0x100&lt;br /&gt;
xchg cx,ax					; set our second counter to zero&lt;br /&gt;
mov dx,music			&lt;br /&gt;
mov ax,0x251C				; mode &amp;quot;0x25&amp;quot; , &amp;quot;0x1C&amp;quot; = change address of timer interrupt&lt;br /&gt;
int 0x21					; see http://mprolab.teipir.gr/vivlio80X86/dosints.pdf&lt;br /&gt;
S:&lt;br /&gt;
in ax,0x60					; wait for &amp;quot;ESC&amp;quot; press, then exit&lt;br /&gt;
dec al						; music plays on anyway, this is just for&lt;br /&gt;
jnz S						; keeping the music exactly as in &amp;quot;Descent&amp;quot;&lt;br /&gt;
ret							; return to prompt&lt;br /&gt;
music:&lt;br /&gt;
inc bx						; increment our first counter (starts at zero)&lt;br /&gt;
test bl,byte rhythmPattern	; play a note every 4th time tick&lt;br /&gt;
jnz nomusic					; otherwise do nothing&lt;br /&gt;
mov dx,0x331&lt;br /&gt;
mov al,0x3F&lt;br /&gt;
out dx,al&lt;br /&gt;
dec dx&lt;br /&gt;
mov al,0xC0					; change instrument on channel 0...&lt;br /&gt;
out dx,al&lt;br /&gt;
mov ax,bx					&lt;br /&gt;
aam byte numInstruments&lt;br /&gt;
add al,byte baseInstrument	; ...to this instrument&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0x90					; play note on channel 0 ...&lt;br /&gt;
out dx,al&lt;br /&gt;
add cl,byte noteStep			 &lt;br /&gt;
mov al,cl&lt;br /&gt;
aam byte noteRange&lt;br /&gt;
imul ax,noteSpread&lt;br /&gt;
add al,baseNote				; ... play THIS note&lt;br /&gt;
out dx,al&lt;br /&gt;
neg al						; (play deeper notes louder = add bass)&lt;br /&gt;
add al,127+39				; ... play it THAT loud&lt;br /&gt;
out dx,al&lt;br /&gt;
nomusic:&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PC Speaker ===&lt;br /&gt;
&lt;br /&gt;
Producing sound with PC speakers is incredibly easy. Basically, you set a system timer to a desired frequency, then connect this timer to the speaker. [http://wiki.osdev.org/PC_Speaker The PC Speaker Article] from OSDEV Wiki has the details about it. An example for a tiny intro that uses PC speaker music is [http://www.pouet.net/prod.php?which=67833 SpeaCore]&lt;br /&gt;
&lt;br /&gt;
==== Basic example with melody pattern ====&lt;br /&gt;
&lt;br /&gt;
A very optimized and dirty variant of producing sound with the speaker is this 12 byte snippet (sound routine from [http://www.pouet.net/prod.php?which=67829 the tiny intro &amp;quot;darkweb&amp;quot;]):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;hlt				; sync to timer1&lt;br /&gt;
inc bx			; increment our counter&lt;br /&gt;
mov ax,bx		; work with a copy&lt;br /&gt;
or al,0x4B      ; melody pattern + 2 LSB for speaker link&lt;br /&gt;
out 0x42,al		; set new countdown for timer2 (two passes)&lt;br /&gt;
out 0x61,al		; link timer2 to PC speaker (2 LSBs are 1)&lt;br /&gt;
jmp si			; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Instead of sending low and high byte of our divisor directly in succession, we do it the &amp;quot;two path&amp;quot; way. That reduces the amount of possible frequencies to 255, which is still good enough for some rough sounds. Linking the timer to the PC speaker might not be obvious : Normally you would read the value of port 0x61, set the two least significant bits to TRUE and write the value again. You can save on all of this, if you just send the &amp;quot;two path&amp;quot; value which you just used for the timer if that value has the two least significant bits already set (''or al,0x4B'' does this). Be aware that port 0x61 does many things apart from just connecting the timer to the speaker. A useful resource for ports in general is the [http://bochs.sourceforge.net/techspec/PORTS.LST Bochs Ports List], for port 0x61 it displays:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
''0061	w	KB controller port B (ISA, EISA)   (PS/2 port A is at 0092)&lt;br /&gt;
&lt;br /&gt;
		system control port for compatibility with 8255&lt;br /&gt;
&lt;br /&gt;
		bit 7	(1= IRQ 0 reset )&lt;br /&gt;
&lt;br /&gt;
		bit 6-4    reserved&lt;br /&gt;
&lt;br /&gt;
		bit 3 = 1  channel check enable&lt;br /&gt;
&lt;br /&gt;
		bit 2 = 1  parity check enable&lt;br /&gt;
&lt;br /&gt;
		'''bit 1 = 1  speaker data enable'''&lt;br /&gt;
&lt;br /&gt;
		'''bit 0 = 1  timer 2 gate to speaker enable''' ''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So if you experience strange things with highly optimized pc speaker output, revert to the safe way. The described way works with real hardware and DosBox. Unfortunately, both Orcacle Virtual Box with MsDos 6.22 and Windows XP NTVDM seem not to properly emulate PC speakers (Investigation and citation needed here!)&lt;br /&gt;
&lt;br /&gt;
==== Simple deep sound in 8 bytes ====&lt;br /&gt;
&lt;br /&gt;
One of the smallest possible PC speaker sound generation might be this 8 byte snippet :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;dec ax			; AX initially 0000h -&amp;gt; AL = 0xFF&lt;br /&gt;
out 42h,al		; change divisor of timer2 to 0xFFFF&lt;br /&gt;
out 42h,al		; resulting in a very low frequency&lt;br /&gt;
out 61h,al		; 2 LSBs are set, connect timer to speaker&lt;br /&gt;
ret				; quit&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Note: This may fail on older real hardware, as there might not be time for the bus to settle between the consecutive code&amp;gt;out 42h,al&amp;lt;/code&amp;gt; statements.&lt;br /&gt;
&lt;br /&gt;
=== COVOX output (aka LPT DAC) ===&lt;br /&gt;
&lt;br /&gt;
It is possible to output to an LPT-connected DAC (&amp;quot;[https://en.wikipedia.org/wiki/Covox_Speech_Thing COVOX]&amp;quot;) in a tinyprog.  A proof-of-concept example is [http://www.pouet.net/prod.php?which=57991 Express Train 125] which uses COVOX for sound generation.  &lt;br /&gt;
&lt;br /&gt;
This method follows the [http://countercomplex.blogspot.com/2011/10/algorithmic-symphonies-from-one-line-of.html &amp;quot;audio from one line of C code&amp;quot;] style of sound generation.  [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 A pouet discussion] exists for more background information.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Output&amp;diff=293</id>
		<title>Output</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Output&amp;diff=293"/>
				<updated>2016-08-20T01:25:49Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Simple deep sound in 8 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Outputting to the screen ==&lt;br /&gt;
&lt;br /&gt;
First, be aware of the [http://img.tfd.com/cde/MEMMAP.GIF MSDOS memory layout]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in Textmode (80x25) ===&lt;br /&gt;
&lt;br /&gt;
==== Hello World / High Level function ====&lt;br /&gt;
&lt;br /&gt;
Here's an obligatory &amp;quot;Hello World&amp;quot; program in text mode, using a [http://www.ctyme.com/intr/rb-2562.htm &amp;quot;high level&amp;quot; MS-DOS function]. With a small optimization already included (using &amp;lt;code&amp;gt;XCHG BP,AX&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;MOV AH,09h&amp;lt;/code&amp;gt;), this snippet is 20 bytes in size.&lt;br /&gt;
&lt;br /&gt;
[[File:Hello world.png|thumb|Hello World!]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h			; we start at CS:100h&lt;br /&gt;
xchg 	bp,ax		; already a trick, puts 09h into AH&lt;br /&gt;
mov		dx,text		; DX expects the adress of a $ terminated string&lt;br /&gt;
int 	21h			; call the DOS function (AH = 09h)&lt;br /&gt;
ret					; quit&lt;br /&gt;
text:&lt;br /&gt;
db 'Hello World!$'&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Of course, this gets shorter with each byte you remove from the text itself. Now let's look into arbitrary screen access. Right after the start of your program you are in mode 3, that is 80x25 in 16 colors. See the [http://www.columbia.edu/~em36/wpdos/videomodes.txt Video Modes List] [[File:Drawchar example.png|thumb|draw char example]] So, to show something on the screen, you would need to set a segment register to 0xB800, then write values into this segment. &lt;br /&gt;
&lt;br /&gt;
==== Low level access ====&lt;br /&gt;
&lt;br /&gt;
The following three snippets showcase how to draw a red smiley in three different ways. All example snippets are meant to be standalone programs, starting with the first instruction and nothing before it. The target coordinate (40,12) is about the middle of the screen. We need a multiplier 2 since one char needs two bytes in memory (char and color is a byte each). The high byte 0x04 means red (4) on black (0) while the 0x01 is the first ASCII char - a smiley.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop ds&lt;br /&gt;
mov bx,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
mov [bx],ax&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop es&lt;br /&gt;
mov di,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
stosw&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push ss&lt;br /&gt;
push 0xb800&lt;br /&gt;
pop ss&lt;br /&gt;
mov sp,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
push ax&lt;br /&gt;
pop ss&lt;br /&gt;
int 0x20&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You might notice that the ''push &amp;lt;word&amp;gt;'' + ''pop seg_reg'' combination is always the same and occupies four bytes alltogether. If correct alignment is not important to you and you really just want ''any'' pointer to the screen, there is another way to get a valid one:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
les bx,[si]&lt;br /&gt;
nop&lt;br /&gt;
stosb&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That's also four bytes, but it already has the &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; opcode (for putting something onto the screen) integrated and even one slot free for another one-byte-instruction. It works because &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially points to the start of our code, and &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; has the hexadecimal representation of &amp;lt;code&amp;gt;0AAh&amp;lt;/code&amp;gt;. After the first command, the segment register &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; contains the value &amp;lt;code&amp;gt;0AA90h&amp;lt;/code&amp;gt;. If you repeatedly write something to the screen with &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; you will eventually reach the &amp;lt;code&amp;gt;0B800h&amp;lt;/code&amp;gt; segment and chars will appear on the screen. With a careful selection of the free one-byte-opcode you can also reintroduce some alignment. This works also with the &amp;lt;code&amp;gt;stosw&amp;lt;/code&amp;gt; opcode &amp;lt;code&amp;gt;0ABh&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Alternative high level functions  ====&lt;br /&gt;
&lt;br /&gt;
Besides the direct way of accessing memory there are also other ways of bringing char to the screen (f.e)&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-4124.htm INT 29h]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2558.htm INT 21h AH=6]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2562.htm INT 21h AH=9]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in mode 13h (320x200) ===&lt;br /&gt;
&lt;br /&gt;
==== Basic pixel output ====&lt;br /&gt;
&lt;br /&gt;
The videomemory for mode 13h is located at segment 0xA000, so you need to assign this value to a segment register. Also, after the start of your program you are normally still in textmode, so you need to switch to the videomode. The following snippet does both:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13 &lt;br /&gt;
int 0x10     ; AH = 0 means : set video mode to AL = 0x13 (320 x 200 pixels in 256 colors)&lt;br /&gt;
push 0xA000  ; put value on the stack&lt;br /&gt;
pop es       ; pop the top stack value into segment register ES&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You're free to use any of the segment register / opcode combinations to write to the screen&lt;br /&gt;
* &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;mov&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Let's add some code that actually draws something on the screen, the following program occupies 23 bytes and draws a fullscreen XOR texture&lt;br /&gt;
[[File:Mode13h-example-xor.png|left|bottom|thumb|mode13h-example-xor]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13&lt;br /&gt;
int 0x10&lt;br /&gt;
push 0xa000&lt;br /&gt;
pop es&lt;br /&gt;
X: cwd			; &amp;quot;clear&amp;quot; DX (if AH &amp;lt; 0x7F)&lt;br /&gt;
mov ax,di		; get screen position into AX&lt;br /&gt;
mov bx,320		; get screen width into BX&lt;br /&gt;
div bx			; divide, to get row and column&lt;br /&gt;
xor ax,dx		; the famous XOR pattern&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
stosb			; finally, draw to the screen&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that there is a different way of preparing the segment register, instead of :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
you can also do :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov ah,0xA0&lt;br /&gt;
mov es,ax&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
both variations occupy 4 bytes, but the latter is executable on processor architectures where ''push &amp;lt;word&amp;gt;'' is not available.&lt;br /&gt;
&lt;br /&gt;
==== Alternative way of pixel plotting and optimization ====&lt;br /&gt;
&lt;br /&gt;
Now let's optimize on the snippet. First, we can adapt the &amp;quot;LES&amp;quot; trick from the textmode section. We just exchange &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
with:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;les bx,[bx]&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
to save two bytes. This works because BX is 0x0000 at start and thus, accesses the region ''before'' our code, which is called [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix]. The two bytes that are put into the segment register ES are bytes 2 and 3  = ''&amp;quot;Segment of the first byte beyond the memory allocated to the program&amp;quot;'' which is usually 0x9FFF. That is just off by one to our desired 0xA000. Unfortunately that means a 16 pixel offset, so if screen alignment means something to you, you can't use this optimization. Also, said two bytes are not always 0x9FFF; for example, if resident programs are above the ''&amp;quot;memory allocated to the program&amp;quot;'' (FreeDos), their content is overwritten if we take their base as our video memory base.&lt;br /&gt;
&lt;br /&gt;
Second, we can use an alternative way of putting pixels to the screen, subfunction AH = 0x0C of int 0x10. Also, instead of constructing row and column from the screen pointer, we can use some interesting properties of the screenwidth regarding logical operations. This results in the following 16 byte program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             ; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov al,0x13&lt;br /&gt;
X: int 0x10		; set video mode AND draw pixel&lt;br /&gt;
inc cx			; increment column&lt;br /&gt;
mov ax,cx		; get column in AH&lt;br /&gt;
xor al,ah		; the famous XOR pattern&lt;br /&gt;
mov ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first optimization is the double usage of the same &amp;quot;int 0x10&amp;quot; as setting the videomode and drawing the pixel. The subfunction AH = 0x0C expects row and column in DX and CX. Since the screenwidth is 320, which is 5 * 64, we can ignore the row and just works with the column, if we use logical operations and just use bit 0-6 of the result. The subfunction AH = 0x0C allows for unbounded column values in CX (up to 65535) and correctly &amp;quot;wraps&amp;quot; it internally without an error.&lt;br /&gt;
&lt;br /&gt;
The major drawback of the &amp;quot;subfunction AH = 0x0C&amp;quot; approach is performance loss. While DosBox and many emulators perform just fine, real hardware will draw much much slower based on the Video BIOS.&lt;br /&gt;
&lt;br /&gt;
==== Basic Animation and user interaction ====&lt;br /&gt;
&lt;br /&gt;
Now let's add the convenient check for the ESC key and also add a simple animation. The &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; register is used as frame counter and incremented after the pixel counter &amp;lt;code&amp;gt;CX&amp;lt;/code&amp;gt; ran through all 65536 values via &amp;lt;code&amp;gt;LOOP&amp;lt;/code&amp;gt;. This frame counter is then added to the column. The resulting program is now 25 bytes in size :&lt;br /&gt;
&lt;br /&gt;
[[File:Xor anim example.gif|thumb]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Producing sound ==&lt;br /&gt;
&lt;br /&gt;
=== MIDI notes ===&lt;br /&gt;
&lt;br /&gt;
Creating sounds with MIDI requires a bit more preparation, but once you're familiar with it, it's even simpler than PC Speaker sound, because you basically don't have to ''create'' the sound, you just have to ''trigger'' it. For the start, you have to know, that there is a lot of [https://en.wikipedia.org/wiki/General_MIDI different instruments] and a [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined way of communication]. Imagine the MIDI interface like a keyboard, you tell it which button/key you want to press, which knob to twist, and sometimes, how hard.&lt;br /&gt;
&lt;br /&gt;
==== Single piano note ====&lt;br /&gt;
&lt;br /&gt;
Let's start of with a simple example, playing a single note on the piano :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh		;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h	;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 		;	send !&lt;br /&gt;
dec 	dx			;	MIDI Data Port ( = 330h )&lt;br /&gt;
mov 	al,	90h		;	send note on channel ZERO - command&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
mov 	al,	56h		;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al		; 	send !&lt;br /&gt;
mov 	al,	67h		;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
ret					; 	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In short: you turn your keyboard on (switching to UART mode), then press a KEY with a certain VOLUME on channel ZERO, then exit. Besides switching to UART mode, all this communication uses the port &amp;lt;code&amp;gt;330h&amp;lt;/code&amp;gt;. This example will work on DosBox but not on Windows XP NTVDM: for [http://www.pouet.net/topic.php?which=10720&amp;amp;page=1 still unclear reasons], the NTVDM emulation delays the note until it receives a second one. The simplest way of at least hearing something is to repeatedly play notes, like in the following example :&lt;br /&gt;
&lt;br /&gt;
==== Repeated piano notes ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh			;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h		;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 			;	send !&lt;br /&gt;
dec 	dx				;	MIDI Data Port ( = 330h )&lt;br /&gt;
main:&lt;br /&gt;
mov 	al,	90h			;	send note on channel 0 - command&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	al,	56h			;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al			; 	send !&lt;br /&gt;
mov 	al,	67h			;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,	[fs:0x46c]	;	read timer&lt;br /&gt;
test 	al,	3			;	skip 3 values&lt;br /&gt;
jnz 	_wait			;&lt;br /&gt;
inc 	byte [fs:0x46c]	; 	inc manually to prevent retrigger&lt;br /&gt;
in 		al,	0x60		;	check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			;	no? repeat&lt;br /&gt;
ret						;	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, enriched with synchronizing against the timer and checking for the ESC key. It works on both DosBox and Windows XP NTVDM and plays a note on the Piano repeatedly.&lt;br /&gt;
&lt;br /&gt;
==== Repeated notes of other instruments ====&lt;br /&gt;
&lt;br /&gt;
While hitting one key repeatedly is not really interesting in general, it can produce decent results when doing it with the right instrument activated, like it was done with the &amp;quot;French Horn&amp;quot; in [https://www.youtube.com/watch?v=tsbxdjP9l50 Timelord (by Baudsurfer)]. Apart from just changing the instrument, let's also optimize a little bit on the size: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
start:	&lt;br /&gt;
mov 	si,data			; init pointer for outsb&lt;br /&gt;
mov 	dx,330h			; change to data port&lt;br /&gt;
mov 	cl,5			; play our music data&lt;br /&gt;
rep 	outsb			; (see below at &amp;quot;data&amp;quot; label)&lt;br /&gt;
inc 	dx				; switch to control port&lt;br /&gt;
outsb					; change to mode &amp;quot;UART&amp;quot;&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,[fs:0x46c]	; read timer value&lt;br /&gt;
cmp	 	al,bl			; wait until...&lt;br /&gt;
jz 		_wait			; ...timer value changed&lt;br /&gt;
xchg 	bx,ax			; save old timer value&lt;br /&gt;
in 		al,0x60			; check for ...&lt;br /&gt;
dec 	al				; ... ESC key&lt;br /&gt;
jnz 	start			; otherwise : repeat &lt;br /&gt;
dec 	dx				; switch to data port again&lt;br /&gt;
outsb					; stop all ... &lt;br /&gt;
outsb					; ... notes played ...&lt;br /&gt;
outsb					; ... on channel 3&lt;br /&gt;
data: 	&lt;br /&gt;
db		0c3h			; change instrument on channel 3&lt;br /&gt;
						; (is also &amp;quot;RET&amp;quot; for program quit)&lt;br /&gt;
db		60				; to &amp;quot;French Horn&amp;quot;&lt;br /&gt;
db		93h				; play note on channel 3&lt;br /&gt;
db		35				; deep &amp;quot;b&amp;quot; = note number 35&lt;br /&gt;
db		127				; play with volume = 127&lt;br /&gt;
db		3fh				; change mode to &amp;quot;UART&amp;quot;&lt;br /&gt;
db		0b3h			; control change on channel 3&lt;br /&gt;
db		123				; Channel Mode Message &amp;quot;All Notes Off&amp;quot;&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, with changed instrument, structuring the MIDI data into a data section, optimizing the output with the usage of &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;out dx,al&amp;lt;/code&amp;gt;, and finalizing the program with a special command to turn ''All Notes Off''. This is necessary for all instruments which don't stop by themself. In all the previous examples, we sent the &amp;quot;NOTE ON&amp;quot; command (&amp;lt;code&amp;gt;9Xh&amp;lt;/code&amp;gt;), but not the according &amp;quot;NOTE OFF&amp;quot; command (&amp;lt;code&amp;gt;8Xh&amp;lt;/code&amp;gt;). Also, the note is now played on channel &amp;lt;code&amp;gt;03h&amp;lt;/code&amp;gt;, since the commandbyte for changing an instrument on channel 3 is &amp;lt;code&amp;gt;0C3h&amp;lt;/code&amp;gt; which is also &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; and can be reused. If this looks complicated at first, always remember, it's just sending [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined commands] to a single port.&lt;br /&gt;
&lt;br /&gt;
==== The drum channel ====&lt;br /&gt;
&lt;br /&gt;
Now, that you're aware that there are different channels (overall: 16) to play notes on, how would you like a channel &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt; specifically for 'Drums' ? The following example plays a track of drum notes repeatedly, while further optimizing for size :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
aas						; 3fh = &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
cwd						; 99h = &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
db 		42,38,42,35		; the drum notes (kick, snare, hihat)&lt;br /&gt;
mov 	dx,0x331		; MIDI Control Port&lt;br /&gt;
outsb					; send &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
dec 	dx				; switch to MIDI data port&lt;br /&gt;
outsb					; send &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
mov		al,[fs:0x46c]	; read timer&lt;br /&gt;
test	al,3&lt;br /&gt;
jnz 	main			; skip 3 values&lt;br /&gt;
inc 	byte [fs:0x46c] ; inc manually to prevent retrigger&lt;br /&gt;
&lt;br /&gt;
inc 	bx				; increment note counter&lt;br /&gt;
and 	bl,3			; truncate to 4 notes&lt;br /&gt;
mov 	al,[bx+si]		; read the drumnote (see above)&lt;br /&gt;
&lt;br /&gt;
out 	dx,al			; send the drum&lt;br /&gt;
mov 	al,127			; set volume to maximum&lt;br /&gt;
out 	dx,al			; send volume&lt;br /&gt;
&lt;br /&gt;
in 		al,0x60			; check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			; no? repeat&lt;br /&gt;
ret						; otherwise quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In contrast to the previous example, the data section is now at the start. That means, it's executed as code! This is dangerous of course, but also saves bytes on assigning the &amp;lt;code&amp;gt;DATA&amp;lt;/code&amp;gt; offset to &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt;. Once &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; incremented &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially two times, it is fixed and further reading from the drumdata is done with &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;. Unless you know exactly what you are doing, don't use that kind of &amp;quot;executing data&amp;quot; optimization!&amp;quot;. In this special case &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;CWD&amp;lt;/code&amp;gt; do no harm and the drum notes &amp;lt;code&amp;gt;42,38,42,35&amp;lt;/code&amp;gt; are carefully crafted and arranged to resemble the instruction &amp;lt;code&amp;gt;SUB AH,[232Ah]&amp;lt;/code&amp;gt; which does no harm either.&lt;br /&gt;
&lt;br /&gt;
==== Creating basic sound effects in 16 bytes ====&lt;br /&gt;
&lt;br /&gt;
In the [https://en.wikipedia.org/wiki/General_MIDI#Sound_Effects MIDI repertoire], there are already some sound effects available. With the &amp;quot;data execution&amp;quot; optimization above, let's fire a gunshot in 16 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;aas&lt;br /&gt;
les 	di,[bx-0x6C]&lt;br /&gt;
xor 	al,127&lt;br /&gt;
mov 	dx,0x331&lt;br /&gt;
outsb&lt;br /&gt;
dec 	dx&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
rep 	outsb&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first three instructions don't do anything (they do, but we don't care), it's just MIDI data.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
the command for switching to &amp;quot;UART&amp;quot; mode, for sending to port &amp;lt;code&amp;gt;0x331&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;les di,[bx-0x6C]&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0xc4&amp;lt;/code&amp;gt; (change instrument on channel 4), &amp;lt;code&amp;gt;0x7F&amp;lt;/code&amp;gt; (change it to &amp;quot;Gunshot&amp;quot;), &amp;lt;code&amp;gt;0x94&amp;lt;/code&amp;gt; (play note on channel 4)&lt;br /&gt;
* &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0x34&amp;lt;/code&amp;gt; (play THIS note), &amp;lt;code&amp;gt;0x7f&amp;lt;/code&amp;gt; (play it THAT loud, 127 is also the allowed maximum)&lt;br /&gt;
&lt;br /&gt;
The rest of the code basically just sends the MIDI data to the interface and exits. You can change the kind of sound effect with modifying the modbyte of the second instruction (change BX to BP or SI etc.). Changing the volume is more simple, change the byte value of &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt; to any value between 0 and 127.&lt;br /&gt;
&lt;br /&gt;
==== Procedural MIDI music generation in 64 bytes ====&lt;br /&gt;
&lt;br /&gt;
With all the above you should now be able to follow the next snippet [http://www.pouet.net/prod.php?which=66313 Descent OST], a small framework for procedural MIDI sound generation in 64 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;; &amp;quot;Descent OST&amp;quot;, a 62 byte MIDI music player for MSDOS&lt;br /&gt;
; created by HellMood/DESiRE (C)2015&lt;br /&gt;
; this is the extracted music routine used in &amp;quot;Descent&amp;quot; &lt;br /&gt;
; it is a procedural MIDI algorithm which sticks a&lt;br /&gt;
; subroutine to the DOS timer (interrupt 0x1C)&lt;br /&gt;
; the registered routine is called ~18.2 times per second&lt;br /&gt;
&lt;br /&gt;
; developed for use with &amp;quot;NASM&amp;quot;, &lt;br /&gt;
; see http://sourceforge.net/projects/nasm/files/&lt;br /&gt;
&lt;br /&gt;
%define		 	rhythmPattern			0b11&lt;br /&gt;
; with &amp;quot;rhythmPattern&amp;quot;, you define how often a note is played&lt;br /&gt;
; generally, higher values and values containing many &amp;quot;ones&amp;quot;&lt;br /&gt;
; in binary representation, will result in faster play&lt;br /&gt;
; for example &amp;quot;0b11&amp;quot; will play every 4th note&lt;br /&gt;
%define 		baseInstrument			9&lt;br /&gt;
; defines the number of the first instrument used.&lt;br /&gt;
; see http://www.midi.org/techspecs/gm1sound.php for a full list&lt;br /&gt;
; keep in mind, that there are only a few instrument blocks&lt;br /&gt;
; whose sounds stop after a while. You won't get good results&lt;br /&gt;
; from strings etc. just a mess of overlayed sounds&lt;br /&gt;
%define 		numInstruments			7&lt;br /&gt;
; defines how many instrument are used. keep in mind, that &amp;quot;rhythm-&lt;br /&gt;
; Pattern&amp;quot; has influence on the picked instrument. the instruments&lt;br /&gt;
; from 9 to 9+7 are called &amp;quot;chromatic percussion&amp;quot;&lt;br /&gt;
%define 		noteStep 				5&lt;br /&gt;
; defines the basic difference from on note to the next. recommended&lt;br /&gt;
; values here are (mainly) 3,4 and 5 for music theoretic reasons&lt;br /&gt;
; but feel free to play around =)&lt;br /&gt;
%define 		noteRange				12&lt;br /&gt;
; after adding the noteStep, the note value is &amp;quot;mod&amp;quot;ded with&lt;br /&gt;
; the &amp;quot;noteRange&amp;quot;. 12 means octave, which results in very harmonic&lt;br /&gt;
; scales&lt;br /&gt;
%define 		noteSpread				3&lt;br /&gt;
; the third step spreads the notes over the tonal spectrum, you may&lt;br /&gt;
; want to keep &amp;quot;noteSpread&amp;quot; * &amp;quot;noteRange&amp;quot; round about 30-60.&lt;br /&gt;
%define 		baseNote				40&lt;br /&gt;
; the general tone height of everything. some instruments don't play&lt;br /&gt;
; arbitrary deep notes correctly, and too high notes cause ear bleeding&lt;br /&gt;
; adjust with care ;)&lt;br /&gt;
&lt;br /&gt;
; WARNING : after exiting the program, the timer interrupt is still active&lt;br /&gt;
; i strongly recommend to reboot or restart DOSBOX!&lt;br /&gt;
&lt;br /&gt;
; ADVISE : Yes, there are music- and math-related things going on here&lt;br /&gt;
; if you're not into music theory, cycle of fifth, and the like, it maybe&lt;br /&gt;
; better to just play around with the parameters, rather then understanding them&lt;br /&gt;
; just change stuff slowly, and eventually you will get &amp;quot;there&amp;quot;&lt;br /&gt;
; wherever that is ;)&lt;br /&gt;
&lt;br /&gt;
org 0x100&lt;br /&gt;
xchg cx,ax					; set our second counter to zero&lt;br /&gt;
mov dx,music			&lt;br /&gt;
mov ax,0x251C				; mode &amp;quot;0x25&amp;quot; , &amp;quot;0x1C&amp;quot; = change address of timer interrupt&lt;br /&gt;
int 0x21					; see http://mprolab.teipir.gr/vivlio80X86/dosints.pdf&lt;br /&gt;
S:&lt;br /&gt;
in ax,0x60					; wait for &amp;quot;ESC&amp;quot; press, then exit&lt;br /&gt;
dec al						; music plays on anyway, this is just for&lt;br /&gt;
jnz S						; keeping the music exactly as in &amp;quot;Descent&amp;quot;&lt;br /&gt;
ret							; return to prompt&lt;br /&gt;
music:&lt;br /&gt;
inc bx						; increment our first counter (starts at zero)&lt;br /&gt;
test bl,byte rhythmPattern	; play a note every 4th time tick&lt;br /&gt;
jnz nomusic					; otherwise do nothing&lt;br /&gt;
mov dx,0x331&lt;br /&gt;
mov al,0x3F&lt;br /&gt;
out dx,al&lt;br /&gt;
dec dx&lt;br /&gt;
mov al,0xC0					; change instrument on channel 0...&lt;br /&gt;
out dx,al&lt;br /&gt;
mov ax,bx					&lt;br /&gt;
aam byte numInstruments&lt;br /&gt;
add al,byte baseInstrument	; ...to this instrument&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0x90					; play note on channel 0 ...&lt;br /&gt;
out dx,al&lt;br /&gt;
add cl,byte noteStep			 &lt;br /&gt;
mov al,cl&lt;br /&gt;
aam byte noteRange&lt;br /&gt;
imul ax,noteSpread&lt;br /&gt;
add al,baseNote				; ... play THIS note&lt;br /&gt;
out dx,al&lt;br /&gt;
neg al						; (play deeper notes louder = add bass)&lt;br /&gt;
add al,127+39				; ... play it THAT loud&lt;br /&gt;
out dx,al&lt;br /&gt;
nomusic:&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PC Speaker ===&lt;br /&gt;
&lt;br /&gt;
Producing sound with PC speakers is incredibly easy. Basically, you set a system timer to a desired frequency, then connect this timer to the speaker. [http://wiki.osdev.org/PC_Speaker The PC Speaker Article] from OSDEV Wiki has the details about it. An example for a tiny intro that uses PC speaker music is [http://www.pouet.net/prod.php?which=67833 SpeaCore]&lt;br /&gt;
&lt;br /&gt;
==== Basic example with melody pattern ====&lt;br /&gt;
&lt;br /&gt;
A very optimized and dirty variant of producing sound with the speaker is this 12 byte snippet (sound routine from [http://www.pouet.net/prod.php?which=67829 the tiny intro &amp;quot;darkweb&amp;quot;]):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;hlt				; sync to timer1&lt;br /&gt;
inc bx			; increment our counter&lt;br /&gt;
mov ax,bx		; work with a copy&lt;br /&gt;
or al,0x4B      ; melody pattern + 2 LSB for speaker link&lt;br /&gt;
out 0x42,al		; set new countdown for timer2 (two passes)&lt;br /&gt;
out 0x61,al		; link timer2 to PC speaker (2 LSBs are 1)&lt;br /&gt;
jmp si			; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Instead of sending low and high byte of our divisor directly in succession, we do it the &amp;quot;two path&amp;quot; way. That reduces the amount of possible frequencies to 255, which is still good enough for some rough sounds. Linking the timer to the PC speaker might not be obvious : Normally you would read the value of port 0x61, set the two least significant bits to TRUE and write the value again. You can save on all of this, if you just send the &amp;quot;two path&amp;quot; value which you just used for the timer if that value has the two least significant bits already set (''or al,0x4B'' does this). Be aware that port 0x61 does many things apart from just connecting the timer to the speaker. A useful resource for ports in general is the [http://bochs.sourceforge.net/techspec/PORTS.LST Bochs Ports List], for port 0x61 it displays:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
''0061	w	KB controller port B (ISA, EISA)   (PS/2 port A is at 0092)&lt;br /&gt;
&lt;br /&gt;
		system control port for compatibility with 8255&lt;br /&gt;
&lt;br /&gt;
		bit 7	(1= IRQ 0 reset )&lt;br /&gt;
&lt;br /&gt;
		bit 6-4    reserved&lt;br /&gt;
&lt;br /&gt;
		bit 3 = 1  channel check enable&lt;br /&gt;
&lt;br /&gt;
		bit 2 = 1  parity check enable&lt;br /&gt;
&lt;br /&gt;
		'''bit 1 = 1  speaker data enable'''&lt;br /&gt;
&lt;br /&gt;
		'''bit 0 = 1  timer 2 gate to speaker enable''' ''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So if you experience strange things with highly optimized pc speaker output, revert to the safe way. The described way works with real hardware and DosBox. Unfortunately, both Orcacle Virtual Box with MsDos 6.22 and Windows XP NTVDM seem not to properly emulate PC speakers (Investigation and citation needed here!)&lt;br /&gt;
&lt;br /&gt;
==== Simple deep sound in 8 bytes ====&lt;br /&gt;
&lt;br /&gt;
One of the smallest possible PC speaker sound generation might be this 8 byte snippet :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;dec ax			; AX initially 0000h -&amp;gt; AL = 0xFF&lt;br /&gt;
out 42h,al		; change divisor of timer2 to 0xFFFF&lt;br /&gt;
out 42h,al		; resulting in a very low frequency&lt;br /&gt;
out 61h,al		; 2 LSBs are set, connect timer to speaker&lt;br /&gt;
ret				; quit&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This may fail on real hardwre, as there might not be time for the bus to settle between the consecutive code&amp;gt;out 42h,al&amp;lt;/code&amp;gt; statements.&lt;br /&gt;
&lt;br /&gt;
=== COVOX output (aka LPT DAC) ===&lt;br /&gt;
&lt;br /&gt;
It is possible to output to an LPT-connected DAC (&amp;quot;[https://en.wikipedia.org/wiki/Covox_Speech_Thing COVOX]&amp;quot;) in a tinyprog.  A proof-of-concept example is [http://www.pouet.net/prod.php?which=57991 Express Train 125] which uses COVOX for sound generation.  &lt;br /&gt;
&lt;br /&gt;
This method follows the [http://countercomplex.blogspot.com/2011/10/algorithmic-symphonies-from-one-line-of.html &amp;quot;audio from one line of C code&amp;quot;] style of sound generation.  [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 A pouet discussion] exists for more background information.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=267</id>
		<title>General Coding Tricks</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=General_Coding_Tricks&amp;diff=267"/>
				<updated>2016-08-18T16:39:33Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: Added Rrrola trick&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Data is code, code is data ==&lt;br /&gt;
&lt;br /&gt;
Code is nothing more than data that the CPU interprets.  For example, consider this multi-byte instruction:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This assembles to &amp;lt;code&amp;gt;B4 37&amp;lt;/code&amp;gt;.  &amp;lt;code&amp;gt;B4&amp;lt;/code&amp;gt; by itself isn't interesting, but &amp;lt;code&amp;gt;37&amp;lt;/code&amp;gt; happens to be the opcode for &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;.  Let's say you had this code before a loop, and you needed to perform &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; at the top of a loop.  Rather than put AAS at the top of the loop, you can reuse the opcode that will already be there as part of the &amp;lt;code&amp;gt;mov ah,37&amp;lt;/code&amp;gt; that comes before it.  Just jump directly into the middle of the &amp;lt;code&amp;gt;mov ah,37h&amp;lt;/code&amp;gt;, which will get interpreted and executed as &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
&lt;br /&gt;
label:&lt;br /&gt;
        mov ah,37h&lt;br /&gt;
        ;misc. stuff&lt;br /&gt;
        loop label+1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;+1&amp;lt;/code&amp;gt; specifies the jump should go to 1 byte past the actual location.&lt;br /&gt;
&lt;br /&gt;
=== Reuse ===&lt;br /&gt;
&lt;br /&gt;
You can use opcodes hidden in your existing data.  For example, .COM files can end with &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;, which is opcode &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt;.  If you already have a &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; somewhere else in your code, even as part of data, just &amp;lt;code&amp;gt;JMP&amp;lt;/code&amp;gt; to that pre-existing &amp;lt;code&amp;gt;C3&amp;lt;/code&amp;gt; instead of adding a &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== If your environment holds you back, change it ==&lt;br /&gt;
&lt;br /&gt;
The [https://en.wikipedia.org/wiki/Video_Graphics_Array#/media/File:VGA_palette_with_black_borders.svg default MCGA palette] is fairly horrible, but can be size advantages to changing it:  While setting a new palette costs bytes, the new palette arrangement could save you headaches down the road.  For example, if your code is calculating pixel colors that fall into goofy ranges, rather than constantly adjust the colors to sane ranges (ie. aligned to powers of 2), just set the palette so that values falling into those ranges look the way you want.  (This assumes you have very small ways of redefining the palette, of course.)&lt;br /&gt;
&lt;br /&gt;
The above is maybe not the best example.  Rewrites to this section are welcome.&lt;br /&gt;
&lt;br /&gt;
== Need a constant? ==&lt;br /&gt;
&lt;br /&gt;
If you need a constant value but you're out of space, search your assembled code for a byte value you can use.&lt;br /&gt;
&lt;br /&gt;
== A smaller way to point to Mode 13's screen segment ==&lt;br /&gt;
&lt;br /&gt;
Rather than &amp;lt;code&amp;gt;mov ah,0a0h; mov es,ax&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;push word 0a000h; pop es&amp;lt;/code&amp;gt;, try this 2-byte wonder:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
les bp,[bx]&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This sets ES=9FFF, only one away from A000.  You can write to the screen with ES: this way as long as you are aware the segment is one paragraph (16 bytes) behind, so just increase your offset by 16 if you need exact placement.&lt;br /&gt;
&lt;br /&gt;
How does this work?   At start of execution of a .COM file, BX=0, and DS=CS. The contents of the COM file get loaded to offset 0x100 in that segment, but loaded before that is the PSP (program segment prefix), which DOS populates with information about the loaded program and other info. The PSP starts with &amp;lt;code&amp;gt;CD 20&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;INT 20&amp;lt;/code&amp;gt;, which exits the program), so that's what gets loaded into BP.  The next word is the number of the last free conventional memory segment, typically 0x9fff (but can be something different if parts of the upper memory range are either not installed or allocated).&lt;br /&gt;
&lt;br /&gt;
'''Warning:''' This trick doesn't always work.  On FreeDos, this can set ES=9FE0 and there is something resident at that location that can screw up the system after normal program exit if you overwrite it.&lt;br /&gt;
&lt;br /&gt;
== Accessing the timer tick for free ==&lt;br /&gt;
&lt;br /&gt;
If using a 386+, FS=0 at .COM start.  So, &amp;lt;code&amp;gt;FS:[046C]&amp;lt;/code&amp;gt; gets you the DOS timer tick variable, which you could use for timing/pacing, or a random seed. Some environments, especially EMS/XMS programs, can modify the FS register, so it can't always assumed to be 0000h. &amp;lt;code&amp;gt;POP DS&amp;lt;/code&amp;gt; right after the start and accessing &amp;lt;code&amp;gt;[046C]&amp;lt;/code&amp;gt; does the trick then with equal size.&lt;br /&gt;
&lt;br /&gt;
== Looping twice ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code that doesn't modify the carry flag, and you know the carry flag is clear, you can loop once in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        cmc&lt;br /&gt;
        jc      looping&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Looping three times ==&lt;br /&gt;
&lt;br /&gt;
If you need to repeat a section of code and you have a register whose value is zero and can be incremented freely, or whose value is -1 and can be decremented freely, you can loop twice in only 3 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
looping:&lt;br /&gt;
        ;do stuff here&lt;br /&gt;
        inc     bx ;if decrementing instead, parity check must be reversed&lt;br /&gt;
        jpo     looping ;1 (01b) and 2 (10b) have odd parity, 3 (11b) has even parity&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== The Rrrola Trick (Obtaining X and Y without DIV) ==&lt;br /&gt;
&lt;br /&gt;
In [[Output#Outputting_in_mode_13h_.28320x200.29|320x200 mode]], instead of constructing X and Y from the screen pointer DI with DIV, you can get a decent estimation with multiplying the screen pointer with 0xCCCD and read X and Y from the 8bit registers DH (+DL as 16bit value) and DL (+AH as 16bit value). The idea is to interpret DI as a kind of 16 bit float in the range [0,1], from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation 0xCCCD is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=266</id>
		<title>Floating-point Opcodes</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=266"/>
				<updated>2016-08-18T16:35:11Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The FPU offers a lot of complex floating-point operations not available to classic x86 CPUs, like &amp;lt;code&amp;gt;SIN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;COS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ATAN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;SQRT&amp;lt;/code&amp;gt;, etc. ([http://www.website.masmforum.com/tutorials/fptute/appen1.htm SIMPLY FPU] by Raymond Filiatreault has a compact overview of all FPU commands for the curious.)  &lt;br /&gt;
&lt;br /&gt;
== FPU Basics ==&lt;br /&gt;
&lt;br /&gt;
Usage and communication with the FPU is quite uncommon and takes a bit to get used to, so we'll start with a simple example.  This is what [[Output#Outputting_in_mode_13h_.28320x200.29|our code example]] looks like originally:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter	          &amp;lt;-- REPLACE THIS WITH FPU CODE&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and this is how the code looks if we replace the instruction with FPU code :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get &amp;quot;column&amp;quot; in AX&lt;br /&gt;
&lt;br /&gt;
fninit				; init FPU first&lt;br /&gt;
mov		[si],ax		; write first addend to a memory location&lt;br /&gt;
fild	word [si]	; F(pu) I(nteger) L(oad)D a WORD from memory location to the FPU stack&lt;br /&gt;
mov		[si],di		; write second addend to a memory location&lt;br /&gt;
fiadd	word [si]	; Directly add the word in the memory location to the top FPU stack&lt;br /&gt;
fist	word [si]	; F(pu) I(nteger) ST(ore) the result into a memory location&lt;br /&gt;
mov		ax,[si]		; Get the word from the memory location into AX&lt;br /&gt;
&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Obviously that made the program bigger instead of smaller, but the point of this exercise is to illustrate how to work with the FPU.)  Looking at our changes, you get a sense of what usual interaction with the FPU is:&lt;br /&gt;
# &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt;: Initialize the FPU&lt;br /&gt;
# transfer values from CPU registers to memory location(s)&lt;br /&gt;
# transfer values from memory location(s) onto FPU stack&lt;br /&gt;
# do the actual calculations using the FPU (more on this soon)&lt;br /&gt;
# transfer result from the FPU stack into memory location(s)&lt;br /&gt;
# transfer result from memory location(s) back into registers&lt;br /&gt;
&lt;br /&gt;
That is a lot of extra code for a single integer addition, but that's a simple example; ''once more complex floating point operations are involved, it starts to pay off.'' &lt;br /&gt;
&lt;br /&gt;
== Optimizing with the FPU ==&lt;br /&gt;
&lt;br /&gt;
For more advanced FPU operation, let's start from scratch with an unoptimized program which plots the distance of each pixel to the screen center as color, in 49 bytes.&lt;br /&gt;
&lt;br /&gt;
[[File:Distance to center example.png|thumb|Distance to center]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h			&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
fninit					; -	&lt;br /&gt;
						; it's useful to comment what's on the&lt;br /&gt;
						; stack after each FPU operation&lt;br /&gt;
						; to not get lost ;) start is : empty (-)&lt;br /&gt;
X:&lt;br /&gt;
xor 	dx,dx			; reset the high word before division&lt;br /&gt;
mov 	bx,320			; 320 columns&lt;br /&gt;
mov 	ax,di			; get screen pointer in AX&lt;br /&gt;
div 	bx				; construct X,Y from screen pointer into AX,DX&lt;br /&gt;
sub 	ax,100			; subtract the origin&lt;br /&gt;
sub 	dx,160			; = (160,100) ... center of 320x200 screen	&lt;br /&gt;
mov 	[si],ax			; move X into a memory location&lt;br /&gt;
fild 	word [si]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
mov 	[si],dx			; move Y into a memory location&lt;br /&gt;
fild 	word [si]		; Y X²&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	word [si]		; -&lt;br /&gt;
mov 	ax,[si]			; get the result from memory&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A few words on this : &lt;br /&gt;
* The FPU registers (st0, st1, ...) are organized as a stack. When you load something to the FPU, everything else will be moved one location further away from the top (implicitly!) Some FPU instructions work only on the top, other allow the explicit parametrization with arbitrary FPU registers.&lt;br /&gt;
* Depending on what you do, sometimes &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt; can be omitted. Real hardware will refuse to work more often than emulators, but it's always worth the try.&lt;br /&gt;
* Accessing memory (size) efficiently can be a real pain. The safest way is to reference absolute memory locations (f.e &amp;lt;code&amp;gt;[1234]&amp;lt;/code&amp;gt;) but that's two bytes more per instruction than referencing memory with &amp;lt;code&amp;gt;[BX]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+DI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[DI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+DI]&amp;lt;/code&amp;gt;. When working with FPU and this ''classic'' approach of FPU communication, you have to design your codeflow to have one or some of these locations available.&lt;br /&gt;
* Accessing the memory is always with regard to the segment register &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; unless you perform segment overrides. When accessing memory with &amp;lt;code&amp;gt;[BP+??]&amp;lt;/code&amp;gt; be aware that the memory is accessed with regard to the segment register &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (see [http://www.oopweb.com/Assembly/Documents/ArtOfAssembly/Volume/Chapter_4/CH04-2.html here, at 4.6.2.2 The Register Indirect Addressing Modes])&lt;br /&gt;
* There are a few conventions which help you identify FPU commands. &amp;quot;i&amp;quot; stands for integer (WORD or DWORD), &amp;quot;p&amp;quot; means &amp;quot;pop stack afterwards&amp;quot;, so &amp;lt;code&amp;gt;FST&amp;lt;/code&amp;gt; means just &amp;quot;store&amp;quot; while &amp;lt;code&amp;gt;FISTP&amp;lt;/code&amp;gt; means &amp;quot;store as integer, then pop the stack&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now let's unleash the state of the art sizecoding arsenal onto this, to bring it down to 37 bytes (40 bytes with aspect correction)&lt;br /&gt;
&lt;br /&gt;
[[File:Distant to center optimized.png|thumb|Distance to center : optimized]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h - 70		; modified to center to 160,100&lt;br /&gt;
aas						; aspect ratio constant part&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
X:&lt;br /&gt;
mov 	ax,0xCCCD		; perform the famous...&lt;br /&gt;
mul		di				; ... Rrrola trick =)&lt;br /&gt;
sub 	dh,[si]			; align vertically&lt;br /&gt;
pusha 					; push all registers on stack&lt;br /&gt;
fild 	word [bx-8]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
fild 	word [bx-9]		; Y X²&lt;br /&gt;
fmul	dword [bx+si]	; aspect ratio correction&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	dword [bx-5]	; -&lt;br /&gt;
popa					; pop all registers from stack&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The resulting image is almost identical to to the former. Let's go through this step by step:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;push 	0a000h - 70&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of aligning horizontally with &amp;lt;code&amp;gt;sub	dx,160&amp;lt;/code&amp;gt; we can code this implicitly by moving our segment register ten units - that is 10 * 16 = 160 pixels - to the left (see [https://en.wikipedia.org/wiki/X86_memory_segmentation#Real_mode Real Mode Addressing]). With further multiple subtraction of 20 units - that is 320 pixels, we can shift the visible screen towards the top, to finetune vertical alignment. As long as this shift is no more than 4 lines ''( 65536 / 320 - 200 = 4,8 )'' there is no further visual impact.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the high byte of a constant, placed in a way that &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt; resolves to ~1.24 when read as 32bit float. The last byte of segment &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; is also of importance. Check yourself with the [http://www.h-schmidt.net/FloatConverter/IEEE754.html IEEE 754 Converter]&lt;br /&gt;
* &amp;lt;code&amp;gt;mov ax,0xCCCD&amp;lt;/code&amp;gt; &amp;amp; &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; (The &amp;quot;Rrrola trick&amp;quot;)&lt;br /&gt;
Instead of constructing X and Y from the screen pointer &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;DIV&amp;lt;/code&amp;gt; you can get a decent estimation with multiplying the screen pointer with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; and read X and Y from the 8bit registers &amp;lt;code&amp;gt;DH&amp;lt;/code&amp;gt; (+DL as 16bit value) and &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; (+AH as 16bit value). The idea is to interpret &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; as a kind of 16 bit float in the range &amp;lt;code&amp;gt;[0,1]&amp;lt;/code&amp;gt;, from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
* &amp;lt;code&amp;gt;sub dh,[si]&amp;lt;/code&amp;gt;&lt;br /&gt;
The instruction at &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;push &amp;lt;word&amp;gt;&amp;lt;/code&amp;gt; and has the opcode &amp;lt;code&amp;gt;0x68&amp;lt;/code&amp;gt; which is 104 in decimal. Combined with the fine tuned vertical alignment above ( ~4 lines) this results in (virtually) subtracting 100 for perfect vertical alignment. This is one byte shorter than &amp;lt;code&amp;gt;sub dh,100&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;pusha / popa&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of going the ''classical'' way of communicating with the FPU, we push all the registers, read/write values with memory addressing to/from the FPU, then pop all registers again. This works when &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; = &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP&amp;lt;/code&amp;gt; is &amp;quot;close enough&amp;quot; to &amp;lt;code&amp;gt;BX&amp;lt;/code&amp;gt; (initially zero and kept that way) to allow &amp;lt;code&amp;gt;[BX+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; addressing. It comes with the special benefit of implicit 8bit shifts. One serious drawback is loss of precision, since the registers &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;AH&amp;lt;/code&amp;gt; &amp;quot;lose connection&amp;quot; when using &amp;lt;code&amp;gt;PUSHA&amp;lt;/code&amp;gt; (see the order of registers : [http://x86.renejeschke.de/html/file_module_x86_id_270.html PUSHA/PUSHAD documentation]&lt;br /&gt;
*&amp;lt;code&amp;gt;fild word [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; &amp;amp; *&amp;lt;code&amp;gt;fistp dword [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the so called &amp;quot;stack addressing&amp;quot;. We assume that &amp;lt;code&amp;gt;BX=0&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP=0xFFFE&amp;lt;/code&amp;gt; at start, so we know where the registers are in memory after &amp;lt;code&amp;gt;pusha&amp;lt;/code&amp;gt; (AX at [BX-4], CX at [BX-6] etc.). It's important to realize that we work with signed 16 bit values now, in the full range of [-32768,32767]. That is also why we need &amp;lt;code&amp;gt;DWORD&amp;lt;/code&amp;gt; when storing the result : &amp;lt;code&amp;gt;sqrt(x²+y²)&amp;lt;/code&amp;gt; exceeds the signed 16bit range for quite some value pairs. Note that there are already implicit 8bit shifts (bx-9,bx-5)&lt;br /&gt;
*&amp;lt;code&amp;gt;fmul dword [bx+si]&amp;lt;/code&amp;gt;&lt;br /&gt;
With the &amp;quot;Rrrola&amp;quot; trick above, we have the row number to be 204 at maximum, but also the column can't be greater than 256. This results in a wrong aspect ratio, but it can almost completely be fixed with this two byte instruction (+ one byte for the &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; instruction) : 256 * 1,24 = 317,44 which is quite close to 320. If aspect ratio is of no meaning to the desired effect, this three bytes can be shaved off. &lt;br /&gt;
&lt;br /&gt;
Now let's add some features:&lt;br /&gt;
* extract angle as opposed to the distance and combine both&lt;br /&gt;
* reverse divide the distance to create the &amp;quot;tunnel&amp;quot; effect&lt;br /&gt;
* animate with smooth steps along the distance&lt;br /&gt;
* improve on the colors with subselecting from the standard palette&lt;br /&gt;
* quit the program on ESC&lt;br /&gt;
&lt;br /&gt;
This results in the following program with a size of 63 bytes :&lt;br /&gt;
&lt;br /&gt;
[[File:Tunnel effect neontube.png|thumb|tunnel effect neontube]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0xa000 - 10 - 3 * 20	; video base - 3.5 lines&lt;br /&gt;
or 		al, 0x13				; mode 13h = 320 x 200 in 256 colors&lt;br /&gt;
pop 	es						; get aligned video memory base&lt;br /&gt;
int 	0x10					; switch videomode&lt;br /&gt;
X: &lt;br /&gt;
sub		dh, [si]				; vertical alignment&lt;br /&gt;
pusha							; push all registers on stack&lt;br /&gt;
fild 	word	[bx-9]			; fpustack :  x&lt;br /&gt;
fild 	word	[bx-8]			; fpustack :  y  x&lt;br /&gt;
fpatan							; fpustack :  arc&lt;br /&gt;
fst 	st1						; fpustack :  arc  arc&lt;br /&gt;
fcos							; fpustack :  cos(arc)  arc&lt;br /&gt;
fimul	dword	[si]			; fpustack :  l*cos(arc)  arc&lt;br /&gt;
fidiv	word	[bx-8]			; fpustack :  l*cos(arc)/x  arc&lt;br /&gt;
fiadd	word	[bp+si]			; fpustack :  l*cos(arc)/x+offset  arc&lt;br /&gt;
fistp	dword	[bx-7]			; fpustack :  arc&lt;br /&gt;
fimul	word	[byte si+val]	; fpustack :  scaled_arc&lt;br /&gt;
fistp	word	[bx-5]			; fpustack :  -&lt;br /&gt;
popa							; pop all registers from stack&lt;br /&gt;
xor 	al, cl					; XOR scaled_arc with distance&lt;br /&gt;
and 	al, 16 + 8 + 2			; sub selecting palette part&lt;br /&gt;
stosb							; writing to screen&lt;br /&gt;
mov 	ax, 0xCCCD				; Performing the famous&lt;br /&gt;
mul 	di						; Rrrola trick&lt;br /&gt;
jo 		X						; next frame check&lt;br /&gt;
add 	word [bp+si], byte 23	; change offset smoothly&lt;br /&gt;
in 		al, 0x60				; check for ...&lt;br /&gt;
dec 	ax						; ...ESC key&lt;br /&gt;
jnz 	X						; otherwise continue&lt;br /&gt;
ret								; quit program&lt;br /&gt;
val:	dw 6519 				; n = 160 * 256 / pi / 2 ; 0x1977&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Many other tiny tunnel effects have been coded, so it is highly recommended to check out the documented source code of [http://web.archive.org/web/20050216133244/http://www.farb-rausch.de/ryg/tunnel.asm &amp;quot;Constant Evolution&amp;quot; by ryg/Farbrausch] and the [http://www.pouet.net/prod.php?which=29412 &amp;quot;Heart shaped tunnel&amp;quot; from Lord Kelvin], both with a size of 64 bytes. While &amp;quot;Constant Evolution&amp;quot; takes a slightly different route than the example here (''classic'' FPU communication, ''classic'' X Y construction, sqrt(x²+y²) instead of using &amp;lt;code&amp;gt;fcos&amp;lt;/code&amp;gt;+&amp;lt;code&amp;gt;fimul&amp;lt;/code&amp;gt;), &amp;quot;Heart shaped tunnel&amp;quot; uses no FPU at all. &lt;br /&gt;
&lt;br /&gt;
The takeaways from this example are:&lt;br /&gt;
&lt;br /&gt;
* Loading a constant from the code with some degrees of freedom&lt;br /&gt;
&amp;lt;code&amp;gt;fimul dword [si]&amp;lt;/code&amp;gt; multiplies with a 32 bit integer dividend for the tunnel effect. The highest byte of this constant points to our code, to the opcode from &amp;lt;code&amp;gt;or al, 0x13&amp;lt;/code&amp;gt;. This instruction puts &amp;lt;code&amp;gt;0x13&amp;lt;/code&amp;gt; into &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt;, and since there a lot of possibilities to achieve this, there is a direct way of changing the appearance of the tunnel with changing this instruction to one of the following : &amp;lt;code&amp;gt;mov al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;add al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;xor al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;adc al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;sbb al,0xED&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sub al,0xED&amp;lt;/code&amp;gt;. In this special case, the instruction can also be swapped with &amp;lt;code&amp;gt;pop es&amp;lt;/code&amp;gt; to gain a further degree of freedom.&lt;br /&gt;
&lt;br /&gt;
* normal loading of a constant which can't be reused as opcode&lt;br /&gt;
Although it's the ultimate goal to not even use a single extra byte for constants, sometimes the required sequence simply does not appear in the code. In this case, a constant is needed to convert the angle from the range [-pi,pi] to the color space in a way that no gaps appear while stepping from 359° to 0°. In the last line at &amp;lt;code&amp;gt;val:&amp;lt;/code&amp;gt; the value &amp;lt;code&amp;gt;160&amp;lt;/code&amp;gt; in the comment is 32 * 5 where 5 is number of &amp;quot;spiral arms&amp;quot; the tunnel effect has. The 8bit shift (*256) is to increase precision. It turns out that 16bit precision is enough to get a decent &amp;quot;gap closer&amp;quot; for values obtained by &amp;lt;code&amp;gt;fpatan&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* operating directly on an indirect memory location without offset&lt;br /&gt;
The location &amp;lt;code&amp;gt;[bp+si]&amp;lt;/code&amp;gt; is used as animation variable, while both participatory registers are kept fixed (the value is &amp;lt;code&amp;gt;0x0A??&amp;lt;/code&amp;gt; and therefor way above our code). Since we work with 16bit values and the top 8bit are the measurement in pixels, the instruction &amp;lt;code&amp;gt;add word [bp+si], byte 23&amp;lt;/code&amp;gt; allows for sub pixel precision in animation, while occupying 3 bytes of space. Depending on the target hardware, this value 23 can be increased/decreased to achieve faster/smoother animation.&lt;br /&gt;
&lt;br /&gt;
* Optimizing the check for the next frame&lt;br /&gt;
Normally, there is a check like &amp;lt;code&amp;gt;test di,di&amp;lt;/code&amp;gt; with direct consecutive branch necessary. The used approach allows for direct branching after &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;jo&amp;lt;/code&amp;gt;, since the overflow flag is always but twice triggered for a frame. This saves two bytes, but also requires adjustment of the animation constant, because the animation constant is also added twice. A further benefit is that in one of these two cases, &amp;lt;code&amp;gt;AX&amp;lt;/code&amp;gt; is zero which save a further byte on the following ESC check (&amp;lt;code&amp;gt;dec ax&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;dec al&amp;lt;/code&amp;gt; )&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now if we abandon all the comfort, alignment, smoothness and convenience, and optimize this straight for size, we end up with a 52 byte version. This does not include the possible exclusion of color tuning (2 bytes), after all the effect is supposed to look at least somewhat appealing ;)&lt;br /&gt;
&lt;br /&gt;
[[File:Neontube 52b optimized.png|thumb|Neontube 52b optimized]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov		al,0x13&lt;br /&gt;
int		0x10&lt;br /&gt;
X: &lt;br /&gt;
or		al, [bp+si]&lt;br /&gt;
xor		al, 0x68&lt;br /&gt;
mov		dx, 0x79F&lt;br /&gt;
pusha&lt;br /&gt;
fild 	word	[bx-9]		; x&lt;br /&gt;
fild 	word	[bx-8]		; y x&lt;br /&gt;
fpatan						; arc&lt;br /&gt;
fst 	st1					; arc arc&lt;br /&gt;
fcos						; cos(arc) arc&lt;br /&gt;
fimul	dword	[si]		; l*cos(arc) arc&lt;br /&gt;
fidiv	word	[bx-8]		; l*cos(arc)/x arc&lt;br /&gt;
fistp	dword	[bx-4]		; arc&lt;br /&gt;
fimul	word	[bx]		; scarc&lt;br /&gt;
fistp	word	[bx-5]		; -&lt;br /&gt;
popa&lt;br /&gt;
sub		ah, [bp+si]&lt;br /&gt;
xor		al, ah&lt;br /&gt;
and		al, 16 + 8 + 4&lt;br /&gt;
stosb&lt;br /&gt;
mov		ax, 0xCCCD&lt;br /&gt;
mul		di&lt;br /&gt;
jmp 	short X-1&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''to be continued''&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=265</id>
		<title>Floating-point Opcodes</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=265"/>
				<updated>2016-08-18T16:34:45Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: /* Optimizing with the FPU */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The FPU offers a lot of complex floating-point operations not available to classic x86 CPUs, like &amp;lt;code&amp;gt;SIN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;COS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ATAN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;SQRT&amp;lt;/code&amp;gt;, etc. ([http://www.website.masmforum.com/tutorials/fptute/appen1.htm SIMPLY FPU] by Raymond Filiatreault has a compact overview of all FPU commands for the curious.)  &lt;br /&gt;
&lt;br /&gt;
== FPU Basics ==&lt;br /&gt;
&lt;br /&gt;
Usage and communication with the FPU is quite uncommon and takes a bit to get used to, so we'll start with a simple example.  This is what [[Output#Outputting_in_mode_13h_.28320x200.29|our code example]] looks like originally:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter	          &amp;lt;-- REPLACE THIS WITH FPU CODE&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and this is how the code looks if we replace the instruction with FPU code :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get &amp;quot;column&amp;quot; in AX&lt;br /&gt;
&lt;br /&gt;
fninit				; init FPU first&lt;br /&gt;
mov		[si],ax		; write first addend to a memory location&lt;br /&gt;
fild	word [si]	; F(pu) I(nteger) L(oad)D a WORD from memory location to the FPU stack&lt;br /&gt;
mov		[si],di		; write second addend to a memory location&lt;br /&gt;
fiadd	word [si]	; Directly add the word in the memory location to the top FPU stack&lt;br /&gt;
fist	word [si]	; F(pu) I(nteger) ST(ore) the result into a memory location&lt;br /&gt;
mov		ax,[si]		; Get the word from the memory location into AX&lt;br /&gt;
&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Obviously that made the program bigger instead of smaller, but the point of this exercise is to illustrate how to work with the FPU.)  Looking at our changes, you get a sense of what usual interaction with the FPU is:&lt;br /&gt;
# &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt;: Initialize the FPU&lt;br /&gt;
# transfer values from CPU registers to memory location(s)&lt;br /&gt;
# transfer values from memory location(s) onto FPU stack&lt;br /&gt;
# do the actual calculations using the FPU (more on this soon)&lt;br /&gt;
# transfer result from the FPU stack into memory location(s)&lt;br /&gt;
# transfer result from memory location(s) back into registers&lt;br /&gt;
&lt;br /&gt;
That is a lot of extra code for a single integer addition, but that's a simple example; ''once more complex floating point operations are involved, it starts to pay off.'' &lt;br /&gt;
&lt;br /&gt;
== Optimizing with the FPU ==&lt;br /&gt;
&lt;br /&gt;
For more advanced FPU operation, let's start from scratch with an unoptimized program which plots the distance of each pixel to the screen center as color, in 49 bytes.&lt;br /&gt;
&lt;br /&gt;
[[File:Distance to center example.png|thumb|Distance to center]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h			&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
fninit					; -	&lt;br /&gt;
						; it's useful to comment what's on the&lt;br /&gt;
						; stack after each FPU operation&lt;br /&gt;
						; to not get lost ;) start is : empty (-)&lt;br /&gt;
X:&lt;br /&gt;
xor 	dx,dx			; reset the high word before division&lt;br /&gt;
mov 	bx,320			; 320 columns&lt;br /&gt;
mov 	ax,di			; get screen pointer in AX&lt;br /&gt;
div 	bx				; construct X,Y from screen pointer into AX,DX&lt;br /&gt;
sub 	ax,100			; subtract the origin&lt;br /&gt;
sub 	dx,160			; = (160,100) ... center of 320x200 screen	&lt;br /&gt;
mov 	[si],ax			; move X into a memory location&lt;br /&gt;
fild 	word [si]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
mov 	[si],dx			; move Y into a memory location&lt;br /&gt;
fild 	word [si]		; Y X²&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	word [si]		; -&lt;br /&gt;
mov 	ax,[si]			; get the result from memory&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A few words on this : &lt;br /&gt;
* The FPU registers (st0, st1, ...) are organized as a stack. When you load something to the FPU, everything else will be moved one location further away from the top (implicitly!) Some FPU instructions work only on the top, other allow the explicit parametrization with arbitrary FPU registers.&lt;br /&gt;
* Depending on what you do, sometimes &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt; can be omitted. Real hardware will refuse to work more often than emulators, but it's always worth the try.&lt;br /&gt;
* Accessing memory (size) efficiently can be a real pain. The safest way is to reference absolute memory locations (f.e &amp;lt;code&amp;gt;[1234]&amp;lt;/code&amp;gt;) but that's two bytes more per instruction than referencing memory with &amp;lt;code&amp;gt;[BX]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+DI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[DI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+DI]&amp;lt;/code&amp;gt;. When working with FPU and this ''classic'' approach of FPU communication, you have to design your codeflow to have one or some of these locations available.&lt;br /&gt;
* Accessing the memory is always with regard to the segment register &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; unless you perform segment overrides. When accessing memory with &amp;lt;code&amp;gt;[BP+??]&amp;lt;/code&amp;gt; be aware that the memory is accessed with regard to the segment register &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (see [http://www.oopweb.com/Assembly/Documents/ArtOfAssembly/Volume/Chapter_4/CH04-2.html here, at 4.6.2.2 The Register Indirect Addressing Modes])&lt;br /&gt;
* There are a few conventions which help you identify FPU commands. &amp;quot;i&amp;quot; stands for integer (WORD or DWORD), &amp;quot;p&amp;quot; means &amp;quot;pop stack afterwards&amp;quot;, so &amp;lt;code&amp;gt;FST&amp;lt;/code&amp;gt; means just &amp;quot;store&amp;quot; while &amp;lt;code&amp;gt;FISTP&amp;lt;/code&amp;gt; means &amp;quot;store as integer, then pop the stack&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now let's unleash the state of the art sizecoding arsenal onto this, to bring it down to 37 bytes (40 bytes with aspect correction)&lt;br /&gt;
&lt;br /&gt;
[[File:Distant to center optimized.png|thumb|Distance to center : optimized]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h - 70		; modified to center to 160,100&lt;br /&gt;
aas						; aspect ratio constant part&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
X:&lt;br /&gt;
mov 	ax,0xCCCD		; perform the famous...&lt;br /&gt;
mul		di				; ... Rrrola trick =)&lt;br /&gt;
sub 	dh,[si]			; align vertically&lt;br /&gt;
pusha 					; push all registers on stack&lt;br /&gt;
fild 	word [bx-8]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
fild 	word [bx-9]		; Y X²&lt;br /&gt;
fmul	dword [bx+si]	; aspect ratio correction&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	dword [bx-5]	; -&lt;br /&gt;
popa					; pop all registers from stack&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The resulting image is almost identical to to the former. Let's go through this step by step:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;push 	0a000h - 70&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of aligning horizontally with &amp;lt;code&amp;gt;sub	dx,160&amp;lt;/code&amp;gt; we can code this implicitly by moving our segment register ten units - that is 10 * 16 = 160 pixels - to the left (see [https://en.wikipedia.org/wiki/X86_memory_segmentation#Real_mode Real Mode Addressing]). With further multiple subtraction of 20 units - that is 320 pixels, we can shift the visible screen towards the top, to finetune vertical alignment. As long as this shift is no more than 4 lines ''( 65536 / 320 - 200 = 4,8 )'' there is no further visual impact.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the high byte of a constant, placed in a way that &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt; resolves to ~1.24 when read as 32bit float. The last byte of segment &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; is also of importance. Check yourself with the [http://www.h-schmidt.net/FloatConverter/IEEE754.html IEEE 754 Converter]&lt;br /&gt;
* &amp;lt;code&amp;gt;mov ax,0xCCCD&amp;lt;/code&amp;gt; &amp;amp; &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; (The &amp;quot;Rrrola trick&amp;quot;)&lt;br /&gt;
Instead of constructing X and Y from the screen pointer &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;DIV&amp;lt;/code&amp;gt; you can get a decent estimation with multiplying the screen pointer with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; and read X and Y from the 8bit registers &amp;lt;code&amp;gt;DH&amp;lt;/code&amp;gt; (+DL as 16bit value) and &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; (+AH as 16bit value). The idea is to interpret &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; as a kind of 16 bit float in the range &amp;lt;code&amp;gt;[0,1]&amp;lt;/code&amp;gt;, from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
* &amp;lt;code&amp;gt;sub dh,[si]&amp;lt;/code&amp;gt;&lt;br /&gt;
The instruction at &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;push &amp;lt;word&amp;gt;&amp;lt;/code&amp;gt; and has the opcode &amp;lt;code&amp;gt;0x68&amp;lt;/code&amp;gt; which is 104 in decimal. Combined with the fine tuned vertical alignment above ( ~4 lines) this results in (virtually) subtracting 100 for perfect vertical alignment. This is one byte shorter than &amp;lt;code&amp;gt;sub dh,100&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;pusha / popa&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of going the ''classical'' way of communicating with the FPU, we push all the registers, read/write values with memory addressing to/from the FPU, then pop all registers again. This works when &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; = &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP&amp;lt;/code&amp;gt; is &amp;quot;close enough&amp;quot; to &amp;lt;code&amp;gt;BX&amp;lt;/code&amp;gt; (initially zero and kept that way) to allow &amp;lt;code&amp;gt;[BX+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; addressing. It comes with the special benefit of implicit 8bit shifts. One serious drawback is loss of precision, since the registers &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;AH&amp;lt;/code&amp;gt; &amp;quot;lose connection&amp;quot; when using &amp;lt;code&amp;gt;PUSHA&amp;lt;/code&amp;gt; (see the order of registers : [http://x86.renejeschke.de/html/file_module_x86_id_270.html PUSHA/PUSHAD documentation]&lt;br /&gt;
*&amp;lt;code&amp;gt;fild word [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; &amp;amp; *&amp;lt;code&amp;gt;fistp dword [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the so called &amp;quot;stack addressing&amp;quot;. We assume that &amp;lt;code&amp;gt;BX=0&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP=0xFFFE&amp;lt;/code&amp;gt; at start, so we know where the registers are in memory after &amp;lt;code&amp;gt;pusha&amp;lt;/code&amp;gt; (AX at [BX-4], CX at [BX-6] etc.). It's important to realize that we work with signed 16 bit values now, in the full range of [-32768,32767]. That is also why we need &amp;lt;code&amp;gt;DWORD&amp;lt;/code&amp;gt; when storing the result : &amp;lt;code&amp;gt;sqrt(x²+y²)&amp;lt;/code&amp;gt; exceeds the signed 16bit range for quite some value pairs. Note that there are already implicit 8bit shifts (bx-9,bx-5)&lt;br /&gt;
*&amp;lt;code&amp;gt;fmul dword [bx+si]&amp;lt;/code&amp;gt;&lt;br /&gt;
With the &amp;quot;Rrrola&amp;quot; trick above, we have the row number to be 204 at maximum, but also the column can't be greater than 256. This results in a wrong aspect ratio, but it can almost completely be fixed with this two byte instruction (+ one byte for the &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; instruction) : 256 * 1,24 = 317,44 which is quite close to 320. If aspect ratio is of no meaning to the desired effect, this three bytes can be shaved off. &lt;br /&gt;
&lt;br /&gt;
=== Adding some features ===&lt;br /&gt;
&lt;br /&gt;
Now let's add some features:&lt;br /&gt;
* extract angle as opposed to the distance and combine both&lt;br /&gt;
* reverse divide the distance to create the &amp;quot;tunnel&amp;quot; effect&lt;br /&gt;
* animate with smooth steps along the distance&lt;br /&gt;
* improve on the colors with subselecting from the standard palette&lt;br /&gt;
* quit the program on ESC&lt;br /&gt;
&lt;br /&gt;
This results in the following program with a size of 63 bytes :&lt;br /&gt;
&lt;br /&gt;
[[File:Tunnel effect neontube.png|thumb|tunnel effect neontube]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0xa000 - 10 - 3 * 20	; video base - 3.5 lines&lt;br /&gt;
or 		al, 0x13				; mode 13h = 320 x 200 in 256 colors&lt;br /&gt;
pop 	es						; get aligned video memory base&lt;br /&gt;
int 	0x10					; switch videomode&lt;br /&gt;
X: &lt;br /&gt;
sub		dh, [si]				; vertical alignment&lt;br /&gt;
pusha							; push all registers on stack&lt;br /&gt;
fild 	word	[bx-9]			; fpustack :  x&lt;br /&gt;
fild 	word	[bx-8]			; fpustack :  y  x&lt;br /&gt;
fpatan							; fpustack :  arc&lt;br /&gt;
fst 	st1						; fpustack :  arc  arc&lt;br /&gt;
fcos							; fpustack :  cos(arc)  arc&lt;br /&gt;
fimul	dword	[si]			; fpustack :  l*cos(arc)  arc&lt;br /&gt;
fidiv	word	[bx-8]			; fpustack :  l*cos(arc)/x  arc&lt;br /&gt;
fiadd	word	[bp+si]			; fpustack :  l*cos(arc)/x+offset  arc&lt;br /&gt;
fistp	dword	[bx-7]			; fpustack :  arc&lt;br /&gt;
fimul	word	[byte si+val]	; fpustack :  scaled_arc&lt;br /&gt;
fistp	word	[bx-5]			; fpustack :  -&lt;br /&gt;
popa							; pop all registers from stack&lt;br /&gt;
xor 	al, cl					; XOR scaled_arc with distance&lt;br /&gt;
and 	al, 16 + 8 + 2			; sub selecting palette part&lt;br /&gt;
stosb							; writing to screen&lt;br /&gt;
mov 	ax, 0xCCCD				; Performing the famous&lt;br /&gt;
mul 	di						; Rrrola trick&lt;br /&gt;
jo 		X						; next frame check&lt;br /&gt;
add 	word [bp+si], byte 23	; change offset smoothly&lt;br /&gt;
in 		al, 0x60				; check for ...&lt;br /&gt;
dec 	ax						; ...ESC key&lt;br /&gt;
jnz 	X						; otherwise continue&lt;br /&gt;
ret								; quit program&lt;br /&gt;
val:	dw 6519 				; n = 160 * 256 / pi / 2 ; 0x1977&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Many other tiny tunnel effects have been coded, so it is highly recommended to check out the documented source code of [http://web.archive.org/web/20050216133244/http://www.farb-rausch.de/ryg/tunnel.asm &amp;quot;Constant Evolution&amp;quot; by ryg/Farbrausch] and the [http://www.pouet.net/prod.php?which=29412 &amp;quot;Heart shaped tunnel&amp;quot; from Lord Kelvin], both with a size of 64 bytes. While &amp;quot;Constant Evolution&amp;quot; takes a slightly different route than the example here (''classic'' FPU communication, ''classic'' X Y construction, sqrt(x²+y²) instead of using &amp;lt;code&amp;gt;fcos&amp;lt;/code&amp;gt;+&amp;lt;code&amp;gt;fimul&amp;lt;/code&amp;gt;), &amp;quot;Heart shaped tunnel&amp;quot; uses no FPU at all. &lt;br /&gt;
&lt;br /&gt;
The takeaways from this example are:&lt;br /&gt;
&lt;br /&gt;
* Loading a constant from the code with some degrees of freedom&lt;br /&gt;
&amp;lt;code&amp;gt;fimul dword [si]&amp;lt;/code&amp;gt; multiplies with a 32 bit integer dividend for the tunnel effect. The highest byte of this constant points to our code, to the opcode from &amp;lt;code&amp;gt;or al, 0x13&amp;lt;/code&amp;gt;. This instruction puts &amp;lt;code&amp;gt;0x13&amp;lt;/code&amp;gt; into &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt;, and since there a lot of possibilities to achieve this, there is a direct way of changing the appearance of the tunnel with changing this instruction to one of the following : &amp;lt;code&amp;gt;mov al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;add al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;xor al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;adc al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;sbb al,0xED&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sub al,0xED&amp;lt;/code&amp;gt;. In this special case, the instruction can also be swapped with &amp;lt;code&amp;gt;pop es&amp;lt;/code&amp;gt; to gain a further degree of freedom.&lt;br /&gt;
&lt;br /&gt;
* normal loading of a constant which can't be reused as opcode&lt;br /&gt;
Although it's the ultimate goal to not even use a single extra byte for constants, sometimes the required sequence simply does not appear in the code. In this case, a constant is needed to convert the angle from the range [-pi,pi] to the color space in a way that no gaps appear while stepping from 359° to 0°. In the last line at &amp;lt;code&amp;gt;val:&amp;lt;/code&amp;gt; the value &amp;lt;code&amp;gt;160&amp;lt;/code&amp;gt; in the comment is 32 * 5 where 5 is number of &amp;quot;spiral arms&amp;quot; the tunnel effect has. The 8bit shift (*256) is to increase precision. It turns out that 16bit precision is enough to get a decent &amp;quot;gap closer&amp;quot; for values obtained by &amp;lt;code&amp;gt;fpatan&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* operating directly on an indirect memory location without offset&lt;br /&gt;
The location &amp;lt;code&amp;gt;[bp+si]&amp;lt;/code&amp;gt; is used as animation variable, while both participatory registers are kept fixed (the value is &amp;lt;code&amp;gt;0x0A??&amp;lt;/code&amp;gt; and therefor way above our code). Since we work with 16bit values and the top 8bit are the measurement in pixels, the instruction &amp;lt;code&amp;gt;add word [bp+si], byte 23&amp;lt;/code&amp;gt; allows for sub pixel precision in animation, while occupying 3 bytes of space. Depending on the target hardware, this value 23 can be increased/decreased to achieve faster/smoother animation.&lt;br /&gt;
&lt;br /&gt;
* Optimizing the check for the next frame&lt;br /&gt;
Normally, there is a check like &amp;lt;code&amp;gt;test di,di&amp;lt;/code&amp;gt; with direct consecutive branch necessary. The used approach allows for direct branching after &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;jo&amp;lt;/code&amp;gt;, since the overflow flag is always but twice triggered for a frame. This saves two bytes, but also requires adjustment of the animation constant, because the animation constant is also added twice. A further benefit is that in one of these two cases, &amp;lt;code&amp;gt;AX&amp;lt;/code&amp;gt; is zero which save a further byte on the following ESC check (&amp;lt;code&amp;gt;dec ax&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;dec al&amp;lt;/code&amp;gt; )&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now if we abandon all the comfort, alignment, smoothness and convenience, and optimize this straight for size, we end up with a 52 byte version. This does not include the possible exclusion of color tuning (2 bytes), after all the effect is supposed to look at least somewhat appealing ;)&lt;br /&gt;
&lt;br /&gt;
[[File:Neontube 52b optimized.png|thumb|Neontube 52b optimized]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov		al,0x13&lt;br /&gt;
int		0x10&lt;br /&gt;
X: &lt;br /&gt;
or		al, [bp+si]&lt;br /&gt;
xor		al, 0x68&lt;br /&gt;
mov		dx, 0x79F&lt;br /&gt;
pusha&lt;br /&gt;
fild 	word	[bx-9]		; x&lt;br /&gt;
fild 	word	[bx-8]		; y x&lt;br /&gt;
fpatan						; arc&lt;br /&gt;
fst 	st1					; arc arc&lt;br /&gt;
fcos						; cos(arc) arc&lt;br /&gt;
fimul	dword	[si]		; l*cos(arc) arc&lt;br /&gt;
fidiv	word	[bx-8]		; l*cos(arc)/x arc&lt;br /&gt;
fistp	dword	[bx-4]		; arc&lt;br /&gt;
fimul	word	[bx]		; scarc&lt;br /&gt;
fistp	word	[bx-5]		; -&lt;br /&gt;
popa&lt;br /&gt;
sub		ah, [bp+si]&lt;br /&gt;
xor		al, ah&lt;br /&gt;
and		al, 16 + 8 + 4&lt;br /&gt;
stosb&lt;br /&gt;
mov		ax, 0xCCCD&lt;br /&gt;
mul		di&lt;br /&gt;
jmp 	short X-1&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''to be continued''&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=264</id>
		<title>Floating-point Opcodes</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Floating-point_Opcodes&amp;diff=264"/>
				<updated>2016-08-18T16:33:33Z</updated>
		
		<summary type="html">&lt;p&gt;Trixter: Clarified sections and wording&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The FPU offers a lot of complex floating-point operations not available to classic x86 CPUs, like &amp;lt;code&amp;gt;SIN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;COS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ATAN&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;SQRT&amp;lt;/code&amp;gt;, etc. ([http://www.website.masmforum.com/tutorials/fptute/appen1.htm SIMPLY FPU] by Raymond Filiatreault has a compact overview of all FPU commands for the curious.)  &lt;br /&gt;
&lt;br /&gt;
== FPU Basics ==&lt;br /&gt;
&lt;br /&gt;
Usage and communication with the FPU is quite uncommon and takes a bit to get used to, so we'll start with a simple example.  This is what [[Output#Outputting_in_mode_13h_.28320x200.29|our code example]] looks like originally:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter	          &amp;lt;-- REPLACE THIS WITH FPU CODE&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and this is how the code looks if we replace the instruction with FPU code :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get &amp;quot;column&amp;quot; in AX&lt;br /&gt;
&lt;br /&gt;
fninit				; init FPU first&lt;br /&gt;
mov		[si],ax		; write first addend to a memory location&lt;br /&gt;
fild	word [si]	; F(pu) I(nteger) L(oad)D a WORD from memory location to the FPU stack&lt;br /&gt;
mov		[si],di		; write second addend to a memory location&lt;br /&gt;
fiadd	word [si]	; Directly add the word in the memory location to the top FPU stack&lt;br /&gt;
fist	word [si]	; F(pu) I(nteger) ST(ore) the result into a memory location&lt;br /&gt;
mov		ax,[si]		; Get the word from the memory location into AX&lt;br /&gt;
&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Obviously that made the program bigger instead of smaller, but the point of this exercise is to illustrate how to work with the FPU.)  Looking at our changes, you get a sense of what usual interaction with the FPU is:&lt;br /&gt;
# &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt;: Initialize the FPU&lt;br /&gt;
# transfer values from CPU registers to memory location(s)&lt;br /&gt;
# transfer values from memory location(s) onto FPU stack&lt;br /&gt;
# do the actual calculations using the FPU (more on this soon)&lt;br /&gt;
# transfer result from the FPU stack into memory location(s)&lt;br /&gt;
# transfer result from memory location(s) back into registers&lt;br /&gt;
&lt;br /&gt;
That is a lot of extra code for a single integer addition, but that's a simple example; ''once more complex floating point operations are involved, it starts to pay off.'' &lt;br /&gt;
&lt;br /&gt;
== Optimizing with the FPU ==&lt;br /&gt;
&lt;br /&gt;
For more advanced FPU operation, let's start from scratch with an unoptimized program which plots the distance of each pixel to the screen center as color, in 49 bytes.&lt;br /&gt;
&lt;br /&gt;
[[File:Distance to center example.png|thumb|Distance to center]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h			&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
fninit					; -	&lt;br /&gt;
						; it's useful to comment what's on the&lt;br /&gt;
						; stack after each FPU operation&lt;br /&gt;
						; to not get lost ;) start is : empty (-)&lt;br /&gt;
X:&lt;br /&gt;
xor 	dx,dx			; reset the high word before division&lt;br /&gt;
mov 	bx,320			; 320 columns&lt;br /&gt;
mov 	ax,di			; get screen pointer in AX&lt;br /&gt;
div 	bx				; construct X,Y from screen pointer into AX,DX&lt;br /&gt;
sub 	ax,100			; subtract the origin&lt;br /&gt;
sub 	dx,160			; = (160,100) ... center of 320x200 screen	&lt;br /&gt;
mov 	[si],ax			; move X into a memory location&lt;br /&gt;
fild 	word [si]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
mov 	[si],dx			; move Y into a memory location&lt;br /&gt;
fild 	word [si]		; Y X²&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	word [si]		; -&lt;br /&gt;
mov 	ax,[si]			; get the result from memory&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A few words on this : &lt;br /&gt;
* The FPU registers (st0, st1, ...) are organized as a stack. When you load something to the FPU, everything else will be moved one location further away from the top (implicitly!) Some FPU instructions work only on the top, other allow the explicit parametrization with arbitrary FPU registers.&lt;br /&gt;
* Depending on what you do, sometimes &amp;lt;code&amp;gt;F(N)INIT&amp;lt;/code&amp;gt; can be omitted. Real hardware will refuse to work more often than emulators, but it's always worth the try.&lt;br /&gt;
* Accessing memory (size) efficiently can be a real pain. The safest way is to reference absolute memory locations (f.e &amp;lt;code&amp;gt;[1234]&amp;lt;/code&amp;gt;) but that's two bytes more per instruction than referencing memory with &amp;lt;code&amp;gt;[BX]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+DI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[BP+SI]&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;[DI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+DI]&amp;lt;/code&amp;gt;. When working with FPU and this ''classic'' approach of FPU communication, you have to design your codeflow to have one or some of these locations available.&lt;br /&gt;
* Accessing the memory is always with regard to the segment register &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; unless you perform segment overrides. When accessing memory with &amp;lt;code&amp;gt;[BP+??]&amp;lt;/code&amp;gt; be aware that the memory is accessed with regard to the segment register &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (see [http://www.oopweb.com/Assembly/Documents/ArtOfAssembly/Volume/Chapter_4/CH04-2.html here, at 4.6.2.2 The Register Indirect Addressing Modes])&lt;br /&gt;
* There are a few conventions which help you identify FPU commands. &amp;quot;i&amp;quot; stands for integer (WORD or DWORD), &amp;quot;p&amp;quot; means &amp;quot;pop stack afterwards&amp;quot;, so &amp;lt;code&amp;gt;FST&amp;lt;/code&amp;gt; means just &amp;quot;store&amp;quot; while &amp;lt;code&amp;gt;FISTP&amp;lt;/code&amp;gt; means &amp;quot;store as integer, then pop the stack&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now let's unleash the state of the art sizecoding arsenal onto this, to bring it down to 37 bytes (40 bytes with aspect correction)&lt;br /&gt;
&lt;br /&gt;
[[File:Distant to center optimized.png|thumb|Distance to center : optimized]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0a000h - 70		; modified to center to 160,100&lt;br /&gt;
aas						; aspect ratio constant part&lt;br /&gt;
pop 	es				; get start of video memory in ES&lt;br /&gt;
mov 	al,0x13			; switch to video mode 13h&lt;br /&gt;
int 	0x10			; 320 * 200 in 256 colors&lt;br /&gt;
X:&lt;br /&gt;
mov 	ax,0xCCCD		; perform the famous...&lt;br /&gt;
mul		di				; ... Rrrola trick =)&lt;br /&gt;
sub 	dh,[si]			; align vertically&lt;br /&gt;
pusha 					; push all registers on stack&lt;br /&gt;
fild 	word [bx-8]		; X&lt;br /&gt;
fmul 	st0				; X²&lt;br /&gt;
fild 	word [bx-9]		; Y X²&lt;br /&gt;
fmul	dword [bx+si]	; aspect ratio correction&lt;br /&gt;
fmul 	st0				; Y² X²&lt;br /&gt;
fadd 	st0,st1			; Y²+X²&lt;br /&gt;
fsqrt					; R&lt;br /&gt;
fistp 	dword [bx-5]	; -&lt;br /&gt;
popa					; pop all registers from stack&lt;br /&gt;
stosb					; write to screen (DI) and increment DI&lt;br /&gt;
jmp short X				; next pixel&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The resulting image is almost identical to to the former. Let's go through this step by step:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;push 	0a000h - 70&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of aligning horizontally with &amp;lt;code&amp;gt;sub	dx,160&amp;lt;/code&amp;gt; we can code this implicitly by moving our segment register ten units - that is 10 * 16 = 160 pixels - to the left (see [https://en.wikipedia.org/wiki/X86_memory_segmentation#Real_mode Real Mode Addressing]). With further multiple subtraction of 20 units - that is 320 pixels, we can shift the visible screen towards the top, to finetune vertical alignment. As long as this shift is no more than 4 lines ''( 65536 / 320 - 200 = 4,8 )'' there is no further visual impact.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the high byte of a constant, placed in a way that &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt; resolves to ~1.24 when read as 32bit float. The last byte of segment &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; is also of importance. Check yourself with the [http://www.h-schmidt.net/FloatConverter/IEEE754.html IEEE 754 Converter]&lt;br /&gt;
* &amp;lt;code&amp;gt;mov ax,0xCCCD&amp;lt;/code&amp;gt; &amp;amp; &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; (The &amp;quot;Rrrola trick&amp;quot;)&lt;br /&gt;
Instead of constructing X and Y from the screen pointer &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;DIV&amp;lt;/code&amp;gt; you can get a decent estimation with multiplying the screen pointer with &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; and read X and Y from the 8bit registers &amp;lt;code&amp;gt;DH&amp;lt;/code&amp;gt; (+DL as 16bit value) and &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; (+AH as 16bit value). The idea is to interpret &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; as a kind of 16 bit float in the range &amp;lt;code&amp;gt;[0,1]&amp;lt;/code&amp;gt;, from start to end. Multiplying this number in [0,1] with 65536 / 320 = 204,8 results in the row before the comma, and again as a kind of a float, the column after the comma. The representation &amp;lt;code&amp;gt;0xCCCD&amp;lt;/code&amp;gt; is the nearest rounding of 204,8 * 256 ( = 52428,8 ~ 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no precision loss.&lt;br /&gt;
* &amp;lt;code&amp;gt;sub dh,[si]&amp;lt;/code&amp;gt;&lt;br /&gt;
The instruction at &amp;lt;code&amp;gt;[SI]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;push &amp;lt;word&amp;gt;&amp;lt;/code&amp;gt; and has the opcode &amp;lt;code&amp;gt;0x68&amp;lt;/code&amp;gt; which is 104 in decimal. Combined with the fine tuned vertical alignment above ( ~4 lines) this results in (virtually) subtracting 100 for perfect vertical alignment. This is one byte shorter than &amp;lt;code&amp;gt;sub dh,100&amp;lt;/code&amp;gt;.&lt;br /&gt;
* &amp;lt;code&amp;gt;pusha / popa&amp;lt;/code&amp;gt;&lt;br /&gt;
Instead of going the ''classical'' way of communicating with the FPU, we push all the registers, read/write values with memory addressing to/from the FPU, then pop all registers again. This works when &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; = &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP&amp;lt;/code&amp;gt; is &amp;quot;close enough&amp;quot; to &amp;lt;code&amp;gt;BX&amp;lt;/code&amp;gt; (initially zero and kept that way) to allow &amp;lt;code&amp;gt;[BX+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; addressing. It comes with the special benefit of implicit 8bit shifts. One serious drawback is loss of precision, since the registers &amp;lt;code&amp;gt;DL&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;AH&amp;lt;/code&amp;gt; &amp;quot;lose connection&amp;quot; when using &amp;lt;code&amp;gt;PUSHA&amp;lt;/code&amp;gt; (see the order of registers : [http://x86.renejeschke.de/html/file_module_x86_id_270.html PUSHA/PUSHAD documentation]&lt;br /&gt;
*&amp;lt;code&amp;gt;fild word [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt; &amp;amp; *&amp;lt;code&amp;gt;fistp dword [bx+&amp;lt;signed byte&amp;gt;]&amp;lt;/code&amp;gt;&lt;br /&gt;
This is the so called &amp;quot;stack addressing&amp;quot;. We assume that &amp;lt;code&amp;gt;BX=0&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;SP=0xFFFE&amp;lt;/code&amp;gt; at start, so we know where the registers are in memory after &amp;lt;code&amp;gt;pusha&amp;lt;/code&amp;gt; (AX at [BX-4], CX at [BX-6] etc.). It's important to realize that we work with signed 16 bit values now, in the full range of [-32768,32767]. That is also why we need &amp;lt;code&amp;gt;DWORD&amp;lt;/code&amp;gt; when storing the result : &amp;lt;code&amp;gt;sqrt(x²+y²)&amp;lt;/code&amp;gt; exceeds the signed 16bit range for quite some value pairs. Note that there are already implicit 8bit shifts (bx-9,bx-5)&lt;br /&gt;
*&amp;lt;code&amp;gt;fmul dword [bx+si]&amp;lt;/code&amp;gt;&lt;br /&gt;
With the &amp;quot;Rrrola&amp;quot; trick above, we have the row number to be 204 at maximum, but also the column can't be greater than 256. This results in a wrong aspect ratio, but it can almost completely be fixed with this two byte instruction (+ one byte for the &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; instruction) : 256 * 1,24 = 317,44 which is quite close to 320. If aspect ratio is of no meaning to the desired effect, this three bytes can be shaved off. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now let's add some features:&lt;br /&gt;
* extract angle as opposed to the distance and combine both&lt;br /&gt;
* reverse divide the distance to create the &amp;quot;tunnel&amp;quot; effect&lt;br /&gt;
* animate with smooth steps along the distance&lt;br /&gt;
* improve on the colors with subselecting from the standard palette&lt;br /&gt;
* quit the program on ESC&lt;br /&gt;
&lt;br /&gt;
This results in the following program with a size of 63 bytes :&lt;br /&gt;
&lt;br /&gt;
[[File:Tunnel effect neontube.png|thumb|tunnel effect neontube]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 	0xa000 - 10 - 3 * 20	; video base - 3.5 lines&lt;br /&gt;
or 		al, 0x13				; mode 13h = 320 x 200 in 256 colors&lt;br /&gt;
pop 	es						; get aligned video memory base&lt;br /&gt;
int 	0x10					; switch videomode&lt;br /&gt;
X: &lt;br /&gt;
sub		dh, [si]				; vertical alignment&lt;br /&gt;
pusha							; push all registers on stack&lt;br /&gt;
fild 	word	[bx-9]			; fpustack :  x&lt;br /&gt;
fild 	word	[bx-8]			; fpustack :  y  x&lt;br /&gt;
fpatan							; fpustack :  arc&lt;br /&gt;
fst 	st1						; fpustack :  arc  arc&lt;br /&gt;
fcos							; fpustack :  cos(arc)  arc&lt;br /&gt;
fimul	dword	[si]			; fpustack :  l*cos(arc)  arc&lt;br /&gt;
fidiv	word	[bx-8]			; fpustack :  l*cos(arc)/x  arc&lt;br /&gt;
fiadd	word	[bp+si]			; fpustack :  l*cos(arc)/x+offset  arc&lt;br /&gt;
fistp	dword	[bx-7]			; fpustack :  arc&lt;br /&gt;
fimul	word	[byte si+val]	; fpustack :  scaled_arc&lt;br /&gt;
fistp	word	[bx-5]			; fpustack :  -&lt;br /&gt;
popa							; pop all registers from stack&lt;br /&gt;
xor 	al, cl					; XOR scaled_arc with distance&lt;br /&gt;
and 	al, 16 + 8 + 2			; sub selecting palette part&lt;br /&gt;
stosb							; writing to screen&lt;br /&gt;
mov 	ax, 0xCCCD				; Performing the famous&lt;br /&gt;
mul 	di						; Rrrola trick&lt;br /&gt;
jo 		X						; next frame check&lt;br /&gt;
add 	word [bp+si], byte 23	; change offset smoothly&lt;br /&gt;
in 		al, 0x60				; check for ...&lt;br /&gt;
dec 	ax						; ...ESC key&lt;br /&gt;
jnz 	X						; otherwise continue&lt;br /&gt;
ret								; quit program&lt;br /&gt;
val:	dw 6519 				; n = 160 * 256 / pi / 2 ; 0x1977&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Many other tiny tunnel effects have been coded, so it is highly recommended to check out the documented source code of [http://web.archive.org/web/20050216133244/http://www.farb-rausch.de/ryg/tunnel.asm &amp;quot;Constant Evolution&amp;quot; by ryg/Farbrausch] and the [http://www.pouet.net/prod.php?which=29412 &amp;quot;Heart shaped tunnel&amp;quot; from Lord Kelvin], both with a size of 64 bytes. While &amp;quot;Constant Evolution&amp;quot; takes a slightly different route than the example here (''classic'' FPU communication, ''classic'' X Y construction, sqrt(x²+y²) instead of using &amp;lt;code&amp;gt;fcos&amp;lt;/code&amp;gt;+&amp;lt;code&amp;gt;fimul&amp;lt;/code&amp;gt;), &amp;quot;Heart shaped tunnel&amp;quot; uses no FPU at all. The takeaways from this example are :&lt;br /&gt;
&lt;br /&gt;
* Loading a constant from the code with some degrees of freedom&lt;br /&gt;
&amp;lt;code&amp;gt;fimul dword [si]&amp;lt;/code&amp;gt; multiplies with a 32 bit integer dividend for the tunnel effect. The highest byte of this constant points to our code, to the opcode from &amp;lt;code&amp;gt;or al, 0x13&amp;lt;/code&amp;gt;. This instruction puts &amp;lt;code&amp;gt;0x13&amp;lt;/code&amp;gt; into &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt;, and since there a lot of possibilities to achieve this, there is a direct way of changing the appearance of the tunnel with changing this instruction to one of the following : &amp;lt;code&amp;gt;mov al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;add al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;xor al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;adc al,0x13&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;sbb al,0xED&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sub al,0xED&amp;lt;/code&amp;gt;. In this special case, the instruction can also be swapped with &amp;lt;code&amp;gt;pop es&amp;lt;/code&amp;gt; to gain a further degree of freedom.&lt;br /&gt;
&lt;br /&gt;
* normal loading of a constant which can't be reused as opcode&lt;br /&gt;
Although it's the ultimate goal to not even use a single extra byte for constants, sometimes the required sequence simply does not appear in the code. In this case, a constant is needed to convert the angle from the range [-pi,pi] to the color space in a way that no gaps appear while stepping from 359° to 0°. In the last line at &amp;lt;code&amp;gt;val:&amp;lt;/code&amp;gt; the value &amp;lt;code&amp;gt;160&amp;lt;/code&amp;gt; in the comment is 32 * 5 where 5 is number of &amp;quot;spiral arms&amp;quot; the tunnel effect has. The 8bit shift (*256) is to increase precision. It turns out that 16bit precision is enough to get a decent &amp;quot;gap closer&amp;quot; for values obtained by &amp;lt;code&amp;gt;fpatan&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* operating directly on an indirect memory location without offset&lt;br /&gt;
The location &amp;lt;code&amp;gt;[bp+si]&amp;lt;/code&amp;gt; is used as animation variable, while both participatory registers are kept fixed (the value is &amp;lt;code&amp;gt;0x0A??&amp;lt;/code&amp;gt; and therefor way above our code). Since we work with 16bit values and the top 8bit are the measurement in pixels, the instruction &amp;lt;code&amp;gt;add word [bp+si], byte 23&amp;lt;/code&amp;gt; allows for sub pixel precision in animation, while occupying 3 bytes of space. Depending on the target hardware, this value 23 can be increased/decreased to achieve faster/smoother animation.&lt;br /&gt;
&lt;br /&gt;
* Optimizing the check for the next frame&lt;br /&gt;
Normally, there is a check like &amp;lt;code&amp;gt;test di,di&amp;lt;/code&amp;gt; with direct consecutive branch necessary. The used approach allows for direct branching after &amp;lt;code&amp;gt;mul di&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;jo&amp;lt;/code&amp;gt;, since the overflow flag is always but twice triggered for a frame. This saves two bytes, but also requires adjustment of the animation constant, because the animation constant is also added twice. A further benefit is that in one of these two cases, &amp;lt;code&amp;gt;AX&amp;lt;/code&amp;gt; is zero which save a further byte on the following ESC check (&amp;lt;code&amp;gt;dec ax&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;dec al&amp;lt;/code&amp;gt; )&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now if we abandon all the comfort, alignment, smoothness and convenience, and optimize this straight for size, we end up with a 52 byte version. This does not include the possible exclusion of color tuning (2 bytes), after all the effect is supposed to look at least somewhat appealing ;)&lt;br /&gt;
&lt;br /&gt;
[[File:Neontube 52b optimized.png|thumb|Neontube 52b optimized]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov		al,0x13&lt;br /&gt;
int		0x10&lt;br /&gt;
X: &lt;br /&gt;
or		al, [bp+si]&lt;br /&gt;
xor		al, 0x68&lt;br /&gt;
mov		dx, 0x79F&lt;br /&gt;
pusha&lt;br /&gt;
fild 	word	[bx-9]		; x&lt;br /&gt;
fild 	word	[bx-8]		; y x&lt;br /&gt;
fpatan						; arc&lt;br /&gt;
fst 	st1					; arc arc&lt;br /&gt;
fcos						; cos(arc) arc&lt;br /&gt;
fimul	dword	[si]		; l*cos(arc) arc&lt;br /&gt;
fidiv	word	[bx-8]		; l*cos(arc)/x arc&lt;br /&gt;
fistp	dword	[bx-4]		; arc&lt;br /&gt;
fimul	word	[bx]		; scarc&lt;br /&gt;
fistp	word	[bx-5]		; -&lt;br /&gt;
popa&lt;br /&gt;
sub		ah, [bp+si]&lt;br /&gt;
xor		al, ah&lt;br /&gt;
and		al, 16 + 8 + 4&lt;br /&gt;
stosb&lt;br /&gt;
mov		ax, 0xCCCD&lt;br /&gt;
mul		di&lt;br /&gt;
jmp 	short X-1&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''to be continued''&lt;/div&gt;</summary>
		<author><name>Trixter</name></author>	</entry>

	</feed>