<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://www.sizecoding.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=HellMood</id>
		<title>SizeCoding - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="http://www.sizecoding.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=HellMood"/>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/wiki/Special:Contributions/HellMood"/>
		<updated>2026-05-03T10:45:55Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.27.0</generator>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=1625</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=1625"/>
				<updated>2025-02-23T17:12:40Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: The hardcode update is now containing releases up to the year 2025, description changed accordingly&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/pZE5rAQrHx&lt;br /&gt;
&lt;br /&gt;
== Seminars ==&lt;br /&gt;
&lt;br /&gt;
Seminars on various sizecoding topics have been presented at the Lovebyte demoparty, and can be watched here: [https://www.youtube.com/playlist?list=PLr4JCYhdWe8mq3aDvAtL0ooO7hAC9gU2s Lovebyte seminars on YouTube]&lt;br /&gt;
&lt;br /&gt;
In recent years 256b intro coding has been covered in presentation sessions at other demo parties. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[https://kernfunny.org/x86/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://baseconvert.com/ieee-754-floating-point Base Convert IEEE 754] : Alternative tool for optimizing/reusing FPU constants with live update&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marqueedesign.demoscene.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://marqueedesign.demoscene.com various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
[https://www.xtof.info/Timing-on-PC-familly-under-DOS.html Timing-on-PC-familly-under-DOS] A complete and very detailed guide about timing, timers, callbacks, interrupts in conjunction with MSDOS&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hardcode.zip Scene.org link] to this archive&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hc_update.zip Scene.org link] to an archive update from 2017 to 2025&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Paint16b&amp;diff=1624</id>
		<title>Paint16b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Paint16b&amp;diff=1624"/>
				<updated>2025-02-17T16:26:16Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: added screenshot&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Tinyprog|title=paint16b|author=Hellmood|bsize=16|link=http://www.pouet.net/prod.php?which=63826}}&lt;br /&gt;
[[Category:Case Study]]&lt;br /&gt;
&lt;br /&gt;
[[File:Pa!nt 16b - Screenshot.jpg|frame|center|Pa!nt 16b - Screenshot]]&lt;br /&gt;
&lt;br /&gt;
paint16b implements a mouse-driven drawing program which has the ability to exit back to DOS and also display the mouse cursor.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
org 100h        ; code starts at 0x100&lt;br /&gt;
mov al,0x12     ; assume ah = 0 ; set graphics mode to 640*480&lt;br /&gt;
inc bx          ; assume bx = 0 ; set to 1 (show cursor)&lt;br /&gt;
mloop:	&lt;br /&gt;
int 0x10        ; first loop, switch to graphic mode&lt;br /&gt;
                ; further loops, set pixel		&lt;br /&gt;
xchg bx,ax      ; first loop, set AX to 1 (show cursor)&lt;br /&gt;
                ; further loops, restore old calling mode		&lt;br /&gt;
xor al,0x02     ; switch modes : show cursor &amp;lt;-&amp;gt; get mouse state&lt;br /&gt;
                ; updating XY every second loop plus drawing&lt;br /&gt;
                ; one pixel left results in thicker lines		&lt;br /&gt;
int 0x33        ; call the mouse interrupt&lt;br /&gt;
xchg bx,ax      ; store the button state in AL for drawing&lt;br /&gt;
                ; remember the current calling mode&lt;br /&gt;
                ; for switching it later (in BX)			&lt;br /&gt;
mov ah,0x0C     ; set mode to &amp;quot;set pixel&amp;quot;&lt;br /&gt;
loop mloop      ; dec CX -&amp;gt; draw one pixel left from cursor&lt;br /&gt;
                ; basically enables drawing pixels&lt;br /&gt;
                ; while the cursor is active&lt;br /&gt;
                ; allows exit if the mouse is leftmost&lt;br /&gt;
ret             ; assume [[FFEE]] = [0] = CD20 = int 20&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=File:Pa!nt_16b_-_Screenshot.jpg&amp;diff=1623</id>
		<title>File:Pa!nt 16b - Screenshot.jpg</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=File:Pa!nt_16b_-_Screenshot.jpg&amp;diff=1623"/>
				<updated>2025-02-17T16:25:50Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Pa!nt 16b - Screenshot&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=User:KathaleenHandy0&amp;diff=1614</id>
		<title>User:KathaleenHandy0</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=User:KathaleenHandy0&amp;diff=1614"/>
				<updated>2024-08-02T08:06:25Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: Blanked the page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=1574</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=1574"/>
				<updated>2024-04-16T20:06:29Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/pZE5rAQrHx&lt;br /&gt;
&lt;br /&gt;
== Seminars ==&lt;br /&gt;
&lt;br /&gt;
Seminars on various sizecoding topics have been presented at the Lovebyte demoparty, and can be watched here: [https://www.youtube.com/playlist?list=PLr4JCYhdWe8mq3aDvAtL0ooO7hAC9gU2s Lovebyte seminars on YouTube]&lt;br /&gt;
&lt;br /&gt;
In recent years 256b intro coding has been covered in presentation sessions at other demo parties. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[https://kernfunny.org/x86/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://baseconvert.com/ieee-754-floating-point Base Convert IEEE 754] : Alternative tool for optimizing/reusing FPU constants with live update&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marquee.revival-studios.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://www.pouet.net/groups.php?which=14389 various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
[https://www.xtof.info/Timing-on-PC-familly-under-DOS.html Timing-on-PC-familly-under-DOS] A complete and very detailed guide about timing, timers, callbacks, interrupts in conjunction with MSDOS&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hardcode.zip Scene.org link] to this archive&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hc_update.zip Scene.org link] to an archive update from 2017 to 2022&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=1187</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=1187"/>
				<updated>2022-09-07T17:33:05Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/pZE5rAQrHx&lt;br /&gt;
&lt;br /&gt;
== Seminars ==&lt;br /&gt;
&lt;br /&gt;
Seminars on various sizecoding topics have been presented at the Lovebyte demoparty, and can be watched here: [https://www.youtube.com/playlist?list=PLr4JCYhdWe8mq3aDvAtL0ooO7hAC9gU2s Lovebyte seminars on YouTube]&lt;br /&gt;
&lt;br /&gt;
In recent years 256b intro coding has been covered in presentation sessions at other demo parties. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[https://kernfunny.org/x86/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marquee.revival-studios.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://www.pouet.net/groups.php?which=14389 various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
[https://www.xtof.info/Timing-on-PC-familly-under-DOS.html Timing-on-PC-familly-under-DOS] A complete and very detailed guide about timing, timers, callbacks, interrupts in conjunction with MSDOS&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hardcode.zip Scene.org link] to this archive&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hc_update.zip Scene.org link] to an archive update from 2017 to 2022&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=1169</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=1169"/>
				<updated>2022-06-16T07:25:29Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: format changes to clarify relations&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/pZE5rAQrHx&lt;br /&gt;
&lt;br /&gt;
== Seminars ==&lt;br /&gt;
&lt;br /&gt;
Seminars on various sizecoding topics have been presented at the Lovebyte demoparty, and can be watched here: [https://www.youtube.com/playlist?list=PLr4JCYhdWe8mq3aDvAtL0ooO7hAC9gU2s Lovebyte seminars on YouTube]&lt;br /&gt;
&lt;br /&gt;
In recent years 256b intro coding has been covered in presentation sessions at other demo parties. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[https://kernfunny.org/x86/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marquee.revival-studios.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://www.pouet.net/groups.php?which=14389 various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hardcode.zip Scene.org link] to this archive&lt;br /&gt;
&lt;br /&gt;
* [https://files.scene.org/get/demos/compilations/hardcode/hc_update.zip Scene.org link] to an archive update from 2017 to 2022&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=1168</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=1168"/>
				<updated>2022-06-16T07:19:33Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: Added scene org links to HardCode archive and the HardCode archive update&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/pZE5rAQrHx&lt;br /&gt;
&lt;br /&gt;
== Seminars ==&lt;br /&gt;
&lt;br /&gt;
Seminars on various sizecoding topics have been presented at the Lovebyte demoparty, and can be watched here: [https://www.youtube.com/playlist?list=PLr4JCYhdWe8mq3aDvAtL0ooO7hAC9gU2s Lovebyte seminars on YouTube]&lt;br /&gt;
&lt;br /&gt;
In recent years 256b intro coding has been covered in presentation sessions at other demo parties. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[https://kernfunny.org/x86/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marquee.revival-studios.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://www.pouet.net/groups.php?which=14389 various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
[https://files.scene.org/get/demos/compilations/hardcode/hardcode.zip Scene.org link] to this archive&lt;br /&gt;
&lt;br /&gt;
[https://files.scene.org/get/demos/compilations/hardcode/hc_update.zip Scene.org link] to an archive update from 2017 to 2022&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=1165</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=1165"/>
				<updated>2022-06-10T14:58:57Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: updated discord link, unlimited duration&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/pZE5rAQrHx&lt;br /&gt;
&lt;br /&gt;
== Seminars ==&lt;br /&gt;
&lt;br /&gt;
Seminars on various sizecoding topics have been presented at the Lovebyte demoparty, and can be watched here: [https://www.youtube.com/playlist?list=PLr4JCYhdWe8mq3aDvAtL0ooO7hAC9gU2s Lovebyte seminars on YouTube]&lt;br /&gt;
&lt;br /&gt;
In recent years 256b intro coding has been covered in presentation sessions at other demo parties. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[https://kernfunny.org/x86/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marquee.revival-studios.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://www.pouet.net/groups.php?which=14389 various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Output&amp;diff=1021</id>
		<title>Output</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Output&amp;diff=1021"/>
				<updated>2022-02-17T20:06:23Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Repeated piano notes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Outputting to the screen ==&lt;br /&gt;
&lt;br /&gt;
First, be aware of the [http://img.tfd.com/cde/MEMMAP.GIF MSDOS memory layout]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in Textmode (80x25) ===&lt;br /&gt;
&lt;br /&gt;
==== Hello World / High Level function ====&lt;br /&gt;
&lt;br /&gt;
Here's an obligatory &amp;quot;Hello World&amp;quot; program in text mode, using a [http://www.ctyme.com/intr/rb-2562.htm &amp;quot;high level&amp;quot; MS-DOS function]. With a small optimization already included (using &amp;lt;code&amp;gt;XCHG BP,AX&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;MOV AH,09h&amp;lt;/code&amp;gt;), this snippet is 20 bytes in size.&lt;br /&gt;
&lt;br /&gt;
[[File:Hello world.png|thumb|Hello World!]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
org 100h			; we start at CS:100h&lt;br /&gt;
xchg 	bp,ax		; already a trick, puts 09h into AH&lt;br /&gt;
mov		dx,text		; DX expects the adress of a $ terminated string&lt;br /&gt;
int 	21h			; call the DOS function (AH = 09h)&lt;br /&gt;
ret					; quit&lt;br /&gt;
text:&lt;br /&gt;
db 'Hello World!$'&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Of course, this gets shorter with each byte you remove from the text itself. Now let's look into arbitrary screen access. Right after the start of your program you are in mode 3, that is 80x25 in 16 colors. See the [http://www.columbia.edu/~em36/wpdos/videomodes.txt Video Modes List] [[File:Drawchar example.png|thumb|draw char example]] So, to show something on the screen, you would need to set a segment register to 0xB800, then write values into this segment. &lt;br /&gt;
&lt;br /&gt;
==== Low level access ====&lt;br /&gt;
&lt;br /&gt;
The following three snippets showcase how to draw a red smiley in three different ways. All example snippets are meant to be standalone programs, starting with the first instruction and nothing before it. The target coordinate (40,12) is about the middle of the screen. We need a multiplier 2 since one char needs two bytes in memory (char and color is a byte each). The high byte 0x04 means red (4) on black (0) while the 0x01 is the first ASCII char - a smiley.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop ds&lt;br /&gt;
mov bx,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
mov [bx],ax&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xb800&lt;br /&gt;
pop es&lt;br /&gt;
mov di,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
stosw&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push ss&lt;br /&gt;
push 0xb800&lt;br /&gt;
pop ss&lt;br /&gt;
mov sp,(80*12+40)*2&lt;br /&gt;
mov ax, 0x0401&lt;br /&gt;
push ax&lt;br /&gt;
pop ss&lt;br /&gt;
int 0x20&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You might notice that the ''push &amp;lt;word&amp;gt;'' + ''pop seg_reg'' combination is always the same and occupies four bytes alltogether. If correct alignment is not important to you and you really just want ''any'' pointer to the screen, there is another way to get a valid one:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
les bx,[si]&lt;br /&gt;
nop&lt;br /&gt;
stosb&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That's also four bytes, but it already has the &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; opcode (for putting something onto the screen) integrated and even one slot free for another one-byte-instruction. It works because &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially points to the start of our code, and &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; has the hexadecimal representation of &amp;lt;code&amp;gt;0AAh&amp;lt;/code&amp;gt;. After the first command, the segment register &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; contains the value &amp;lt;code&amp;gt;0AA90h&amp;lt;/code&amp;gt;. If you repeatedly write something to the screen with &amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt; you will eventually reach the &amp;lt;code&amp;gt;0B800h&amp;lt;/code&amp;gt; segment and chars will appear on the screen. With a careful selection of the free one-byte-opcode you can also reintroduce some alignment. This works also with the &amp;lt;code&amp;gt;stosw&amp;lt;/code&amp;gt; opcode &amp;lt;code&amp;gt;0ABh&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Alternative high level functions  ====&lt;br /&gt;
&lt;br /&gt;
Besides the direct way of accessing memory there are also other ways of bringing char to the screen (f.e)&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-4124.htm INT 29h]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2558.htm INT 21h AH=6]&lt;br /&gt;
* [http://www.ctyme.com/intr/rb-2562.htm INT 21h AH=9]&lt;br /&gt;
&lt;br /&gt;
=== Outputting in mode 13h (320x200) ===&lt;br /&gt;
&lt;br /&gt;
==== Basic pixel output ====&lt;br /&gt;
&lt;br /&gt;
The videomemory for mode 13h is located at segment 0xA000, so you need to assign this value to a segment register. Also, after the start of your program you are normally still in textmode, so you need to switch to the videomode. The following snippet does both:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13 &lt;br /&gt;
int 0x10     ; AH = 0 means : set video mode to AL = 0x13 (320 x 200 pixels in 256 colors)&lt;br /&gt;
push 0xA000  ; put value on the stack&lt;br /&gt;
pop es       ; pop the top stack value into segment register ES&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You're free to use any of the segment register / opcode combinations to write to the screen&lt;br /&gt;
* &amp;lt;code&amp;gt;ES&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;stosb&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;DS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;mov&amp;lt;/code&amp;gt;)&lt;br /&gt;
* &amp;lt;code&amp;gt;SS&amp;lt;/code&amp;gt; (&amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
Let's add some code that actually draws something on the screen, the following program occupies 23 bytes and draws a fullscreen XOR texture&lt;br /&gt;
[[File:Mode13h-example-xor.png|left|bottom|thumb|mode13h-example-xor]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,0x13&lt;br /&gt;
int 0x10&lt;br /&gt;
push 0xa000&lt;br /&gt;
pop es&lt;br /&gt;
X: cwd			; &amp;quot;clear&amp;quot; DX (if AH &amp;lt; 0x7F)&lt;br /&gt;
mov ax,di		; get screen position into AX&lt;br /&gt;
mov bx,320		; get screen width into BX&lt;br /&gt;
div bx			; divide, to get row and column&lt;br /&gt;
xor ax,dx		; the famous XOR pattern&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
stosb			; finally, draw to the screen&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that there is a different way of preparing the segment register, instead of :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
you can also do :&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov ah,0xA0&lt;br /&gt;
mov es,ax&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
both variations occupy 4 bytes, but the latter is executable on processor architectures where ''push &amp;lt;word&amp;gt;'' is not available.&lt;br /&gt;
&lt;br /&gt;
==== Alternative way of pixel plotting and optimization ====&lt;br /&gt;
&lt;br /&gt;
Now let's optimize on the snippet. First, we can adapt the &amp;quot;LES&amp;quot; trick from the textmode section. We just exchange &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;push 0xa000&lt;br /&gt;
pop es&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
with:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;les bx,[bx]&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
to save two bytes. This works because BX is 0x0000 at start and thus, accesses the region ''before'' our code, which is called [https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix]. The two bytes that are put into the segment register ES are bytes 2 and 3  = ''&amp;quot;Segment of the first byte beyond the memory allocated to the program&amp;quot;'' which is usually 0x9FFF. That is just off by one to our desired 0xA000. Unfortunately that means a 16 pixel offset, so if screen alignment means something to you, you can't use this optimization. Also, said two bytes are not always 0x9FFF; for example, if resident programs are above the ''&amp;quot;memory allocated to the program&amp;quot;'' (FreeDos), their content is overwritten if we take their base as our video memory base.&lt;br /&gt;
&lt;br /&gt;
Second, we can use an alternative way of putting pixels to the screen, subfunction AH = 0x0C of int 0x10. Also, instead of constructing row and column from the screen pointer, we can use some interesting properties of the screenwidth regarding logical operations. This results in the following 16 byte program:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             ; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov al,0x13&lt;br /&gt;
X: int 0x10		; set video mode AND draw pixel&lt;br /&gt;
inc cx			; increment column&lt;br /&gt;
mov ax,cx		; get column in AH&lt;br /&gt;
xor al,ah		; the famous XOR pattern&lt;br /&gt;
mov ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
and al,32+8		; a more interesting variation of it&lt;br /&gt;
jmp short X		; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first optimization is the double usage of the same &amp;quot;int 0x10&amp;quot; as setting the videomode and drawing the pixel. The subfunction AH = 0x0C expects row and column in DX and CX. Since the screenwidth is 320, which is 5 * 64, we can ignore the row and just works with the column, if we use logical operations and just use bit 0-6 of the result. The subfunction AH = 0x0C allows for unbounded column values in CX (up to 65535) and correctly &amp;quot;wraps&amp;quot; it internally without an error.&lt;br /&gt;
&lt;br /&gt;
The major drawback of the &amp;quot;subfunction AH = 0x0C&amp;quot; approach is performance loss. While DosBox and many emulators perform just fine, real hardware will draw much much slower based on the Video BIOS.&lt;br /&gt;
&lt;br /&gt;
==== Basic animation and user interaction ====&lt;br /&gt;
&lt;br /&gt;
Now let's add the convenient check for the ESC key and also add a simple animation. The &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; register is used as frame counter and incremented after the pixel counter &amp;lt;code&amp;gt;CX&amp;lt;/code&amp;gt; ran through all 65536 values via &amp;lt;code&amp;gt;LOOP&amp;lt;/code&amp;gt;. This frame counter is then added to the column. The resulting program is now 25 bytes in size :&lt;br /&gt;
&lt;br /&gt;
[[File:Xor anim example.gif|thumb]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;cwd             	; &amp;quot;clear&amp;quot; DX for perfect alignment&lt;br /&gt;
mov 	al,0x13&lt;br /&gt;
X: 		int 0x10	; set video mode AND draw pixel&lt;br /&gt;
mov 	ax,cx		; get column in AH&lt;br /&gt;
add		ax,di		; offset by framecounter&lt;br /&gt;
xor 	al,ah		; the famous XOR pattern&lt;br /&gt;
and 	al,32+8		; a more interesting variation of it&lt;br /&gt;
mov 	ah,0x0C		; set subfunction &amp;quot;set pixel&amp;quot; for int 0x10&lt;br /&gt;
loop 	X			; loop 65536 times&lt;br /&gt;
inc 	di			; increment framecounter&lt;br /&gt;
in 		al,0x60		; check keyboard ...&lt;br /&gt;
dec 	al			; ... for ESC&lt;br /&gt;
jnz 	X			; rinse and repeat&lt;br /&gt;
ret					; quit program&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
( ↑ This example is the blueprint in the [[Floating-point_Opcodes#FPU_Basics| FPU Basics Section]].)&lt;br /&gt;
&lt;br /&gt;
=== Using Custom Colors ===&lt;br /&gt;
&lt;br /&gt;
==== Shades of Hue within the Default VGA palette ====&lt;br /&gt;
&lt;br /&gt;
You might have noticed there is a bit of structure to [https://i.stack.imgur.com/OSZ6D.png the default VGA Palette], which you can exploit for some interesting results. Looking at the pallete there is a rainbow of different hue values that start at index 32 that are repeated in a slightly different luma seperated by 72 indices. If you are okay with limiting the amount of shades you need, you can get a small colorramp for all kinds of hue values by simply calculating your color-index like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;color=((shade%3)*72)+32+huevalue&amp;lt;/code&amp;gt; &lt;br /&gt;
&lt;br /&gt;
For an example of how this looks for all kinds of hue values, see [https://www.pouet.net/prod.php?which=63520 Popcast] by Hellmood/Desire.&lt;br /&gt;
&lt;br /&gt;
==== Setting a Custom Palette ====&lt;br /&gt;
&lt;br /&gt;
Sometimes, when the [https://i.stack.imgur.com/OSZ6D.png Default VGA Palette] doesn't quite match the look you are looking for, it can be useful to set your own palette using the VGA registers, the basic setup loop looks like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
palloop:&lt;br /&gt;
mov ax,cx&lt;br /&gt;
mov dx,0x3c8&lt;br /&gt;
out dx,al    ; select palette color&lt;br /&gt;
inc dx&lt;br /&gt;
out dx,al    ; write red value (0..63)&lt;br /&gt;
out dx,al    ; write green value (0..63)&lt;br /&gt;
out dx,al    ; write blue value (0..63)&lt;br /&gt;
loop palloop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
The above code sets a simple grayscale palette, assumes CX Register to be at 0) and is compatible with all DOS platforms.&lt;br /&gt;
In some cases you can ommit the &amp;lt;code&amp;gt;mov dx,0x3c8, out dx,al, inc dx&amp;lt;/code&amp;gt; and directly access the data register by just using &amp;lt;code&amp;gt;mov dx,0x3c9&amp;lt;/code&amp;gt; instead.&lt;br /&gt;
&lt;br /&gt;
To get more interesting colors than just grayscale, you can alter the value of the AL register in between setting the red, green and blue values.  For example by shifting, adding, substracting or performing logical operations. Just get creative and check if the result is sufficient for your usecase.&lt;br /&gt;
&lt;br /&gt;
TomCat will show the most common color palettes grouped by functionality. Check his article: [https://abaddon.hu/256b/colors.html Colors (in tiny intros)]&lt;br /&gt;
&lt;br /&gt;
== Producing sound ==&lt;br /&gt;
&lt;br /&gt;
=== MIDI notes ===&lt;br /&gt;
&lt;br /&gt;
Creating sounds with MIDI requires a bit more preparation, but once you're familiar with it, it's even simpler than PC Speaker sound, because you basically don't have to ''create'' the sound, you just have to ''trigger'' it. For the start, you have to know, that there is a lot of [https://en.wikipedia.org/wiki/General_MIDI different instruments] and a [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined way of communication]. Imagine the MIDI interface like a keyboard, you tell it which button/key you want to press, which knob to twist, and sometimes, how hard. Per default, the active instrument is the [https://en.wikipedia.org/wiki/Piano Acoustic Grand Piano].&lt;br /&gt;
&lt;br /&gt;
==== Single piano note ====&lt;br /&gt;
&lt;br /&gt;
Let's start of with a simple example, playing a single note on the piano :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh		;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h	;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 		;	send !&lt;br /&gt;
dec 	dx			;	MIDI Data Port ( = 330h )&lt;br /&gt;
mov 	al,	90h		;	send note on channel ZERO - command&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
mov 	al,	56h		;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al		; 	send !&lt;br /&gt;
mov 	al,	67h		;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al		;	send !&lt;br /&gt;
ret					; 	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In short: you turn your keyboard on (switching to UART mode), then press a KEY with a certain VOLUME on channel ZERO, then exit. Besides switching to UART mode, all this communication uses the port &amp;lt;code&amp;gt;330h&amp;lt;/code&amp;gt;. This example will work on DosBox but not on Windows XP NTVDM: for [http://www.pouet.net/topic.php?which=10720&amp;amp;page=1 still unclear reasons], the NTVDM emulation delays the note until it receives a second one. The simplest way of at least hearing something is to repeatedly play notes, like in the following example :&lt;br /&gt;
&lt;br /&gt;
==== Repeated piano notes ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov 	al,	3Fh			;	set UART mode - command&lt;br /&gt;
mov 	dx,	331h		;	MIDI Control Port&lt;br /&gt;
out 	dx,	al 			;	send !&lt;br /&gt;
dec 	dx				;	MIDI Data Port ( = 330h )&lt;br /&gt;
main:&lt;br /&gt;
mov 	al,	90h			;	send note on channel 0 - command&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	al,	56h			;	data byte 1 : KEY = 56h&lt;br /&gt;
out 	dx,	al			; 	send !&lt;br /&gt;
mov 	al,	67h			;	data byte 2 : VOLUME = 67h&lt;br /&gt;
out 	dx,	al			;	send !&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
_wait:&lt;br /&gt;
hlt&lt;br /&gt;
loop _wait&lt;br /&gt;
in 		al,	0x60		;	check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			;	no? repeat&lt;br /&gt;
ret						;	quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, enriched with synchronizing against the timer and checking for the ESC key. It works on both DosBox and Windows XP NTVDM and plays a note on the Piano repeatedly.&lt;br /&gt;
&lt;br /&gt;
==== Repeated notes of other instruments ====&lt;br /&gt;
&lt;br /&gt;
While hitting one key repeatedly is not really interesting in general, it can produce decent results when doing it with the right instrument activated, like it was done with the &amp;quot;French Horn&amp;quot; in [https://www.youtube.com/watch?v=tsbxdjP9l50 Timelord (by Baudsurfer)]. Apart from just changing the instrument, let's also optimize a little bit on the size: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
start:	&lt;br /&gt;
mov 	si,data			; init pointer for outsb&lt;br /&gt;
mov 	dx,330h			; change to data port&lt;br /&gt;
mov 	cl,5			; play our music data&lt;br /&gt;
rep 	outsb			; (see below at &amp;quot;data&amp;quot; label)&lt;br /&gt;
inc 	dx				; switch to control port&lt;br /&gt;
outsb					; change to mode &amp;quot;UART&amp;quot;&lt;br /&gt;
_wait:&lt;br /&gt;
mov 	al,[fs:0x46c]	; read timer value&lt;br /&gt;
cmp	 	al,bl			; wait until...&lt;br /&gt;
jz 		_wait			; ...timer value changed&lt;br /&gt;
xchg 	bx,ax			; save old timer value&lt;br /&gt;
in 		al,0x60			; check for ...&lt;br /&gt;
dec 	al				; ... ESC key&lt;br /&gt;
jnz 	start			; otherwise : repeat &lt;br /&gt;
dec 	dx				; switch to data port again&lt;br /&gt;
outsb					; stop all ... &lt;br /&gt;
outsb					; ... notes played ...&lt;br /&gt;
outsb					; ... on channel 3&lt;br /&gt;
data: 	&lt;br /&gt;
db		0c3h			; change instrument on channel 3&lt;br /&gt;
						; (is also &amp;quot;RET&amp;quot; for program quit)&lt;br /&gt;
db		60				; to &amp;quot;French Horn&amp;quot;&lt;br /&gt;
db		93h				; play note on channel 3&lt;br /&gt;
db		35				; deep &amp;quot;b&amp;quot; = note number 35&lt;br /&gt;
db		127				; play with volume = 127&lt;br /&gt;
db		3fh				; change mode to &amp;quot;UART&amp;quot;&lt;br /&gt;
db		0b3h			; control change on channel 3&lt;br /&gt;
db		123				; Channel Mode Message &amp;quot;All Notes Off&amp;quot;&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
↑ This is the previous example, with changed instrument, structuring the MIDI data into a data section, optimizing the output with the usage of &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; instead of &amp;lt;code&amp;gt;out dx,al&amp;lt;/code&amp;gt;, and finalizing the program with a special command to turn ''All Notes Off''. This is necessary for all instruments which don't stop by themself. In all the previous examples, we sent the &amp;quot;NOTE ON&amp;quot; command (&amp;lt;code&amp;gt;9Xh&amp;lt;/code&amp;gt;), but not the according &amp;quot;NOTE OFF&amp;quot; command (&amp;lt;code&amp;gt;8Xh&amp;lt;/code&amp;gt;). Also, the note is now played on channel &amp;lt;code&amp;gt;03h&amp;lt;/code&amp;gt;, since the commandbyte for changing an instrument on channel 3 is &amp;lt;code&amp;gt;0C3h&amp;lt;/code&amp;gt; which is also &amp;lt;code&amp;gt;RET&amp;lt;/code&amp;gt; and can be reused. If this looks complicated at first, always remember, it's just sending [https://www.midi.org/specifications/item/table-1-summary-of-midi-message defined commands] to a single port.&lt;br /&gt;
&lt;br /&gt;
==== The drum channel ====&lt;br /&gt;
&lt;br /&gt;
Now, that you're aware that there are different channels (overall: 16) to play notes on, how would you like a channel &amp;lt;code&amp;gt;09h&amp;lt;/code&amp;gt; specifically for 'Drums' ? [http://www.voidaudio.net/percussion.html Ten different drumsets] with dozens of samples are available out of the box. Per default, the &amp;quot;Standard Kit&amp;quot; is active. The following example plays a track of drum notes repeatedly, while further optimizing for size :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
aas						; 3fh = &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
cwd						; 99h = &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
db 		42,38,42,35		; the drum notes (kick, snare, hihat)&lt;br /&gt;
mov 	dx,0x331		; MIDI Control Port&lt;br /&gt;
outsb					; send &amp;quot;set UART mode&amp;quot;&lt;br /&gt;
dec 	dx				; switch to MIDI data port&lt;br /&gt;
outsb					; send &amp;quot;play note on drum channel&amp;quot; command&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
mov		al,[fs:0x46c]	; read timer&lt;br /&gt;
test	al,3&lt;br /&gt;
jnz 	main			; skip 3 values&lt;br /&gt;
inc 	byte [fs:0x46c] ; inc manually to prevent retrigger&lt;br /&gt;
&lt;br /&gt;
inc 	bx				; increment note counter&lt;br /&gt;
and 	bl,3			; truncate to 4 notes&lt;br /&gt;
mov 	al,[bx+si]		; read the drumnote (see above)&lt;br /&gt;
&lt;br /&gt;
out 	dx,al			; send the drum&lt;br /&gt;
mov 	al,127			; set volume to maximum&lt;br /&gt;
out 	dx,al			; send volume&lt;br /&gt;
&lt;br /&gt;
in 		al,0x60			; check for ESC&lt;br /&gt;
dec 	al				;&lt;br /&gt;
jnz 	main			; no? repeat&lt;br /&gt;
ret						; otherwise quit&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In contrast to the previous example, the data section is now at the start. That means, it's executed as code! This is dangerous of course, but also saves bytes on assigning the &amp;lt;code&amp;gt;DATA&amp;lt;/code&amp;gt; offset to &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt;. Once &amp;lt;code&amp;gt;outsb&amp;lt;/code&amp;gt; incremented &amp;lt;code&amp;gt;SI&amp;lt;/code&amp;gt; initially two times, it is fixed and further reading from the drumdata is done with &amp;lt;code&amp;gt;[BX+SI]&amp;lt;/code&amp;gt;. Unless you know exactly what you are doing, don't use that kind of &amp;quot;executing data&amp;quot; optimization!&amp;quot;. In this special case &amp;lt;code&amp;gt;AAS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;CWD&amp;lt;/code&amp;gt; do no harm and the drum notes &amp;lt;code&amp;gt;42,38,42,35&amp;lt;/code&amp;gt; are carefully crafted and arranged to resemble the instruction &amp;lt;code&amp;gt;SUB AH,[232Ah]&amp;lt;/code&amp;gt; which does no harm either.&lt;br /&gt;
&lt;br /&gt;
==== Further Midi instrument tuning by controllers and pitch ====&lt;br /&gt;
&lt;br /&gt;
If you are familiar with hardware synthesizers you'll definitely remember the typical pitch bend or modulation wheels beside the keys, usually two of them. Those are commonly assigned to a vibrato/tremolo effect and a +/-pitch to tune the played note. You can also use those functions in your intro code to affect the currently played midi instrument note.&lt;br /&gt;
&lt;br /&gt;
To access these parameters the coding follows the usual midi programming like you can see here:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,10110000b	;Controller command on Midi channel 0&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,00000001b	;0...127 data byte 1 =&amp;gt; '1' is the code for the modulation wheel typically assigned to vibrato/tremolo&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,01111111b	;0...127 data byte 2 =&amp;gt; e.g. '01111111' =&amp;gt; Maximum vibrato level&lt;br /&gt;
out dx,al&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In that example the maximum vibrato level is assigned to any instrument played on midi channel 0. This effect was used in the [http://www.mikusite.de/x86/cryscom.zip Crystal Comet 128 Byte intro by Kuemmel].&lt;br /&gt;
&lt;br /&gt;
For pitch bend the code would be like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;mov al,11100000b	;Pitch bend command on Midi channel 0&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0lllllllb	;0...127 data byte 1 =&amp;gt; LSB value for pitch&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0mmmmmmmb	;0...127 data byte 2 =&amp;gt; MSB value for pitch&lt;br /&gt;
out dx,al	&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Pitch bend uses a 14 Bit value. The center is at 0x2000 (meaning no pitch). Numbers from 0x2000 up to 0x3fff increase the pitch and from 0x2000 down to 0x0000 will decrease it. The range of 0x2000 should refer to 2 semitones. So you can bend +/- 2 semitones. Please be aware that those values must be converted to two 7 Bit values. Therefore e.g. 0x3000 would be 0x60 (MSB) and 0x00 (LSB). &lt;br /&gt;
&lt;br /&gt;
Of course there are more midi controller options, e.g. you could change the stereo pan level. As a reference and for more detailed information please have a look at this [http://www.music-software-development.com/midi-tutorial.html Midi tutorial page].&lt;br /&gt;
==== Creating basic sound effects in 16 bytes ====&lt;br /&gt;
&lt;br /&gt;
In the [https://en.wikipedia.org/wiki/General_MIDI#Sound_Effects MIDI repertoire], there are already some sound effects available. With the &amp;quot;data execution&amp;quot; optimization above, let's fire a gunshot in 16 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;aas&lt;br /&gt;
les 	di,[bx-0x6C]&lt;br /&gt;
xor 	al,127&lt;br /&gt;
mov 	dx,0x331&lt;br /&gt;
outsb&lt;br /&gt;
dec 	dx&lt;br /&gt;
mov 	cl,5&lt;br /&gt;
rep 	outsb&lt;br /&gt;
ret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first three instructions don't do anything (they do, but we don't care), it's just MIDI data.&lt;br /&gt;
* &amp;lt;code&amp;gt;aas&amp;lt;/code&amp;gt;&lt;br /&gt;
the command for switching to &amp;quot;UART&amp;quot; mode, for sending to port &amp;lt;code&amp;gt;0x331&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;les di,[bx-0x6C]&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0xc4&amp;lt;/code&amp;gt; (change instrument on channel 4), &amp;lt;code&amp;gt;0x7F&amp;lt;/code&amp;gt; (change it to &amp;quot;Gunshot&amp;quot;), &amp;lt;code&amp;gt;0x94&amp;lt;/code&amp;gt; (play note on channel 4)&lt;br /&gt;
* &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt;&lt;br /&gt;
assembles to &amp;lt;code&amp;gt;0x34&amp;lt;/code&amp;gt; (play THIS note), &amp;lt;code&amp;gt;0x7f&amp;lt;/code&amp;gt; (play it THAT loud, 127 is also the allowed maximum)&lt;br /&gt;
&lt;br /&gt;
The rest of the code basically just sends the MIDI data to the interface and exits. You can change the kind of sound effect with modifying the modbyte of the second instruction (change BX to BP or SI etc.). Changing the volume is more simple, change the byte value of &amp;lt;code&amp;gt;xor al,127&amp;lt;/code&amp;gt; to any value between 0 and 127.&lt;br /&gt;
&lt;br /&gt;
==== Procedural MIDI music generation in 64 bytes ====&lt;br /&gt;
&lt;br /&gt;
With all the above you should now be able to follow the next snippet [http://www.pouet.net/prod.php?which=66313 Descent OST], a small framework for procedural MIDI sound generation in 64 bytes :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;; &amp;quot;Descent OST&amp;quot;, a 62 byte MIDI music player for MSDOS&lt;br /&gt;
; created by HellMood/DESiRE (C)2015&lt;br /&gt;
; this is the extracted music routine used in &amp;quot;Descent&amp;quot; &lt;br /&gt;
; it is a procedural MIDI algorithm which sticks a&lt;br /&gt;
; subroutine to the DOS timer (interrupt 0x1C)&lt;br /&gt;
; the registered routine is called ~18.2 times per second&lt;br /&gt;
&lt;br /&gt;
; developed for use with &amp;quot;NASM&amp;quot;, &lt;br /&gt;
; see http://sourceforge.net/projects/nasm/files/&lt;br /&gt;
&lt;br /&gt;
%define		 	rhythmPattern			0b11&lt;br /&gt;
; with &amp;quot;rhythmPattern&amp;quot;, you define how often a note is played&lt;br /&gt;
; generally, higher values and values containing many &amp;quot;ones&amp;quot;&lt;br /&gt;
; in binary representation, will result in faster play&lt;br /&gt;
; for example &amp;quot;0b11&amp;quot; will play every 4th note&lt;br /&gt;
%define 		baseInstrument			9&lt;br /&gt;
; defines the number of the first instrument used.&lt;br /&gt;
; see http://www.midi.org/techspecs/gm1sound.php for a full list&lt;br /&gt;
; keep in mind, that there are only a few instrument blocks&lt;br /&gt;
; whose sounds stop after a while. You won't get good results&lt;br /&gt;
; from strings etc. just a mess of overlayed sounds&lt;br /&gt;
%define 		numInstruments			7&lt;br /&gt;
; defines how many instrument are used. keep in mind, that &amp;quot;rhythm-&lt;br /&gt;
; Pattern&amp;quot; has influence on the picked instrument. the instruments&lt;br /&gt;
; from 9 to 9+7 are called &amp;quot;chromatic percussion&amp;quot;&lt;br /&gt;
%define 		noteStep 				5&lt;br /&gt;
; defines the basic difference from on note to the next. recommended&lt;br /&gt;
; values here are (mainly) 3,4 and 5 for music theoretic reasons&lt;br /&gt;
; but feel free to play around =)&lt;br /&gt;
%define 		noteRange				12&lt;br /&gt;
; after adding the noteStep, the note value is &amp;quot;mod&amp;quot;ded with&lt;br /&gt;
; the &amp;quot;noteRange&amp;quot;. 12 means octave, which results in very harmonic&lt;br /&gt;
; scales&lt;br /&gt;
%define 		noteSpread				3&lt;br /&gt;
; the third step spreads the notes over the tonal spectrum, you may&lt;br /&gt;
; want to keep &amp;quot;noteSpread&amp;quot; * &amp;quot;noteRange&amp;quot; round about 30-60.&lt;br /&gt;
%define 		baseNote				40&lt;br /&gt;
; the general tone height of everything. some instruments don't play&lt;br /&gt;
; arbitrary deep notes correctly, and too high notes cause ear bleeding&lt;br /&gt;
; adjust with care ;)&lt;br /&gt;
&lt;br /&gt;
; WARNING : after exiting the program, the timer interrupt is still active&lt;br /&gt;
; i strongly recommend to reboot or restart DOSBOX!&lt;br /&gt;
&lt;br /&gt;
; ADVISE : Yes, there are music- and math-related things going on here&lt;br /&gt;
; if you're not into music theory, cycle of fifth, and the like, it maybe&lt;br /&gt;
; better to just play around with the parameters, rather then understanding them&lt;br /&gt;
; just change stuff slowly, and eventually you will get &amp;quot;there&amp;quot;&lt;br /&gt;
; wherever that is ;)&lt;br /&gt;
&lt;br /&gt;
org 0x100&lt;br /&gt;
xchg cx,ax					; set our second counter to zero&lt;br /&gt;
mov dx,music			&lt;br /&gt;
mov ax,0x251C				; mode &amp;quot;0x25&amp;quot; , &amp;quot;0x1C&amp;quot; = change address of timer interrupt&lt;br /&gt;
int 0x21					; see http://mprolab.teipir.gr/vivlio80X86/dosints.pdf&lt;br /&gt;
S:&lt;br /&gt;
in ax,0x60					; wait for &amp;quot;ESC&amp;quot; press, then exit&lt;br /&gt;
dec al						; music plays on anyway, this is just for&lt;br /&gt;
jnz S						; keeping the music exactly as in &amp;quot;Descent&amp;quot;&lt;br /&gt;
ret							; return to prompt&lt;br /&gt;
music:&lt;br /&gt;
inc bx						; increment our first counter (starts at zero)&lt;br /&gt;
test bl,byte rhythmPattern	; play a note every 4th time tick&lt;br /&gt;
jnz nomusic					; otherwise do nothing&lt;br /&gt;
mov dx,0x331&lt;br /&gt;
mov al,0x3F&lt;br /&gt;
out dx,al&lt;br /&gt;
dec dx&lt;br /&gt;
mov al,0xC0					; change instrument on channel 0...&lt;br /&gt;
out dx,al&lt;br /&gt;
mov ax,bx					&lt;br /&gt;
aam byte numInstruments&lt;br /&gt;
add al,byte baseInstrument	; ...to this instrument&lt;br /&gt;
out dx,al&lt;br /&gt;
mov al,0x90					; play note on channel 0 ...&lt;br /&gt;
out dx,al&lt;br /&gt;
add cl,byte noteStep			 &lt;br /&gt;
mov al,cl&lt;br /&gt;
aam byte noteRange&lt;br /&gt;
imul ax,noteSpread&lt;br /&gt;
add al,baseNote				; ... play THIS note&lt;br /&gt;
out dx,al&lt;br /&gt;
neg al						; (play deeper notes louder = add bass)&lt;br /&gt;
add al,127+39				; ... play it THAT loud&lt;br /&gt;
out dx,al&lt;br /&gt;
nomusic:&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PC Speaker ===&lt;br /&gt;
&lt;br /&gt;
Producing sound with PC speakers is incredibly easy. Basically, you set a system timer to a desired frequency, then connect this timer to the speaker. [http://wiki.osdev.org/PC_Speaker The PC Speaker Article] from OSDEV Wiki has the details about it. An example for a tiny intro that uses PC speaker music is [http://www.pouet.net/prod.php?which=67833 SpeaCore]&lt;br /&gt;
&lt;br /&gt;
==== Basic example with melody pattern ====&lt;br /&gt;
&lt;br /&gt;
A very optimized and dirty variant of producing sound with the speaker is this 12 byte snippet (sound routine from [http://www.pouet.net/prod.php?which=67829 the tiny intro &amp;quot;darkweb&amp;quot;]):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;hlt				; sync to timer1&lt;br /&gt;
inc bx			; increment our counter&lt;br /&gt;
mov ax,bx		; work with a copy&lt;br /&gt;
or al,0x4B      ; melody pattern + 2 LSB for speaker link&lt;br /&gt;
out 0x42,al		; set new countdown for timer2 (two passes)&lt;br /&gt;
out 0x61,al		; link timer2 to PC speaker (2 LSBs are 1)&lt;br /&gt;
jmp si			; rinse and repeat&amp;lt;/syntaxhighlight&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Instead of sending low and high byte of our divisor directly in succession, we do it the &amp;quot;two path&amp;quot; way. That reduces the amount of possible frequencies to 255, which is still good enough for some rough sounds. Linking the timer to the PC speaker might not be obvious : Normally you would read the value of port 0x61, set the two least significant bits to TRUE and write the value again. You can save on all of this, if you just send the &amp;quot;two path&amp;quot; value which you just used for the timer if that value has the two least significant bits already set (''or al,0x4B'' does this). Be aware that port 0x61 does many things apart from just connecting the timer to the speaker. A useful resource for ports in general is the [http://bochs.sourceforge.net/techspec/PORTS.LST Bochs Ports List], for port 0x61 it displays:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
''0061	w	KB controller port B (ISA, EISA)   (PS/2 port A is at 0092)&lt;br /&gt;
&lt;br /&gt;
		system control port for compatibility with 8255&lt;br /&gt;
&lt;br /&gt;
		bit 7	(1= IRQ 0 reset )&lt;br /&gt;
&lt;br /&gt;
		bit 6-4    reserved&lt;br /&gt;
&lt;br /&gt;
		bit 3 = 1  channel check enable&lt;br /&gt;
&lt;br /&gt;
		bit 2 = 1  parity check enable&lt;br /&gt;
&lt;br /&gt;
		'''bit 1 = 1  speaker data enable'''&lt;br /&gt;
&lt;br /&gt;
		'''bit 0 = 1  timer 2 gate to speaker enable''' ''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So if you experience strange things with highly optimized pc speaker output, revert to the safe way. The described way works with real hardware and DosBox. Unfortunately, both Orcacle Virtual Box with MsDos 6.22 and Windows XP NTVDM seem not to properly emulate PC speakers (Investigation and citation needed here!)&lt;br /&gt;
&lt;br /&gt;
==== Simple deep sound in 8 bytes ====&lt;br /&gt;
&lt;br /&gt;
One of the smallest possible PC speaker sound generation might be this 8 byte snippet :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;dec ax			; AX initially 0000h -&amp;gt; AL = 0xFF&lt;br /&gt;
out 42h,al		; change divisor of timer2 to 0xFFFF&lt;br /&gt;
out 42h,al		; resulting in a very low frequency&lt;br /&gt;
out 61h,al		; 2 LSBs are set, connect timer to speaker&lt;br /&gt;
ret				; quit&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Note: This may fail on actual hardware, as there might not be time for the bus to settle between the consecutive &amp;lt;code&amp;gt;out 42h,al&amp;lt;/code&amp;gt; statements.)&lt;br /&gt;
&lt;br /&gt;
=== COVOX output (aka LPT DAC) ===&lt;br /&gt;
&lt;br /&gt;
It is possible to output to an LPT-connected DAC (&amp;quot;[https://en.wikipedia.org/wiki/Covox_Speech_Thing COVOX]&amp;quot;) in a tinyprog.  A proof-of-concept example is [http://www.pouet.net/prod.php?which=57991 Express Train 125] which uses COVOX for sound generation.  &lt;br /&gt;
&lt;br /&gt;
This method follows the [http://countercomplex.blogspot.com/2011/10/algorithmic-symphonies-from-one-line-of.html &amp;quot;audio from one line of C code&amp;quot;] style of sound generation.  [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 A pouet discussion] exists for more background information.&lt;br /&gt;
&lt;br /&gt;
=== Advanced PC Speaker and COVOX sound via interrupt ===&lt;br /&gt;
&lt;br /&gt;
For a more advanced use of PC Speaker or COVOX sound output for tiny intros,&lt;br /&gt;
also regarding a specific timing to a desired sample frequency playback, the use of an interrupt&lt;br /&gt;
timer is recommended. To illustrate this we take a so called bytebeat and make it into a workable&lt;br /&gt;
code example for PC Speaker and COVOX.&lt;br /&gt;
&lt;br /&gt;
The major difference between the two is that COVOX has the benefit of a precision of 8 bits and PC Speaker usually only 6 bits.&lt;br /&gt;
Furthermore the setup/access is different as shown in the sections before. Regarding size of the code and quality of the&lt;br /&gt;
sound COVOX is preferable.&lt;br /&gt;
&lt;br /&gt;
The bytebeat we are using as an example can be seen and heard [http://wurstcaptures.untergrund.net/music/?oneliner=((t%264096)%3F((t*(t%5Et%25255)%7C(t%3E%3E4))%3E%3E1)%3A(t%3E%3E3)%7C((t%268192)%3Ft%20%3C%3C2%3At))&amp;amp;rate=11025 here.] It's based on JavaScript syntax: ((t&amp;amp;4096)?((t*(t^t%255)|(t&amp;gt;&amp;gt;4))&amp;gt;&amp;gt;1):(t&amp;gt;&amp;gt;3)|((t&amp;amp;8192)?t &amp;lt;&amp;lt;2:t))&lt;br /&gt;
&lt;br /&gt;
Bytebeat code like this can be directly ported to assembler by evaluating the single expressions step by step as you can&lt;br /&gt;
see in the implementations here. Those examples work within DOSBox and should also run on real hardware with FreeDOS. It doesn't&lt;br /&gt;
show any graphical output, it just plays the bytebeat until a key is pressed. Your graphics routine should be placed right after&lt;br /&gt;
the 'main' label. &lt;br /&gt;
&lt;br /&gt;
==== PC Speaker variant ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
&lt;br /&gt;
 mov    ax,3508h			;21h, ah=35h get interrupt handler | al=08h interrupt number (PIT timer)&lt;br /&gt;
 int    21h					;return: es:bx&lt;br /&gt;
 push   es&lt;br /&gt;
 push   bx					;backup current interrupt handler&lt;br /&gt;
 mov    cx,63 + 108*256  	;PIT counter divisor = 108 and speaker enable for init   &lt;br /&gt;
 mov    bl,90h				;10010000b =&amp;gt; on &amp;quot;init&amp;quot;  	&lt;br /&gt;
							;Bit0   =   0 counter 16 Bits set&lt;br /&gt;
							;Bit3-1 = 000 mode 0 select&lt;br /&gt;
							;Bit5-4 =  01 read/write counter bits 0-7 only&lt;br /&gt;
							;Bit7-6 =  10 counter 2 select&lt;br /&gt;
&lt;br /&gt;
 mov    dx,irq				;new handler address&lt;br /&gt;
 call   init&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
 mov    ah,0&lt;br /&gt;
 int    16h					;ah = 0, int16h =&amp;gt; read keypress&lt;br /&gt;
&lt;br /&gt;
 pop    dx&lt;br /&gt;
 pop    ds					;restore handler address at exit&lt;br /&gt;
 xor    cx,cx				;PIT counter divisor = 0 and speaker disable for exit &lt;br /&gt;
 mov    bl,0b6h				;bl = 10110110b =&amp;gt; at exit&lt;br /&gt;
init:&lt;br /&gt;
 xchg   ax,cx &lt;br /&gt;
 out    61h,al				;al = 0 or 63 =&amp;gt; Bit0 = 1 timer 2 gate to speaker enable,&lt;br /&gt;
 mov    al,ah				;Bit1 = 1 speaker data enable ...or disable both at al = 0&lt;br /&gt;
 out    40h,al				;al = 0 or 108 =&amp;gt; write PIT counter 0 divisor&lt;br /&gt;
 salc&lt;br /&gt;
 out    40h,al				;al = 0 =&amp;gt; write PIT counter 0 divisor again = 0 high byte&lt;br /&gt;
							;=&amp;gt; this results in a frequency for the interrupt call of 11025 Hz.&lt;br /&gt;
							;as clock is 1,19318181818 MHz =&amp;gt; 1,19318181818 MHz / 108 = 11025 Hz&lt;br /&gt;
 xchg   ax,bx				;al=bl  = 10110110b&lt;br /&gt;
 out    43h,al				;Bit0   =   0 counter 16 Bits set&lt;br /&gt;
							;Bit3-1 = 011 mode 3 select, square wave generator&lt;br /&gt;
							;Bit5-4 =  11 read/write counter bits 0-7 first, then 8-15&lt;br /&gt;
							;Bit7-6 =  10 counter 2 select&lt;br /&gt;
 mov    ax,2508h			;21h, ah=25h set interrupt handler | al=08h interrupt number (PIT timer)  &lt;br /&gt;
 int    21h&lt;br /&gt;
retn&lt;br /&gt;
&lt;br /&gt;
;bytebeat: ((t&amp;amp;4096)?((t*(t^t%255)|(t&amp;gt;&amp;gt;4))&amp;gt;&amp;gt;1):(t&amp;gt;&amp;gt;3)|((t&amp;amp;8192)?t&amp;lt;&amp;lt;2:t))&lt;br /&gt;
irq:&lt;br /&gt;
 pusha&lt;br /&gt;
 mov    bp,255&lt;br /&gt;
 mov    ax,0            	;ax: t&lt;br /&gt;
.counter:&lt;br /&gt;
 mov    cx,ax&lt;br /&gt;
 shr    cx,3            	;cx: (t&amp;gt;&amp;gt;3)&lt;br /&gt;
 test   ax,4096         	;(t&amp;amp;4096)?&lt;br /&gt;
 jz     .1&lt;br /&gt;
 mov    bx,ax           	;bx: t&lt;br /&gt;
 sub    dx,dx           	;dx:ax t&lt;br /&gt;
 div    bp              	;dx: (t%255)&lt;br /&gt;
 xor    dx,bx           	;dx: (t^(t%255))&lt;br /&gt;
 shr    cx,1            	;cx: (t&amp;gt;&amp;gt;4)&lt;br /&gt;
 xchg   ax,bx           	;ax: t&lt;br /&gt;
 mul    dx              	;ax: t*(t^(t%255))&lt;br /&gt;
 or     ax,cx           	;ax: t*(t^(t%255))|(t&amp;gt;&amp;gt;4)&lt;br /&gt;
 shr    ax,1            	;ax: (t*(t^(t%255))|(t&amp;gt;&amp;gt;4))&amp;gt;&amp;gt;1&lt;br /&gt;
 jmp    .3&lt;br /&gt;
.1:&lt;br /&gt;
 test   ax,8192         	;(t&amp;amp;8192)?&lt;br /&gt;
 jz     .2&lt;br /&gt;
 shl    ax,2            	;ax: (t&amp;lt;&amp;lt;2)&lt;br /&gt;
.2:&lt;br /&gt;
 or     ax,cx           	;ax: ax|(t&amp;gt;&amp;gt;3)&lt;br /&gt;
.3:&lt;br /&gt;
 shr    al,2            	;downscale to 6 bits&lt;br /&gt;
 jz     .4&lt;br /&gt;
 out    42h,al				;write 6 Bit data to speaker (PIT counter 2)&lt;br /&gt;
.4:&lt;br /&gt;
&lt;br /&gt;
 inc    word [bp-255+irq.counter-2]&lt;br /&gt;
 mov    al,20h				;00100000b&lt;br /&gt;
 out    20h,al				;Bit 5 = 1 send End Of Interrupt (EOI) signal&lt;br /&gt;
 popa&lt;br /&gt;
iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== COVOX variant ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;org 100h&lt;br /&gt;
&lt;br /&gt;
 mov    ax,3508h			;21h, ah=35h get interrupt handler | al=08h interrupt number (PIT timer)&lt;br /&gt;
 int    21h					;return: es:bx&lt;br /&gt;
 push   es&lt;br /&gt;
 push   bx					;backup current interrupt handler&lt;br /&gt;
 mov    al,108				;PIT counter divisor&lt;br /&gt;
 mov    dx,irq				;new handler address&lt;br /&gt;
 call   init&lt;br /&gt;
&lt;br /&gt;
main:&lt;br /&gt;
 mov    ah,0&lt;br /&gt;
 int    16h					;ah = 0, int16h =&amp;gt; read keypress&lt;br /&gt;
 pop    dx&lt;br /&gt;
 pop    ds					;restore handler address at exit&lt;br /&gt;
 salc						;al = 0 at exit&lt;br /&gt;
&lt;br /&gt;
init:&lt;br /&gt;
 out    40h,al				;al = 0 or 108 =&amp;gt; write PIT counter 0 divisor = 108 low byte&lt;br /&gt;
 salc&lt;br /&gt;
 out    40h,al				;al = 0 =&amp;gt; write PIT counter 0 divisor again = 0 high byte&lt;br /&gt;
							;=&amp;gt; this results in a frequency for the interrupt call of 11025 Hz.&lt;br /&gt;
							;as clock is 1,19318181818 MHz =&amp;gt; 1,19318181818 MHz / 108 = 11025 Hz&lt;br /&gt;
 mov    ax,2508h			;21h, ah=25h set interrupt handler | al=08h interrupt number (PIT timer)&lt;br /&gt;
 int    21h                                                             &lt;br /&gt;
 retn&lt;br /&gt;
&lt;br /&gt;
;bytebeat: ((t&amp;amp;4096)?((t*(t^t%255)|(t&amp;gt;&amp;gt;4))&amp;gt;&amp;gt;1):(t&amp;gt;&amp;gt;3)|((t&amp;amp;8192)?t&amp;lt;&amp;lt;2:t))&lt;br /&gt;
irq:&lt;br /&gt;
 pusha&lt;br /&gt;
 mov    bp,255&lt;br /&gt;
 mov    ax,0				;ax: t&lt;br /&gt;
.counter:&lt;br /&gt;
 mov    cx,ax&lt;br /&gt;
 shr    cx,3				;cx: (t&amp;gt;&amp;gt;3)&lt;br /&gt;
 test   ax,4096				;(t&amp;amp;4096)?&lt;br /&gt;
 jz     .1&lt;br /&gt;
 mov    bx,ax				;bx: t&lt;br /&gt;
 sub    dx,dx				;dx: ax t&lt;br /&gt;
 div    bp					;dx: (t%255)&lt;br /&gt;
 xor    dx,bx				;dx: (t^(t%255))&lt;br /&gt;
 shr    cx,1				;cx: (t&amp;gt;&amp;gt;4)&lt;br /&gt;
 xchg   ax,bx				;ax: t&lt;br /&gt;
 mul    dx					;ax: t*(t^(t%255))&lt;br /&gt;
 or     ax,cx				;ax: t*(t^(t%255))|(t&amp;gt;&amp;gt;4)&lt;br /&gt;
 shr    ax,1				;ax: (t*(t^(t%255))|(t&amp;gt;&amp;gt;4))&amp;gt;&amp;gt;1&lt;br /&gt;
 jmp   .3&lt;br /&gt;
.1:&lt;br /&gt;
 test   ax,8192				;(t&amp;amp;8192)?&lt;br /&gt;
 jz    .2&lt;br /&gt;
 shl    ax,2				;ax: (t&amp;lt;&amp;lt;2)&lt;br /&gt;
.2:&lt;br /&gt;
 or     ax,cx				;ax: ax|(t&amp;gt;&amp;gt;3)&lt;br /&gt;
.3:&lt;br /&gt;
 mov    dx,0378h			;LPT1 parallel port address&lt;br /&gt;
 out    dx,al				;write 8 Bit sample data&lt;br /&gt;
 inc    word[bp-255+irq.counter-2]&lt;br /&gt;
 mov    al,20h              ;00100000b&lt;br /&gt;
 out    20h,al				;Bit 5 = 1 send End Of Interrupt (EOI) signal&lt;br /&gt;
 popa&lt;br /&gt;
 iret&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Further notes on the two variants ====&lt;br /&gt;
&lt;br /&gt;
It's important to set and know the sample frequency you want. E.g. if you want to port the frequency from 11025 Hz to e.g. 18939 Hz&lt;br /&gt;
for the same sound you need to change the following code parts e.g. for COVOX. Pay attention that also the bytebeat parameters where adjusted to fit more or less the double frequency:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;;...snip...&lt;br /&gt;
 mov    al,63				;PIT counter divisor instead of 108 =&amp;gt; 1,19318181818 MHz / 63 = 18939 Hz&lt;br /&gt;
;...snip...&lt;br /&gt;
;bytebeat: ((t&amp;amp;8192)?((t*(t^t%255)|(t&amp;gt;&amp;gt;5))&amp;gt;&amp;gt;1):(t&amp;gt;&amp;gt;4)|((t&amp;amp;16192)?t&amp;lt;&amp;lt;2:t))&lt;br /&gt;
;...snip...&lt;br /&gt;
.counter:&lt;br /&gt;
 mov    cx,ax&lt;br /&gt;
 shr    cx,4            	;cx: (t&amp;gt;&amp;gt;4)&lt;br /&gt;
 test   ax,8192         	;(t&amp;amp;8192)?&lt;br /&gt;
 jz     .1              	&lt;br /&gt;
 mov    bx,ax           	;bx: t&lt;br /&gt;
 sub    dx,dx           	;dx:ax t&lt;br /&gt;
 div    bp              	;dx: (t%255)&lt;br /&gt;
 xor    dx,bx           	;dx: (t^(t%255))&lt;br /&gt;
 shr    cx,1            	;cx: (t&amp;gt;&amp;gt;5)&lt;br /&gt;
 xchg   ax,bx           	;ax: t&lt;br /&gt;
 mul    dx              	;ax: t*(t^(t%255))&lt;br /&gt;
 or     ax,cx           	;ax: t*(t^(t%255))|(t&amp;gt;&amp;gt;4)&lt;br /&gt;
 shr    ax,1            	;ax: (t*(t^(t%255))|(t&amp;gt;&amp;gt;4))&amp;gt;&amp;gt;1&lt;br /&gt;
 jmp    .3&lt;br /&gt;
.1:&lt;br /&gt;
 test   ax,16384        	;(t&amp;amp;16384)?&lt;br /&gt;
 jz     .2&lt;br /&gt;
 shl    ax,2            	;ax: (t&amp;lt;&amp;lt;2)&lt;br /&gt;
.2:&lt;br /&gt;
 or     ax,cx           	;ax: ax|(t&amp;gt;&amp;gt;3)&lt;br /&gt;
;...snip...&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The routine here uses a frequency of 18939 Hz. So regarding the 16 bit timer used here this would result in a length of a maximum&lt;br /&gt;
of 65535/18939 = 3.46 seconds before everything loops. Usually that would be enough for some drumbeat, but not for a complete song&lt;br /&gt;
or melody. In that case you have to use another register as a 'top' timer to trigger your changes for the sound.&lt;br /&gt;
&lt;br /&gt;
One more thing to check and maybe modify if you hear an imperfect sound is the timing regarding when a sample value is actually &amp;quot;played&amp;quot;.&lt;br /&gt;
Preferable you would want to play each sample value at exactly the same time. But as your sample generation routine might need a different&lt;br /&gt;
amount of CPU cycles each time the interrupt is called this can differ all the time, when code is used like above.&lt;br /&gt;
&lt;br /&gt;
One solution for this is to play the sample calculated from the last interrupt call right away when the interrupt is called the next time.&lt;br /&gt;
You can do that via self-modifying code like shown here. It takes 5 Bytes more:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;;...snip...&lt;br /&gt;
irq:&lt;br /&gt;
pusha&lt;br /&gt;
mov    dx,0378h&lt;br /&gt;
mov    al,0           &lt;br /&gt;
.sample:&lt;br /&gt;
out    dx,al&lt;br /&gt;
mov    bp,255&lt;br /&gt;
mov    ax,0            ; ax: t&lt;br /&gt;
.counter:&lt;br /&gt;
;...snip...&lt;br /&gt;
inc    word [bp-255+irq.counter-2]&lt;br /&gt;
mov    byte [bp-255+irq.sample-1],al&lt;br /&gt;
mov    al,20h&lt;br /&gt;
out    20h,al&lt;br /&gt;
;...snip...&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some remarks: All the code above is not optimized to the max regarding size due to educational reasons.&lt;br /&gt;
Depending on your code and dependency of the interrupt subroutine you can do several size optimizations.&lt;br /&gt;
&lt;br /&gt;
Instead of using the interrupt &amp;lt;code&amp;gt;08&amp;lt;/code&amp;gt; theoretically the user defined interrupt number &amp;lt;code&amp;gt;1c&amp;lt;/code&amp;gt; could be&lt;br /&gt;
used also, but by now this seems to work only with DOSBox but not on a real system with FreeDOS. Further tests&lt;br /&gt;
are needed to see what is the problem here. The use of interrupt &amp;lt;code&amp;gt;1c&amp;lt;/code&amp;gt; would save 4 bytes as the following&lt;br /&gt;
code lines to finalize the interrupt could be omitted in the examples above:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;;...snip...&lt;br /&gt;
mov al,20h&lt;br /&gt;
out 20h,al&lt;br /&gt;
;...snip...&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
What can be done with bytebeats ? Basically everything :-) You can look at this [http://www.pouet.net/topic.php?which=8357&amp;amp;page=1 thread] on pouet to get an idea and check the first PC Speaker intro with bytebeat: [http://www.pouet.net/prod.php?which=71766 TCTRONIC by TomCat/Abaddon].&lt;br /&gt;
 &lt;br /&gt;
Some basic waveforms can be encoded like this:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;;sawtooth wave&lt;br /&gt;
t &amp;amp; 127&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;;square wave&lt;br /&gt;
t &amp;amp; 128&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;;triangle wave&lt;br /&gt;
t ^ ((t &amp;amp; 128) * 127)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How would you go from here to create a specific tone, e.g. an &amp;quot;A4&amp;quot;, which would have a frequency of 440 Hz (Check this &lt;br /&gt;
[https://en.wikipedia.org/wiki/Piano_key_frequencies link] to get a list for the frequencies of the notes) ?&lt;br /&gt;
For that you have to relate the set frequency of the interrupt to the tone frequency and the length of one wave of your&lt;br /&gt;
wave generator. If we have a sawtooth of &amp;lt;code&amp;gt;t&amp;amp;127&amp;lt;/code&amp;gt; at 22050 Hz this would result in a tone of 22050/128 = 172.3 Hz.&lt;br /&gt;
To reach 440 Hz we can simply stretch/multiply the timer by 440/172.3 = 2.554 to hear the desired note:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
(t*2.554) &amp;amp; 127&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[https://naivesound.comglitch Glitch] is another online tool to create bytebeat sounds with some enhanced syntax. There are several good reads on this tool and the theorise behind it and byte beat [https://medium.com/@naive_sound here]&lt;br /&gt;
&lt;br /&gt;
Of course there are endless possibilities and the whole world of real time sound calculation/generation is open to you. Here are some tiny intros which use this techniques already: [http://www.pouet.net/prod.php?which=77741 Plasmifier cover 256B], [http://www.pouet.net/prod.php?which=70599 2(56)unlimited], [http://www.pouet.net/prod.php?which=78505 somehow].&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=914</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=914"/>
				<updated>2021-09-08T14:41:35Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/j6UqqrbkDX&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
During the last years 256b intro coding was also a topic in some demo party presentation sessions. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[https://kernfunny.org/x86/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marquee.revival-studios.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://www.pouet.net/groups.php?which=14389 various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Resources&amp;diff=913</id>
		<title>Resources</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Resources&amp;diff=913"/>
				<updated>2021-09-08T14:20:14Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Discussions ==&lt;br /&gt;
&lt;br /&gt;
Pouet's [http://www.pouet.net/topic.php?which=8791&amp;amp;page=1 Tiny Intro Toolbox Thread] was one of sizecoding.org's main inspirations.&lt;br /&gt;
&lt;br /&gt;
We also have an active discord server dedicated to sizecoding at: https://discord.gg/j6UqqrbkDX&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
The Hugi demoscene diskmag has been kind to the sizecoding movement.  They included [http://www.hugi.scene.org/online/coding/hugi%20se%204%20-%20index%20sorted%20by%20topic.htm many programming articles] since it started in 1996.  Additionally, the [http://www.hugi.scene.org/compo/compoold.htm Hugi Size Coding Competition Series] is the gold standard for learning how to sizecode:  Each competition was given a specific task, then participants submitted their results.  ([http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=13#pid13 All of the entries with their source code] are available, as are [http://retrolandia.net/hugi/showthread.php?tid=7&amp;amp;pid=396#pid396 the rules for each challenge].)  Hugi #35 also provided a [http://hugi.scene.org/online/hugi35/hugi%2035%20-%20demoscene%20forum%20adok%20256b%20intros%20round-table.htm 256b intros round-table] between Baze, Digimind, &amp;amp;#344;r&amp;amp;#345;ola, Pirx, Optimus, and Seven.&lt;br /&gt;
&lt;br /&gt;
During the last years 256b intro coding was also a topic in some demo party presentation sessions. The recordings are available on YouTube. There are &lt;br /&gt;
[https://www.youtube.com/watch?v=WbWYBE9vjdk part 1] and [https://www.youtube.com/watch?v=51BOp5Pi4mU part 2] of Řrřola's talk &amp;quot;Magic of 256-byte x86 intros&amp;quot; held at Demobit 2017 and 2018 respectively. A similar talk [https://www.youtube.com/watch?v=V_ntnQKemWA &amp;quot;256 byte intros on modern PCs&amp;quot;] was held by TomCat/Abaddon at the Function 2017 party. Supplemental files can be downloaded from [http://www.pouet.net/prod.php?which=72454 Pouet.net]. The presentations cover everything from setting up the coding environment to all kinds of code optimization, constant reuse, flag tricks, palette generation, and how to get from an idea to the final intro. TomCat also provides a [https://www.abaddon.hu/usbdos/ dev tools page] and his [https://www.abaddon.hu/256b/ SizeCoding Blog] with lots of optimization ideas.&lt;br /&gt;
&lt;br /&gt;
fysnet.net has an extensive [http://www.fysnet.net/yourhelp.htm list of DOS .COM register starting values].  If your environment isn't listed, [http://www.fysnet.net/yourhelp.com a program] is also available that can print out all of the register starting values.  See also [https://web.archive.org/web/20170418182707/http://pferrie.host22.com/misc/lowlevel12.htm DOS Register Starting Values] to understand why the register starting values for .COM files are set the way that they are.&lt;br /&gt;
&lt;br /&gt;
The [http://sparksandflames.com/files/x86InstructionChart.html Sparks And Flames x86 Instruction Chart] is awesome, once you can &amp;quot;read&amp;quot; it. Find out what the bright green (1-byte) commands do, and use them. For the extreme sizecoding (&amp;lt;=32 bytes) it's also very useful to have the HEX value in sight.&lt;br /&gt;
&lt;br /&gt;
The [http://ref.x86asm.net/geek.html Geek Edition x86asm-ref] is comprehensive and dense.  It might be the only reference you'll need.  The [http://ref.x86asm.net/geek.html#modrm_byte_16 last part] covers modr/m bytes, which is useful.&lt;br /&gt;
&lt;br /&gt;
[https://www.felixcloutier.com/x86/ Yet another reference] by Félix Cloutier.&lt;br /&gt;
&lt;br /&gt;
[http://www.techhelpmanual.com/2-main_menu.html Tech Help Manual] includes a dense reference of DOS interrupts and x86 instructions.&lt;br /&gt;
&lt;br /&gt;
[http://c9x.me/x86/ c9x.me]: A x86 instruction reference&lt;br /&gt;
&lt;br /&gt;
[http://x86.renejeschke.de/ Rene Jeschke's preservation of siyobik.com's x86 Documentation] is one of the most compact and understandable references to the x86 instruction set you'll ever see.&lt;br /&gt;
&lt;br /&gt;
Intel IA-32 Software Developer's Manual Instruction Set Reference [http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/whgdata/whlstt2.htm hypertext version].&lt;br /&gt;
&lt;br /&gt;
[http://www.rcollins.org/secrets/OpCodes.html Robert Collins' Undocumented Opcodes] page has some gems you can exploit depending on your target environment.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/PC_Speaker PC Speaker Basics] For everything from simple bleeps to sophisticated synthesizers.&lt;br /&gt;
&lt;br /&gt;
[https://en.wikipedia.org/wiki/Program_Segment_Prefix Program Segment Prefix] : what's in front of our .COM code, and why.&lt;br /&gt;
&lt;br /&gt;
Never underestimate [http://www.hugi.scene.org/online/coding/hugi%2017%20-%20coaax.htm The Hidden Power of BCD Instructions].&lt;br /&gt;
&lt;br /&gt;
[http://www.agner.org/optimize/optimizing_assembly.pdf An optimization guide for x86 platforms By Agner Fog] Chapter 10: Optimizing for size.&lt;br /&gt;
&lt;br /&gt;
[http://home.sch.bme.hu/~ervin/codegems.html Code Gems by Ervin/AbaddoN] Nice collection based on the Imphobia diskmagazine article.&lt;br /&gt;
&lt;br /&gt;
[http://wiki.osdev.org/Programmable_Interval_Timer Programmable Interval Timer]: Timing is everything, if you're targeting a slow system. Very old but good text [http://www.phatcode.net/res/246/files/pctim003.txt PCTIM003.TXT]&lt;br /&gt;
&lt;br /&gt;
[http://stanislavs.org/helppc/bios_data_area.html BIOS data area]: Direct system communication overview.  There are environment values here you may be able to utilize.&lt;br /&gt;
&lt;br /&gt;
[https://www.midi.org/specifications/item/table-1-summary-of-midi-message Summary of MIDI messages] : Summary of all basic MIDI messages.&lt;br /&gt;
&lt;br /&gt;
[http://www.voidaudio.net/percussion.html Roland GS/GM2 Drum Set List and Note Map] : Detailed Description of all MIDI drumkits.&lt;br /&gt;
&lt;br /&gt;
[http://www.h-schmidt.net/FloatConverter/IEEE754.html Online IEEE 754 Converter] : Excellent tool for optimizing/reusing FPU constants&lt;br /&gt;
&lt;br /&gt;
[https://www.reddit.com/r/tinycode Tinycode on Reddit]: Collection of many different size coding productions incl. x86 based ones with discussion (mostly provided by [https://www.reddit.com/user/Hell__Mood/ HellMood] himself)&lt;br /&gt;
&lt;br /&gt;
[https://marquee.revival-studios.com/blog Superogue's sizecoding microblog]: Collection of demoscene and sizecoding related articles and development writeups up [https://www.pouet.net/groups.php?which=14389 various Marquee Design tiny intros]&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Tinyprogs are so small that you sometimes don't need the source code to understand them:  Just pop them into a disassembler (like IDA) or a debugger and look at them.  If you want to grab specimens for study, here are some places to start.&lt;br /&gt;
&lt;br /&gt;
Pouet can be searched for the latest tinyprogs:  [http://www.pouet.net/prodlist.php?type%5B%5D=256b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 256b], [http://www.pouet.net/prodlist.php?type%5B%5D=128b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 128b], [http://www.pouet.net/prodlist.php?type%5B%5D=64b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 64b], and even [http://www.pouet.net/prodlist.php?type%5B%5D=32b&amp;amp;platform%5B%5D=MS-Dos&amp;amp;page=1 32b and smaller].&lt;br /&gt;
&lt;br /&gt;
[http://hardcode.untergrund.net/ HardCode] is a repository of 7800 64KB and smaller intros.&lt;br /&gt;
&lt;br /&gt;
[http://256bytes.untergrund.net 256b.com] is a website dedicated primarily to 256b and smaller releases.&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Memories&amp;diff=571</id>
		<title>Memories</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Memories&amp;diff=571"/>
				<updated>2020-06-04T18:56:56Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* External Links */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Update==&lt;br /&gt;
The final freedos version is completed, you can [https://www.youtube.com/watch?v=wlW84fEHngM watch it here]. It includes the Amiga Ball as an extra effect, but lacks the music, since a normal modern PC booted to freedos can not play MIDI that easily. The filesize is still 256 bytes.&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
Hello! My name is &amp;quot;HellMood&amp;quot; and this is not a usual wiki style article. It will be written from my point of view, in first person. This article is all about a tiny MS DOS program called &amp;quot;Memories&amp;quot;. This program has a size of 256 bytes and won the &amp;quot;PC 256 byte&amp;quot; competition of the [https://en.wikipedia.org/wiki/Demoscene demoscene] event [https://2020.revision-party.net/ &amp;quot;Revision&amp;quot;] in 2020, as well as the public choice award. You can watch a video of the output of this program here (https://www.youtube.com/watch?v=Imquk_3oFf4) and a video including live reactions of online audience and moderators here (https://www.twitch.tv/videos/589179638?t=7h02m40s). You can download the release and comment here (https://www.pouet.net/prod.php?which=85227). This wiki article will give you an indepth analysis of the program and point out historical references as well as development steps. In case you did not notice, you are now on the &amp;quot;sizecoding wiki&amp;quot;. This will help you not only to understand the interior of &amp;quot;Memories&amp;quot;, but also enable you to produce something like this yourself. Feel free to look around! It is recommended that you start with the [http://www.sizecoding.org/wiki/Getting_Started basics of this wiki] first if you're new to sizecoding and/or x86 assembler. While the principles might be easy to grasp, details of the implementation might not.&lt;br /&gt;
&lt;br /&gt;
==Disclaimer==&lt;br /&gt;
This article discusses the submitted DosBox version (256 bytes). The available archive contains versions for FreeDos and Windows XP DOS which, at the time of this writing, don't work on all computers. These alternative versions were included as proof of concept, to show that this program does not only run on the emulator &amp;quot;DosBox&amp;quot;. For the &amp;quot;PC 256 bytes&amp;quot; competition of the event &amp;quot;Revision&amp;quot; in 2020, it was allowed to specify either &amp;quot;FreeDos&amp;quot; or &amp;quot;DosBox&amp;quot; as platform, the latter with a specific configuration. It is in fact possible to modify the DosBox version to work on FreeDos, MS DOS, WinXP and Win98 as the released alternative versions showcase, but this is not the focus of this article. Anyway, safe versions for all platform and all computers are in the works. Regarding possible further optimizations, I will discuss/explain only the submitted version although i already found several spots in the code that can be optimized. As the history of tiny effects show, almost no code is perfect, and can be brought down further in size. To avoid confusion between different versions, only the submitted version will be explained.&lt;br /&gt;
&lt;br /&gt;
==History of tiny effects==&lt;br /&gt;
[[File:Pouet categories.png|thumb|Size categories on http://www.pouet.net]]&lt;br /&gt;
Sizecoders think in size categories. For MS DOS, these categories are 256b, 128b, 64b and 32b. These are the standards of one of the biggest demoscene archives, http://www.pouet.net. There is no 16b category, although many tiny effects can be implemented in 16 bytes. Almost all of the effects of &amp;quot;Memories&amp;quot; have been coded and optimized by me before, and mostly the implementations were attempts to reduce an already existing effect in size, or do something similar in less size. Reducing the size in this case means, reducing it to the (one of the) next lower 2^N category. For example I might have seen an effect done in 33 to 64 bytes, and then reduced the size to 32 bytes or less. Almost everytime I moved an effect to a lower category, I submitted the resulting tiny program to a demoscene party which allowed remote entries in the 256b category and/or published the result to http://www.pouet.net . In this section I will introduce the effects and mention the origins and authors.&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Kasparov.gif|frame|kasparov, 16 bytes]]&lt;br /&gt;
The origin of this effect is my own &amp;quot;Kasparov 16b&amp;quot; from 2018 (https://www.pouet.net/prod.php?which=75912) apparently I &amp;quot;had this piece lying around, stuck at 17 bytes and not that beautiful, until i found a trick ;)&amp;quot; (quote from the release notes) There was a similar effect in 32 bytes before : &amp;quot;ew&amp;quot; by &amp;quot;headcrash&amp;quot; from 2003. (https://www.pouet.net/prod.php?which=17567) In this case, my design goal was to show &amp;quot;real&amp;quot; chessboards of 8x8 fields and recognizable dark and light squares, as well as the correct orientation of the single chessboards, meaning that the bottom right corner (h1) has to be a [https://en.wikibooks.org/wiki/Chess/Arranging_The_Board white square]. For &amp;quot;Memories&amp;quot;, the effect had to be reimplemented with another approach of writing to the screen to work with the framework, also the scrolling direction was changed to be different from the &amp;quot;scrolling tilted plane&amp;quot; effect.&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
The zooming circles were supposed to be entered to a demoparty as a 32 bytes intro, but I never actually did it. There is no real predecessor in 64 bytes, because in the 64b category much more complex effects are possible. The zooming circles were the result of desperately trying to reach the 32b category for a circular &amp;quot;tunnel&amp;quot; effect, for that my own record is still 52 bytes (&amp;quot;Neontube&amp;quot; - 2016)(https://www.pouet.net/prod.php?which=66808) which in turn was the optimization of the all-time-classic 64 byte effect &amp;quot;constant evolution&amp;quot; from ryg/Farbrausch (2003) (https://www.pouet.net/prod.php?which=8697). In the zooming circles routine, the distance and the angle are removed/ignored, so that the 32b category could be reached.&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Floorcast variation.png|thumb|Floorcast, 32 bytes version, variation]]&lt;br /&gt;
The scrolling tilted plane is one of my own releases, &amp;quot;floorcast 32b&amp;quot; from 2018. Floorcasts have its own history in sizecoding, the effect was brought down stepwise from 256 bytes to 32 bytes. The versions differ in the number of planes, some show two planes, others only one. For the &amp;quot;floorcast 32b&amp;quot; release I specifically decided against the &amp;quot;XOR&amp;quot; texture, whereas in &amp;quot;Memories&amp;quot; I used it again, but disguised with another final touch of &amp;quot;AND&amp;quot;.&lt;br /&gt;
* &amp;quot;rain_storm version 2008 - 256 bytes - https://www.pouet.net/prod.php?which=50479&lt;br /&gt;
* &amp;quot;org_100h&amp;quot; version 2008 - 128 bytes - https://www.pouet.net/prod.php?which=63271&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2013 - 86 bytes - https://www.pouet.net/prod.php?which=61070&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2014 - 64 bytes - https://www.pouet.net/prod.php?which=63305&lt;br /&gt;
* &amp;quot;HellMood&amp;quot; version 2018 - 32 bytes - https://www.pouet.net/prod.php?which=77774&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Projektbeschreibung.gif|frame|Projektbeschreibung, 32 bytes]]&lt;br /&gt;
The parallax checkerboards have been released by me as 32 bytes effect [https://www.pouet.net/prod.php?which=76788 &amp;quot;Projektbeschreibung&amp;quot;] in 2018. It was the direct attempt to bring [https://www.pouet.net/prod.php?which=28543 &amp;quot;Follow the light&amp;quot;] from &amp;quot;Digimind&amp;quot; (2006) or my own [https://www.pouet.net/prod.php?which=63293 &amp;quot;Lucy&amp;quot;] (2014) down to 32 bytes. Very helpful inspirations came from [https://www.pouet.net/prod.php?which=32281 &amp;quot;Rrrolas&amp;quot; &amp;quot;Paralaxa&amp;quot;] (32 bytes, 2007) and [https://www.pouet.net/prod.php?which=61075 &amp;quot;Sensenstahl&amp;quot;s &amp;quot;Byteropolis&amp;quot;] (32 bytes, 2013). In fact Rrrolas rendering approach was already close to the final solution, the code was modified to correct the planes arrangement, change the shape from triangles to checkerboards and improve on the colors. In &amp;quot;Memories&amp;quot;, the coloring of &amp;quot;Digimind&amp;quot;s version was used. Also, the effect has been modified to reduce shearing as much as possible.&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Colpinski.png|thumb|colpinski, 16 bytes]]&lt;br /&gt;
[[File:Rotastic.png|thumb|left|rotastic, 32 bytes]]&lt;br /&gt;
This consists of two effects, the rotozoomer and the sierpinski effect as a texture. The sierpinski effect is based on my own &amp;quot;Colpinski 16b&amp;quot; from 2013 (https://www.pouet.net/prod.php?which=62079) where &amp;quot;frag&amp;quot; of &amp;quot;fsqrt&amp;quot; helped pushing the limits. There is no real predecessor to that, because it is known that this effect can be done directly by combining X and Y, instead of an Iterated Function System or Cellular Automaton approach. The rotozoomer was released as 32b intro &amp;quot;rotastic&amp;quot; by me in 2017 (https://www.pouet.net/prod.php?which=70112), it is based on the ideas of &amp;quot;ryg&amp;quot; of &amp;quot;Farbrausch&amp;quot; (51 bytes, 2002, https://www.pouet.net/prod.php?which=5015) and &amp;quot;Gargaj&amp;quot; of &amp;quot;Conspiracy&amp;quot; (49 bytes, 2002, https://www.pouet.net/prod.php?which=6088).&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Intoanew.gif|frame|Into a new era, 64 bytes version]]&lt;br /&gt;
The raycast bent tunnel is a tweaked version of my own 64 bytes &amp;quot;Into a new era&amp;quot; release (2018, https://www.pouet.net/prod.php?which=78044). The custom colors were exchanged for colors of the standard palette, the geometry was tweaked a bit and changed accordingly, so that no depth effect was used for the calculation of texture values. The standalone version of this effect has a size of 50 bytes. The two main inspirations for the 64 bytes version were two 128 byte intros, &amp;quot;Spongy&amp;quot; by &amp;quot;TBC&amp;quot; (2009, https://www.pouet.net/prod.php?which=53871) and &amp;quot;Wolf128&amp;quot; from &amp;quot;Baudsurfer&amp;quot; (2014, https://www.pouet.net/prod.php?which=63518), while the particular algorithm was developed independently.&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Ocean.png|thumb|Ocean, 64b version]]&lt;br /&gt;
The ocean effect is based on my own 64 bytes release &amp;quot;Ocean&amp;quot; from 2016 (https://www.pouet.net/prod.php?which=66857). The custom color generation and music generation have been stripped from the original, both were not compatible with the main framework of &amp;quot;Memories&amp;quot; without consuming a LOT of extra space. The special effect of &amp;quot;dawn&amp;quot; happens as a byproduct of the general framework approach. This will be explained in the next chapter.&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
The transition between two effects is an effect itself, which has no real predecessor. It is rather an idea that evolved over the years, and might have been done by many others in similar ways. In a nutshell, while calculating a frame, each pixel position is randomized and the time - which determines the effect to use - is offset by this randomization value which is downscaled before. This allows the usage of the standard VGA palette (picture, source) rather than creating custom colors for real crossfading and thus saves space.&lt;br /&gt;
&lt;br /&gt;
==Tiny megademo framework==&lt;br /&gt;
The tiny megademo framework is based on ideas of [https://www.pouet.net/prod.php?which=68020 &amp;quot;Quatro&amp;quot;], done by &amp;quot;Sensenstahl&amp;quot; and me. For putting many tiny effects together into a &amp;quot;megademo&amp;quot;, they have all to use the same approach and they have to be mostly free of assumptions (contents of memory and registers). Also they have to use the same timing values and be tuned to a shared timing in general. Preparing the single effects to be includeable in a framework like this took quite some time and required a lot of extra space initially. It has to be noted, that - sadly - some of my most impressive effects (judged by audience reactions and views on social media) could not be included at all, because the approaches were not combineable without a HUGE overhead. Once all effects were streamlined, I could think about &amp;quot;factoring&amp;quot; out common calculations which repeatedly occur, resulting in a few saved bytes. The framework does the following:&lt;br /&gt;
* Setting up screen mode to 320 x 200 pixels in 256 colors&lt;br /&gt;
* Initializing a pointer to the screen&lt;br /&gt;
* Installing a callback for timing and music&lt;br /&gt;
* Mainloop&lt;br /&gt;
** Calculation of X and Y from the current screen location&lt;br /&gt;
*** with the [http://www.sizecoding.org/wiki/General_Coding_Tricks#Obtaining_X_and_Y_without_DIV_.28The_Rrrola_Trick.29 &amp;quot;Rrrola Trick&amp;quot;]&lt;br /&gt;
** P = Pseudorandom value from screen location&lt;br /&gt;
** Offset current time T by a scaled value P'&lt;br /&gt;
** Selection of effect number N according to T'&lt;br /&gt;
** Execution of effect N (X,Y) for current pixel&lt;br /&gt;
*** On entrance, XY is in register DX (DL,DH)&lt;br /&gt;
*** On entrance, the current time is in register BP&lt;br /&gt;
*** On exit, the pixel color is expected in AL&lt;br /&gt;
** Advance to next pixel&lt;br /&gt;
** Triple diagonal interlacing for smoothing&lt;br /&gt;
** Repeat until frame is complete&lt;br /&gt;
* Set timer to a tempo of ~35 FPS&lt;br /&gt;
* Check keyboard for ESC&lt;br /&gt;
** Quit on ESC, otherwise continue&lt;br /&gt;
=== Code of framework===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
	mov al,0x13				; set AL to mode 320*200 in 256 colors&lt;br /&gt;
	int 0x10	 			; call BIOS to set mode&lt;br /&gt;
	xchg bp,ax				; set timing value to 0x13 &lt;br /&gt;
	push 0xa000-10			; write the screen adress to register ES&lt;br /&gt;
	pop es					; works in conjunction with Rrrola trick&lt;br /&gt;
	mov ax,0x251c			; parameter for changing timer interrupt&lt;br /&gt;
	mov dl,timer			; adress of timer routine, assume DH=1&lt;br /&gt;
	int 0x21				; install timer routine&lt;br /&gt;
top:&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
	call bx					; call the effect&lt;br /&gt;
	stosb					; write the return value and advance&lt;br /&gt;
	inc di					; triple interlace trick for after&lt;br /&gt;
	inc di					; effect and smoothing the animation&lt;br /&gt;
	jnz top					; repeat until the frame is complete&lt;br /&gt;
	mov al,tempo			; set AL to divider for timer&lt;br /&gt;
	out 40h,al				; set timing (dual pass)&lt;br /&gt;
	in al,0x60				; read keyboard&lt;br /&gt;
	dec al					; quit on ESC&lt;br /&gt;
	jnz top					; otherwise repeat loop&lt;br /&gt;
sounds: db 0xc3, 11, 0x93; 0xc3 is MIDI/RET; fx2-s is used as volume&lt;br /&gt;
table: 	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Explanation of effects==&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Array of chessboards.png|thumb|array of chessboards]]&lt;br /&gt;
The most simple effect and a good one to start with. After offsetting the row by the time, the classic XOR pattern is applied. To achieve the impression of a grid of chessboards, all bits but two are set in the color. The real trick lies in shifting to a &amp;quot;good&amp;quot; palette spot. The portion of the frame which is perceived as black, isn't actually black, it's the dark part of the [https://en.wikipedia.org/wiki/Video_Graphics_Array#Color_palette standard VGA palette]. Shifting that way also makes the dark and light square look like they are from an ancient old chessboard.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	xchg dx,ax		; get XY into AX&lt;br /&gt;
	sub ax,bp		; subtract time from row&lt;br /&gt;
	xor al,ah		; XOR pattern (x xor y)&lt;br /&gt;
	or al,0xDB		; pattern for array of boards&lt;br /&gt;
	add al,13h		; shift to good palette spot&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
[[File:Zooming circles.png|thumb|zooming circles]]&lt;br /&gt;
The distance D of a point (X,Y) to the center (0,0) is sqrt(X²+Y²). The framework already makes sure the DL contains a centered X coordinate, DH=Y has to be centered though. Performing a square root on x86 requires quite a bit of code, but this actually can be omitted. With a clever selection of colors, the impression of zooming circles without square root is quite decent.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov al,dh		; get Y in AL&lt;br /&gt;
	sub al,100		; align Y vertically&lt;br /&gt;
	imul al			; AL = Y²&lt;br /&gt;
	xchg dx,ax		; Y²/256 in DH, X in AL&lt;br /&gt;
	imul al			; AL = X²&lt;br /&gt;
	add dh,ah		; DH = (X² + Y²)/256&lt;br /&gt;
	mov al,dh		; AL = (X² + Y²)/256&lt;br /&gt;
	add ax,bp		; offset color by time&lt;br /&gt;
	and al,8+16		; select special rings&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Tiltplane.png|thumb|scrolling tilted plane]]&lt;br /&gt;
This effect is achieved as follow. First, a large constant is divided by the row number Y, to simulate distance. Then the resulting value is used twice, a) scaled by a centered X value and b) offset by current time. These results are then combined with the XOR pattern of which a special pattern is selected.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov ax,0x1329	; initialize with constant&lt;br /&gt;
	add dh,al		; preventing divide overflow&lt;br /&gt;
	div dh			; reverse divide AL = C/Y'&lt;br /&gt;
	xchg dx,ax		; DL = C/Y', AL = X&lt;br /&gt;
	imul dl			; AH = CX/Y'&lt;br /&gt;
	sub dx,bp		; DL = C/Y'-T 	&lt;br /&gt;
	xor ah,dl		; AH = (CX/Y') ^ (C/Y'-T)&lt;br /&gt;
	mov al,ah		; move to AL&lt;br /&gt;
	and al,4+8+16	; select special pattern&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Paralax.png|thumb|parallax checker boards]]&lt;br /&gt;
This is kind of raycasting with dynamic geometry. Each object plane is split horizontally because of the signed 16 bit multiplication and vertically because of an implicit logic column number operation. Additionally, implicit shearing is applied, so that the edges of the resulting grid (4 &amp;quot;solid&amp;quot; patches, 4 &amp;quot;transparent&amp;quot; patches, alternating) connect. If the ray hits one of the solid patches, the color becomes the iteration number (+palette offset into grayscale), if not, the plane is shifted by the screenpointer and the process is repeated until the maximum iteration number is reached.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cx,bp		; set inital point to time&lt;br /&gt;
	mov bx,-16		; limit to 16 iterations&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di		; offset point by screenpointer&lt;br /&gt;
	mov ax,819		; magic, related to Rrrola constant&lt;br /&gt;
	imul cx			; get X',Y' in DX&lt;br /&gt;
	ror dx,1		; set carry flag on &amp;quot;hit&amp;quot;&lt;br /&gt;
	inc bx			; increment iteration count&lt;br /&gt;
	ja fx3L			; loop until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	lea ax,[bx+31]	; map value to standard gray scale&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Sierpinski rotozoomer.png|thumb|sierpinski rotozoomer]]&lt;br /&gt;
[[File:1 div cos(atan(x)).png|thumb|1/cos(atan(x)), made with www.google.com]]&lt;br /&gt;
Rotozooming normally requires trigonometric functions, or good estimations of them. Let's look at the [https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions usual 2D rotation equation] and interpret it in a special way to get rid of the trigonometry in the calculation:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = x*cos(a) - y*sin(a)&lt;br /&gt;
y' = x*sin(a) + y*cos(a)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
If we add zooming to it, it looks like this:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = z * (x*cos(a) - y*sin(a))&lt;br /&gt;
y' = z * (x*sin(a) + y*cos(a))&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Now let's assume, we don't specify z ourself, and factor out cos(a):&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = cos(a) * (x - y*tan(a))&lt;br /&gt;
y' = cos(a) * (x*tan(a) + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Finally, let's substitute tan(a) with our time variable, since the tangens function rises to infinity the closer we get to +-90°:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = cos(atan(T)) * (x - y*T)&lt;br /&gt;
y' = cos(atan(T)) * (x*T + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
As long as we don't care about specifying the zoomfactor and don't directly control the angle, we can now rotate between -90° and +90° without using trigonometric functions. As a byproduct, the zoomfactor is now tied to the time T. The function for the zoomfactor is shown in the image, it results in zooming from infinitely small to one (original) and back to infinitely small. Quite a few bytes have been spent to beautify the effect, correct the timing offset, speeding up the animation, enlarging the sierpinski pixels and have decent colors, but I thought it's worth the space.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	lea cx,[bp-2048]; center time to pass zero&lt;br /&gt;
	sal cx,3		; speed up by factor 8!&lt;br /&gt;
	movzx ax,dh		; get X into AL&lt;br /&gt;
	movsx dx,dl		; get Y int DL&lt;br /&gt;
	mov bx,ax		; save X in BX&lt;br /&gt;
	imul bx,cx		; BX = X*T&lt;br /&gt;
	add bh,dl		; BH = X*T/256+Y&lt;br /&gt;
	imul dx,cx		; DX = Y*T&lt;br /&gt;
	sub al,dh		; AL = X-Y*T/256&lt;br /&gt;
	and al,bh		; AL = (X-Y*T/256)&amp;amp;(X*T/256+Y)&lt;br /&gt;
	and al,252		; thicker sierpinski&lt;br /&gt;
	salc			; set pixel value to black&lt;br /&gt;
	jnz fx4q		; leave black if not sierpinski&lt;br /&gt;
	mov al,0x2A		; otherwise: a nice orange&lt;br /&gt;
	fx4q:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Raycastbenttunnel.png|thumb|raycast bent tunnel]]&lt;br /&gt;
This is a variation of a the [https://www.pouet.net/prod.php?which=78044 raycast tunnel &amp;quot;into a new era&amp;quot;] (see also above). A [https://www.reddit.com/r/tinycode/comments/de7dj9/moving_cubes_with_pathtracing_lighting_and/ detailed explanation] has been posted to reddit for the [https://www.pouet.net/prod.php?which=83204 similar production &amp;quot;Essence&amp;quot;]. The custom colors have been removed, the bend direction is different, and the geometry is a bit more closed, to increase the performance on slower computers and DosBox.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cl,-9		; start with depth 9 (moves backwards)&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx			; save DX, destroyed inside the loop&lt;br /&gt;
		mov al,dh	; Get Y into AL&lt;br /&gt;
		sub al,100	; Centering Y has to be done &amp;quot;manually&amp;quot;.&lt;br /&gt;
		imul cl		; Multiply AL=Y by the current distance, to get a projection(1)&lt;br /&gt;
		xchg ax,dx	; Get X into AL, while saving the result in DX (DH)&lt;br /&gt;
		add al,cl	; add distance to projection, (bend to the right)&lt;br /&gt;
		imul cl		; Multiply AL=X by the current distance, to get a projection(2)&lt;br /&gt;
		mov al,dh	; Get projection(1) in AL&lt;br /&gt;
		xor al,ah	; combine with projection(2)&lt;br /&gt;
		add al,4	; center the walls around 0&lt;br /&gt;
		test al,-8	; check if the wall is hit&lt;br /&gt;
	pop dx			; restore DX&lt;br /&gt;
	loopz fx5L		; repeat until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	sub cx,bp		; offset depth by time&lt;br /&gt;
	xor al,cl		; XOR pattern for texture &lt;br /&gt;
	aam 6			; irregular pattern with MOD 6&lt;br /&gt;
	add al,20		; offset into grayscale palette&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Oceannight.png|thumb|left|oceannight]]&lt;br /&gt;
[[File:Oceanday.png|thumb|right|oceanday]]&lt;br /&gt;
The ocean effect is a wonderful example for a [https://en.wikipedia.org/wiki/Bob_Ross#Style &amp;quot;happy accident&amp;quot;]. If you load a value into the FPU as integer, and store it as floating point value, and interpret it as integer again, you get a cool pattern. Combine that with a reverse divide and you get a neat wavy effect. This works in tight conjunction with the register DX, which has a sign flip at the position we want, so we can easily separate sky and sea. The final touch is the coloring though. By design of the global framework, the value of AL is determined on function entrance, it contains the effect adress. With a bit of code shuffling, the color of the sky came &amp;quot;for free&amp;quot;, without using instructions, as well as the &amp;quot;fading to day&amp;quot;-color, which is also an effect adress. And that is no happy accident. In other versions than the original submitted DosBox version, the sky color can vary for that reason.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	sub dh,120			; check if pixel is in the sky&lt;br /&gt;
	js fx6q				; quit if that's the case&lt;br /&gt;
	mov [bx+si],dx		; move XY to a memory location&lt;br /&gt;
	fild word [bx+si]	; read memory location as integer&lt;br /&gt;
	fidivr dword [bx+si]; reverse divide by constant&lt;br /&gt;
	fstp dword [bx+si-1]; store result as floating point&lt;br /&gt;
	mov ax,[bx+si]		; get the result into AX&lt;br /&gt;
	add ax,bp			; modify color by time&lt;br /&gt;
	and al,128			; threshold into two bands&lt;br /&gt;
	dec ax				; beautify colors to blue/black&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
For convenience, the fading effect from the global framework, isolated. Basically it is generating a pseudorandom value from the screenpointer, then offsetting the time by a scaled value of it, then calling the destined effect.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MIDI music part==&lt;br /&gt;
This code section increments the timing value and produces sound. By chosing channel 3 we can reuse the &amp;quot;change channel instrument&amp;quot; instruction again as &amp;quot;RET&amp;quot;. By shifting around the code of the effects, a reasonable value for the volume could be created and thus save another byte. Be aware that this code sequence only works if the MIDI device is already in UART mode, otherwise three more bytes have to be spent. Quite a few people from the audience and sceners told my afterwards that the melody sounds a bit like [https://en.wikipedia.org/wiki/Incantations_(album) Incantations from Mike Oldfield], but it is rather created by a simple approach. Starting with [https://www.pouet.net/prod.php?which=65604 &amp;quot;Hypnoteye&amp;quot; in 2015], I experimented with procedural MIDI and eventually released a [https://www.pouet.net/prod.php?which=66313 small MIDI framework (64 bytes)]. The basic principle is to jump through tonal space with fixed steps, and map high values back to lower ones (by modulo). With simple combinations of stepwidth and modvalue, interesting effects can be produced. For example stepwidth=3 for reduced minor chords which are never resolved, likewise stepwidth=4, or stepwidth=6 for tritonal effects. With a well chosen modvalue, these patterns can create progressions. I did not do an appropriate theoretical analysis yet, rather explored the tonal space and noted when something sounded interesting.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
sounds: db 0xc3, 11, 0x93, fx2-s&lt;br /&gt;
...&lt;br /&gt;
		inc bp				; increment timing value&lt;br /&gt;
		test bp, 7			; play a note every 8th step&lt;br /&gt;
		jnz nomuse			; quit if in between&lt;br /&gt;
		mov dx,0x330		; port number for MIDI&lt;br /&gt;
		mov si,sounds		; adress for sound data&lt;br /&gt;
		outsb				; change instrument of channel 3&lt;br /&gt;
		outsb				; to vibraphone&lt;br /&gt;
		outsb				; play a note on channel 3&lt;br /&gt;
		imul ax,bp,-19*32*4	; the magic melody constant&lt;br /&gt;
		shr ax,10			; scale down and implicit &amp;quot;and 63&amp;quot;&lt;br /&gt;
		add al,22			; pitch base is 22&lt;br /&gt;
		out dx,al			; play THIS note on channel 3&lt;br /&gt;
		outsb				; play it with THIS volume&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Further tweaks==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==Differences between dos versions==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==External Links==&lt;br /&gt;
*[https://en.wikipedia.org/wiki/Demoscene Demoscene]&lt;br /&gt;
*[https://www.youtube.com/watch?v=Imquk_3oFf4 Youtube Capture]&lt;br /&gt;
*[https://www.twitch.tv/videos/589179638?t=7h02m40s Twitch Live Reactions]&lt;br /&gt;
*[https://www.pouet.net/prod.php?which=85227 Production Page on Pouet]&lt;br /&gt;
*[https://www.reddit.com/r/programming/comments/g1qhf8/memories_256_byte_msdos_intro/ Reddit/Programming Thread I]&lt;br /&gt;
*[https://www.reddit.com/r/programming/comments/g4ewh7/how_its_made_memories_a_256_byte_dos_intro_with_8/ Reddit/Programming Thread II]&lt;br /&gt;
*[https://github.com/cesarmiquel/memories-256b-msdos-intro Analysis by &amp;quot;cesarmiquel&amp;quot;]&lt;br /&gt;
*[https://hackaday.com/2020/04/21/a-jaw-dropping-demo-in-only-256-bytes/ Hackaday Article]&lt;br /&gt;
*[https://github.com/simpassi/cocoamemories Simpassis Port to MAC (TIGR)] [https://www.youtube.com/watch?v=_EgG_ZaBB7o Youtube Capture]&lt;br /&gt;
&lt;br /&gt;
==Original release code (complete)==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
; &amp;quot;memories&amp;quot; by HellMood/DESiRE&lt;br /&gt;
; the tiny megademo, 256 byte msdos intro&lt;br /&gt;
; shown in April 2020 @ REVISION&lt;br /&gt;
;&lt;br /&gt;
;   (= WILL BE COMMENTED IN DETAIL LATER =)&lt;br /&gt;
;&lt;br /&gt;
; create : nasm.exe memories.asm -fbin -o memories.com&lt;br /&gt;
; CHOOSE YOUR TARGET PLATFORM (compo version is dosbox)&lt;br /&gt;
; be sure to use the dosbox.conf from this archive!&lt;br /&gt;
; only ONE of the defines should be active!&lt;br /&gt;
%define dosbox			; size : 256 bytes&lt;br /&gt;
;%define freedos		; size : 230 bytes&lt;br /&gt;
;%define winxpdos		; size : 263 bytes&lt;br /&gt;
&lt;br /&gt;
; DON'T TOUCH THESE UNLESS YOU KNOW WHAT YOU'RE DOING&lt;br /&gt;
%ifdef winxpdos&lt;br /&gt;
	%define music&lt;br /&gt;
	%define switch_uart&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
	%define safe_segment&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef dosbox&lt;br /&gt;
	%define music&lt;br /&gt;
	;%define safe_dx ; sometimes needed&lt;br /&gt;
%endif&lt;br /&gt;
&lt;br /&gt;
; GLOBAL PARAMETERS, TUNE WITH CARE!&lt;br /&gt;
%define volume 127	; not used on dosbox (optimization)&lt;br /&gt;
%define instrument 11&lt;br /&gt;
%define scale_mod -19*32*4; &lt;br /&gt;
%define time_mask 7&lt;br /&gt;
%define targetFPS 35&lt;br /&gt;
%define tempo 1193182/256/targetFPS		&lt;br /&gt;
%define sierp_color 0x2A&lt;br /&gt;
%define tunnel_base_color 20&lt;br /&gt;
%define tunnel_pattern 6&lt;br /&gt;
%define tilt_plate_pattern 4+8+16&lt;br /&gt;
%define circles_pattern 8+16&lt;br /&gt;
&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov fs,ax&lt;br /&gt;
	mov [fs:0x46c],ax&lt;br /&gt;
%endif&lt;br /&gt;
	mov al,0x13&lt;br /&gt;
	int 0x10	 &lt;br /&gt;
	xchg bp,ax&lt;br /&gt;
	push 0xa000-10&lt;br /&gt;
	pop es&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	mov ax,0x251c&lt;br /&gt;
	%ifdef safe_dx	&lt;br /&gt;
		mov dx,timer	&lt;br /&gt;
	%else ; assume DH=1, mostly true on DosBox&lt;br /&gt;
		mov dl,timer&lt;br /&gt;
	%endif&lt;br /&gt;
	int 0x21&lt;br /&gt;
%endif&lt;br /&gt;
top:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov bp,[fs:0x46c]&lt;br /&gt;
%endif	&lt;br /&gt;
	mov ax,0xcccd&lt;br /&gt;
	mul di&lt;br /&gt;
	add al,ah&lt;br /&gt;
	xor ah,ah&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	shr ax,9&lt;br /&gt;
	and al,15&lt;br /&gt;
	xchg bx,ax&lt;br /&gt;
	mov bh,1&lt;br /&gt;
	mov bl,[byte bx+table]&lt;br /&gt;
	call bx&lt;br /&gt;
	stosb&lt;br /&gt;
	inc di&lt;br /&gt;
	inc di&lt;br /&gt;
	jnz top&lt;br /&gt;
	mov al,tempo&lt;br /&gt;
	out 40h,al&lt;br /&gt;
	in al,0x60&lt;br /&gt;
	dec al&lt;br /&gt;
	jnz top&lt;br /&gt;
sounds:&lt;br /&gt;
	db 0xc3	; is MIDI/RET&lt;br /&gt;
%ifdef music&lt;br /&gt;
	db instrument,0x93&lt;br /&gt;
	%ifdef switch_uart&lt;br /&gt;
		db volume		; without switch, volume is in table&lt;br /&gt;
		db 0x3f &lt;br /&gt;
	%endif&lt;br /&gt;
%endif&lt;br /&gt;
table: ; first index is volume, change order with care!		    					&lt;br /&gt;
	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
stop:&lt;br /&gt;
	pop ax&lt;br /&gt;
	ret&lt;br /&gt;
timer:&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	%ifdef safe_segment&lt;br /&gt;
		push cs&lt;br /&gt;
		pop ds&lt;br /&gt;
	%endif&lt;br /&gt;
		inc bp&lt;br /&gt;
	%ifdef music	&lt;br /&gt;
		test bp, time_mask&lt;br /&gt;
		jnz nomuse&lt;br /&gt;
		mov dx,0x330&lt;br /&gt;
		mov si,sounds&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		imul ax,bp,scale_mod&lt;br /&gt;
		shr ax,10&lt;br /&gt;
		add al,22&lt;br /&gt;
		out dx,al&lt;br /&gt;
		outsb&lt;br /&gt;
		%ifdef switch_uart&lt;br /&gt;
			inc dx&lt;br /&gt;
			outsb&lt;br /&gt;
		%endif&lt;br /&gt;
	%endif&lt;br /&gt;
nomuse:&lt;br /&gt;
	iret&lt;br /&gt;
%endif	&lt;br /&gt;
fx0: ; tilted plane, scrolling&lt;br /&gt;
	mov ax,0x1329&lt;br /&gt;
	add dh,al&lt;br /&gt;
	div dh&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul dl&lt;br /&gt;
	sub dx,bp&lt;br /&gt;
	xor ah,dl&lt;br /&gt;
	mov al,ah&lt;br /&gt;
	and al,tilt_plate_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx2: ; board of chessboards&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	sub ax,bp&lt;br /&gt;
	xor al,ah&lt;br /&gt;
	or al,0xDB&lt;br /&gt;
	add al,13h&lt;br /&gt;
ret&lt;br /&gt;
fx1: ; circles, zooming&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	sub al,100&lt;br /&gt;
	imul al&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul al&lt;br /&gt;
	add dh,ah&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,circles_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx3: ; parallax checkerboards&lt;br /&gt;
	mov cx,bp&lt;br /&gt;
	mov bx,-16&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di&lt;br /&gt;
	mov ax,819&lt;br /&gt;
	imul cx	 &lt;br /&gt;
	ror dx,1	 &lt;br /&gt;
	inc bx	 &lt;br /&gt;
	ja fx3L&lt;br /&gt;
	lea ax,[bx+31]	 &lt;br /&gt;
ret&lt;br /&gt;
fx4: ; sierpinski rotozoomer	&lt;br /&gt;
	lea cx,[bp-2048]&lt;br /&gt;
	sal cx,3&lt;br /&gt;
	movzx ax,dh&lt;br /&gt;
	movsx dx,dl&lt;br /&gt;
	mov bx,ax&lt;br /&gt;
	imul bx,cx&lt;br /&gt;
	add bh,dl&lt;br /&gt;
	imul dx,cx&lt;br /&gt;
	sub al,dh&lt;br /&gt;
	and al,bh&lt;br /&gt;
	and al,0b11111100&lt;br /&gt;
	salc				; VERY slow on dosbox, but ok&lt;br /&gt;
	jnz fx4q&lt;br /&gt;
	mov al,sierp_color&lt;br /&gt;
	fx4q:&lt;br /&gt;
ret&lt;br /&gt;
fx5: ; raycast bent tunnel&lt;br /&gt;
	mov cl,-9&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		sub al,100&lt;br /&gt;
		imul cl&lt;br /&gt;
		xchg ax,dx	&lt;br /&gt;
		add al,cl&lt;br /&gt;
		imul cl&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		xor al,ah&lt;br /&gt;
		add al,4&lt;br /&gt;
		test al,-8&lt;br /&gt;
	pop dx&lt;br /&gt;
	loopz fx5L&lt;br /&gt;
	sub cx,bp&lt;br /&gt;
	xor al,cl&lt;br /&gt;
	aam tunnel_pattern; VERY slow on dosbox, but ok&lt;br /&gt;
	add al,tunnel_base_color&lt;br /&gt;
ret&lt;br /&gt;
fx6: ; ocean night / to day sky&lt;br /&gt;
	sub dh,120&lt;br /&gt;
	js fx6q&lt;br /&gt;
	mov [bx+si],dx&lt;br /&gt;
	fild word [bx+si]&lt;br /&gt;
	fidivr dword [bx+si]&lt;br /&gt;
	fstp dword [bx+si-1]&lt;br /&gt;
	mov ax,[bx+si]&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,128&lt;br /&gt;
	dec ax&lt;br /&gt;
fx6q:&lt;br /&gt;
ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Bonus - NFO/ASCII==&lt;br /&gt;
&lt;br /&gt;
  &amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
                                                             art : hammerfist&lt;br /&gt;
         ∂#MW%e                              _d$Ng,&lt;br /&gt;
         'B,  ∂b                   _jM@$QZb,cQ&amp;quot;  )@&lt;br /&gt;
  ,edRB$b,l@   Wk,yGR$KM&amp;amp;$b,     ,dP&amp;quot;     Wl ]bsd%UR8BG6&amp;amp;$@DSyG#ZKM&amp;amp;$b,&lt;br /&gt;
,dP      &amp;quot;T%L  'MGF      &amp;quot;*∂R_   Tg    &amp;quot;*4Zk,#I  YP   W&amp;quot;    7P      &amp;quot;*∂R&lt;br /&gt;
4M   gd@    ^   ∂@   d@b   dQ$#@Z@R3L_    &amp;quot;*GMj  'W      ,gd$   d@b   9Q$#%b&lt;br /&gt;
W#,  `M          Wb  `*  _4P   `Qk  *#N8L   `H5   @b   'QR7YK   `*  _4F&amp;quot;   Qk&lt;br /&gt;
`6@L             dML            '@          ,BK   'M    ∂B  *b,            '#L&lt;br /&gt;
  ^QBb,_     _,4&amp;amp;M∞∂@=,_       _dGL       _gQKM    GL    @k  'Mg,_         _dG,&lt;br /&gt;
    &amp;quot;*BN5W$2#MNP&amp;quot;   &amp;quot;*G3WRM8&amp;amp;B5P&amp;quot;`Y@QNW3Z5P&amp;quot; ∂#$W8BRM3XZN87    &amp;quot;*GW38M%EBDW5P&amp;quot;`&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
                              p r e s e n t s&lt;br /&gt;
&lt;br /&gt;
            4&lt;br /&gt;
           d@,&lt;br /&gt;
         _&amp;amp; `Wl&lt;br /&gt;
      _,aP   &amp;quot;#baedM$#@@K JP*&amp;quot;?ML&lt;br /&gt;
 ,ad@$#P&amp;quot;         ,d@NEWVB&amp;quot;     X,aQPYb,_&lt;br /&gt;
V@Mm,_          ,d@MW#BW'      EMP&amp;quot;   '¶R ,ngBP^fML&lt;br /&gt;
 ¶M@N@y        Y#BNW#M&amp;quot;       J9&amp;quot;      `MQ9&amp;quot;      &amp;quot;MgRBq  ,QBMg,&lt;br /&gt;
  VN#P` ,d@@    `WM@^                   7f         ¶F` 7kY&amp;quot;   ^G  _.eQNE1.&lt;br /&gt;
   ]B _G@MWN$,   `P                     '     4b       QP      ¶w@F*^  ^Qb&lt;br /&gt;
   ]O@NRM#W@MNB,         ;                    ^`      j        JP^       Yl&lt;br /&gt;
  J#NRNWM@#BcT&amp;quot;^        ,A  _J                     _q@                   `X&lt;br /&gt;
 '¶WM#B@WdY`,7        _G#YN#PM                 _,gG&amp;quot;                      M,&lt;br /&gt;
  *BN#WP&amp;quot;  dK       ,Q@NRMB&amp;quot;]9       ,      _,M@Q*                        #A&lt;br /&gt;
   &amp;quot;U^      V@h,   iNBW#NT  J'      J9     s@QN&amp;quot;         _;               'D,&lt;br /&gt;
             ¶RMBv&amp;amp;NMQR@9  .W      .K'     &amp;quot;9`         ,6BA   _JL          ]l&lt;br /&gt;
              Y#NE@W#NRP   #[      `¶8               _d@MW#B_jW#W          BN&lt;br /&gt;
               &amp;quot;GQ@MR#W    QL_      *B            _,p#NBW#NQMG@WY          3Q&lt;br /&gt;
                  &amp;quot;Y@F     ,XW@M%im,_Yb_     _,g5@#MW@QMNE@E@NRMB         ,WM&lt;br /&gt;
                    `  _,gP*&amp;quot;#REM#GB@N#MQbnd@N#M@MW#R8QSB^'WQERM@        ;4NB,&lt;br /&gt;
                     ,GYKL    ¶E#B8R8QSB@M@#BM#W@MNB&amp;quot;`_  ,  &amp;quot;^` N       ,dW@Ql&lt;br /&gt;
                   _Q`'W`*t    '¶@GS#MBQ#E@W#NQBW[     'LvQ_   ,K    _dNABGM#N&lt;br /&gt;
                  ,F   '          `^WAB@QGE9*&amp;quot;9^*@L    jP7FY,  ¶h,_.jWM#BR#GBM,&lt;br /&gt;
                 J;    ,   _                  '       '   &amp;quot;LL  YxE#B8R8QSBNW@W;&lt;br /&gt;
                AP   _,Ag6^          _   J                  ¶A  `&amp;quot;Q#M@MW#R8E#P&lt;br /&gt;
               j@   `&amp;quot;XQW[            'LvK,_      'L_,/      @t    Y#NE@WNR&amp;quot;&lt;br /&gt;
              :M/     9^*@L           jP7F&amp;quot;       _PYKL     _,A;     ¶RSNQ&amp;quot;&lt;br /&gt;
              dKL     '     `        '   &amp;quot;L      &amp;quot;`'W`*t   `&amp;quot;XQb      `W^&lt;br /&gt;
              Q`8t            'L_,/         ,   _   '        9^Q&lt;br /&gt;
             ,W               _PYKL       _,Ag6^             ' W,     _ ,#N&amp;amp;&lt;br /&gt;
             !N  _   J       &amp;quot;`'W`*t     `&amp;quot;XQW[       _  J     N!_JG9^RwQ' *t&lt;br /&gt;
             `W,  'LvK,_        '        _gGB8@L   _   'LvK,_ ,WgB'    V    7L&lt;br /&gt;
         _.,gm&amp;amp;@B&amp;amp;wBZF&amp;quot;                j@'`  &amp;quot;WL _gML  jZd7Yb lN&amp;quot;          dBWl&lt;br /&gt;
      ,g&amp;amp;QB*&amp;quot;^`    `&amp;quot;*G@g, .gR&amp;amp;k,_   ,N&amp;quot;      '@QF  ¶k;gMF  *QvQ     jQ, ,@N@B#,&lt;br /&gt;
   .eQF*`              `Yb@&amp;quot;  &amp;quot;*6Qg,gF     ,   7     XMN&amp;quot;    'MNB,    ^¶QWSER@N;&lt;br /&gt;
 ,gP&amp;quot;           qy,      W'       ^Q'     &amp;amp;L      ,g@W'       `QMEL     `&amp;quot;WBNWP&lt;br /&gt;
g7              ¶9      ,X         M?     9&amp;quot;   _q8MSK           ¶EMt       *@K&lt;br /&gt;
Vh   _,m#L             _AH        le         ,GBDNE9^A,          *@F        NMg&lt;br /&gt;
 ¶L,qQ@ND           _.m@Bl        We      ,gM@B8#Q'   ¶h_                   lWE,&lt;br /&gt;
  W9NHW@`          JWM#B@]        @e     4WR@NGF^      'QL                  dRWl&lt;br /&gt;
   VMd*            &amp;quot;@BE@PM        'N      *UP&amp;quot;           VW,               JRSB;&lt;br /&gt;
  ,@F       j       `¶WK W,        ¶t                     XNt            _A@E#N&lt;br /&gt;
_JP       ,6&amp;amp;         &amp;quot;GLdM         XD,               _.g8NMA@k,_    _,gG#NMGR;&lt;br /&gt;
&amp;quot;Z      .JRER           'VMi     _jNB#W&amp;amp;_         _,j@E@W#Nl ¶MBGMNQGNQMG@QBW9&lt;br /&gt;
 ¶h   ,G@NRMBl            `&amp;quot;   ,d#R@M$F ¶Mg,_.gp&amp;amp;@@NEWVBWBMG  *QMN8R8SBN$E@WF&lt;br /&gt;
  Vb dW#R8QSRb,                 *YM@EQ,_ 'MENBW#NQMG#B@R@MW#l   &amp;quot;BM@QNENRQG'&lt;br /&gt;
   *WGS#MBMNEYL                    `^&amp;quot;*8M@Q@NRM#W@BWSNW@QBF&amp;quot;`     `^*@QBF^ [HFT]&lt;br /&gt;
    ^M@MW#Q9 ^Wt                           `^¶RQ@W8NQGP*`&lt;br /&gt;
     ¶Q#@P     Vk                            lA `&amp;quot;^`&lt;br /&gt;
      Y&amp;quot;       `MA                           J#,&lt;br /&gt;
                *R@,                        ,MQl&lt;br /&gt;
                 Y#Wk,                      GWM8L&lt;br /&gt;
                  W8RQSt,_                 AQ@MR#,&lt;br /&gt;
                  `@M@#SB@Mbm.,_          QNBW#NW&lt;br /&gt;
                    ¶QB8R8SBN$WNRM@#GNtwg@NMQR@B'&lt;br /&gt;
                     *MBQ#8R8QS@NE@WNBW#NQMG@NR;&lt;br /&gt;
                      `WGS#MBQ#R8QSB@NE@W#NQBW9&lt;br /&gt;
                        *OMW@QMNE@E@NRMW@QMB@*&lt;br /&gt;
                          `^&amp;quot;YQW@Q#SB#NE@EGP&lt;br /&gt;
                               `^&amp;quot;*8R@GBQF`&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Memories&amp;diff=570</id>
		<title>Memories</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Memories&amp;diff=570"/>
				<updated>2020-06-04T10:38:57Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* External Links */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Update==&lt;br /&gt;
The final freedos version is completed, you can [https://www.youtube.com/watch?v=wlW84fEHngM watch it here]. It includes the Amiga Ball as an extra effect, but lacks the music, since a normal modern PC booted to freedos can not play MIDI that easily. The filesize is still 256 bytes.&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
Hello! My name is &amp;quot;HellMood&amp;quot; and this is not a usual wiki style article. It will be written from my point of view, in first person. This article is all about a tiny MS DOS program called &amp;quot;Memories&amp;quot;. This program has a size of 256 bytes and won the &amp;quot;PC 256 byte&amp;quot; competition of the [https://en.wikipedia.org/wiki/Demoscene demoscene] event [https://2020.revision-party.net/ &amp;quot;Revision&amp;quot;] in 2020, as well as the public choice award. You can watch a video of the output of this program here (https://www.youtube.com/watch?v=Imquk_3oFf4) and a video including live reactions of online audience and moderators here (https://www.twitch.tv/videos/589179638?t=7h02m40s). You can download the release and comment here (https://www.pouet.net/prod.php?which=85227). This wiki article will give you an indepth analysis of the program and point out historical references as well as development steps. In case you did not notice, you are now on the &amp;quot;sizecoding wiki&amp;quot;. This will help you not only to understand the interior of &amp;quot;Memories&amp;quot;, but also enable you to produce something like this yourself. Feel free to look around! It is recommended that you start with the [http://www.sizecoding.org/wiki/Getting_Started basics of this wiki] first if you're new to sizecoding and/or x86 assembler. While the principles might be easy to grasp, details of the implementation might not.&lt;br /&gt;
&lt;br /&gt;
==Disclaimer==&lt;br /&gt;
This article discusses the submitted DosBox version (256 bytes). The available archive contains versions for FreeDos and Windows XP DOS which, at the time of this writing, don't work on all computers. These alternative versions were included as proof of concept, to show that this program does not only run on the emulator &amp;quot;DosBox&amp;quot;. For the &amp;quot;PC 256 bytes&amp;quot; competition of the event &amp;quot;Revision&amp;quot; in 2020, it was allowed to specify either &amp;quot;FreeDos&amp;quot; or &amp;quot;DosBox&amp;quot; as platform, the latter with a specific configuration. It is in fact possible to modify the DosBox version to work on FreeDos, MS DOS, WinXP and Win98 as the released alternative versions showcase, but this is not the focus of this article. Anyway, safe versions for all platform and all computers are in the works. Regarding possible further optimizations, I will discuss/explain only the submitted version although i already found several spots in the code that can be optimized. As the history of tiny effects show, almost no code is perfect, and can be brought down further in size. To avoid confusion between different versions, only the submitted version will be explained.&lt;br /&gt;
&lt;br /&gt;
==History of tiny effects==&lt;br /&gt;
[[File:Pouet categories.png|thumb|Size categories on http://www.pouet.net]]&lt;br /&gt;
Sizecoders think in size categories. For MS DOS, these categories are 256b, 128b, 64b and 32b. These are the standards of one of the biggest demoscene archives, http://www.pouet.net. There is no 16b category, although many tiny effects can be implemented in 16 bytes. Almost all of the effects of &amp;quot;Memories&amp;quot; have been coded and optimized by me before, and mostly the implementations were attempts to reduce an already existing effect in size, or do something similar in less size. Reducing the size in this case means, reducing it to the (one of the) next lower 2^N category. For example I might have seen an effect done in 33 to 64 bytes, and then reduced the size to 32 bytes or less. Almost everytime I moved an effect to a lower category, I submitted the resulting tiny program to a demoscene party which allowed remote entries in the 256b category and/or published the result to http://www.pouet.net . In this section I will introduce the effects and mention the origins and authors.&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Kasparov.gif|frame|kasparov, 16 bytes]]&lt;br /&gt;
The origin of this effect is my own &amp;quot;Kasparov 16b&amp;quot; from 2018 (https://www.pouet.net/prod.php?which=75912) apparently I &amp;quot;had this piece lying around, stuck at 17 bytes and not that beautiful, until i found a trick ;)&amp;quot; (quote from the release notes) There was a similar effect in 32 bytes before : &amp;quot;ew&amp;quot; by &amp;quot;headcrash&amp;quot; from 2003. (https://www.pouet.net/prod.php?which=17567) In this case, my design goal was to show &amp;quot;real&amp;quot; chessboards of 8x8 fields and recognizable dark and light squares, as well as the correct orientation of the single chessboards, meaning that the bottom right corner (h1) has to be a [https://en.wikibooks.org/wiki/Chess/Arranging_The_Board white square]. For &amp;quot;Memories&amp;quot;, the effect had to be reimplemented with another approach of writing to the screen to work with the framework, also the scrolling direction was changed to be different from the &amp;quot;scrolling tilted plane&amp;quot; effect.&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
The zooming circles were supposed to be entered to a demoparty as a 32 bytes intro, but I never actually did it. There is no real predecessor in 64 bytes, because in the 64b category much more complex effects are possible. The zooming circles were the result of desperately trying to reach the 32b category for a circular &amp;quot;tunnel&amp;quot; effect, for that my own record is still 52 bytes (&amp;quot;Neontube&amp;quot; - 2016)(https://www.pouet.net/prod.php?which=66808) which in turn was the optimization of the all-time-classic 64 byte effect &amp;quot;constant evolution&amp;quot; from ryg/Farbrausch (2003) (https://www.pouet.net/prod.php?which=8697). In the zooming circles routine, the distance and the angle are removed/ignored, so that the 32b category could be reached.&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Floorcast variation.png|thumb|Floorcast, 32 bytes version, variation]]&lt;br /&gt;
The scrolling tilted plane is one of my own releases, &amp;quot;floorcast 32b&amp;quot; from 2018. Floorcasts have its own history in sizecoding, the effect was brought down stepwise from 256 bytes to 32 bytes. The versions differ in the number of planes, some show two planes, others only one. For the &amp;quot;floorcast 32b&amp;quot; release I specifically decided against the &amp;quot;XOR&amp;quot; texture, whereas in &amp;quot;Memories&amp;quot; I used it again, but disguised with another final touch of &amp;quot;AND&amp;quot;.&lt;br /&gt;
* &amp;quot;rain_storm version 2008 - 256 bytes - https://www.pouet.net/prod.php?which=50479&lt;br /&gt;
* &amp;quot;org_100h&amp;quot; version 2008 - 128 bytes - https://www.pouet.net/prod.php?which=63271&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2013 - 86 bytes - https://www.pouet.net/prod.php?which=61070&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2014 - 64 bytes - https://www.pouet.net/prod.php?which=63305&lt;br /&gt;
* &amp;quot;HellMood&amp;quot; version 2018 - 32 bytes - https://www.pouet.net/prod.php?which=77774&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Projektbeschreibung.gif|frame|Projektbeschreibung, 32 bytes]]&lt;br /&gt;
The parallax checkerboards have been released by me as 32 bytes effect [https://www.pouet.net/prod.php?which=76788 &amp;quot;Projektbeschreibung&amp;quot;] in 2018. It was the direct attempt to bring [https://www.pouet.net/prod.php?which=28543 &amp;quot;Follow the light&amp;quot;] from &amp;quot;Digimind&amp;quot; (2006) or my own [https://www.pouet.net/prod.php?which=63293 &amp;quot;Lucy&amp;quot;] (2014) down to 32 bytes. Very helpful inspirations came from [https://www.pouet.net/prod.php?which=32281 &amp;quot;Rrrolas&amp;quot; &amp;quot;Paralaxa&amp;quot;] (32 bytes, 2007) and [https://www.pouet.net/prod.php?which=61075 &amp;quot;Sensenstahl&amp;quot;s &amp;quot;Byteropolis&amp;quot;] (32 bytes, 2013). In fact Rrrolas rendering approach was already close to the final solution, the code was modified to correct the planes arrangement, change the shape from triangles to checkerboards and improve on the colors. In &amp;quot;Memories&amp;quot;, the coloring of &amp;quot;Digimind&amp;quot;s version was used. Also, the effect has been modified to reduce shearing as much as possible.&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Colpinski.png|thumb|colpinski, 16 bytes]]&lt;br /&gt;
[[File:Rotastic.png|thumb|left|rotastic, 32 bytes]]&lt;br /&gt;
This consists of two effects, the rotozoomer and the sierpinski effect as a texture. The sierpinski effect is based on my own &amp;quot;Colpinski 16b&amp;quot; from 2013 (https://www.pouet.net/prod.php?which=62079) where &amp;quot;frag&amp;quot; of &amp;quot;fsqrt&amp;quot; helped pushing the limits. There is no real predecessor to that, because it is known that this effect can be done directly by combining X and Y, instead of an Iterated Function System or Cellular Automaton approach. The rotozoomer was released as 32b intro &amp;quot;rotastic&amp;quot; by me in 2017 (https://www.pouet.net/prod.php?which=70112), it is based on the ideas of &amp;quot;ryg&amp;quot; of &amp;quot;Farbrausch&amp;quot; (51 bytes, 2002, https://www.pouet.net/prod.php?which=5015) and &amp;quot;Gargaj&amp;quot; of &amp;quot;Conspiracy&amp;quot; (49 bytes, 2002, https://www.pouet.net/prod.php?which=6088).&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Intoanew.gif|frame|Into a new era, 64 bytes version]]&lt;br /&gt;
The raycast bent tunnel is a tweaked version of my own 64 bytes &amp;quot;Into a new era&amp;quot; release (2018, https://www.pouet.net/prod.php?which=78044). The custom colors were exchanged for colors of the standard palette, the geometry was tweaked a bit and changed accordingly, so that no depth effect was used for the calculation of texture values. The standalone version of this effect has a size of 50 bytes. The two main inspirations for the 64 bytes version were two 128 byte intros, &amp;quot;Spongy&amp;quot; by &amp;quot;TBC&amp;quot; (2009, https://www.pouet.net/prod.php?which=53871) and &amp;quot;Wolf128&amp;quot; from &amp;quot;Baudsurfer&amp;quot; (2014, https://www.pouet.net/prod.php?which=63518), while the particular algorithm was developed independently.&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Ocean.png|thumb|Ocean, 64b version]]&lt;br /&gt;
The ocean effect is based on my own 64 bytes release &amp;quot;Ocean&amp;quot; from 2016 (https://www.pouet.net/prod.php?which=66857). The custom color generation and music generation have been stripped from the original, both were not compatible with the main framework of &amp;quot;Memories&amp;quot; without consuming a LOT of extra space. The special effect of &amp;quot;dawn&amp;quot; happens as a byproduct of the general framework approach. This will be explained in the next chapter.&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
The transition between two effects is an effect itself, which has no real predecessor. It is rather an idea that evolved over the years, and might have been done by many others in similar ways. In a nutshell, while calculating a frame, each pixel position is randomized and the time - which determines the effect to use - is offset by this randomization value which is downscaled before. This allows the usage of the standard VGA palette (picture, source) rather than creating custom colors for real crossfading and thus saves space.&lt;br /&gt;
&lt;br /&gt;
==Tiny megademo framework==&lt;br /&gt;
The tiny megademo framework is based on ideas of [https://www.pouet.net/prod.php?which=68020 &amp;quot;Quatro&amp;quot;], done by &amp;quot;Sensenstahl&amp;quot; and me. For putting many tiny effects together into a &amp;quot;megademo&amp;quot;, they have all to use the same approach and they have to be mostly free of assumptions (contents of memory and registers). Also they have to use the same timing values and be tuned to a shared timing in general. Preparing the single effects to be includeable in a framework like this took quite some time and required a lot of extra space initially. It has to be noted, that - sadly - some of my most impressive effects (judged by audience reactions and views on social media) could not be included at all, because the approaches were not combineable without a HUGE overhead. Once all effects were streamlined, I could think about &amp;quot;factoring&amp;quot; out common calculations which repeatedly occur, resulting in a few saved bytes. The framework does the following:&lt;br /&gt;
* Setting up screen mode to 320 x 200 pixels in 256 colors&lt;br /&gt;
* Initializing a pointer to the screen&lt;br /&gt;
* Installing a callback for timing and music&lt;br /&gt;
* Mainloop&lt;br /&gt;
** Calculation of X and Y from the current screen location&lt;br /&gt;
*** with the [http://www.sizecoding.org/wiki/General_Coding_Tricks#Obtaining_X_and_Y_without_DIV_.28The_Rrrola_Trick.29 &amp;quot;Rrrola Trick&amp;quot;]&lt;br /&gt;
** P = Pseudorandom value from screen location&lt;br /&gt;
** Offset current time T by a scaled value P'&lt;br /&gt;
** Selection of effect number N according to T'&lt;br /&gt;
** Execution of effect N (X,Y) for current pixel&lt;br /&gt;
*** On entrance, XY is in register DX (DL,DH)&lt;br /&gt;
*** On entrance, the current time is in register BP&lt;br /&gt;
*** On exit, the pixel color is expected in AL&lt;br /&gt;
** Advance to next pixel&lt;br /&gt;
** Triple diagonal interlacing for smoothing&lt;br /&gt;
** Repeat until frame is complete&lt;br /&gt;
* Set timer to a tempo of ~35 FPS&lt;br /&gt;
* Check keyboard for ESC&lt;br /&gt;
** Quit on ESC, otherwise continue&lt;br /&gt;
=== Code of framework===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
	mov al,0x13				; set AL to mode 320*200 in 256 colors&lt;br /&gt;
	int 0x10	 			; call BIOS to set mode&lt;br /&gt;
	xchg bp,ax				; set timing value to 0x13 &lt;br /&gt;
	push 0xa000-10			; write the screen adress to register ES&lt;br /&gt;
	pop es					; works in conjunction with Rrrola trick&lt;br /&gt;
	mov ax,0x251c			; parameter for changing timer interrupt&lt;br /&gt;
	mov dl,timer			; adress of timer routine, assume DH=1&lt;br /&gt;
	int 0x21				; install timer routine&lt;br /&gt;
top:&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
	call bx					; call the effect&lt;br /&gt;
	stosb					; write the return value and advance&lt;br /&gt;
	inc di					; triple interlace trick for after&lt;br /&gt;
	inc di					; effect and smoothing the animation&lt;br /&gt;
	jnz top					; repeat until the frame is complete&lt;br /&gt;
	mov al,tempo			; set AL to divider for timer&lt;br /&gt;
	out 40h,al				; set timing (dual pass)&lt;br /&gt;
	in al,0x60				; read keyboard&lt;br /&gt;
	dec al					; quit on ESC&lt;br /&gt;
	jnz top					; otherwise repeat loop&lt;br /&gt;
sounds: db 0xc3, 11, 0x93; 0xc3 is MIDI/RET; fx2-s is used as volume&lt;br /&gt;
table: 	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Explanation of effects==&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Array of chessboards.png|thumb|array of chessboards]]&lt;br /&gt;
The most simple effect and a good one to start with. After offsetting the row by the time, the classic XOR pattern is applied. To achieve the impression of a grid of chessboards, all bits but two are set in the color. The real trick lies in shifting to a &amp;quot;good&amp;quot; palette spot. The portion of the frame which is perceived as black, isn't actually black, it's the dark part of the [https://en.wikipedia.org/wiki/Video_Graphics_Array#Color_palette standard VGA palette]. Shifting that way also makes the dark and light square look like they are from an ancient old chessboard.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	xchg dx,ax		; get XY into AX&lt;br /&gt;
	sub ax,bp		; subtract time from row&lt;br /&gt;
	xor al,ah		; XOR pattern (x xor y)&lt;br /&gt;
	or al,0xDB		; pattern for array of boards&lt;br /&gt;
	add al,13h		; shift to good palette spot&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
[[File:Zooming circles.png|thumb|zooming circles]]&lt;br /&gt;
The distance D of a point (X,Y) to the center (0,0) is sqrt(X²+Y²). The framework already makes sure the DL contains a centered X coordinate, DH=Y has to be centered though. Performing a square root on x86 requires quite a bit of code, but this actually can be omitted. With a clever selection of colors, the impression of zooming circles without square root is quite decent.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov al,dh		; get Y in AL&lt;br /&gt;
	sub al,100		; align Y vertically&lt;br /&gt;
	imul al			; AL = Y²&lt;br /&gt;
	xchg dx,ax		; Y²/256 in DH, X in AL&lt;br /&gt;
	imul al			; AL = X²&lt;br /&gt;
	add dh,ah		; DH = (X² + Y²)/256&lt;br /&gt;
	mov al,dh		; AL = (X² + Y²)/256&lt;br /&gt;
	add ax,bp		; offset color by time&lt;br /&gt;
	and al,8+16		; select special rings&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Tiltplane.png|thumb|scrolling tilted plane]]&lt;br /&gt;
This effect is achieved as follow. First, a large constant is divided by the row number Y, to simulate distance. Then the resulting value is used twice, a) scaled by a centered X value and b) offset by current time. These results are then combined with the XOR pattern of which a special pattern is selected.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov ax,0x1329	; initialize with constant&lt;br /&gt;
	add dh,al		; preventing divide overflow&lt;br /&gt;
	div dh			; reverse divide AL = C/Y'&lt;br /&gt;
	xchg dx,ax		; DL = C/Y', AL = X&lt;br /&gt;
	imul dl			; AH = CX/Y'&lt;br /&gt;
	sub dx,bp		; DL = C/Y'-T 	&lt;br /&gt;
	xor ah,dl		; AH = (CX/Y') ^ (C/Y'-T)&lt;br /&gt;
	mov al,ah		; move to AL&lt;br /&gt;
	and al,4+8+16	; select special pattern&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Paralax.png|thumb|parallax checker boards]]&lt;br /&gt;
This is kind of raycasting with dynamic geometry. Each object plane is split horizontally because of the signed 16 bit multiplication and vertically because of an implicit logic column number operation. Additionally, implicit shearing is applied, so that the edges of the resulting grid (4 &amp;quot;solid&amp;quot; patches, 4 &amp;quot;transparent&amp;quot; patches, alternating) connect. If the ray hits one of the solid patches, the color becomes the iteration number (+palette offset into grayscale), if not, the plane is shifted by the screenpointer and the process is repeated until the maximum iteration number is reached.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cx,bp		; set inital point to time&lt;br /&gt;
	mov bx,-16		; limit to 16 iterations&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di		; offset point by screenpointer&lt;br /&gt;
	mov ax,819		; magic, related to Rrrola constant&lt;br /&gt;
	imul cx			; get X',Y' in DX&lt;br /&gt;
	ror dx,1		; set carry flag on &amp;quot;hit&amp;quot;&lt;br /&gt;
	inc bx			; increment iteration count&lt;br /&gt;
	ja fx3L			; loop until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	lea ax,[bx+31]	; map value to standard gray scale&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Sierpinski rotozoomer.png|thumb|sierpinski rotozoomer]]&lt;br /&gt;
[[File:1 div cos(atan(x)).png|thumb|1/cos(atan(x)), made with www.google.com]]&lt;br /&gt;
Rotozooming normally requires trigonometric functions, or good estimations of them. Let's look at the [https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions usual 2D rotation equation] and interpret it in a special way to get rid of the trigonometry in the calculation:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = x*cos(a) - y*sin(a)&lt;br /&gt;
y' = x*sin(a) + y*cos(a)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
If we add zooming to it, it looks like this:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = z * (x*cos(a) - y*sin(a))&lt;br /&gt;
y' = z * (x*sin(a) + y*cos(a))&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Now let's assume, we don't specify z ourself, and factor out cos(a):&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = cos(a) * (x - y*tan(a))&lt;br /&gt;
y' = cos(a) * (x*tan(a) + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Finally, let's substitute tan(a) with our time variable, since the tangens function rises to infinity the closer we get to +-90°:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = cos(atan(T)) * (x - y*T)&lt;br /&gt;
y' = cos(atan(T)) * (x*T + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
As long as we don't care about specifying the zoomfactor and don't directly control the angle, we can now rotate between -90° and +90° without using trigonometric functions. As a byproduct, the zoomfactor is now tied to the time T. The function for the zoomfactor is shown in the image, it results in zooming from infinitely small to one (original) and back to infinitely small. Quite a few bytes have been spent to beautify the effect, correct the timing offset, speeding up the animation, enlarging the sierpinski pixels and have decent colors, but I thought it's worth the space.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	lea cx,[bp-2048]; center time to pass zero&lt;br /&gt;
	sal cx,3		; speed up by factor 8!&lt;br /&gt;
	movzx ax,dh		; get X into AL&lt;br /&gt;
	movsx dx,dl		; get Y int DL&lt;br /&gt;
	mov bx,ax		; save X in BX&lt;br /&gt;
	imul bx,cx		; BX = X*T&lt;br /&gt;
	add bh,dl		; BH = X*T/256+Y&lt;br /&gt;
	imul dx,cx		; DX = Y*T&lt;br /&gt;
	sub al,dh		; AL = X-Y*T/256&lt;br /&gt;
	and al,bh		; AL = (X-Y*T/256)&amp;amp;(X*T/256+Y)&lt;br /&gt;
	and al,252		; thicker sierpinski&lt;br /&gt;
	salc			; set pixel value to black&lt;br /&gt;
	jnz fx4q		; leave black if not sierpinski&lt;br /&gt;
	mov al,0x2A		; otherwise: a nice orange&lt;br /&gt;
	fx4q:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Raycastbenttunnel.png|thumb|raycast bent tunnel]]&lt;br /&gt;
This is a variation of a the [https://www.pouet.net/prod.php?which=78044 raycast tunnel &amp;quot;into a new era&amp;quot;] (see also above). A [https://www.reddit.com/r/tinycode/comments/de7dj9/moving_cubes_with_pathtracing_lighting_and/ detailed explanation] has been posted to reddit for the [https://www.pouet.net/prod.php?which=83204 similar production &amp;quot;Essence&amp;quot;]. The custom colors have been removed, the bend direction is different, and the geometry is a bit more closed, to increase the performance on slower computers and DosBox.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cl,-9		; start with depth 9 (moves backwards)&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx			; save DX, destroyed inside the loop&lt;br /&gt;
		mov al,dh	; Get Y into AL&lt;br /&gt;
		sub al,100	; Centering Y has to be done &amp;quot;manually&amp;quot;.&lt;br /&gt;
		imul cl		; Multiply AL=Y by the current distance, to get a projection(1)&lt;br /&gt;
		xchg ax,dx	; Get X into AL, while saving the result in DX (DH)&lt;br /&gt;
		add al,cl	; add distance to projection, (bend to the right)&lt;br /&gt;
		imul cl		; Multiply AL=X by the current distance, to get a projection(2)&lt;br /&gt;
		mov al,dh	; Get projection(1) in AL&lt;br /&gt;
		xor al,ah	; combine with projection(2)&lt;br /&gt;
		add al,4	; center the walls around 0&lt;br /&gt;
		test al,-8	; check if the wall is hit&lt;br /&gt;
	pop dx			; restore DX&lt;br /&gt;
	loopz fx5L		; repeat until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	sub cx,bp		; offset depth by time&lt;br /&gt;
	xor al,cl		; XOR pattern for texture &lt;br /&gt;
	aam 6			; irregular pattern with MOD 6&lt;br /&gt;
	add al,20		; offset into grayscale palette&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Oceannight.png|thumb|left|oceannight]]&lt;br /&gt;
[[File:Oceanday.png|thumb|right|oceanday]]&lt;br /&gt;
The ocean effect is a wonderful example for a [https://en.wikipedia.org/wiki/Bob_Ross#Style &amp;quot;happy accident&amp;quot;]. If you load a value into the FPU as integer, and store it as floating point value, and interpret it as integer again, you get a cool pattern. Combine that with a reverse divide and you get a neat wavy effect. This works in tight conjunction with the register DX, which has a sign flip at the position we want, so we can easily separate sky and sea. The final touch is the coloring though. By design of the global framework, the value of AL is determined on function entrance, it contains the effect adress. With a bit of code shuffling, the color of the sky came &amp;quot;for free&amp;quot;, without using instructions, as well as the &amp;quot;fading to day&amp;quot;-color, which is also an effect adress. And that is no happy accident. In other versions than the original submitted DosBox version, the sky color can vary for that reason.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	sub dh,120			; check if pixel is in the sky&lt;br /&gt;
	js fx6q				; quit if that's the case&lt;br /&gt;
	mov [bx+si],dx		; move XY to a memory location&lt;br /&gt;
	fild word [bx+si]	; read memory location as integer&lt;br /&gt;
	fidivr dword [bx+si]; reverse divide by constant&lt;br /&gt;
	fstp dword [bx+si-1]; store result as floating point&lt;br /&gt;
	mov ax,[bx+si]		; get the result into AX&lt;br /&gt;
	add ax,bp			; modify color by time&lt;br /&gt;
	and al,128			; threshold into two bands&lt;br /&gt;
	dec ax				; beautify colors to blue/black&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
For convenience, the fading effect from the global framework, isolated. Basically it is generating a pseudorandom value from the screenpointer, then offsetting the time by a scaled value of it, then calling the destined effect.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MIDI music part==&lt;br /&gt;
This code section increments the timing value and produces sound. By chosing channel 3 we can reuse the &amp;quot;change channel instrument&amp;quot; instruction again as &amp;quot;RET&amp;quot;. By shifting around the code of the effects, a reasonable value for the volume could be created and thus save another byte. Be aware that this code sequence only works if the MIDI device is already in UART mode, otherwise three more bytes have to be spent. Quite a few people from the audience and sceners told my afterwards that the melody sounds a bit like [https://en.wikipedia.org/wiki/Incantations_(album) Incantations from Mike Oldfield], but it is rather created by a simple approach. Starting with [https://www.pouet.net/prod.php?which=65604 &amp;quot;Hypnoteye&amp;quot; in 2015], I experimented with procedural MIDI and eventually released a [https://www.pouet.net/prod.php?which=66313 small MIDI framework (64 bytes)]. The basic principle is to jump through tonal space with fixed steps, and map high values back to lower ones (by modulo). With simple combinations of stepwidth and modvalue, interesting effects can be produced. For example stepwidth=3 for reduced minor chords which are never resolved, likewise stepwidth=4, or stepwidth=6 for tritonal effects. With a well chosen modvalue, these patterns can create progressions. I did not do an appropriate theoretical analysis yet, rather explored the tonal space and noted when something sounded interesting.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
sounds: db 0xc3, 11, 0x93, fx2-s&lt;br /&gt;
...&lt;br /&gt;
		inc bp				; increment timing value&lt;br /&gt;
		test bp, 7			; play a note every 8th step&lt;br /&gt;
		jnz nomuse			; quit if in between&lt;br /&gt;
		mov dx,0x330		; port number for MIDI&lt;br /&gt;
		mov si,sounds		; adress for sound data&lt;br /&gt;
		outsb				; change instrument of channel 3&lt;br /&gt;
		outsb				; to vibraphone&lt;br /&gt;
		outsb				; play a note on channel 3&lt;br /&gt;
		imul ax,bp,-19*32*4	; the magic melody constant&lt;br /&gt;
		shr ax,10			; scale down and implicit &amp;quot;and 63&amp;quot;&lt;br /&gt;
		add al,22			; pitch base is 22&lt;br /&gt;
		out dx,al			; play THIS note on channel 3&lt;br /&gt;
		outsb				; play it with THIS volume&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Further tweaks==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==Differences between dos versions==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==External Links==&lt;br /&gt;
*[https://en.wikipedia.org/wiki/Demoscene Demoscene]&lt;br /&gt;
*[https://www.youtube.com/watch?v=Imquk_3oFf4 Youtube Capture]&lt;br /&gt;
*[https://www.twitch.tv/videos/589179638?t=7h02m40s Twitch Live Reactions]&lt;br /&gt;
*[https://www.pouet.net/prod.php?which=85227 Production Page on Pouet]&lt;br /&gt;
*[https://www.reddit.com/r/programming/comments/g1qhf8/memories_256_byte_msdos_intro/ Reddit/Programming Thread I]&lt;br /&gt;
*[https://www.reddit.com/r/programming/comments/g4ewh7/how_its_made_memories_a_256_byte_dos_intro_with_8/ Reddit/Programming Thread II]&lt;br /&gt;
*[https://github.com/cesarmiquel/memories-256b-msdos-intro Analysis by &amp;quot;cesarmiquel&amp;quot;]&lt;br /&gt;
*[https://hackaday.com/2020/04/21/a-jaw-dropping-demo-in-only-256-bytes/ Hackaday Article]&lt;br /&gt;
*[https://github.com/simpassi/cocoamemories Simpassis Port to MAC (TIGR)]&lt;br /&gt;
&lt;br /&gt;
==Original release code (complete)==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
; &amp;quot;memories&amp;quot; by HellMood/DESiRE&lt;br /&gt;
; the tiny megademo, 256 byte msdos intro&lt;br /&gt;
; shown in April 2020 @ REVISION&lt;br /&gt;
;&lt;br /&gt;
;   (= WILL BE COMMENTED IN DETAIL LATER =)&lt;br /&gt;
;&lt;br /&gt;
; create : nasm.exe memories.asm -fbin -o memories.com&lt;br /&gt;
; CHOOSE YOUR TARGET PLATFORM (compo version is dosbox)&lt;br /&gt;
; be sure to use the dosbox.conf from this archive!&lt;br /&gt;
; only ONE of the defines should be active!&lt;br /&gt;
%define dosbox			; size : 256 bytes&lt;br /&gt;
;%define freedos		; size : 230 bytes&lt;br /&gt;
;%define winxpdos		; size : 263 bytes&lt;br /&gt;
&lt;br /&gt;
; DON'T TOUCH THESE UNLESS YOU KNOW WHAT YOU'RE DOING&lt;br /&gt;
%ifdef winxpdos&lt;br /&gt;
	%define music&lt;br /&gt;
	%define switch_uart&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
	%define safe_segment&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef dosbox&lt;br /&gt;
	%define music&lt;br /&gt;
	;%define safe_dx ; sometimes needed&lt;br /&gt;
%endif&lt;br /&gt;
&lt;br /&gt;
; GLOBAL PARAMETERS, TUNE WITH CARE!&lt;br /&gt;
%define volume 127	; not used on dosbox (optimization)&lt;br /&gt;
%define instrument 11&lt;br /&gt;
%define scale_mod -19*32*4; &lt;br /&gt;
%define time_mask 7&lt;br /&gt;
%define targetFPS 35&lt;br /&gt;
%define tempo 1193182/256/targetFPS		&lt;br /&gt;
%define sierp_color 0x2A&lt;br /&gt;
%define tunnel_base_color 20&lt;br /&gt;
%define tunnel_pattern 6&lt;br /&gt;
%define tilt_plate_pattern 4+8+16&lt;br /&gt;
%define circles_pattern 8+16&lt;br /&gt;
&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov fs,ax&lt;br /&gt;
	mov [fs:0x46c],ax&lt;br /&gt;
%endif&lt;br /&gt;
	mov al,0x13&lt;br /&gt;
	int 0x10	 &lt;br /&gt;
	xchg bp,ax&lt;br /&gt;
	push 0xa000-10&lt;br /&gt;
	pop es&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	mov ax,0x251c&lt;br /&gt;
	%ifdef safe_dx	&lt;br /&gt;
		mov dx,timer	&lt;br /&gt;
	%else ; assume DH=1, mostly true on DosBox&lt;br /&gt;
		mov dl,timer&lt;br /&gt;
	%endif&lt;br /&gt;
	int 0x21&lt;br /&gt;
%endif&lt;br /&gt;
top:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov bp,[fs:0x46c]&lt;br /&gt;
%endif	&lt;br /&gt;
	mov ax,0xcccd&lt;br /&gt;
	mul di&lt;br /&gt;
	add al,ah&lt;br /&gt;
	xor ah,ah&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	shr ax,9&lt;br /&gt;
	and al,15&lt;br /&gt;
	xchg bx,ax&lt;br /&gt;
	mov bh,1&lt;br /&gt;
	mov bl,[byte bx+table]&lt;br /&gt;
	call bx&lt;br /&gt;
	stosb&lt;br /&gt;
	inc di&lt;br /&gt;
	inc di&lt;br /&gt;
	jnz top&lt;br /&gt;
	mov al,tempo&lt;br /&gt;
	out 40h,al&lt;br /&gt;
	in al,0x60&lt;br /&gt;
	dec al&lt;br /&gt;
	jnz top&lt;br /&gt;
sounds:&lt;br /&gt;
	db 0xc3	; is MIDI/RET&lt;br /&gt;
%ifdef music&lt;br /&gt;
	db instrument,0x93&lt;br /&gt;
	%ifdef switch_uart&lt;br /&gt;
		db volume		; without switch, volume is in table&lt;br /&gt;
		db 0x3f &lt;br /&gt;
	%endif&lt;br /&gt;
%endif&lt;br /&gt;
table: ; first index is volume, change order with care!		    					&lt;br /&gt;
	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
stop:&lt;br /&gt;
	pop ax&lt;br /&gt;
	ret&lt;br /&gt;
timer:&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	%ifdef safe_segment&lt;br /&gt;
		push cs&lt;br /&gt;
		pop ds&lt;br /&gt;
	%endif&lt;br /&gt;
		inc bp&lt;br /&gt;
	%ifdef music	&lt;br /&gt;
		test bp, time_mask&lt;br /&gt;
		jnz nomuse&lt;br /&gt;
		mov dx,0x330&lt;br /&gt;
		mov si,sounds&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		imul ax,bp,scale_mod&lt;br /&gt;
		shr ax,10&lt;br /&gt;
		add al,22&lt;br /&gt;
		out dx,al&lt;br /&gt;
		outsb&lt;br /&gt;
		%ifdef switch_uart&lt;br /&gt;
			inc dx&lt;br /&gt;
			outsb&lt;br /&gt;
		%endif&lt;br /&gt;
	%endif&lt;br /&gt;
nomuse:&lt;br /&gt;
	iret&lt;br /&gt;
%endif	&lt;br /&gt;
fx0: ; tilted plane, scrolling&lt;br /&gt;
	mov ax,0x1329&lt;br /&gt;
	add dh,al&lt;br /&gt;
	div dh&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul dl&lt;br /&gt;
	sub dx,bp&lt;br /&gt;
	xor ah,dl&lt;br /&gt;
	mov al,ah&lt;br /&gt;
	and al,tilt_plate_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx2: ; board of chessboards&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	sub ax,bp&lt;br /&gt;
	xor al,ah&lt;br /&gt;
	or al,0xDB&lt;br /&gt;
	add al,13h&lt;br /&gt;
ret&lt;br /&gt;
fx1: ; circles, zooming&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	sub al,100&lt;br /&gt;
	imul al&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul al&lt;br /&gt;
	add dh,ah&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,circles_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx3: ; parallax checkerboards&lt;br /&gt;
	mov cx,bp&lt;br /&gt;
	mov bx,-16&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di&lt;br /&gt;
	mov ax,819&lt;br /&gt;
	imul cx	 &lt;br /&gt;
	ror dx,1	 &lt;br /&gt;
	inc bx	 &lt;br /&gt;
	ja fx3L&lt;br /&gt;
	lea ax,[bx+31]	 &lt;br /&gt;
ret&lt;br /&gt;
fx4: ; sierpinski rotozoomer	&lt;br /&gt;
	lea cx,[bp-2048]&lt;br /&gt;
	sal cx,3&lt;br /&gt;
	movzx ax,dh&lt;br /&gt;
	movsx dx,dl&lt;br /&gt;
	mov bx,ax&lt;br /&gt;
	imul bx,cx&lt;br /&gt;
	add bh,dl&lt;br /&gt;
	imul dx,cx&lt;br /&gt;
	sub al,dh&lt;br /&gt;
	and al,bh&lt;br /&gt;
	and al,0b11111100&lt;br /&gt;
	salc				; VERY slow on dosbox, but ok&lt;br /&gt;
	jnz fx4q&lt;br /&gt;
	mov al,sierp_color&lt;br /&gt;
	fx4q:&lt;br /&gt;
ret&lt;br /&gt;
fx5: ; raycast bent tunnel&lt;br /&gt;
	mov cl,-9&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		sub al,100&lt;br /&gt;
		imul cl&lt;br /&gt;
		xchg ax,dx	&lt;br /&gt;
		add al,cl&lt;br /&gt;
		imul cl&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		xor al,ah&lt;br /&gt;
		add al,4&lt;br /&gt;
		test al,-8&lt;br /&gt;
	pop dx&lt;br /&gt;
	loopz fx5L&lt;br /&gt;
	sub cx,bp&lt;br /&gt;
	xor al,cl&lt;br /&gt;
	aam tunnel_pattern; VERY slow on dosbox, but ok&lt;br /&gt;
	add al,tunnel_base_color&lt;br /&gt;
ret&lt;br /&gt;
fx6: ; ocean night / to day sky&lt;br /&gt;
	sub dh,120&lt;br /&gt;
	js fx6q&lt;br /&gt;
	mov [bx+si],dx&lt;br /&gt;
	fild word [bx+si]&lt;br /&gt;
	fidivr dword [bx+si]&lt;br /&gt;
	fstp dword [bx+si-1]&lt;br /&gt;
	mov ax,[bx+si]&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,128&lt;br /&gt;
	dec ax&lt;br /&gt;
fx6q:&lt;br /&gt;
ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Bonus - NFO/ASCII==&lt;br /&gt;
&lt;br /&gt;
  &amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
                                                             art : hammerfist&lt;br /&gt;
         ∂#MW%e                              _d$Ng,&lt;br /&gt;
         'B,  ∂b                   _jM@$QZb,cQ&amp;quot;  )@&lt;br /&gt;
  ,edRB$b,l@   Wk,yGR$KM&amp;amp;$b,     ,dP&amp;quot;     Wl ]bsd%UR8BG6&amp;amp;$@DSyG#ZKM&amp;amp;$b,&lt;br /&gt;
,dP      &amp;quot;T%L  'MGF      &amp;quot;*∂R_   Tg    &amp;quot;*4Zk,#I  YP   W&amp;quot;    7P      &amp;quot;*∂R&lt;br /&gt;
4M   gd@    ^   ∂@   d@b   dQ$#@Z@R3L_    &amp;quot;*GMj  'W      ,gd$   d@b   9Q$#%b&lt;br /&gt;
W#,  `M          Wb  `*  _4P   `Qk  *#N8L   `H5   @b   'QR7YK   `*  _4F&amp;quot;   Qk&lt;br /&gt;
`6@L             dML            '@          ,BK   'M    ∂B  *b,            '#L&lt;br /&gt;
  ^QBb,_     _,4&amp;amp;M∞∂@=,_       _dGL       _gQKM    GL    @k  'Mg,_         _dG,&lt;br /&gt;
    &amp;quot;*BN5W$2#MNP&amp;quot;   &amp;quot;*G3WRM8&amp;amp;B5P&amp;quot;`Y@QNW3Z5P&amp;quot; ∂#$W8BRM3XZN87    &amp;quot;*GW38M%EBDW5P&amp;quot;`&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
                              p r e s e n t s&lt;br /&gt;
&lt;br /&gt;
            4&lt;br /&gt;
           d@,&lt;br /&gt;
         _&amp;amp; `Wl&lt;br /&gt;
      _,aP   &amp;quot;#baedM$#@@K JP*&amp;quot;?ML&lt;br /&gt;
 ,ad@$#P&amp;quot;         ,d@NEWVB&amp;quot;     X,aQPYb,_&lt;br /&gt;
V@Mm,_          ,d@MW#BW'      EMP&amp;quot;   '¶R ,ngBP^fML&lt;br /&gt;
 ¶M@N@y        Y#BNW#M&amp;quot;       J9&amp;quot;      `MQ9&amp;quot;      &amp;quot;MgRBq  ,QBMg,&lt;br /&gt;
  VN#P` ,d@@    `WM@^                   7f         ¶F` 7kY&amp;quot;   ^G  _.eQNE1.&lt;br /&gt;
   ]B _G@MWN$,   `P                     '     4b       QP      ¶w@F*^  ^Qb&lt;br /&gt;
   ]O@NRM#W@MNB,         ;                    ^`      j        JP^       Yl&lt;br /&gt;
  J#NRNWM@#BcT&amp;quot;^        ,A  _J                     _q@                   `X&lt;br /&gt;
 '¶WM#B@WdY`,7        _G#YN#PM                 _,gG&amp;quot;                      M,&lt;br /&gt;
  *BN#WP&amp;quot;  dK       ,Q@NRMB&amp;quot;]9       ,      _,M@Q*                        #A&lt;br /&gt;
   &amp;quot;U^      V@h,   iNBW#NT  J'      J9     s@QN&amp;quot;         _;               'D,&lt;br /&gt;
             ¶RMBv&amp;amp;NMQR@9  .W      .K'     &amp;quot;9`         ,6BA   _JL          ]l&lt;br /&gt;
              Y#NE@W#NRP   #[      `¶8               _d@MW#B_jW#W          BN&lt;br /&gt;
               &amp;quot;GQ@MR#W    QL_      *B            _,p#NBW#NQMG@WY          3Q&lt;br /&gt;
                  &amp;quot;Y@F     ,XW@M%im,_Yb_     _,g5@#MW@QMNE@E@NRMB         ,WM&lt;br /&gt;
                    `  _,gP*&amp;quot;#REM#GB@N#MQbnd@N#M@MW#R8QSB^'WQERM@        ;4NB,&lt;br /&gt;
                     ,GYKL    ¶E#B8R8QSB@M@#BM#W@MNB&amp;quot;`_  ,  &amp;quot;^` N       ,dW@Ql&lt;br /&gt;
                   _Q`'W`*t    '¶@GS#MBQ#E@W#NQBW[     'LvQ_   ,K    _dNABGM#N&lt;br /&gt;
                  ,F   '          `^WAB@QGE9*&amp;quot;9^*@L    jP7FY,  ¶h,_.jWM#BR#GBM,&lt;br /&gt;
                 J;    ,   _                  '       '   &amp;quot;LL  YxE#B8R8QSBNW@W;&lt;br /&gt;
                AP   _,Ag6^          _   J                  ¶A  `&amp;quot;Q#M@MW#R8E#P&lt;br /&gt;
               j@   `&amp;quot;XQW[            'LvK,_      'L_,/      @t    Y#NE@WNR&amp;quot;&lt;br /&gt;
              :M/     9^*@L           jP7F&amp;quot;       _PYKL     _,A;     ¶RSNQ&amp;quot;&lt;br /&gt;
              dKL     '     `        '   &amp;quot;L      &amp;quot;`'W`*t   `&amp;quot;XQb      `W^&lt;br /&gt;
              Q`8t            'L_,/         ,   _   '        9^Q&lt;br /&gt;
             ,W               _PYKL       _,Ag6^             ' W,     _ ,#N&amp;amp;&lt;br /&gt;
             !N  _   J       &amp;quot;`'W`*t     `&amp;quot;XQW[       _  J     N!_JG9^RwQ' *t&lt;br /&gt;
             `W,  'LvK,_        '        _gGB8@L   _   'LvK,_ ,WgB'    V    7L&lt;br /&gt;
         _.,gm&amp;amp;@B&amp;amp;wBZF&amp;quot;                j@'`  &amp;quot;WL _gML  jZd7Yb lN&amp;quot;          dBWl&lt;br /&gt;
      ,g&amp;amp;QB*&amp;quot;^`    `&amp;quot;*G@g, .gR&amp;amp;k,_   ,N&amp;quot;      '@QF  ¶k;gMF  *QvQ     jQ, ,@N@B#,&lt;br /&gt;
   .eQF*`              `Yb@&amp;quot;  &amp;quot;*6Qg,gF     ,   7     XMN&amp;quot;    'MNB,    ^¶QWSER@N;&lt;br /&gt;
 ,gP&amp;quot;           qy,      W'       ^Q'     &amp;amp;L      ,g@W'       `QMEL     `&amp;quot;WBNWP&lt;br /&gt;
g7              ¶9      ,X         M?     9&amp;quot;   _q8MSK           ¶EMt       *@K&lt;br /&gt;
Vh   _,m#L             _AH        le         ,GBDNE9^A,          *@F        NMg&lt;br /&gt;
 ¶L,qQ@ND           _.m@Bl        We      ,gM@B8#Q'   ¶h_                   lWE,&lt;br /&gt;
  W9NHW@`          JWM#B@]        @e     4WR@NGF^      'QL                  dRWl&lt;br /&gt;
   VMd*            &amp;quot;@BE@PM        'N      *UP&amp;quot;           VW,               JRSB;&lt;br /&gt;
  ,@F       j       `¶WK W,        ¶t                     XNt            _A@E#N&lt;br /&gt;
_JP       ,6&amp;amp;         &amp;quot;GLdM         XD,               _.g8NMA@k,_    _,gG#NMGR;&lt;br /&gt;
&amp;quot;Z      .JRER           'VMi     _jNB#W&amp;amp;_         _,j@E@W#Nl ¶MBGMNQGNQMG@QBW9&lt;br /&gt;
 ¶h   ,G@NRMBl            `&amp;quot;   ,d#R@M$F ¶Mg,_.gp&amp;amp;@@NEWVBWBMG  *QMN8R8SBN$E@WF&lt;br /&gt;
  Vb dW#R8QSRb,                 *YM@EQ,_ 'MENBW#NQMG#B@R@MW#l   &amp;quot;BM@QNENRQG'&lt;br /&gt;
   *WGS#MBMNEYL                    `^&amp;quot;*8M@Q@NRM#W@BWSNW@QBF&amp;quot;`     `^*@QBF^ [HFT]&lt;br /&gt;
    ^M@MW#Q9 ^Wt                           `^¶RQ@W8NQGP*`&lt;br /&gt;
     ¶Q#@P     Vk                            lA `&amp;quot;^`&lt;br /&gt;
      Y&amp;quot;       `MA                           J#,&lt;br /&gt;
                *R@,                        ,MQl&lt;br /&gt;
                 Y#Wk,                      GWM8L&lt;br /&gt;
                  W8RQSt,_                 AQ@MR#,&lt;br /&gt;
                  `@M@#SB@Mbm.,_          QNBW#NW&lt;br /&gt;
                    ¶QB8R8SBN$WNRM@#GNtwg@NMQR@B'&lt;br /&gt;
                     *MBQ#8R8QS@NE@WNBW#NQMG@NR;&lt;br /&gt;
                      `WGS#MBQ#R8QSB@NE@W#NQBW9&lt;br /&gt;
                        *OMW@QMNE@E@NRMW@QMB@*&lt;br /&gt;
                          `^&amp;quot;YQW@Q#SB#NE@EGP&lt;br /&gt;
                               `^&amp;quot;*8R@GBQF`&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Memories&amp;diff=556</id>
		<title>Memories</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Memories&amp;diff=556"/>
				<updated>2020-05-05T07:54:18Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Update==&lt;br /&gt;
The final freedos version is completed, you can [https://www.youtube.com/watch?v=wlW84fEHngM watch it here]. It includes the Amiga Ball as an extra effect, but lacks the music, since a normal modern PC booted to freedos can not play MIDI that easily. The filesize is still 256 bytes.&lt;br /&gt;
==Introduction==&lt;br /&gt;
&lt;br /&gt;
Hello! My name is &amp;quot;HellMood&amp;quot; and this is not a usual wiki style article. It will be written from my point of view, in first person. This article is all about a tiny MS DOS program called &amp;quot;Memories&amp;quot;. This program has a size of 256 bytes and won the &amp;quot;PC 256 byte&amp;quot; competition of the [https://en.wikipedia.org/wiki/Demoscene demoscene] event [https://2020.revision-party.net/ &amp;quot;Revision&amp;quot;] in 2020, as well as the public choice award. You can watch a video of the output of this program here (https://www.youtube.com/watch?v=Imquk_3oFf4) and a video including live reactions of online audience and moderators here (https://www.twitch.tv/videos/589179638?t=7h02m40s). You can download the release and comment here (https://www.pouet.net/prod.php?which=85227). This wiki article will give you an indepth analysis of the program and point out historical references as well as development steps. In case you did not notice, you are now on the &amp;quot;sizecoding wiki&amp;quot;. This will help you not only to understand the interior of &amp;quot;Memories&amp;quot;, but also enable you to produce something like this yourself. Feel free to look around! It is recommended that you start with the [http://www.sizecoding.org/wiki/Getting_Started basics of this wiki] first if you're new to sizecoding and/or x86 assembler. While the principles might be easy to grasp, details of the implementation might not.&lt;br /&gt;
&lt;br /&gt;
==Disclaimer==&lt;br /&gt;
This article discusses the submitted DosBox version (256 bytes). The available archive contains versions for FreeDos and Windows XP DOS which, at the time of this writing, don't work on all computers. These alternative versions were included as proof of concept, to show that this program does not only run on the emulator &amp;quot;DosBox&amp;quot;. For the &amp;quot;PC 256 bytes&amp;quot; competition of the event &amp;quot;Revision&amp;quot; in 2020, it was allowed to specify either &amp;quot;FreeDos&amp;quot; or &amp;quot;DosBox&amp;quot; as platform, the latter with a specific configuration. It is in fact possible to modify the DosBox version to work on FreeDos, MS DOS, WinXP and Win98 as the released alternative versions showcase, but this is not the focus of this article. Anyway, safe versions for all platform and all computers are in the works. Regarding possible further optimizations, I will discuss/explain only the submitted version although i already found several spots in the code that can be optimized. As the history of tiny effects show, almost no code is perfect, and can be brought down further in size. To avoid confusion between different versions, only the submitted version will be explained.&lt;br /&gt;
&lt;br /&gt;
==History of tiny effects==&lt;br /&gt;
[[File:Pouet categories.png|thumb|Size categories on http://www.pouet.net]]&lt;br /&gt;
Sizecoders think in size categories. For MS DOS, these categories are 256b, 128b, 64b and 32b. These are the standards of one of the biggest demoscene archives, http://www.pouet.net. There is no 16b category, although many tiny effects can be implemented in 16 bytes. Almost all of the effects of &amp;quot;Memories&amp;quot; have been coded and optimized by me before, and mostly the implementations were attempts to reduce an already existing effect in size, or do something similar in less size. Reducing the size in this case means, reducing it to the (one of the) next lower 2^N category. For example I might have seen an effect done in 33 to 64 bytes, and then reduced the size to 32 bytes or less. Almost everytime I moved an effect to a lower category, I submitted the resulting tiny program to a demoscene party which allowed remote entries in the 256b category and/or published the result to http://www.pouet.net . In this section I will introduce the effects and mention the origins and authors.&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Kasparov.gif|frame|kasparov, 16 bytes]]&lt;br /&gt;
The origin of this effect is my own &amp;quot;Kasparov 16b&amp;quot; from 2018 (https://www.pouet.net/prod.php?which=75912) apparently I &amp;quot;had this piece lying around, stuck at 17 bytes and not that beautiful, until i found a trick ;)&amp;quot; (quote from the release notes) There was a similar effect in 32 bytes before : &amp;quot;ew&amp;quot; by &amp;quot;headcrash&amp;quot; from 2003. (https://www.pouet.net/prod.php?which=17567) In this case, my design goal was to show &amp;quot;real&amp;quot; chessboards of 8x8 fields and recognizable dark and light squares, as well as the correct orientation of the single chessboards, meaning that the bottom right corner (h1) has to be a [https://en.wikibooks.org/wiki/Chess/Arranging_The_Board white square]. For &amp;quot;Memories&amp;quot;, the effect had to be reimplemented with another approach of writing to the screen to work with the framework, also the scrolling direction was changed to be different from the &amp;quot;scrolling tilted plane&amp;quot; effect.&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
The zooming circles were supposed to be entered to a demoparty as a 32 bytes intro, but I never actually did it. There is no real predecessor in 64 bytes, because in the 64b category much more complex effects are possible. The zooming circles were the result of desperately trying to reach the 32b category for a circular &amp;quot;tunnel&amp;quot; effect, for that my own record is still 52 bytes (&amp;quot;Neontube&amp;quot; - 2016)(https://www.pouet.net/prod.php?which=66808) which in turn was the optimization of the all-time-classic 64 byte effect &amp;quot;constant evolution&amp;quot; from ryg/Farbrausch (2003) (https://www.pouet.net/prod.php?which=8697). In the zooming circles routine, the distance and the angle are removed/ignored, so that the 32b category could be reached.&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Floorcast variation.png|thumb|Floorcast, 32 bytes version, variation]]&lt;br /&gt;
The scrolling tilted plane is one of my own releases, &amp;quot;floorcast 32b&amp;quot; from 2018. Floorcasts have its own history in sizecoding, the effect was brought down stepwise from 256 bytes to 32 bytes. The versions differ in the number of planes, some show two planes, others only one. For the &amp;quot;floorcast 32b&amp;quot; release I specifically decided against the &amp;quot;XOR&amp;quot; texture, whereas in &amp;quot;Memories&amp;quot; I used it again, but disguised with another final touch of &amp;quot;AND&amp;quot;.&lt;br /&gt;
* &amp;quot;rain_storm version 2008 - 256 bytes - https://www.pouet.net/prod.php?which=50479&lt;br /&gt;
* &amp;quot;org_100h&amp;quot; version 2008 - 128 bytes - https://www.pouet.net/prod.php?which=63271&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2013 - 86 bytes - https://www.pouet.net/prod.php?which=61070&lt;br /&gt;
* &amp;quot;Baudsurfer&amp;quot; version 2014 - 64 bytes - https://www.pouet.net/prod.php?which=63305&lt;br /&gt;
* &amp;quot;HellMood&amp;quot; version 2018 - 32 bytes - https://www.pouet.net/prod.php?which=77774&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Projektbeschreibung.gif|frame|Projektbeschreibung, 32 bytes]]&lt;br /&gt;
The parallax checkerboards have been released by me as 32 bytes effect [https://www.pouet.net/prod.php?which=76788 &amp;quot;Projektbeschreibung&amp;quot;] in 2018. It was the direct attempt to bring [https://www.pouet.net/prod.php?which=28543 &amp;quot;Follow the light&amp;quot;] from &amp;quot;Digimind&amp;quot; (2006) or my own [https://www.pouet.net/prod.php?which=63293 &amp;quot;Lucy&amp;quot;] (2014) down to 32 bytes. Very helpful inspirations came from [https://www.pouet.net/prod.php?which=32281 &amp;quot;Rrrolas&amp;quot; &amp;quot;Paralaxa&amp;quot;] (32 bytes, 2007) and [https://www.pouet.net/prod.php?which=61075 &amp;quot;Sensenstahl&amp;quot;s &amp;quot;Byteropolis&amp;quot;] (32 bytes, 2013). In fact Rrrolas rendering approach was already close to the final solution, the code was modified to correct the planes arrangement, change the shape from triangles to checkerboards and improve on the colors. In &amp;quot;Memories&amp;quot;, the coloring of &amp;quot;Digimind&amp;quot;s version was used. Also, the effect has been modified to reduce shearing as much as possible.&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Colpinski.png|thumb|colpinski, 16 bytes]]&lt;br /&gt;
[[File:Rotastic.png|thumb|left|rotastic, 32 bytes]]&lt;br /&gt;
This consists of two effects, the rotozoomer and the sierpinski effect as a texture. The sierpinski effect is based on my own &amp;quot;Colpinski 16b&amp;quot; from 2013 (https://www.pouet.net/prod.php?which=62079) where &amp;quot;frag&amp;quot; of &amp;quot;fsqrt&amp;quot; helped pushing the limits. There is no real predecessor to that, because it is known that this effect can be done directly by combining X and Y, instead of an Iterated Function System or Cellular Automaton approach. The rotozoomer was released as 32b intro &amp;quot;rotastic&amp;quot; by me in 2017 (https://www.pouet.net/prod.php?which=70112), it is based on the ideas of &amp;quot;ryg&amp;quot; of &amp;quot;Farbrausch&amp;quot; (51 bytes, 2002, https://www.pouet.net/prod.php?which=5015) and &amp;quot;Gargaj&amp;quot; of &amp;quot;Conspiracy&amp;quot; (49 bytes, 2002, https://www.pouet.net/prod.php?which=6088).&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Intoanew.gif|frame|Into a new era, 64 bytes version]]&lt;br /&gt;
The raycast bent tunnel is a tweaked version of my own 64 bytes &amp;quot;Into a new era&amp;quot; release (2018, https://www.pouet.net/prod.php?which=78044). The custom colors were exchanged for colors of the standard palette, the geometry was tweaked a bit and changed accordingly, so that no depth effect was used for the calculation of texture values. The standalone version of this effect has a size of 50 bytes. The two main inspirations for the 64 bytes version were two 128 byte intros, &amp;quot;Spongy&amp;quot; by &amp;quot;TBC&amp;quot; (2009, https://www.pouet.net/prod.php?which=53871) and &amp;quot;Wolf128&amp;quot; from &amp;quot;Baudsurfer&amp;quot; (2014, https://www.pouet.net/prod.php?which=63518), while the particular algorithm was developed independently.&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Ocean.png|thumb|Ocean, 64b version]]&lt;br /&gt;
The ocean effect is based on my own 64 bytes release &amp;quot;Ocean&amp;quot; from 2016 (https://www.pouet.net/prod.php?which=66857). The custom color generation and music generation have been stripped from the original, both were not compatible with the main framework of &amp;quot;Memories&amp;quot; without consuming a LOT of extra space. The special effect of &amp;quot;dawn&amp;quot; happens as a byproduct of the general framework approach. This will be explained in the next chapter.&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
The transition between two effects is an effect itself, which has no real predecessor. It is rather an idea that evolved over the years, and might have been done by many others in similar ways. In a nutshell, while calculating a frame, each pixel position is randomized and the time - which determines the effect to use - is offset by this randomization value which is downscaled before. This allows the usage of the standard VGA palette (picture, source) rather than creating custom colors for real crossfading and thus saves space.&lt;br /&gt;
&lt;br /&gt;
==Tiny megademo framework==&lt;br /&gt;
The tiny megademo framework is based on ideas of [https://www.pouet.net/prod.php?which=68020 &amp;quot;Quatro&amp;quot;], done by &amp;quot;Sensenstahl&amp;quot; and me. For putting many tiny effects together into a &amp;quot;megademo&amp;quot;, they have all to use the same approach and they have to be mostly free of assumptions (contents of memory and registers). Also they have to use the same timing values and be tuned to a shared timing in general. Preparing the single effects to be includeable in a framework like this took quite some time and required a lot of extra space initially. It has to be noted, that - sadly - some of my most impressive effects (judged by audience reactions and views on social media) could not be included at all, because the approaches were not combineable without a HUGE overhead. Once all effects were streamlined, I could think about &amp;quot;factoring&amp;quot; out common calculations which repeatedly occur, resulting in a few saved bytes. The framework does the following:&lt;br /&gt;
* Setting up screen mode to 320 x 200 pixels in 256 colors&lt;br /&gt;
* Initializing a pointer to the screen&lt;br /&gt;
* Installing a callback for timing and music&lt;br /&gt;
* Mainloop&lt;br /&gt;
** Calculation of X and Y from the current screen location&lt;br /&gt;
*** with the [http://www.sizecoding.org/wiki/General_Coding_Tricks#Obtaining_X_and_Y_without_DIV_.28The_Rrrola_Trick.29 &amp;quot;Rrrola Trick&amp;quot;]&lt;br /&gt;
** P = Pseudorandom value from screen location&lt;br /&gt;
** Offset current time T by a scaled value P'&lt;br /&gt;
** Selection of effect number N according to T'&lt;br /&gt;
** Execution of effect N (X,Y) for current pixel&lt;br /&gt;
*** On entrance, XY is in register DX (DL,DH)&lt;br /&gt;
*** On entrance, the current time is in register BP&lt;br /&gt;
*** On exit, the pixel color is expected in AL&lt;br /&gt;
** Advance to next pixel&lt;br /&gt;
** Triple diagonal interlacing for smoothing&lt;br /&gt;
** Repeat until frame is complete&lt;br /&gt;
* Set timer to a tempo of ~35 FPS&lt;br /&gt;
* Check keyboard for ESC&lt;br /&gt;
** Quit on ESC, otherwise continue&lt;br /&gt;
=== Code of framework===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
	mov al,0x13				; set AL to mode 320*200 in 256 colors&lt;br /&gt;
	int 0x10	 			; call BIOS to set mode&lt;br /&gt;
	xchg bp,ax				; set timing value to 0x13 &lt;br /&gt;
	push 0xa000-10			; write the screen adress to register ES&lt;br /&gt;
	pop es					; works in conjunction with Rrrola trick&lt;br /&gt;
	mov ax,0x251c			; parameter for changing timer interrupt&lt;br /&gt;
	mov dl,timer			; adress of timer routine, assume DH=1&lt;br /&gt;
	int 0x21				; install timer routine&lt;br /&gt;
top:&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
	call bx					; call the effect&lt;br /&gt;
	stosb					; write the return value and advance&lt;br /&gt;
	inc di					; triple interlace trick for after&lt;br /&gt;
	inc di					; effect and smoothing the animation&lt;br /&gt;
	jnz top					; repeat until the frame is complete&lt;br /&gt;
	mov al,tempo			; set AL to divider for timer&lt;br /&gt;
	out 40h,al				; set timing (dual pass)&lt;br /&gt;
	in al,0x60				; read keyboard&lt;br /&gt;
	dec al					; quit on ESC&lt;br /&gt;
	jnz top					; otherwise repeat loop&lt;br /&gt;
sounds: db 0xc3, 11, 0x93; 0xc3 is MIDI/RET; fx2-s is used as volume&lt;br /&gt;
table: 	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Explanation of effects==&lt;br /&gt;
===Array of chessboards===&lt;br /&gt;
[[File:Array of chessboards.png|thumb|array of chessboards]]&lt;br /&gt;
The most simple effect and a good one to start with. After offsetting the row by the time, the classic XOR pattern is applied. To achieve the impression of a grid of chessboards, all bits but two are set in the color. The real trick lies in shifting to a &amp;quot;good&amp;quot; palette spot. The portion of the frame which is perceived as black, isn't actually black, it's the dark part of the [https://en.wikipedia.org/wiki/Video_Graphics_Array#Color_palette standard VGA palette]. Shifting that way also makes the dark and light square look like they are from an ancient old chessboard.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	xchg dx,ax		; get XY into AX&lt;br /&gt;
	sub ax,bp		; subtract time from row&lt;br /&gt;
	xor al,ah		; XOR pattern (x xor y)&lt;br /&gt;
	or al,0xDB		; pattern for array of boards&lt;br /&gt;
	add al,13h		; shift to good palette spot&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Zooming circles===&lt;br /&gt;
[[File:Zooming circles.png|thumb|zooming circles]]&lt;br /&gt;
The distance D of a point (X,Y) to the center (0,0) is sqrt(X²+Y²). The framework already makes sure the DL contains a centered X coordinate, DH=Y has to be centered though. Performing a square root on x86 requires quite a bit of code, but this actually can be omitted. With a clever selection of colors, the impression of zooming circles without square root is quite decent.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov al,dh		; get Y in AL&lt;br /&gt;
	sub al,100		; align Y vertically&lt;br /&gt;
	imul al			; AL = Y²&lt;br /&gt;
	xchg dx,ax		; Y²/256 in DH, X in AL&lt;br /&gt;
	imul al			; AL = X²&lt;br /&gt;
	add dh,ah		; DH = (X² + Y²)/256&lt;br /&gt;
	mov al,dh		; AL = (X² + Y²)/256&lt;br /&gt;
	add ax,bp		; offset color by time&lt;br /&gt;
	and al,8+16		; select special rings&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Scrolling tilted plane===&lt;br /&gt;
[[File:Tiltplane.png|thumb|scrolling tilted plane]]&lt;br /&gt;
This effect is achieved as follow. First, a large constant is divided by the row number Y, to simulate distance. Then the resulting value is used twice, a) scaled by a centered X value and b) offset by current time. These results are then combined with the XOR pattern of which a special pattern is selected.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
	mov ax,0x1329	; initialize with constant&lt;br /&gt;
	add dh,al		; preventing divide overflow&lt;br /&gt;
	div dh			; reverse divide AL = C/Y'&lt;br /&gt;
	xchg dx,ax		; DL = C/Y', AL = X&lt;br /&gt;
	imul dl			; AH = CX/Y'&lt;br /&gt;
	sub dx,bp		; DL = C/Y'-T 	&lt;br /&gt;
	xor ah,dl		; AH = (CX/Y') ^ (C/Y'-T)&lt;br /&gt;
	mov al,ah		; move to AL&lt;br /&gt;
	and al,4+8+16	; select special pattern&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Parallax checkerboards===&lt;br /&gt;
[[File:Paralax.png|thumb|parallax checker boards]]&lt;br /&gt;
This is kind of raycasting with dynamic geometry. Each object plane is split horizontally because of the signed 16 bit multiplication and vertically because of an implicit logic column number operation. Additionally, implicit shearing is applied, so that the edges of the resulting grid (4 &amp;quot;solid&amp;quot; patches, 4 &amp;quot;transparent&amp;quot; patches, alternating) connect. If the ray hits one of the solid patches, the color becomes the iteration number (+palette offset into grayscale), if not, the plane is shifted by the screenpointer and the process is repeated until the maximum iteration number is reached.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cx,bp		; set inital point to time&lt;br /&gt;
	mov bx,-16		; limit to 16 iterations&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di		; offset point by screenpointer&lt;br /&gt;
	mov ax,819		; magic, related to Rrrola constant&lt;br /&gt;
	imul cx			; get X',Y' in DX&lt;br /&gt;
	ror dx,1		; set carry flag on &amp;quot;hit&amp;quot;&lt;br /&gt;
	inc bx			; increment iteration count&lt;br /&gt;
	ja fx3L			; loop until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	lea ax,[bx+31]	; map value to standard gray scale&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Sierpinski rotozoomer===&lt;br /&gt;
[[File:Sierpinski rotozoomer.png|thumb|sierpinski rotozoomer]]&lt;br /&gt;
[[File:1 div cos(atan(x)).png|thumb|1/cos(atan(x)), made with www.google.com]]&lt;br /&gt;
Rotozooming normally requires trigonometric functions, or good estimations of them. Let's look at the [https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions usual 2D rotation equation] and interpret it in a special way to get rid of the trigonometry in the calculation:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = x*cos(a) - y*sin(a)&lt;br /&gt;
y' = x*sin(a) + y*cos(a)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
If we add zooming to it, it looks like this:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = z * (x*cos(a) - y*sin(a))&lt;br /&gt;
y' = z * (x*sin(a) + y*cos(a))&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Now let's assume, we don't specify z ourself, and factor out cos(a):&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = cos(a) * (x - y*tan(a))&lt;br /&gt;
y' = cos(a) * (x*tan(a) + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Finally, let's substitute tan(a) with our time variable, since the tangens function rises to infinity the closer we get to +-90°:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
x' = cos(atan(T)) * (x - y*T)&lt;br /&gt;
y' = cos(atan(T)) * (x*T + y)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
As long as we don't care about specifying the zoomfactor and don't directly control the angle, we can now rotate between -90° and +90° without using trigonometric functions. As a byproduct, the zoomfactor is now tied to the time T. The function for the zoomfactor is shown in the image, it results in zooming from infinitely small to one (original) and back to infinitely small. Quite a few bytes have been spent to beautify the effect, correct the timing offset, speeding up the animation, enlarging the sierpinski pixels and have decent colors, but I thought it's worth the space.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	lea cx,[bp-2048]; center time to pass zero&lt;br /&gt;
	sal cx,3		; speed up by factor 8!&lt;br /&gt;
	movzx ax,dh		; get X into AL&lt;br /&gt;
	movsx dx,dl		; get Y int DL&lt;br /&gt;
	mov bx,ax		; save X in BX&lt;br /&gt;
	imul bx,cx		; BX = X*T&lt;br /&gt;
	add bh,dl		; BH = X*T/256+Y&lt;br /&gt;
	imul dx,cx		; DX = Y*T&lt;br /&gt;
	sub al,dh		; AL = X-Y*T/256&lt;br /&gt;
	and al,bh		; AL = (X-Y*T/256)&amp;amp;(X*T/256+Y)&lt;br /&gt;
	and al,252		; thicker sierpinski&lt;br /&gt;
	salc			; set pixel value to black&lt;br /&gt;
	jnz fx4q		; leave black if not sierpinski&lt;br /&gt;
	mov al,0x2A		; otherwise: a nice orange&lt;br /&gt;
	fx4q:&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Raycast bent tunnel===&lt;br /&gt;
[[File:Raycastbenttunnel.png|thumb|raycast bent tunnel]]&lt;br /&gt;
This is a variation of a the [https://www.pouet.net/prod.php?which=78044 raycast tunnel &amp;quot;into a new era&amp;quot;] (see also above). A [https://www.reddit.com/r/tinycode/comments/de7dj9/moving_cubes_with_pathtracing_lighting_and/ detailed explanation] has been posted to reddit for the [https://www.pouet.net/prod.php?which=83204 similar production &amp;quot;Essence&amp;quot;]. The custom colors have been removed, the bend direction is different, and the geometry is a bit more closed, to increase the performance on slower computers and DosBox.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov cl,-9		; start with depth 9 (moves backwards)&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx			; save DX, destroyed inside the loop&lt;br /&gt;
		mov al,dh	; Get Y into AL&lt;br /&gt;
		sub al,100	; Centering Y has to be done &amp;quot;manually&amp;quot;.&lt;br /&gt;
		imul cl		; Multiply AL=Y by the current distance, to get a projection(1)&lt;br /&gt;
		xchg ax,dx	; Get X into AL, while saving the result in DX (DH)&lt;br /&gt;
		add al,cl	; add distance to projection, (bend to the right)&lt;br /&gt;
		imul cl		; Multiply AL=X by the current distance, to get a projection(2)&lt;br /&gt;
		mov al,dh	; Get projection(1) in AL&lt;br /&gt;
		xor al,ah	; combine with projection(2)&lt;br /&gt;
		add al,4	; center the walls around 0&lt;br /&gt;
		test al,-8	; check if the wall is hit&lt;br /&gt;
	pop dx			; restore DX&lt;br /&gt;
	loopz fx5L		; repeat until &amp;quot;hit&amp;quot; or &amp;quot;iter=max&amp;quot;&lt;br /&gt;
	sub cx,bp		; offset depth by time&lt;br /&gt;
	xor al,cl		; XOR pattern for texture &lt;br /&gt;
	aam 6			; irregular pattern with MOD 6&lt;br /&gt;
	add al,20		; offset into grayscale palette&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Ocean night to day===&lt;br /&gt;
[[File:Oceannight.png|thumb|left|oceannight]]&lt;br /&gt;
[[File:Oceanday.png|thumb|right|oceanday]]&lt;br /&gt;
The ocean effect is a wonderful example for a [https://en.wikipedia.org/wiki/Bob_Ross#Style &amp;quot;happy accident&amp;quot;]. If you load a value into the FPU as integer, and store it as floating point value, and interpret it as integer again, you get a cool pattern. Combine that with a reverse divide and you get a neat wavy effect. This works in tight conjunction with the register DX, which has a sign flip at the position we want, so we can easily separate sky and sea. The final touch is the coloring though. By design of the global framework, the value of AL is determined on function entrance, it contains the effect adress. With a bit of code shuffling, the color of the sky came &amp;quot;for free&amp;quot;, without using instructions, as well as the &amp;quot;fading to day&amp;quot;-color, which is also an effect adress. And that is no happy accident. In other versions than the original submitted DosBox version, the sky color can vary for that reason.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	sub dh,120			; check if pixel is in the sky&lt;br /&gt;
	js fx6q				; quit if that's the case&lt;br /&gt;
	mov [bx+si],dx		; move XY to a memory location&lt;br /&gt;
	fild word [bx+si]	; read memory location as integer&lt;br /&gt;
	fidivr dword [bx+si]; reverse divide by constant&lt;br /&gt;
	fstp dword [bx+si-1]; store result as floating point&lt;br /&gt;
	mov ax,[bx+si]		; get the result into AX&lt;br /&gt;
	add ax,bp			; modify color by time&lt;br /&gt;
	and al,128			; threshold into two bands&lt;br /&gt;
	dec ax				; beautify colors to blue/black&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Fading effect===&lt;br /&gt;
For convenience, the fading effect from the global framework, isolated. Basically it is generating a pseudorandom value from the screenpointer, then offsetting the time by a scaled value of it, then calling the destined effect.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
	mov ax,0xcccd			; load magic Rrrola constant&lt;br /&gt;
	mul di					; transform screen pointer to X, Y&lt;br /&gt;
	add al,ah				; use transformation garbage as&lt;br /&gt;
	xor ah,ah				; pseudorandom value and clear AH&lt;br /&gt;
	add ax,bp				; add time value to random value&lt;br /&gt;
	shr ax,9				; divide by 512 (basically the speed)&lt;br /&gt;
	and al,15				; filter effect number&lt;br /&gt;
	xchg bx,ax				; move effect number to BX&lt;br /&gt;
	mov bh,1				; reset BH to align with start of code&lt;br /&gt;
	mov bl,[byte bx+table]	; read the effect address from the table&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==MIDI music part==&lt;br /&gt;
This code section increments the timing value and produces sound. By chosing channel 3 we can reuse the &amp;quot;change channel instrument&amp;quot; instruction again as &amp;quot;RET&amp;quot;. By shifting around the code of the effects, a reasonable value for the volume could be created and thus save another byte. Be aware that this code sequence only works if the MIDI device is already in UART mode, otherwise three more bytes have to be spent. Quite a few people from the audience and sceners told my afterwards that the melody sounds a bit like [https://en.wikipedia.org/wiki/Incantations_(album) Incantations from Mike Oldfield], but it is rather created by a simple approach. Starting with [https://www.pouet.net/prod.php?which=65604 &amp;quot;Hypnoteye&amp;quot; in 2015], I experimented with procedural MIDI and eventually released a [https://www.pouet.net/prod.php?which=66313 small MIDI framework (64 bytes)]. The basic principle is to jump through tonal space with fixed steps, and map high values back to lower ones (by modulo). With simple combinations of stepwidth and modvalue, interesting effects can be produced. For example stepwidth=3 for reduced minor chords which are never resolved, likewise stepwidth=4, or stepwidth=6 for tritonal effects. With a well chosen modvalue, these patterns can create progressions. I did not do an appropriate theoretical analysis yet, rather explored the tonal space and noted when something sounded interesting.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
sounds: db 0xc3, 11, 0x93, fx2-s&lt;br /&gt;
...&lt;br /&gt;
		inc bp				; increment timing value&lt;br /&gt;
		test bp, 7			; play a note every 8th step&lt;br /&gt;
		jnz nomuse			; quit if in between&lt;br /&gt;
		mov dx,0x330		; port number for MIDI&lt;br /&gt;
		mov si,sounds		; adress for sound data&lt;br /&gt;
		outsb				; change instrument of channel 3&lt;br /&gt;
		outsb				; to vibraphone&lt;br /&gt;
		outsb				; play a note on channel 3&lt;br /&gt;
		imul ax,bp,-19*32*4	; the magic melody constant&lt;br /&gt;
		shr ax,10			; scale down and implicit &amp;quot;and 63&amp;quot;&lt;br /&gt;
		add al,22			; pitch base is 22&lt;br /&gt;
		out dx,al			; play THIS note on channel 3&lt;br /&gt;
		outsb				; play it with THIS volume&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Further tweaks==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==Differences between dos versions==&lt;br /&gt;
To be done ;)&lt;br /&gt;
==External Links==&lt;br /&gt;
*[https://en.wikipedia.org/wiki/Demoscene Demoscene]&lt;br /&gt;
*[https://www.youtube.com/watch?v=Imquk_3oFf4 Youtube Capture]&lt;br /&gt;
*[https://www.twitch.tv/videos/589179638?t=7h02m40s Twitch Live Reactions]&lt;br /&gt;
*[https://www.pouet.net/prod.php?which=85227 Production Page on Pouet]&lt;br /&gt;
*[https://www.reddit.com/r/programming/comments/g1qhf8/memories_256_byte_msdos_intro/ Reddit/Programming Thread I]&lt;br /&gt;
*[https://www.reddit.com/r/programming/comments/g4ewh7/how_its_made_memories_a_256_byte_dos_intro_with_8/ Reddit/Programming Thread II]&lt;br /&gt;
*[https://github.com/cesarmiquel/memories-256b-msdos-intro Analysis by &amp;quot;cesarmiquel&amp;quot;]&lt;br /&gt;
*[https://hackaday.com/2020/04/21/a-jaw-dropping-demo-in-only-256-bytes/ Hackaday Article]&lt;br /&gt;
&lt;br /&gt;
==Original release code (complete)==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
; &amp;quot;memories&amp;quot; by HellMood/DESiRE&lt;br /&gt;
; the tiny megademo, 256 byte msdos intro&lt;br /&gt;
; shown in April 2020 @ REVISION&lt;br /&gt;
;&lt;br /&gt;
;   (= WILL BE COMMENTED IN DETAIL LATER =)&lt;br /&gt;
;&lt;br /&gt;
; create : nasm.exe memories.asm -fbin -o memories.com&lt;br /&gt;
; CHOOSE YOUR TARGET PLATFORM (compo version is dosbox)&lt;br /&gt;
; be sure to use the dosbox.conf from this archive!&lt;br /&gt;
; only ONE of the defines should be active!&lt;br /&gt;
%define dosbox			; size : 256 bytes&lt;br /&gt;
;%define freedos		; size : 230 bytes&lt;br /&gt;
;%define winxpdos		; size : 263 bytes&lt;br /&gt;
&lt;br /&gt;
; DON'T TOUCH THESE UNLESS YOU KNOW WHAT YOU'RE DOING&lt;br /&gt;
%ifdef winxpdos&lt;br /&gt;
	%define music&lt;br /&gt;
	%define switch_uart&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
	%define safe_segment&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	%define safe_dx&lt;br /&gt;
%endif&lt;br /&gt;
%ifdef dosbox&lt;br /&gt;
	%define music&lt;br /&gt;
	;%define safe_dx ; sometimes needed&lt;br /&gt;
%endif&lt;br /&gt;
&lt;br /&gt;
; GLOBAL PARAMETERS, TUNE WITH CARE!&lt;br /&gt;
%define volume 127	; not used on dosbox (optimization)&lt;br /&gt;
%define instrument 11&lt;br /&gt;
%define scale_mod -19*32*4; &lt;br /&gt;
%define time_mask 7&lt;br /&gt;
%define targetFPS 35&lt;br /&gt;
%define tempo 1193182/256/targetFPS		&lt;br /&gt;
%define sierp_color 0x2A&lt;br /&gt;
%define tunnel_base_color 20&lt;br /&gt;
%define tunnel_pattern 6&lt;br /&gt;
%define tilt_plate_pattern 4+8+16&lt;br /&gt;
%define circles_pattern 8+16&lt;br /&gt;
&lt;br /&gt;
org 100h&lt;br /&gt;
s:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov fs,ax&lt;br /&gt;
	mov [fs:0x46c],ax&lt;br /&gt;
%endif&lt;br /&gt;
	mov al,0x13&lt;br /&gt;
	int 0x10	 &lt;br /&gt;
	xchg bp,ax&lt;br /&gt;
	push 0xa000-10&lt;br /&gt;
	pop es&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	mov ax,0x251c&lt;br /&gt;
	%ifdef safe_dx	&lt;br /&gt;
		mov dx,timer	&lt;br /&gt;
	%else ; assume DH=1, mostly true on DosBox&lt;br /&gt;
		mov dl,timer&lt;br /&gt;
	%endif&lt;br /&gt;
	int 0x21&lt;br /&gt;
%endif&lt;br /&gt;
top:&lt;br /&gt;
%ifdef freedos&lt;br /&gt;
	mov bp,[fs:0x46c]&lt;br /&gt;
%endif	&lt;br /&gt;
	mov ax,0xcccd&lt;br /&gt;
	mul di&lt;br /&gt;
	add al,ah&lt;br /&gt;
	xor ah,ah&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	shr ax,9&lt;br /&gt;
	and al,15&lt;br /&gt;
	xchg bx,ax&lt;br /&gt;
	mov bh,1&lt;br /&gt;
	mov bl,[byte bx+table]&lt;br /&gt;
	call bx&lt;br /&gt;
	stosb&lt;br /&gt;
	inc di&lt;br /&gt;
	inc di&lt;br /&gt;
	jnz top&lt;br /&gt;
	mov al,tempo&lt;br /&gt;
	out 40h,al&lt;br /&gt;
	in al,0x60&lt;br /&gt;
	dec al&lt;br /&gt;
	jnz top&lt;br /&gt;
sounds:&lt;br /&gt;
	db 0xc3	; is MIDI/RET&lt;br /&gt;
%ifdef music&lt;br /&gt;
	db instrument,0x93&lt;br /&gt;
	%ifdef switch_uart&lt;br /&gt;
		db volume		; without switch, volume is in table&lt;br /&gt;
		db 0x3f &lt;br /&gt;
	%endif&lt;br /&gt;
%endif&lt;br /&gt;
table: ; first index is volume, change order with care!		    					&lt;br /&gt;
	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s&lt;br /&gt;
stop:&lt;br /&gt;
	pop ax&lt;br /&gt;
	ret&lt;br /&gt;
timer:&lt;br /&gt;
%ifndef freedos&lt;br /&gt;
	%ifdef safe_segment&lt;br /&gt;
		push cs&lt;br /&gt;
		pop ds&lt;br /&gt;
	%endif&lt;br /&gt;
		inc bp&lt;br /&gt;
	%ifdef music	&lt;br /&gt;
		test bp, time_mask&lt;br /&gt;
		jnz nomuse&lt;br /&gt;
		mov dx,0x330&lt;br /&gt;
		mov si,sounds&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		outsb&lt;br /&gt;
		imul ax,bp,scale_mod&lt;br /&gt;
		shr ax,10&lt;br /&gt;
		add al,22&lt;br /&gt;
		out dx,al&lt;br /&gt;
		outsb&lt;br /&gt;
		%ifdef switch_uart&lt;br /&gt;
			inc dx&lt;br /&gt;
			outsb&lt;br /&gt;
		%endif&lt;br /&gt;
	%endif&lt;br /&gt;
nomuse:&lt;br /&gt;
	iret&lt;br /&gt;
%endif	&lt;br /&gt;
fx0: ; tilted plane, scrolling&lt;br /&gt;
	mov ax,0x1329&lt;br /&gt;
	add dh,al&lt;br /&gt;
	div dh&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul dl&lt;br /&gt;
	sub dx,bp&lt;br /&gt;
	xor ah,dl&lt;br /&gt;
	mov al,ah&lt;br /&gt;
	and al,tilt_plate_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx2: ; board of chessboards&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	sub ax,bp&lt;br /&gt;
	xor al,ah&lt;br /&gt;
	or al,0xDB&lt;br /&gt;
	add al,13h&lt;br /&gt;
ret&lt;br /&gt;
fx1: ; circles, zooming&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	sub al,100&lt;br /&gt;
	imul al&lt;br /&gt;
	xchg dx,ax&lt;br /&gt;
	imul al&lt;br /&gt;
	add dh,ah&lt;br /&gt;
	mov al,dh&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,circles_pattern&lt;br /&gt;
ret&lt;br /&gt;
fx3: ; parallax checkerboards&lt;br /&gt;
	mov cx,bp&lt;br /&gt;
	mov bx,-16&lt;br /&gt;
fx3L:&lt;br /&gt;
	add cx,di&lt;br /&gt;
	mov ax,819&lt;br /&gt;
	imul cx	 &lt;br /&gt;
	ror dx,1	 &lt;br /&gt;
	inc bx	 &lt;br /&gt;
	ja fx3L&lt;br /&gt;
	lea ax,[bx+31]	 &lt;br /&gt;
ret&lt;br /&gt;
fx4: ; sierpinski rotozoomer	&lt;br /&gt;
	lea cx,[bp-2048]&lt;br /&gt;
	sal cx,3&lt;br /&gt;
	movzx ax,dh&lt;br /&gt;
	movsx dx,dl&lt;br /&gt;
	mov bx,ax&lt;br /&gt;
	imul bx,cx&lt;br /&gt;
	add bh,dl&lt;br /&gt;
	imul dx,cx&lt;br /&gt;
	sub al,dh&lt;br /&gt;
	and al,bh&lt;br /&gt;
	and al,0b11111100&lt;br /&gt;
	salc				; VERY slow on dosbox, but ok&lt;br /&gt;
	jnz fx4q&lt;br /&gt;
	mov al,sierp_color&lt;br /&gt;
	fx4q:&lt;br /&gt;
ret&lt;br /&gt;
fx5: ; raycast bent tunnel&lt;br /&gt;
	mov cl,-9&lt;br /&gt;
	fx5L: &lt;br /&gt;
	push dx&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		sub al,100&lt;br /&gt;
		imul cl&lt;br /&gt;
		xchg ax,dx	&lt;br /&gt;
		add al,cl&lt;br /&gt;
		imul cl&lt;br /&gt;
		mov al,dh&lt;br /&gt;
		xor al,ah&lt;br /&gt;
		add al,4&lt;br /&gt;
		test al,-8&lt;br /&gt;
	pop dx&lt;br /&gt;
	loopz fx5L&lt;br /&gt;
	sub cx,bp&lt;br /&gt;
	xor al,cl&lt;br /&gt;
	aam tunnel_pattern; VERY slow on dosbox, but ok&lt;br /&gt;
	add al,tunnel_base_color&lt;br /&gt;
ret&lt;br /&gt;
fx6: ; ocean night / to day sky&lt;br /&gt;
	sub dh,120&lt;br /&gt;
	js fx6q&lt;br /&gt;
	mov [bx+si],dx&lt;br /&gt;
	fild word [bx+si]&lt;br /&gt;
	fidivr dword [bx+si]&lt;br /&gt;
	fstp dword [bx+si-1]&lt;br /&gt;
	mov ax,[bx+si]&lt;br /&gt;
	add ax,bp&lt;br /&gt;
	and al,128&lt;br /&gt;
	dec ax&lt;br /&gt;
fx6q:&lt;br /&gt;
ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Bonus - NFO/ASCII==&lt;br /&gt;
&lt;br /&gt;
  &amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt; &lt;br /&gt;
                                                             art : hammerfist&lt;br /&gt;
         ∂#MW%e                              _d$Ng,&lt;br /&gt;
         'B,  ∂b                   _jM@$QZb,cQ&amp;quot;  )@&lt;br /&gt;
  ,edRB$b,l@   Wk,yGR$KM&amp;amp;$b,     ,dP&amp;quot;     Wl ]bsd%UR8BG6&amp;amp;$@DSyG#ZKM&amp;amp;$b,&lt;br /&gt;
,dP      &amp;quot;T%L  'MGF      &amp;quot;*∂R_   Tg    &amp;quot;*4Zk,#I  YP   W&amp;quot;    7P      &amp;quot;*∂R&lt;br /&gt;
4M   gd@    ^   ∂@   d@b   dQ$#@Z@R3L_    &amp;quot;*GMj  'W      ,gd$   d@b   9Q$#%b&lt;br /&gt;
W#,  `M          Wb  `*  _4P   `Qk  *#N8L   `H5   @b   'QR7YK   `*  _4F&amp;quot;   Qk&lt;br /&gt;
`6@L             dML            '@          ,BK   'M    ∂B  *b,            '#L&lt;br /&gt;
  ^QBb,_     _,4&amp;amp;M∞∂@=,_       _dGL       _gQKM    GL    @k  'Mg,_         _dG,&lt;br /&gt;
    &amp;quot;*BN5W$2#MNP&amp;quot;   &amp;quot;*G3WRM8&amp;amp;B5P&amp;quot;`Y@QNW3Z5P&amp;quot; ∂#$W8BRM3XZN87    &amp;quot;*GW38M%EBDW5P&amp;quot;`&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
                              p r e s e n t s&lt;br /&gt;
&lt;br /&gt;
            4&lt;br /&gt;
           d@,&lt;br /&gt;
         _&amp;amp; `Wl&lt;br /&gt;
      _,aP   &amp;quot;#baedM$#@@K JP*&amp;quot;?ML&lt;br /&gt;
 ,ad@$#P&amp;quot;         ,d@NEWVB&amp;quot;     X,aQPYb,_&lt;br /&gt;
V@Mm,_          ,d@MW#BW'      EMP&amp;quot;   '¶R ,ngBP^fML&lt;br /&gt;
 ¶M@N@y        Y#BNW#M&amp;quot;       J9&amp;quot;      `MQ9&amp;quot;      &amp;quot;MgRBq  ,QBMg,&lt;br /&gt;
  VN#P` ,d@@    `WM@^                   7f         ¶F` 7kY&amp;quot;   ^G  _.eQNE1.&lt;br /&gt;
   ]B _G@MWN$,   `P                     '     4b       QP      ¶w@F*^  ^Qb&lt;br /&gt;
   ]O@NRM#W@MNB,         ;                    ^`      j        JP^       Yl&lt;br /&gt;
  J#NRNWM@#BcT&amp;quot;^        ,A  _J                     _q@                   `X&lt;br /&gt;
 '¶WM#B@WdY`,7        _G#YN#PM                 _,gG&amp;quot;                      M,&lt;br /&gt;
  *BN#WP&amp;quot;  dK       ,Q@NRMB&amp;quot;]9       ,      _,M@Q*                        #A&lt;br /&gt;
   &amp;quot;U^      V@h,   iNBW#NT  J'      J9     s@QN&amp;quot;         _;               'D,&lt;br /&gt;
             ¶RMBv&amp;amp;NMQR@9  .W      .K'     &amp;quot;9`         ,6BA   _JL          ]l&lt;br /&gt;
              Y#NE@W#NRP   #[      `¶8               _d@MW#B_jW#W          BN&lt;br /&gt;
               &amp;quot;GQ@MR#W    QL_      *B            _,p#NBW#NQMG@WY          3Q&lt;br /&gt;
                  &amp;quot;Y@F     ,XW@M%im,_Yb_     _,g5@#MW@QMNE@E@NRMB         ,WM&lt;br /&gt;
                    `  _,gP*&amp;quot;#REM#GB@N#MQbnd@N#M@MW#R8QSB^'WQERM@        ;4NB,&lt;br /&gt;
                     ,GYKL    ¶E#B8R8QSB@M@#BM#W@MNB&amp;quot;`_  ,  &amp;quot;^` N       ,dW@Ql&lt;br /&gt;
                   _Q`'W`*t    '¶@GS#MBQ#E@W#NQBW[     'LvQ_   ,K    _dNABGM#N&lt;br /&gt;
                  ,F   '          `^WAB@QGE9*&amp;quot;9^*@L    jP7FY,  ¶h,_.jWM#BR#GBM,&lt;br /&gt;
                 J;    ,   _                  '       '   &amp;quot;LL  YxE#B8R8QSBNW@W;&lt;br /&gt;
                AP   _,Ag6^          _   J                  ¶A  `&amp;quot;Q#M@MW#R8E#P&lt;br /&gt;
               j@   `&amp;quot;XQW[            'LvK,_      'L_,/      @t    Y#NE@WNR&amp;quot;&lt;br /&gt;
              :M/     9^*@L           jP7F&amp;quot;       _PYKL     _,A;     ¶RSNQ&amp;quot;&lt;br /&gt;
              dKL     '     `        '   &amp;quot;L      &amp;quot;`'W`*t   `&amp;quot;XQb      `W^&lt;br /&gt;
              Q`8t            'L_,/         ,   _   '        9^Q&lt;br /&gt;
             ,W               _PYKL       _,Ag6^             ' W,     _ ,#N&amp;amp;&lt;br /&gt;
             !N  _   J       &amp;quot;`'W`*t     `&amp;quot;XQW[       _  J     N!_JG9^RwQ' *t&lt;br /&gt;
             `W,  'LvK,_        '        _gGB8@L   _   'LvK,_ ,WgB'    V    7L&lt;br /&gt;
         _.,gm&amp;amp;@B&amp;amp;wBZF&amp;quot;                j@'`  &amp;quot;WL _gML  jZd7Yb lN&amp;quot;          dBWl&lt;br /&gt;
      ,g&amp;amp;QB*&amp;quot;^`    `&amp;quot;*G@g, .gR&amp;amp;k,_   ,N&amp;quot;      '@QF  ¶k;gMF  *QvQ     jQ, ,@N@B#,&lt;br /&gt;
   .eQF*`              `Yb@&amp;quot;  &amp;quot;*6Qg,gF     ,   7     XMN&amp;quot;    'MNB,    ^¶QWSER@N;&lt;br /&gt;
 ,gP&amp;quot;           qy,      W'       ^Q'     &amp;amp;L      ,g@W'       `QMEL     `&amp;quot;WBNWP&lt;br /&gt;
g7              ¶9      ,X         M?     9&amp;quot;   _q8MSK           ¶EMt       *@K&lt;br /&gt;
Vh   _,m#L             _AH        le         ,GBDNE9^A,          *@F        NMg&lt;br /&gt;
 ¶L,qQ@ND           _.m@Bl        We      ,gM@B8#Q'   ¶h_                   lWE,&lt;br /&gt;
  W9NHW@`          JWM#B@]        @e     4WR@NGF^      'QL                  dRWl&lt;br /&gt;
   VMd*            &amp;quot;@BE@PM        'N      *UP&amp;quot;           VW,               JRSB;&lt;br /&gt;
  ,@F       j       `¶WK W,        ¶t                     XNt            _A@E#N&lt;br /&gt;
_JP       ,6&amp;amp;         &amp;quot;GLdM         XD,               _.g8NMA@k,_    _,gG#NMGR;&lt;br /&gt;
&amp;quot;Z      .JRER           'VMi     _jNB#W&amp;amp;_         _,j@E@W#Nl ¶MBGMNQGNQMG@QBW9&lt;br /&gt;
 ¶h   ,G@NRMBl            `&amp;quot;   ,d#R@M$F ¶Mg,_.gp&amp;amp;@@NEWVBWBMG  *QMN8R8SBN$E@WF&lt;br /&gt;
  Vb dW#R8QSRb,                 *YM@EQ,_ 'MENBW#NQMG#B@R@MW#l   &amp;quot;BM@QNENRQG'&lt;br /&gt;
   *WGS#MBMNEYL                    `^&amp;quot;*8M@Q@NRM#W@BWSNW@QBF&amp;quot;`     `^*@QBF^ [HFT]&lt;br /&gt;
    ^M@MW#Q9 ^Wt                           `^¶RQ@W8NQGP*`&lt;br /&gt;
     ¶Q#@P     Vk                            lA `&amp;quot;^`&lt;br /&gt;
      Y&amp;quot;       `MA                           J#,&lt;br /&gt;
                *R@,                        ,MQl&lt;br /&gt;
                 Y#Wk,                      GWM8L&lt;br /&gt;
                  W8RQSt,_                 AQ@MR#,&lt;br /&gt;
                  `@M@#SB@Mbm.,_          QNBW#NW&lt;br /&gt;
                    ¶QB8R8SBN$WNRM@#GNtwg@NMQR@B'&lt;br /&gt;
                     *MBQ#8R8QS@NE@WNBW#NQMG@NR;&lt;br /&gt;
                      `WGS#MBQ#R8QSB@NE@W#NQBW9&lt;br /&gt;
                        *OMW@QMNE@E@NRMW@QMB@*&lt;br /&gt;
                          `^&amp;quot;YQW@Q#SB#NE@EGP&lt;br /&gt;
                               `^&amp;quot;*8R@GBQF`&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=555</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=555"/>
				<updated>2020-05-04T10:27:00Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Switching to Textmode : 38 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;You can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules], but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (80x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Modbyte tuning, jumping into modbytes, code path alignment : 32 bytes==&lt;br /&gt;
[[File:Modbyte to 100h.png|frame|right|lds_modbyte_instructions]]&lt;br /&gt;
Sometimes, an instruction has several degrees of &amp;quot;freedom&amp;quot;. That means, that the effect of that instruction can also be achieved by an alternative version of that instruction. In this case, the &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; instruction, which puts two bytes of the code into the segment DS, also loads two bytes into a register we (almost) don't care about. The only requirement is that &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; points to the start of the code, which can either be done by [SI] or [BX+SI]. The right image shows which modbyte numbers would be satisfying (highlighted green). Now, this selection can be applied to the instruction table below (highlighted red). It becomes clear that the used instruction&amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; would, interpreted as modbyte, be &amp;lt;code&amp;gt;SP,[SI]&amp;lt;/code&amp;gt; and thus it would be possible to jump into this modbyte to execute. &amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
To be more clear: &amp;lt;code&amp;gt;lds sp,[si]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;0xc5 0x24&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;and al, 0x20&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;0x24 0x20&amp;lt;/code&amp;gt;, so TWO &amp;lt;code&amp;gt;0x24&amp;lt;/code&amp;gt;are merged into ONE. &amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;To make this work, the &amp;quot;host&amp;quot; instruction has to be only executed once (it would not work in a loop). Also, the parameter of the injected instruction has to be put &amp;quot;behind&amp;quot; the &amp;quot;host&amp;quot; instruction (a single &amp;lt;code&amp;gt;db 32&amp;lt;/code&amp;gt; in the code). Finally, it has to be made sure that this second code path aligns with the rest of the code, and does no damage to the intended effect (for example, critical registers could be modified, or worse, illegal instructions could be created that way). In this case the new codepath consists of &amp;lt;code&amp;gt;and [bp+di+0807h],dh&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;add al,0a7h&amp;lt;/code&amp;gt;, after which it aligns normally. These instructions are executed only once and do not modify critical registers.&amp;lt;br/&amp;gt;&lt;br /&gt;
Sometimes, a bit of code shuffling has to be performed to make such a trick work. Here, the &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt;. The critical function of being also a good segment value has been overtaken by &amp;lt;code&amp;gt;mov bl&amp;lt;/code&amp;gt; (see table above).&amp;lt;br/&amp;gt;&lt;br /&gt;
[[File:Instruction selection.png|frame|center|instruction table for modbytes]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
lds sp,[si]&lt;br /&gt;
X: db 32&lt;br /&gt;
mov bl,7                    ; O: 3 iterations&lt;br /&gt;
or [si],al                  ; O: Add in new cell&lt;br /&gt;
cmpsw&lt;br /&gt;
shr byte [di],5             ; O: Shift previous value &lt;br /&gt;
C: xchg cx,ax&lt;br /&gt;
add al,[di+bx+94]           ; O: Add in this column&lt;br /&gt;
add al,[si+bx-4]&lt;br /&gt;
add al,[si+bx+156]&lt;br /&gt;
dec bx                      ; O: Loop back&lt;br /&gt;
jnz C&lt;br /&gt;
mov al,[si]                 ; O: 3 = birth, 4 = stay (tricky): &lt;br /&gt;
stc                         ; O: 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
rcr al,cl                   ; O:          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
jmp short X-1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=554</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=554"/>
				<updated>2020-04-30T21:54:38Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;You can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules], but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Modbyte tuning, jumping into modbytes, code path alignment : 32 bytes==&lt;br /&gt;
[[File:Modbyte to 100h.png|frame|right|lds_modbyte_instructions]]&lt;br /&gt;
Sometimes, an instruction has several degrees of &amp;quot;freedom&amp;quot;. That means, that the effect of that instruction can also be achieved by an alternative version of that instruction. In this case, the &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; instruction, which puts two bytes of the code into the segment DS, also loads two bytes into a register we (almost) don't care about. The only requirement is that &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; points to the start of the code, which can either be done by [SI] or [BX+SI]. The right image shows which modbyte numbers would be satisfying (highlighted green). Now, this selection can be applied to the instruction table below (highlighted red). It becomes clear that the used instruction&amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; would, interpreted as modbyte, be &amp;lt;code&amp;gt;SP,[SI]&amp;lt;/code&amp;gt; and thus it would be possible to jump into this modbyte to execute. &amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
To be more clear: &amp;lt;code&amp;gt;lds sp,[si]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;0xc5 0x24&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;and al, 0x20&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;0x24 0x20&amp;lt;/code&amp;gt;, so TWO &amp;lt;code&amp;gt;0x24&amp;lt;/code&amp;gt;are merged into ONE. &amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;To make this work, the &amp;quot;host&amp;quot; instruction has to be only executed once (it would not work in a loop). Also, the parameter of the injected instruction has to be put &amp;quot;behind&amp;quot; the &amp;quot;host&amp;quot; instruction (a single &amp;lt;code&amp;gt;db 32&amp;lt;/code&amp;gt; in the code). Finally, it has to be made sure that this second code path aligns with the rest of the code, and does no damage to the intended effect (for example, critical registers could be modified, or worse, illegal instructions could be created that way). In this case the new codepath consists of &amp;lt;code&amp;gt;and [bp+di+0807h],dh&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;add al,0a7h&amp;lt;/code&amp;gt;, after which it aligns normally. These instructions are executed only once and do not modify critical registers.&amp;lt;br/&amp;gt;&lt;br /&gt;
Sometimes, a bit of code shuffling has to be performed to make such a trick work. Here, the &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt;. The critical function of being also a good segment value has been overtaken by &amp;lt;code&amp;gt;mov bl&amp;lt;/code&amp;gt; (see table above).&amp;lt;br/&amp;gt;&lt;br /&gt;
[[File:Instruction selection.png|frame|center|instruction table for modbytes]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
lds sp,[si]&lt;br /&gt;
X: db 32&lt;br /&gt;
mov bl,7                    ; O: 3 iterations&lt;br /&gt;
or [si],al                  ; O: Add in new cell&lt;br /&gt;
cmpsw&lt;br /&gt;
shr byte [di],5             ; O: Shift previous value &lt;br /&gt;
C: xchg cx,ax&lt;br /&gt;
add al,[di+bx+94]           ; O: Add in this column&lt;br /&gt;
add al,[si+bx-4]&lt;br /&gt;
add al,[si+bx+156]&lt;br /&gt;
dec bx                      ; O: Loop back&lt;br /&gt;
jnz C&lt;br /&gt;
mov al,[si]                 ; O: 3 = birth, 4 = stay (tricky): &lt;br /&gt;
stc                         ; O: 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
rcr al,cl                   ; O:          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
jmp short X-1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=553</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=553"/>
				<updated>2020-04-30T18:52:25Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: clearer explanation, thanks gargaj&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules], but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Modbyte tuning, jumping into modbytes, code path alignment : 32 bytes==&lt;br /&gt;
[[File:Modbyte to 100h.png|frame|right|lds_modbyte_instructions]]&lt;br /&gt;
Sometimes, an instruction has several degrees of &amp;quot;freedom&amp;quot;. That means, that the effect of that instruction can also be achieved by an alternative version of that instruction. In this case, the &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; instruction, which puts two bytes of the code into the segment DS, also loads two bytes into a register we (almost) don't care about. The only requirement is that &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; points to the start of the code, which can either be done by [SI] or [BX+SI]. The right image shows which modbyte numbers would be satisfying (highlighted green). Now, this selection can be applied to the instruction table below (highlighted red). It becomes clear that the used instruction&amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; would, interpreted as modbyte, be &amp;lt;code&amp;gt;SP,[SI]&amp;lt;/code&amp;gt; and thus it would be possible to jump into this modbyte to execute. &amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
To be more clear: &amp;lt;code&amp;gt;lds sp,[si]&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;0xc5 0x24&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;and al, 0x20&amp;lt;/code&amp;gt; is &amp;lt;code&amp;gt;0x24 0x20&amp;lt;/code&amp;gt;, so TWO &amp;lt;code&amp;gt;0x24&amp;lt;/code&amp;gt;are merged into ONE. &amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;To make this work, the &amp;quot;host&amp;quot; instruction has to be only executed once (it would not work in a loop). Also, the parameter of the injected instruction has to be put &amp;quot;behind&amp;quot; the &amp;quot;host&amp;quot; instruction (a single &amp;lt;code&amp;gt;db 32&amp;lt;/code&amp;gt; in the code). Finally, it has to be made sure that this second code path aligns with the rest of the code, and does no damage to the intended effect (for example, critical registers could be modified, or worse, illegal instructions could be created that way). In this case the new codepath consists of &amp;lt;code&amp;gt;and [bp+di+0807h],dh&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;add al,0a7h&amp;lt;/code&amp;gt;, after which it aligns normally. These instructions are executed only once and do not modify critical registers.&amp;lt;br/&amp;gt;&lt;br /&gt;
Sometimes, a bit of code shuffling has to be performed to make such a trick work. Here, the &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt;. The critical function of being also a good segment value has been overtaken by &amp;lt;code&amp;gt;mov bl&amp;lt;/code&amp;gt; (see table above).&amp;lt;br/&amp;gt;&lt;br /&gt;
[[File:Instruction selection.png|frame|center|instruction table for modbytes]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
lds sp,[si]&lt;br /&gt;
X: db 32&lt;br /&gt;
mov bl,7                    ; O: 3 iterations&lt;br /&gt;
or [si],al                  ; O: Add in new cell&lt;br /&gt;
cmpsw&lt;br /&gt;
shr byte [di],5             ; O: Shift previous value &lt;br /&gt;
C: xchg cx,ax&lt;br /&gt;
add al,[di+bx+94]           ; O: Add in this column&lt;br /&gt;
add al,[si+bx-4]&lt;br /&gt;
add al,[si+bx+156]&lt;br /&gt;
dec bx                      ; O: Loop back&lt;br /&gt;
jnz C&lt;br /&gt;
mov al,[si]                 ; O: 3 = birth, 4 = stay (tricky): &lt;br /&gt;
stc                         ; O: 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
rcr al,cl                   ; O:          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
jmp short X-1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=552</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=552"/>
				<updated>2020-04-30T18:32:32Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Modbyte tuning, jumping into modbytes, code path alignment : 32b */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules], but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Modbyte tuning, jumping into modbytes, code path alignment : 32 bytes==&lt;br /&gt;
[[File:Modbyte to 100h.png|frame|right|lds_modbyte_instructions]]&lt;br /&gt;
Sometimes, an instruction has several degrees of &amp;quot;freedom&amp;quot;. That means, that the effect of that instruction can also be achieved by an alternative version of that instruction. In this case, the &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; instruction, which puts two bytes of the code into the segment DS, also loads two bytes into a register we (almost) don't care about. The only requirement is that &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; points to the start of the code, which can either be done by [SI] or [BX+SI]. The right image shows which modbyte numbers would be satisfying (highlighted green). Now, this selection can be applied to the instruction table below (highlighted red). It becomes clear that the used instruction&amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; would, interpreted as modbyte, be &amp;lt;code&amp;gt;SP,[SI]&amp;lt;/code&amp;gt; and thus it would be possible to jump into this modbyte to execute. &lt;br /&gt;
&amp;lt;br/&amp;gt;To make this work, the &amp;quot;host&amp;quot; instruction has to be only executed once (it would not work in a loop). Also, the parameter of the injected instruction has to be put &amp;quot;behind&amp;quot; the &amp;quot;host&amp;quot; instruction (a single &amp;lt;code&amp;gt;db 32&amp;lt;/code&amp;gt; in the code). Finally, it has to be made sure that this second code path aligns with the rest of the code, and does no damage to the intended effect (for example, critical registers could be modified, or worse, illegal instructions could be created that way). In this case the new codepath consists of &amp;lt;code&amp;gt;and [bp+di+0807h],dh&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;add al,0a7h&amp;lt;/code&amp;gt;, after which it aligns normally. These instructions are executed only once and do not modify critical registers.&amp;lt;br/&amp;gt;&lt;br /&gt;
Sometimes, a bit of code shuffling has to be performed to make such a trick work. Here, the &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt;. The critical function of being also a good segment value has been overtaken by &amp;lt;code&amp;gt;mov bl&amp;lt;/code&amp;gt; (see table above).&amp;lt;br/&amp;gt;&lt;br /&gt;
[[File:Instruction selection.png|frame|center|instruction table for modbytes]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
lds sp,[si]&lt;br /&gt;
X: db 32&lt;br /&gt;
mov bl,7                    ; O: 3 iterations&lt;br /&gt;
or [si],al                  ; O: Add in new cell&lt;br /&gt;
cmpsw&lt;br /&gt;
shr byte [di],5             ; O: Shift previous value &lt;br /&gt;
C: xchg cx,ax&lt;br /&gt;
add al,[di+bx+94]           ; O: Add in this column&lt;br /&gt;
add al,[si+bx-4]&lt;br /&gt;
add al,[si+bx+156]&lt;br /&gt;
dec bx                      ; O: Loop back&lt;br /&gt;
jnz C&lt;br /&gt;
mov al,[si]                 ; O: 3 = birth, 4 = stay (tricky): &lt;br /&gt;
stc                         ; O: 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
rcr al,cl                   ; O:          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
jmp short X-1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=551</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=551"/>
				<updated>2020-04-30T18:05:02Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Original version : 65 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules], but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Modbyte tuning, jumping into modbytes, code path alignment : 32b==&lt;br /&gt;
[[File:Modbyte to 100h.png|frame|right|lds_modbyte_instructions]]&lt;br /&gt;
Sometimes, an instruction has several degrees of &amp;quot;freedom&amp;quot;. That means, that the effect of that instruction can also be achieved by an alternative version of that instruction. In this case, the &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; instruction, which puts two bytes of the code into the segment DS, also loads two bytes into a register we (almost) don't care about. The only requirement is that &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; points to the start of the code, which can either be done by [SI] or [BX+SI]. The right image shows which modbyte numbers would be satisfying (highlighted green). Now, this selection can be applied to the instruction table below (highlighted red). It becomes clear that the used instruction&amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; would, interpreted as modbyte, be &amp;lt;code&amp;gt;SP,[SI]&amp;lt;/code&amp;gt; and thus it would be possible to jump into this modbyte to execute. &lt;br /&gt;
&amp;lt;br/&amp;gt;To make this work, the &amp;quot;host&amp;quot; instruction has to be only executed once (it would not work in a loop). Also, the parameter of the injected instruction has to be put &amp;quot;behind&amp;quot; the &amp;quot;host&amp;quot; instruction (a single &amp;lt;code&amp;gt;db 32&amp;lt;/code&amp;gt; in the code). Finally, it has to be made sure that this second code path aligns with the rest of the code, and does no damage to the intended effect (for example, critical registers could be modified, or worse, illegal instructions could be created that way). In this case the new codepath consists of &amp;lt;code&amp;gt;and [bp+di+0807h],dh&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;add al,0a7h&amp;lt;/code&amp;gt;, after which it aligns normally. These instructions are executed only once and do not modify critical registers.&amp;lt;br/&amp;gt;&lt;br /&gt;
Sometimes, a bit of code shuffling has to be performed to make such a trick work. Here, the &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt;. The critical function of being also a good segment value has been overtaken by &amp;lt;code&amp;gt;mov bl&amp;lt;/code&amp;gt; (see table above).&amp;lt;br/&amp;gt;&lt;br /&gt;
[[File:Instruction selection.png|frame|center|instruction table for modbytes]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
lds sp,[si]&lt;br /&gt;
X: db 32&lt;br /&gt;
mov bl,7                    ; O: 3 iterations&lt;br /&gt;
or [si],al                  ; O: Add in new cell&lt;br /&gt;
cmpsw&lt;br /&gt;
shr byte [di],5             ; O: Shift previous value &lt;br /&gt;
C: xchg cx,ax&lt;br /&gt;
add al,[di+bx+94]           ; O: Add in this column&lt;br /&gt;
add al,[si+bx-4]&lt;br /&gt;
add al,[si+bx+156]&lt;br /&gt;
dec bx                      ; O: Loop back&lt;br /&gt;
jnz C&lt;br /&gt;
mov al,[si]                 ; O: 3 = birth, 4 = stay (tricky): &lt;br /&gt;
stc                         ; O: 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
rcr al,cl                   ; O:          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
jmp short X-1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=550</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=550"/>
				<updated>2020-04-30T17:57:28Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Modbyte tuning, jumping into modbytes, code path alignment : 32b */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Modbyte tuning, jumping into modbytes, code path alignment : 32b==&lt;br /&gt;
[[File:Modbyte to 100h.png|frame|right|lds_modbyte_instructions]]&lt;br /&gt;
Sometimes, an instruction has several degrees of &amp;quot;freedom&amp;quot;. That means, that the effect of that instruction can also be achieved by an alternative version of that instruction. In this case, the &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; instruction, which puts two bytes of the code into the segment DS, also loads two bytes into a register we (almost) don't care about. The only requirement is that &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; points to the start of the code, which can either be done by [SI] or [BX+SI]. The right image shows which modbyte numbers would be satisfying (highlighted green). Now, this selection can be applied to the instruction table below (highlighted red). It becomes clear that the used instruction&amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; would, interpreted as modbyte, be &amp;lt;code&amp;gt;SP,[SI]&amp;lt;/code&amp;gt; and thus it would be possible to jump into this modbyte to execute. &lt;br /&gt;
&amp;lt;br/&amp;gt;To make this work, the &amp;quot;host&amp;quot; instruction has to be only executed once (it would not work in a loop). Also, the parameter of the injected instruction has to be put &amp;quot;behind&amp;quot; the &amp;quot;host&amp;quot; instruction (a single &amp;lt;code&amp;gt;db 32&amp;lt;/code&amp;gt; in the code). Finally, it has to be made sure that this second code path aligns with the rest of the code, and does no damage to the intended effect (for example, critical registers could be modified, or worse, illegal instructions could be created that way). In this case the new codepath consists of &amp;lt;code&amp;gt;and [bp+di+0807h],dh&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;add al,0a7h&amp;lt;/code&amp;gt;, after which it aligns normally. These instructions are executed only once and do not modify critical registers.&amp;lt;br/&amp;gt;&lt;br /&gt;
Sometimes, a bit of code shuffling has to be performed to make such a trick work. Here, the &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt;. The critical function of being also a good segment value has been overtaken by &amp;lt;code&amp;gt;mov bl&amp;lt;/code&amp;gt; (see table above).&amp;lt;br/&amp;gt;&lt;br /&gt;
[[File:Instruction selection.png|frame|center|instruction table for modbytes]]&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
lds sp,[si]&lt;br /&gt;
X: db 32&lt;br /&gt;
mov bl,7                    ; O: 3 iterations&lt;br /&gt;
or [si],al                  ; O: Add in new cell&lt;br /&gt;
cmpsw&lt;br /&gt;
shr byte [di],5             ; O: Shift previous value &lt;br /&gt;
C: xchg cx,ax&lt;br /&gt;
add al,[di+bx+94]           ; O: Add in this column&lt;br /&gt;
add al,[si+bx-4]&lt;br /&gt;
add al,[si+bx+156]&lt;br /&gt;
dec bx                      ; O: Loop back&lt;br /&gt;
jnz C&lt;br /&gt;
mov al,[si]                 ; O: 3 = birth, 4 = stay (tricky): &lt;br /&gt;
stc                         ; O: 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
rcr al,cl                   ; O:          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
jmp short X-1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=File:Instruction_selection.png&amp;diff=549</id>
		<title>File:Instruction selection.png</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=File:Instruction_selection.png&amp;diff=549"/>
				<updated>2020-04-30T17:40:24Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Selecting an instructions by modbyte restrictions&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=File:Modbyte_to_100h.png&amp;diff=548</id>
		<title>File:Modbyte to 100h.png</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=File:Modbyte_to_100h.png&amp;diff=548"/>
				<updated>2020-04-30T17:37:07Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Modbyte combinations pointing to &amp;quot;100h&amp;quot;, with selected choice&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=547</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=547"/>
				<updated>2020-04-30T17:14:40Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Modbyte tuning, jumping into modbytes, code path alignment : 32b==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
lds sp,[si]&lt;br /&gt;
X: db 32&lt;br /&gt;
mov bl,7                    ; O: 3 iterations&lt;br /&gt;
or [si],al                  ; O: Add in new cell&lt;br /&gt;
cmpsw&lt;br /&gt;
shr byte [di],5             ; O: Shift previous value &lt;br /&gt;
C: xchg cx,ax&lt;br /&gt;
add al,[di+bx+94]           ; O: Add in this column&lt;br /&gt;
add al,[si+bx-4]&lt;br /&gt;
add al,[si+bx+156]&lt;br /&gt;
dec bx                      ; O: Loop back&lt;br /&gt;
jnz C&lt;br /&gt;
mov al,[si]                 ; O: 3 = birth, 4 = stay (tricky): &lt;br /&gt;
stc                         ; O: 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
rcr al,cl                   ; O:          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
jmp short X-1&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=546</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=546"/>
				<updated>2020-04-30T17:11:28Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Combining exchange with alignment : 33 bytes ==&lt;br /&gt;
When thinking about &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt; and how to skip one row to get rid of one of the double &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, my own production &amp;quot;M8trix&amp;quot; (2015) came to mind, where i did [https://www.pouet.net/prod.php?which=63126#c713378 pretty much the same as here], pulling the &amp;lt;code&amp;gt;xchg&amp;lt;/code&amp;gt;into the loop and doing alternating counting, so that &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; counts the acual cells, while &amp;lt;code&amp;gt;al&amp;lt;/code&amp;gt; is never actually used (it &amp;quot;counts&amp;quot; the colors). To make that little dance work, &amp;lt;code&amp;gt;bl&amp;lt;/code&amp;gt; has to start at 7.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,7 &lt;br /&gt;
AccLoop: xchg cx,ax&lt;br /&gt;
		 add  al,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  al,[si+bx-4]&lt;br /&gt;
         add  al,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=545</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=545"/>
				<updated>2020-04-30T16:56:41Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Synchronizing SI/DI, Improved cleanup : 34 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
Now there are very short instructions available to advance the registers SI and DI, some of them at the same time, and one of them is &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt;. Not only does it not &amp;quot;hurt&amp;quot; the intended computation (the &amp;quot;compare&amp;quot; part of the instruction can be ignored), it also advances both SI and DI by TWO, so that the alignment of the screen in text mode is perfectly matched.&amp;lt;br/&amp;gt;&lt;br /&gt;
The usage of &amp;lt;code&amp;gt;cmpsw&amp;lt;/code&amp;gt; requires to remove &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; since there is no simple command to advance SI in the opposite direction (without involving direction flags), so it had been changed again to &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; to be one of the commands that also works as high byte of a segment adress, and an additional &amp;lt;code&amp;gt;dec si&amp;lt;/code&amp;gt; to align DI and SI, so that the clean up step is always in the same distance &amp;quot;behind&amp;quot; the current calculation. The assumption DI = SI - 258 is true on almost every DOS system. As a byproduct, one of the memory access instruction can now be rewritten to use DI instead of SI (like in the original), to save one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=544</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=544"/>
				<updated>2020-04-30T16:47:42Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Synchronizing SI/DI, Improved cleanup : 34 bytes ==&lt;br /&gt;
A lot of tiny changes were the result of just one idea: How to optimize the clean up step? After all it is not really neccessary to correct a marked cell as soon as possible, instead, it can be waited for a certain amount of time/steps. But any nontrivial version of &amp;lt;code&amp;gt;shr byte [si-160-6],5&amp;lt;/code&amp;gt; still uses four bytes, unless it is brought into one of the &amp;quot;pure&amp;quot; forms that only take up THREE bytes: &amp;lt;code&amp;gt;shr byte[(bp/bx)+si/di],x&amp;lt;/code&amp;gt;. Since SI and BX were already in use, and the usage of BP would implicate that the register SS is used instead of DS, the only remaining register possible is DI.&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsb&lt;br /&gt;
		 dec si&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si],al          ; Add in new cell     ^&lt;br /&gt;
		 cmpsw&lt;br /&gt;
		 shr byte [di],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[di+bx+94]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=543</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=543"/>
				<updated>2020-04-30T16:37:57Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Switching to Textmode : 38 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. That is directly punished with an additional byte, because &amp;lt;code&amp;gt;DI&amp;lt;/code&amp;gt; is no longer involved in the process, thus, an optimization had to be removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=542</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=542"/>
				<updated>2020-04-30T16:16:57Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Using instructions as segment adress : 36 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). In this case &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; can be reused as the first (higher) byte of the segment. The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=541</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=541"/>
				<updated>2020-04-30T16:15:06Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* TODO */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Using instructions as segment adress : 36 bytes ==&lt;br /&gt;
[[File:LDStrick.png|frame|center|possible instructions for LDS trick in textmode]]&lt;br /&gt;
Instead of using &amp;lt;code&amp;gt;push&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;pop&amp;lt;/code&amp;gt; to get the screen adress, there is also the instruction &amp;lt;code&amp;gt;lds&amp;lt;/code&amp;gt; available, which reads the segment value from memory. A value &amp;quot;close&amp;quot; to &amp;lt;code&amp;gt;0xb800&amp;lt;/code&amp;gt; would be sufficient, because the visible screen in textmode is just a tiny part of the 64 kilobytes addressable by one segment. The idea is now to reuse parts of the code as segment address, which is possible when the instructions is one of the above. If there is such an instruction, it can start at the 4th byte (&amp;lt;code&amp;gt;[si]&amp;lt;/code&amp;gt; points to the start of the code and &amp;lt;code&amp;gt;lds bx,[si]&amp;lt;/code&amp;gt; puts the first two bytes into BX and the 3rd and 4th into DS, reversed). The 3rd byte would be only relevant for alignment, so instead of putting &amp;quot;0x00&amp;quot; there, a one-byte-instruction can be used there. The whole process saves two bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 lds bx,[si] &lt;br /&gt;
LifeLoop:&lt;br /&gt;
		 stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
		 lodsw&lt;br /&gt;
		 rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
		 and al,20h            ; ^carry   |         ^&lt;br /&gt;
		 or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
		 shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
		 mov bl,6 &lt;br /&gt;
		 xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=File:LDStrick.png&amp;diff=540</id>
		<title>File:LDStrick.png</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=File:LDStrick.png&amp;diff=540"/>
				<updated>2020-04-30T16:08:57Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;possible instructions for LDS trick in textmode&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=539</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=539"/>
				<updated>2020-04-30T15:49:28Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Switching to Textmode : 38 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:Golt38.png|thumb|game of life, text mode, 32b - 38b]]&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=File:Golt38.png&amp;diff=538</id>
		<title>File:Golt38.png</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=File:Golt38.png&amp;diff=538"/>
				<updated>2020-04-30T15:49:13Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;game of life, textmode, 32b - 38b&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=537</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=537"/>
				<updated>2020-04-30T15:47:17Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Switching to Textmode : 39 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Switching to Textmode : 38 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp short LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=536</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=536"/>
				<updated>2020-04-30T15:43:06Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Switching to Textmode : 39 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
Setting up screen mode and pixel access is requiring quite a bit of space, so in this version, it is removed. The assumption is that the computer this runs on, is already in text mode (40x25 chars, colors). This also helps with the calculation, since now it takes place directly on the screen (only one segment has to be set up) and no content has to be generated initially, since there is always at least something on the screen that works as seed value. A small downside is the alignment of cells, because in textmode, one cell occupies TWO bytes (one for color information). Luckily, the color information is by default set to &amp;quot;gray on black&amp;quot;. An additional &amp;lt;code&amp;gt;dec bx&amp;lt;/code&amp;gt;, replacing &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;lodsw&amp;lt;/code&amp;gt; and changing &amp;lt;code&amp;gt;mov bl,3&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mov bl,6&amp;lt;/code&amp;gt; helps fixing the alignment issue. Additionally, the screen address changed (&amp;lt;code&amp;gt;push 0xb800&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;pop ds&amp;lt;/code&amp;gt;) Another lucky coincident is, that instead of blue pixels, we now have a &amp;quot;smiley char&amp;quot; with orthogonal borders, which is a decent representation of a living cell. Inbetween marking and correction it shortly changes to an exclamation mark (!), which is barely visible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=535</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=535"/>
				<updated>2020-04-30T15:33:09Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Remove key handler and RNG : 44 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
[[File:Gol44c.png|thumb|game of life, 44 bytes, increased contrast]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=File:Gol44c.png&amp;diff=534</id>
		<title>File:Gol44c.png</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=File:Gol44c.png&amp;diff=534"/>
				<updated>2020-04-30T15:32:54Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;game of life, 44 bytes, increased contrast&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=533</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=533"/>
				<updated>2020-04-30T15:25:50Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Remove key handler and RNG : 44 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
In order to reach 32 bytes, all the convenient stuff has to be removed. In case there is space left, parts of it could be reintegrated again. There are tiny changes to make this work as intended. The segment where all the calculation takes place has been changed to &amp;lt;code&amp;gt;1000h&amp;lt;/code&amp;gt;, pointing to a lower memory location. (Note: this might be working just with DosBox) The activity there (visible on the screen) helps spawning actual game of life structures.&lt;br /&gt;
[[File:Gol44.png|thumb|right|game of life, 44 bytes]]&lt;br /&gt;
&amp;lt;code&amp;gt;mov al,[si]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;inc si&amp;lt;/code&amp;gt; have been replaced with &amp;lt;code&amp;gt;lodsb&amp;lt;/code&amp;gt; since that saves one byte.&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    01000h          ; DS = low memory segment&lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=File:Gol44.png&amp;diff=532</id>
		<title>File:Gol44.png</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=File:Gol44.png&amp;diff=532"/>
				<updated>2020-04-30T15:23:44Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;game of life, 44 bytes&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=531</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=531"/>
				<updated>2020-04-30T15:05:32Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;, which as shown before, does not hurt the computation, besides the cell value has a temporary value of 32 or 33, thus being visible as brighter blue pixel in the short time span between marking and correction.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=530</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=530"/>
				<updated>2020-04-30T15:00:46Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Original version : 65 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; by &amp;lt;code&amp;gt;shr [byte di-65],5&amp;lt;/code&amp;gt; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
When the summation is complete, the aforementioned &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; is executed, but not before setting the carry flag (&amp;lt;code&amp;gt;stc&amp;lt;/code&amp;gt;) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; we get exactly the value according to the rules defined above. &amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
This value is now set in the original cell with &amp;lt;code&amp;gt;or [si-1],al&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=529</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=529"/>
				<updated>2020-04-30T14:52:56Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Original version : 65 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only neighbour cells inside a 3x3 neighbourhood , ALL cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation &amp;lt;code&amp;gt;cl&amp;lt;/code&amp;gt; is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt; after &amp;lt;code&amp;gt;xchg cx,ax&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcr&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=528</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=528"/>
				<updated>2020-04-30T14:49:47Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Original version : 65 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only neighbour cells inside a 3x3 neighbourhood , ALL cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction &amp;lt;code&amp;gt;and al,0x20&amp;lt;/code&amp;gt;. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the &amp;lt;code&amp;gt;rcl&amp;lt;/code&amp;gt; instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html &amp;quot;The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits.&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=527</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=527"/>
				<updated>2020-04-30T14:41:47Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Original version : 65 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only neighbour cells inside a 3x3 neighbourhood , ALL cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&lt;br /&gt;
Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work &amp;quot;in situ&amp;quot; (updating the current cell instantly will lead to wrong results of following calculations), current cells are &amp;quot;marked&amp;quot;, and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be &amp;quot;corrected&amp;quot; to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. &amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
At the start of the loop, there is already the first &amp;quot;trick&amp;quot; happening. The register of summation is not properly cleaned, but at this point it can either contain 0 or 32...&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=526</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=526"/>
				<updated>2020-04-30T14:32:27Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Original version : 65 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only neighbour cells inside a 3x3 neighbourhood , ALL cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=525</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=525"/>
				<updated>2020-04-30T14:32:03Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: /* Original version : 65 bytes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/br&amp;gt;&lt;br /&gt;
It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules &amp;quot;normal&amp;quot; game of life rules, but with a twist. Instead of regarding only neighbour cells inside a 3x3 neighbourhood , ALL cells are taken into consideration, and the rules are reinterpreted as:&lt;br /&gt;
&lt;br /&gt;
* If the number of cells is 3, the center cell will be alive.&lt;br /&gt;
* If the number of cells is 4, the center cell keeps its state.&lt;br /&gt;
* Otherwise, the cell dies (or stays dead).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=524</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=524"/>
				<updated>2020-04-30T12:36:46Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can [https://www.pouet.net/prod.php?which=85485 download and comment] the intro.&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=523</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=523"/>
				<updated>2020-04-30T12:35:41Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This writeup is still in the works! Come back later to get information on all the mean tricks ;)&lt;br /&gt;
&lt;br /&gt;
==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=522</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=522"/>
				<updated>2020-04-30T12:28:28Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
		 push 0xb800&lt;br /&gt;
		 pop ds&lt;br /&gt;
LifeLoop:&lt;br /&gt;
         stc                   ; 3 = birth, 4 = stay (tricky):&lt;br /&gt;
         rcr al,cl             ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3)&lt;br /&gt;
         and al,20h            ; ^carry   |         ^&lt;br /&gt;
         or [si-2],al          ; Add in new cell     ^&lt;br /&gt;
         shr byte [si-160-6],5    ; Shift previous value&lt;br /&gt;
         mov bl,6 &lt;br /&gt;
		xchg cx,ax&lt;br /&gt;
AccLoop: add  cl,[si+bx-160-4]    ; Add in this column&lt;br /&gt;
         add  cl,[si+bx-4]&lt;br /&gt;
         add  cl,[si+bx+160-4]&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         dec  bx               ; Loop back&lt;br /&gt;
         jnz  AccLoop&lt;br /&gt;
		 lodsw&lt;br /&gt;
         jmp LifeLoop&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=521</id>
		<title>Game of Life 32b</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=Game_of_Life_32b&amp;diff=521"/>
				<updated>2020-04-30T12:26:56Z</updated>
		
		<summary type="html">&lt;p&gt;HellMood: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Original version : 65 bytes==&lt;br /&gt;
We'll start with the old 65 bytes version and bring it down to 32 bytes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm&lt;br /&gt;
; Life simulator, 72 bytes  - Vladislav Kaipetsky and Tenie Remmel &lt;br /&gt;
;                 65 bytes  - Mark Andreas &lt;br /&gt;
 &lt;br /&gt;
; If no args, regs on startup are: &lt;br /&gt;
 &lt;br /&gt;
; AX = BX = 0000h &lt;br /&gt;
; SI = IP = 0100h &lt;br /&gt;
; DI = SP = FFFEh &lt;br /&gt;
 &lt;br /&gt;
IDEAL &lt;br /&gt;
MODEL TINY &lt;br /&gt;
P386 &lt;br /&gt;
CODESEG &lt;br /&gt;
ORG 100h &lt;br /&gt;
 &lt;br /&gt;
Start:  int     1ah             ; ah=00: cx=hours, dx=tic counter &lt;br /&gt;
 &lt;br /&gt;
        mov     al,13h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        xchg    dx,ax &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero &lt;br /&gt;
RandLoop: &lt;br /&gt;
        rol     ax,1            ; Generate random number &lt;br /&gt;
        adc     [bx],al &lt;br /&gt;
        dec     bx &lt;br /&gt;
        jnz     RandLoop &lt;br /&gt;
 &lt;br /&gt;
; BX will not be equal to 3 the first time this loop is executed, but &lt;br /&gt;
; it will be for all other times.   As SI = 0100h and DI = FFFEh on &lt;br /&gt;
; startup, SI - DI will be equal to 258. &lt;br /&gt;
 &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        inc     si              ; Loop while not zero &lt;br /&gt;
        jnz     LifeLoop &lt;br /&gt;
 &lt;br /&gt;
        mov     ah,1            ; Check for key &lt;br /&gt;
        int     16h &lt;br /&gt;
        jz      LifeLoop        ; Loop if no key &lt;br /&gt;
 &lt;br /&gt;
        xchg    ax,bx           ; Set text mode &lt;br /&gt;
        int     10h &lt;br /&gt;
        ret                     ; Return &lt;br /&gt;
End     Start &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
==Remove key handler and RNG : 44 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
Start:  &lt;br /&gt;
 &lt;br /&gt;
        mov     al,93h          ; Set mode 13h &lt;br /&gt;
        int     10h &lt;br /&gt;
 &lt;br /&gt;
        push    09000h          ; DS = last 64K segment &lt;br /&gt;
        pop     ds &lt;br /&gt;
        push    0A000h          ; ES = video memory &lt;br /&gt;
        pop     es &lt;br /&gt;
                                ; BX is already zero  &lt;br /&gt;
LifeLoop: &lt;br /&gt;
        xchg    cx,ax &lt;br /&gt;
AccLoop: &lt;br /&gt;
        add     cl,[di+bx-64]   ; Add in this column &lt;br /&gt;
        add     cl,[si+bx-2] &lt;br /&gt;
        add     cl,[si+bx+318] &lt;br /&gt;
        dec     bx              ; Loop back &lt;br /&gt;
        jnz     AccLoop &lt;br /&gt;
 &lt;br /&gt;
        ;mov     al,[si]         ; Get center cell, set pixel &lt;br /&gt;
		lodsb&lt;br /&gt;
        stosb &lt;br /&gt;
        stc                     ; 3 = birth, 4 = stay (tricky): &lt;br /&gt;
        rcr     al,cl           ; 1.00?0000x --&amp;gt; 0.0x100?00 (rcr 3) &lt;br /&gt;
        and     al,20h          ; ^carry   |         ^ &lt;br /&gt;
                                ;          +---&amp;gt; 0.00x100?0 (rcr 4) &lt;br /&gt;
        or      [si-1],al       ; Add in new cell     ^ &lt;br /&gt;
        shr     [byte di-65],5  ; Shift previous value &lt;br /&gt;
 &lt;br /&gt;
        mov     bl,3            ; 3 iterations in AccLoop &lt;br /&gt;
        ; inc     si              ; Loop while not zero &lt;br /&gt;
        jmp short     LifeLoop &lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== Switching to Textmode : 39 bytes==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
== TODO ==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;nasm&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>HellMood</name></author>	</entry>

	</feed>