<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://www.sizecoding.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Bwl</id>
		<title>SizeCoding - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="http://www.sizecoding.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Bwl"/>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/wiki/Special:Contributions/Bwl"/>
		<updated>2026-05-03T05:53:14Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.27.0</generator>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=MELT.COM&amp;diff=789</id>
		<title>MELT.COM</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=MELT.COM&amp;diff=789"/>
				<updated>2021-01-08T11:34:52Z</updated>
		
		<summary type="html">&lt;p&gt;Bwl: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Case Study]]&lt;br /&gt;
MELT.COM was written by an unknown author in the 1980s.  Originally 49 bytes in size, it performs the following cute effect:&lt;br /&gt;
&lt;br /&gt;
{{#ev:youtube|https://www.youtube.com/watch?v=MoGKh5X3nS4}}&lt;br /&gt;
&lt;br /&gt;
(The video is simulated and shows how MELT performs on the old hardware it was written for.)  This effect is achieved by increasing or decreasing each onscreen character's value until it reaches #32, the space character.  The original source is lost to history, but here's a quick commented disassembly:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
                org 100h&lt;br /&gt;
&lt;br /&gt;
                mov     ax, 0B800h&lt;br /&gt;
                mov     es, ax          ; es now points to screen segment&lt;br /&gt;
&lt;br /&gt;
doScreen:                               &lt;br /&gt;
                mov     cx, 2000        ; Going to loop over all (80*25=)2000 characters&lt;br /&gt;
                xor     bx, bx          ; bx = 0.  Also our &amp;quot;num of altered chars&amp;quot; counter&lt;br /&gt;
                mov     di, bx          ; es:di now points at the screen (b800:0000)&lt;br /&gt;
&lt;br /&gt;
alterChars:                             &lt;br /&gt;
                mov     ah, es:[di]     ; Retreive onscreen character&lt;br /&gt;
                cmp     ah, 32          ; comp to a space character (#32)&lt;br /&gt;
                jz      short nextChar  ; If already a space, do nothing&lt;br /&gt;
                jl      short upToSpace ; If lower than a space, increase upward&lt;br /&gt;
                dec     ah              ; If higher than a space, decrease downward&lt;br /&gt;
                mov     es:[di], ah     ; Store altered character back to screen&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
                jmp     short nextChar  ; Keep processing characters&lt;br /&gt;
&lt;br /&gt;
upToSpace:                              &lt;br /&gt;
                inc     ah              ; Increase character upwards towards a space&lt;br /&gt;
                mov     es:[di], ah     ; Store altered character back to screen&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
&lt;br /&gt;
nextChar:                               &lt;br /&gt;
                inc     di&lt;br /&gt;
                inc     di              ; es:di now points to next character&lt;br /&gt;
                loop    alterChars      ; Continue processing characters&lt;br /&gt;
                cmp     bx, 0           ; Were any characters processed?&lt;br /&gt;
                jnz     short doScreen  ; If so (bx != 0), keep processing&lt;br /&gt;
                mov     ah, 4Ch         ; Otherwise, get ready to terminate&lt;br /&gt;
                int     21h             ; DOS - 2+ - QUIT WITH EXIT CODE (EXIT)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some very quick wins right off the bat:&lt;br /&gt;
&lt;br /&gt;
* AX is already 0, so set AX=B800 by setting AH=B8&lt;br /&gt;
* Replace DOS exit sequence with RET, since this is a .COM file&lt;br /&gt;
* The &amp;quot;CMP bx,0&amp;quot; can be replaced with a shorter test like &amp;quot;OR bx,bx&amp;quot;&lt;br /&gt;
* There is inefficient loading and saving of screen characters with manually advancing DI twice -- 80x86 has 1-byte string instructions to do that, so we'll use LODSW and STOSW.  This will load the character attribute byte redundantly, but we don't care about speed, we care about size.  Because we only use LODS once, we can shave a byte by getting rid of MOV DS,AX (3 bytes) and using ES: LODS (2 bytes) in the inner loop.&lt;br /&gt;
&lt;br /&gt;
This shaves 12 bytes down to 37 total.  At this point we can make some drastic changes that will shave bytes, but also make the program not behave ''exactly'' as it did before.  I chose to do the following:&lt;br /&gt;
&lt;br /&gt;
* 2000 is 7D0 in hex.  Change MOV CX,2000 (decimal) to MOV CH,08 (hex) to shave a byte.  This could result in CX being anywhere in the range 0800 to 08FF but the difference is minimal at execution time.  It's also larger than the original area, but that is fine since there is there is extra screen RAM after the visible portion of the screen.&lt;br /&gt;
* The code contains a check for characters below #32 (space) and moves them upward.  Most characters onscreen are going to be above #32, so this isn't really necessary and I removed the check.  Even if it were necessary, removing the check will just rotate them downward until they wrap around to 255, then go downward again until hitting #32 and stopping.&lt;br /&gt;
&lt;br /&gt;
This gets the code down to 29 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
                org 100h&lt;br /&gt;
&lt;br /&gt;
                mov     ah, 0B8h&lt;br /&gt;
                mov     es, ax          ; es so we can use es: LODS and STOS to the same place&lt;br /&gt;
&lt;br /&gt;
doScreen:&lt;br /&gt;
                mov     ch, 08          ; (80 * 25 = 2000/07d0h)&lt;br /&gt;
                xor     bx, bx          ; bx is also our &amp;quot;num of altered chars&amp;quot; counter&lt;br /&gt;
                mov     di, bx          ; es:di now points at the screen (b800:0000)&lt;br /&gt;
                mov     si, di          ; ds:si = es:di, needed for lods/stos&lt;br /&gt;
&lt;br /&gt;
alterChars:&lt;br /&gt;
                es: lodsw               ; Retreive onscreen character&lt;br /&gt;
                cmp     al, 32          ; comp to a space character (#32)&lt;br /&gt;
                jz      short nextChar  ; If already a space, do nothing&lt;br /&gt;
                dec     al              ; If higher than a space, decrease downward&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
&lt;br /&gt;
nextChar:&lt;br /&gt;
                stosw&lt;br /&gt;
                loop    alterChars      ; Continue processing characters&lt;br /&gt;
                or      bx,bx           ; Were any characters processed?&lt;br /&gt;
                jnz     short doScreen  ; If so (bx != 0), keep processing&lt;br /&gt;
                ret                     ; exit&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Optimizing the above is left as an exercise to the reader.  It may be possible to shrink it further using instructions for the 80186 and later processors, but it runs way too quickly on those systems, so if doing so you may want to insert a HLT to pace the animation at 18.2 Hz.  It may also be possible to shrink it by taking advantage of the fact that ASCII character #0 is a null and shows nothing onscreen (just like the #32 space character), so instead of decrementing down to #32, you could decrement down to #0 in a more optimized form (like using the zero flag).&lt;br /&gt;
&lt;br /&gt;
== Or you could rethink it completely ==&lt;br /&gt;
&lt;br /&gt;
If you're willing to deviate from the original target environment (namely, that it was designed for a 4.77 MHz 8088), you can eliminate range checking for all sorts of optimizations.  &lt;br /&gt;
Instead of using two pointers/segments, it's possible to use only one, and directly operate on the chars on screen. Combined with optimized segment assignment and exclusion of the &amp;quot;processed chars&amp;quot;-count (fixed amount of iterations, set to the maximum possible 65536 * 255) this results in a 17 byte variation :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;nop&amp;quot; = 0AF90h&lt;br /&gt;
			nop						; for alignment, maybe optimization potential here ;)&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],32	    ; is the current char on screen a &amp;quot;space&amp;quot;?&lt;br /&gt;
			jz		short skip 		; yes, don't decrement&lt;br /&gt;
			dec 	byte [di]		; nope, decrement&lt;br /&gt;
skip:&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above performs the same function as the previous optimization, but runs much more slowly as it traverses more RAM areas than necessary.&lt;br /&gt;
&lt;br /&gt;
There might be a way of altering the &amp;lt;code&amp;gt;NOP&amp;lt;/code&amp;gt; to something useful.&lt;br /&gt;
&lt;br /&gt;
Returning to the non-printing null char, we can shave one more byte, to produce this 16 bytes version:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;nop&amp;quot; = 0AF90h&lt;br /&gt;
			nop						; for alignment, maybe optimization potential here ;)&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],al	    ; is the current char on screen a &amp;quot;null&amp;quot;?&lt;br /&gt;
			jz		short skip 		; yes, don't decrement&lt;br /&gt;
			dec 	byte [di]		; nope, decrement&lt;br /&gt;
skip:&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above performs a similar function as the previous optimization, but runs even more slowly as it requires regular characters to be decremented further than before.&lt;br /&gt;
&lt;br /&gt;
Note the &amp;lt;CODE&amp;gt;CMP/JZ/DEC&amp;lt;/CODE&amp;gt; sequence.  If we can control the carry flag as a result of the comparison, then we can perform a conditional decrement without the branch, via an &amp;lt;CODE&amp;gt;ADC&amp;lt;/CODE&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Of course, if the AL register is zero, then there can never be a carry as a result of the comparison, but if we increase AL to 1, then every value other than zero will clear the carry.&lt;br /&gt;
&lt;br /&gt;
To increase the AX register would normally cost us one byte, but we happen to have one to spare in the &amp;lt;CODE&amp;gt;LDS&amp;lt;/CODE&amp;gt; alignment.&lt;br /&gt;
&lt;br /&gt;
That allows us to shave one more byte, to produce this 15 bytes version:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;inc&amp;quot; = 0AF40h&lt;br /&gt;
			inc		ax				; set AL to 1, also for alignment&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],al	    ; is the current char on screen a &amp;quot;null&amp;quot;?&lt;br /&gt;
			adc 	byte [di],0ffh  ; if not, then decrement&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Use of &amp;lt;CODE&amp;gt;scasw&amp;lt;/CODE&amp;gt; opens up the mind to the use of the nice built-in comparison if we switch to &amp;lt;CODE&amp;gt;scasb&amp;lt;/CODE&amp;gt;.&lt;br /&gt;
As we need to use the es segment we need to to take the cost of an segement prefix.&lt;br /&gt;
As the comparison from &amp;lt;CODE&amp;gt;scasb&amp;lt;/CODE&amp;gt; is reversed as opposed to the manual &amp;lt;CODE&amp;gt;cmp&amp;lt;/CODE&amp;gt; we also need to change the &amp;lt;CODE&amp;gt;adc,-1&amp;lt;/CODE&amp;gt; to an &amp;lt;CODE&amp;gt;sbb,0&amp;lt;/CODE&amp;gt;. &lt;br /&gt;
Also we do not need an additional counter apart from di which is already used, however cx needs to be increased by one.&lt;br /&gt;
&lt;br /&gt;
14 bytes:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			les		dx,[si]&lt;br /&gt;
			inc		cx&lt;br /&gt;
doChar:&lt;br /&gt;
			scasb&lt;br /&gt;
			sbb 	[byte es:di-1],al&lt;br /&gt;
			inc		di&lt;br /&gt;
			jnz 	doChar&lt;br /&gt;
			loop 	doChar&lt;br /&gt;
			ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If we then accept the loss of the last iteration, thereby leaving an occasional &amp;quot;happy face&amp;quot; (0xff) on the screen, we can be even more creative with the addressing and use di+bx, where we decrement bx to -1.&lt;br /&gt;
Alternatives as &amp;lt;CODE&amp;gt;dec si/jns&amp;lt;/CODE&amp;gt; can be used instead of &amp;lt;CODE&amp;gt;loop&amp;lt;/CODE&amp;gt; for yet another correct 14 bytes version.&lt;br /&gt;
&lt;br /&gt;
13 bytes:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			les		dx,[si]&lt;br /&gt;
			dec		bx&lt;br /&gt;
doChar:&lt;br /&gt;
			scasb&lt;br /&gt;
			sbb 	[es:di+bx],al&lt;br /&gt;
			inc		di&lt;br /&gt;
			jnz 	doChar&lt;br /&gt;
			loop 	doChar&lt;br /&gt;
			ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Bwl</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=MELT.COM&amp;diff=788</id>
		<title>MELT.COM</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=MELT.COM&amp;diff=788"/>
				<updated>2021-01-08T11:30:19Z</updated>
		
		<summary type="html">&lt;p&gt;Bwl: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Case Study]]&lt;br /&gt;
MELT.COM was written by an unknown author in the 1980s.  Originally 49 bytes in size, it performs the following cute effect:&lt;br /&gt;
&lt;br /&gt;
{{#ev:youtube|https://www.youtube.com/watch?v=MoGKh5X3nS4}}&lt;br /&gt;
&lt;br /&gt;
(The video is simulated and shows how MELT performs on the old hardware it was written for.)  This effect is achieved by increasing or decreasing each onscreen character's value until it reaches #32, the space character.  The original source is lost to history, but here's a quick commented disassembly:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
                org 100h&lt;br /&gt;
&lt;br /&gt;
                mov     ax, 0B800h&lt;br /&gt;
                mov     es, ax          ; es now points to screen segment&lt;br /&gt;
&lt;br /&gt;
doScreen:                               &lt;br /&gt;
                mov     cx, 2000        ; Going to loop over all (80*25=)2000 characters&lt;br /&gt;
                xor     bx, bx          ; bx = 0.  Also our &amp;quot;num of altered chars&amp;quot; counter&lt;br /&gt;
                mov     di, bx          ; es:di now points at the screen (b800:0000)&lt;br /&gt;
&lt;br /&gt;
alterChars:                             &lt;br /&gt;
                mov     ah, es:[di]     ; Retreive onscreen character&lt;br /&gt;
                cmp     ah, 32          ; comp to a space character (#32)&lt;br /&gt;
                jz      short nextChar  ; If already a space, do nothing&lt;br /&gt;
                jl      short upToSpace ; If lower than a space, increase upward&lt;br /&gt;
                dec     ah              ; If higher than a space, decrease downward&lt;br /&gt;
                mov     es:[di], ah     ; Store altered character back to screen&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
                jmp     short nextChar  ; Keep processing characters&lt;br /&gt;
&lt;br /&gt;
upToSpace:                              &lt;br /&gt;
                inc     ah              ; Increase character upwards towards a space&lt;br /&gt;
                mov     es:[di], ah     ; Store altered character back to screen&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
&lt;br /&gt;
nextChar:                               &lt;br /&gt;
                inc     di&lt;br /&gt;
                inc     di              ; es:di now points to next character&lt;br /&gt;
                loop    alterChars      ; Continue processing characters&lt;br /&gt;
                cmp     bx, 0           ; Were any characters processed?&lt;br /&gt;
                jnz     short doScreen  ; If so (bx != 0), keep processing&lt;br /&gt;
                mov     ah, 4Ch         ; Otherwise, get ready to terminate&lt;br /&gt;
                int     21h             ; DOS - 2+ - QUIT WITH EXIT CODE (EXIT)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some very quick wins right off the bat:&lt;br /&gt;
&lt;br /&gt;
* AX is already 0, so set AX=B800 by setting AH=B8&lt;br /&gt;
* Replace DOS exit sequence with RET, since this is a .COM file&lt;br /&gt;
* The &amp;quot;CMP bx,0&amp;quot; can be replaced with a shorter test like &amp;quot;OR bx,bx&amp;quot;&lt;br /&gt;
* There is inefficient loading and saving of screen characters with manually advancing DI twice -- 80x86 has 1-byte string instructions to do that, so we'll use LODSW and STOSW.  This will load the character attribute byte redundantly, but we don't care about speed, we care about size.  Because we only use LODS once, we can shave a byte by getting rid of MOV DS,AX (3 bytes) and using ES: LODS (2 bytes) in the inner loop.&lt;br /&gt;
&lt;br /&gt;
This shaves 12 bytes down to 37 total.  At this point we can make some drastic changes that will shave bytes, but also make the program not behave ''exactly'' as it did before.  I chose to do the following:&lt;br /&gt;
&lt;br /&gt;
* 2000 is 7D0 in hex.  Change MOV CX,2000 (decimal) to MOV CH,08 (hex) to shave a byte.  This could result in CX being anywhere in the range 0800 to 08FF but the difference is minimal at execution time.  It's also larger than the original area, but that is fine since there is there is extra screen RAM after the visible portion of the screen.&lt;br /&gt;
* The code contains a check for characters below #32 (space) and moves them upward.  Most characters onscreen are going to be above #32, so this isn't really necessary and I removed the check.  Even if it were necessary, removing the check will just rotate them downward until they wrap around to 255, then go downward again until hitting #32 and stopping.&lt;br /&gt;
&lt;br /&gt;
This gets the code down to 29 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
                org 100h&lt;br /&gt;
&lt;br /&gt;
                mov     ah, 0B8h&lt;br /&gt;
                mov     es, ax          ; es so we can use es: LODS and STOS to the same place&lt;br /&gt;
&lt;br /&gt;
doScreen:&lt;br /&gt;
                mov     ch, 08          ; (80 * 25 = 2000/07d0h)&lt;br /&gt;
                xor     bx, bx          ; bx is also our &amp;quot;num of altered chars&amp;quot; counter&lt;br /&gt;
                mov     di, bx          ; es:di now points at the screen (b800:0000)&lt;br /&gt;
                mov     si, di          ; ds:si = es:di, needed for lods/stos&lt;br /&gt;
&lt;br /&gt;
alterChars:&lt;br /&gt;
                es: lodsw               ; Retreive onscreen character&lt;br /&gt;
                cmp     al, 32          ; comp to a space character (#32)&lt;br /&gt;
                jz      short nextChar  ; If already a space, do nothing&lt;br /&gt;
                dec     al              ; If higher than a space, decrease downward&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
&lt;br /&gt;
nextChar:&lt;br /&gt;
                stosw&lt;br /&gt;
                loop    alterChars      ; Continue processing characters&lt;br /&gt;
                or      bx,bx           ; Were any characters processed?&lt;br /&gt;
                jnz     short doScreen  ; If so (bx != 0), keep processing&lt;br /&gt;
                ret                     ; exit&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Optimizing the above is left as an exercise to the reader.  It may be possible to shrink it further using instructions for the 80186 and later processors, but it runs way too quickly on those systems, so if doing so you may want to insert a HLT to pace the animation at 18.2 Hz.  It may also be possible to shrink it by taking advantage of the fact that ASCII character #0 is a null and shows nothing onscreen (just like the #32 space character), so instead of decrementing down to #32, you could decrement down to #0 in a more optimized form (like using the zero flag).&lt;br /&gt;
&lt;br /&gt;
== Or you could rethink it completely ==&lt;br /&gt;
&lt;br /&gt;
If you're willing to deviate from the original target environment (namely, that it was designed for a 4.77 MHz 8088), you can eliminate range checking for all sorts of optimizations.  &lt;br /&gt;
Instead of using two pointers/segments, it's possible to use only one, and directly operate on the chars on screen. Combined with optimized segment assignment and exclusion of the &amp;quot;processed chars&amp;quot;-count (fixed amount of iterations, set to the maximum possible 65536 * 255) this results in a 17 byte variation :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;nop&amp;quot; = 0AF90h&lt;br /&gt;
			nop						; for alignment, maybe optimization potential here ;)&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],32	    ; is the current char on screen a &amp;quot;space&amp;quot;?&lt;br /&gt;
			jz		short skip 		; yes, don't decrement&lt;br /&gt;
			dec 	byte [di]		; nope, decrement&lt;br /&gt;
skip:&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above performs the same function as the previous optimization, but runs much more slowly as it traverses more RAM areas than necessary.&lt;br /&gt;
&lt;br /&gt;
There might be a way of altering the &amp;lt;code&amp;gt;NOP&amp;lt;/code&amp;gt; to something useful.&lt;br /&gt;
&lt;br /&gt;
Returning to the non-printing null char, we can shave one more byte, to produce this 16 bytes version:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;nop&amp;quot; = 0AF90h&lt;br /&gt;
			nop						; for alignment, maybe optimization potential here ;)&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],al	    ; is the current char on screen a &amp;quot;null&amp;quot;?&lt;br /&gt;
			jz		short skip 		; yes, don't decrement&lt;br /&gt;
			dec 	byte [di]		; nope, decrement&lt;br /&gt;
skip:&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above performs a similar function as the previous optimization, but runs even more slowly as it requires regular characters to be decremented further than before.&lt;br /&gt;
&lt;br /&gt;
Note the &amp;lt;CODE&amp;gt;CMP/JZ/DEC&amp;lt;/CODE&amp;gt; sequence.  If we can control the carry flag as a result of the comparison, then we can perform a conditional decrement without the branch, via an &amp;lt;CODE&amp;gt;ADC&amp;lt;/CODE&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Of course, if the AL register is zero, then there can never be a carry as a result of the comparison, but if we increase AL to 1, then every value other than zero will clear the carry.&lt;br /&gt;
&lt;br /&gt;
To increase the AX register would normally cost us one byte, but we happen to have one to spare in the &amp;lt;CODE&amp;gt;LDS&amp;lt;/CODE&amp;gt; alignment.&lt;br /&gt;
&lt;br /&gt;
That allows us to shave one more byte, to produce this 15 bytes version:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;inc&amp;quot; = 0AF40h&lt;br /&gt;
			inc		ax				; set AL to 1, also for alignment&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],al	    ; is the current char on screen a &amp;quot;null&amp;quot;?&lt;br /&gt;
			adc 	byte [di],0ffh  ; if not, then decrement&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Use of &amp;lt;CODE&amp;gt;scasw&amp;lt;/CODE&amp;gt; opens up the mind to the use of the nice built-in comparison if we switch to &amp;lt;CODE&amp;gt;scasb&amp;lt;/CODE&amp;gt;.&lt;br /&gt;
As we need to use the es segment we need to to take the cost of an segement prefix.&lt;br /&gt;
As the comparison from &amp;lt;CODE&amp;gt;scasb&amp;lt;/CODE&amp;gt; is reversed as opposed to the manual &amp;lt;CODE&amp;gt;cmp&amp;lt;/CODE&amp;gt; we also need to change the &amp;lt;CODE&amp;gt;adc,-1&amp;lt;/CODE&amp;gt; to an &amp;lt;CODE&amp;gt;sbb,0&amp;lt;/CODE&amp;gt;. &lt;br /&gt;
Also we do not need an additional counter apart from di which is already used, however cx needs to be increased by one.&lt;br /&gt;
&lt;br /&gt;
14 bytes:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
	les	dx,[si]&lt;br /&gt;
	inc	cx&lt;br /&gt;
doChar:&lt;br /&gt;
	scasb&lt;br /&gt;
	sbb 	[byte es:di-1],al&lt;br /&gt;
	inc	di&lt;br /&gt;
	jnz 	doChar&lt;br /&gt;
	loop 	doChar&lt;br /&gt;
	ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If we then accept the loss of the last iteration, thereby leaving an occasional &amp;quot;happy face&amp;quot; (0xff) on the screen, we can be even more creative with the addressing and use di+bx, where we decrement bx to -1.&lt;br /&gt;
Alternatives as &amp;lt;CODE&amp;gt;dec si/jns&amp;lt;/CODE&amp;gt; can be used instead of &amp;lt;CODE&amp;gt;loop&amp;lt;/CODE&amp;gt; for yet another correct 14 bytes version.&lt;br /&gt;
&lt;br /&gt;
13 bytes:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
	les	dx,[si]&lt;br /&gt;
	dec	bx&lt;br /&gt;
doChar:&lt;br /&gt;
	scasb&lt;br /&gt;
	sbb 	[es:di+bx],al&lt;br /&gt;
	inc	di&lt;br /&gt;
	jnz 	doChar&lt;br /&gt;
	loop 	doChar&lt;br /&gt;
	ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Bwl</name></author>	</entry>

	<entry>
		<id>http://www.sizecoding.org/index.php?title=MELT.COM&amp;diff=787</id>
		<title>MELT.COM</title>
		<link rel="alternate" type="text/html" href="http://www.sizecoding.org/index.php?title=MELT.COM&amp;diff=787"/>
				<updated>2021-01-08T11:27:19Z</updated>
		
		<summary type="html">&lt;p&gt;Bwl: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Case Study]]&lt;br /&gt;
MELT.COM was written by an unknown author in the 1980s.  Originally 49 bytes in size, it performs the following cute effect:&lt;br /&gt;
&lt;br /&gt;
{{#ev:youtube|https://www.youtube.com/watch?v=MoGKh5X3nS4}}&lt;br /&gt;
&lt;br /&gt;
(The video is simulated and shows how MELT performs on the old hardware it was written for.)  This effect is achieved by increasing or decreasing each onscreen character's value until it reaches #32, the space character.  The original source is lost to history, but here's a quick commented disassembly:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
                org 100h&lt;br /&gt;
&lt;br /&gt;
                mov     ax, 0B800h&lt;br /&gt;
                mov     es, ax          ; es now points to screen segment&lt;br /&gt;
&lt;br /&gt;
doScreen:                               &lt;br /&gt;
                mov     cx, 2000        ; Going to loop over all (80*25=)2000 characters&lt;br /&gt;
                xor     bx, bx          ; bx = 0.  Also our &amp;quot;num of altered chars&amp;quot; counter&lt;br /&gt;
                mov     di, bx          ; es:di now points at the screen (b800:0000)&lt;br /&gt;
&lt;br /&gt;
alterChars:                             &lt;br /&gt;
                mov     ah, es:[di]     ; Retreive onscreen character&lt;br /&gt;
                cmp     ah, 32          ; comp to a space character (#32)&lt;br /&gt;
                jz      short nextChar  ; If already a space, do nothing&lt;br /&gt;
                jl      short upToSpace ; If lower than a space, increase upward&lt;br /&gt;
                dec     ah              ; If higher than a space, decrease downward&lt;br /&gt;
                mov     es:[di], ah     ; Store altered character back to screen&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
                jmp     short nextChar  ; Keep processing characters&lt;br /&gt;
&lt;br /&gt;
upToSpace:                              &lt;br /&gt;
                inc     ah              ; Increase character upwards towards a space&lt;br /&gt;
                mov     es:[di], ah     ; Store altered character back to screen&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
&lt;br /&gt;
nextChar:                               &lt;br /&gt;
                inc     di&lt;br /&gt;
                inc     di              ; es:di now points to next character&lt;br /&gt;
                loop    alterChars      ; Continue processing characters&lt;br /&gt;
                cmp     bx, 0           ; Were any characters processed?&lt;br /&gt;
                jnz     short doScreen  ; If so (bx != 0), keep processing&lt;br /&gt;
                mov     ah, 4Ch         ; Otherwise, get ready to terminate&lt;br /&gt;
                int     21h             ; DOS - 2+ - QUIT WITH EXIT CODE (EXIT)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some very quick wins right off the bat:&lt;br /&gt;
&lt;br /&gt;
* AX is already 0, so set AX=B800 by setting AH=B8&lt;br /&gt;
* Replace DOS exit sequence with RET, since this is a .COM file&lt;br /&gt;
* The &amp;quot;CMP bx,0&amp;quot; can be replaced with a shorter test like &amp;quot;OR bx,bx&amp;quot;&lt;br /&gt;
* There is inefficient loading and saving of screen characters with manually advancing DI twice -- 80x86 has 1-byte string instructions to do that, so we'll use LODSW and STOSW.  This will load the character attribute byte redundantly, but we don't care about speed, we care about size.  Because we only use LODS once, we can shave a byte by getting rid of MOV DS,AX (3 bytes) and using ES: LODS (2 bytes) in the inner loop.&lt;br /&gt;
&lt;br /&gt;
This shaves 12 bytes down to 37 total.  At this point we can make some drastic changes that will shave bytes, but also make the program not behave ''exactly'' as it did before.  I chose to do the following:&lt;br /&gt;
&lt;br /&gt;
* 2000 is 7D0 in hex.  Change MOV CX,2000 (decimal) to MOV CH,08 (hex) to shave a byte.  This could result in CX being anywhere in the range 0800 to 08FF but the difference is minimal at execution time.  It's also larger than the original area, but that is fine since there is there is extra screen RAM after the visible portion of the screen.&lt;br /&gt;
* The code contains a check for characters below #32 (space) and moves them upward.  Most characters onscreen are going to be above #32, so this isn't really necessary and I removed the check.  Even if it were necessary, removing the check will just rotate them downward until they wrap around to 255, then go downward again until hitting #32 and stopping.&lt;br /&gt;
&lt;br /&gt;
This gets the code down to 29 bytes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
                org 100h&lt;br /&gt;
&lt;br /&gt;
                mov     ah, 0B8h&lt;br /&gt;
                mov     es, ax          ; es so we can use es: LODS and STOS to the same place&lt;br /&gt;
&lt;br /&gt;
doScreen:&lt;br /&gt;
                mov     ch, 08          ; (80 * 25 = 2000/07d0h)&lt;br /&gt;
                xor     bx, bx          ; bx is also our &amp;quot;num of altered chars&amp;quot; counter&lt;br /&gt;
                mov     di, bx          ; es:di now points at the screen (b800:0000)&lt;br /&gt;
                mov     si, di          ; ds:si = es:di, needed for lods/stos&lt;br /&gt;
&lt;br /&gt;
alterChars:&lt;br /&gt;
                es: lodsw               ; Retreive onscreen character&lt;br /&gt;
                cmp     al, 32          ; comp to a space character (#32)&lt;br /&gt;
                jz      short nextChar  ; If already a space, do nothing&lt;br /&gt;
                dec     al              ; If higher than a space, decrease downward&lt;br /&gt;
                inc     bx              ; increase &amp;quot;number of processed chars&amp;quot; counter&lt;br /&gt;
&lt;br /&gt;
nextChar:&lt;br /&gt;
                stosw&lt;br /&gt;
                loop    alterChars      ; Continue processing characters&lt;br /&gt;
                or      bx,bx           ; Were any characters processed?&lt;br /&gt;
                jnz     short doScreen  ; If so (bx != 0), keep processing&lt;br /&gt;
                ret                     ; exit&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Optimizing the above is left as an exercise to the reader.  It may be possible to shrink it further using instructions for the 80186 and later processors, but it runs way too quickly on those systems, so if doing so you may want to insert a HLT to pace the animation at 18.2 Hz.  It may also be possible to shrink it by taking advantage of the fact that ASCII character #0 is a null and shows nothing onscreen (just like the #32 space character), so instead of decrementing down to #32, you could decrement down to #0 in a more optimized form (like using the zero flag).&lt;br /&gt;
&lt;br /&gt;
== Or you could rethink it completely ==&lt;br /&gt;
&lt;br /&gt;
If you're willing to deviate from the original target environment (namely, that it was designed for a 4.77 MHz 8088), you can eliminate range checking for all sorts of optimizations.  &lt;br /&gt;
Instead of using two pointers/segments, it's possible to use only one, and directly operate on the chars on screen. Combined with optimized segment assignment and exclusion of the &amp;quot;processed chars&amp;quot;-count (fixed amount of iterations, set to the maximum possible 65536 * 255) this results in a 17 byte variation :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;nop&amp;quot; = 0AF90h&lt;br /&gt;
			nop						; for alignment, maybe optimization potential here ;)&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],32	    ; is the current char on screen a &amp;quot;space&amp;quot;?&lt;br /&gt;
			jz		short skip 		; yes, don't decrement&lt;br /&gt;
			dec 	byte [di]		; nope, decrement&lt;br /&gt;
skip:&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above performs the same function as the previous optimization, but runs much more slowly as it traverses more RAM areas than necessary.&lt;br /&gt;
&lt;br /&gt;
There might be a way of altering the &amp;lt;code&amp;gt;NOP&amp;lt;/code&amp;gt; to something useful.&lt;br /&gt;
&lt;br /&gt;
Returning to the non-printing null char, we can shave one more byte, to produce this 16 bytes version:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;nop&amp;quot; = 0AF90h&lt;br /&gt;
			nop						; for alignment, maybe optimization potential here ;)&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],al	    ; is the current char on screen a &amp;quot;null&amp;quot;?&lt;br /&gt;
			jz		short skip 		; yes, don't decrement&lt;br /&gt;
			dec 	byte [di]		; nope, decrement&lt;br /&gt;
skip:&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above performs a similar function as the previous optimization, but runs even more slowly as it requires regular characters to be decremented further than before.&lt;br /&gt;
&lt;br /&gt;
Note the &amp;lt;CODE&amp;gt;CMP/JZ/DEC&amp;lt;/CODE&amp;gt; sequence.  If we can control the carry flag as a result of the comparison, then we can perform a conditional decrement without the branch, via an &amp;lt;CODE&amp;gt;ADC&amp;lt;/CODE&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Of course, if the AL register is zero, then there can never be a carry as a result of the comparison, but if we increase AL to 1, then every value other than zero will clear the carry.&lt;br /&gt;
&lt;br /&gt;
To increase the AX register would normally cost us one byte, but we happen to have one to spare in the &amp;lt;CODE&amp;gt;LDS&amp;lt;/CODE&amp;gt; alignment.&lt;br /&gt;
&lt;br /&gt;
That allows us to shave one more byte, to produce this 15 bytes version:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
			lds		bx,[si]			; load DS from our code &amp;quot;scasw&amp;quot; + &amp;quot;inc&amp;quot; = 0AF40h&lt;br /&gt;
			inc		ax				; set AL to 1, also for alignment&lt;br /&gt;
doChar:&lt;br /&gt;
			scasw					; used to advance DI by 2, and as DS high byte (0AFh)&lt;br /&gt;
			cmp 	byte [di],al	    ; is the current char on screen a &amp;quot;null&amp;quot;?&lt;br /&gt;
			adc 	byte [di],0ffh  ; if not, then decrement&lt;br /&gt;
			dec 	dx				; first loop, besides the first time &lt;br /&gt;
			jnz 	doChar			; it's 65536 iterations&lt;br /&gt;
			loop 	doChar			; second loop, 255 times&lt;br /&gt;
			ret						; quit&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Use of &amp;lt;CODE&amp;gt;scasw&amp;lt;/CODE&amp;gt; opens up the mind to the use of the nice built-in comparison if we switch to &amp;lt;CODE&amp;gt;scasb&amp;lt;/CODE&amp;gt;.&lt;br /&gt;
As we need to use the es segment we need to to take the cost of an segement prefix.&lt;br /&gt;
As the comparison from &amp;lt;CODE&amp;gt;scasb&amp;lt;/CODE&amp;gt; is reversed as opposed to the manual &amp;lt;CODE&amp;gt;cmp&amp;lt;/CODE&amp;gt; we also need to change the &amp;lt;CODE&amp;gt;adc,-1&amp;lt;/CODE&amp;gt; to an &amp;lt;CODE&amp;gt;sbb,0&amp;lt;/CODE&amp;gt;. &lt;br /&gt;
Also we do not need an additional counter apart from di which is already used, however cx needs to be increased by one.&lt;br /&gt;
&lt;br /&gt;
14 bytes:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
	les	dx,[si]&lt;br /&gt;
	inc	cx&lt;br /&gt;
doChar:&lt;br /&gt;
	scasb&lt;br /&gt;
	sbb 	[byte es:di-1],al&lt;br /&gt;
	inc	di&lt;br /&gt;
	jnz 	doChar&lt;br /&gt;
	loop 	doChar&lt;br /&gt;
	ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If we then accept the loss of the last iteration, thereby leaving an occasional &amp;quot;happy face&amp;quot; (0xff) on the screen, we can be even more creative with the addressing and use di+bx, where we decrement bx to -1.&lt;br /&gt;
Alternatives as &amp;lt;CODE&amp;gt;dec si/jnz&amp;lt;/CODE&amp;gt; can be used instead of &amp;lt;CODE&amp;gt;loop&amp;lt;/CODE&amp;gt; for yet another correct 14 bytes version.&lt;br /&gt;
&lt;br /&gt;
13 bytes:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=nasm&amp;gt;&lt;br /&gt;
	les	dx,[si]&lt;br /&gt;
	dec	bx&lt;br /&gt;
doChar:&lt;br /&gt;
	scasb&lt;br /&gt;
	sbb 	[es:di+bx],al&lt;br /&gt;
	inc	di&lt;br /&gt;
	jnz 	doChar&lt;br /&gt;
	loop 	doChar&lt;br /&gt;
	ret&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Bwl</name></author>	</entry>

	</feed>