Memories

From SizeCoding
Revision as of 07:45, 20 April 2020 by HellMood (talk | contribs) (Parallax checkerboards)

Jump to: navigation, search

Introduction

Hello! My name is "HellMood" and this is not a usual wiki style article. It will be written from my point of view, in first person. This article is all about a tiny MS DOS program called "Memories". This program has a size of 256 bytes and won the "PC 256 byte" competition of the demoscene event "Revision" in 2020, as well as the public choice award. You can watch a video of the output of this program here (https://www.youtube.com/watch?v=Imquk_3oFf4) and a video including live reactions of online audience and moderators here (https://www.twitch.tv/videos/589179638?t=7h02m40s). You can download the release and comment here (https://www.pouet.net/prod.php?which=85227). This wiki article will give you an indepth analysis of the program and point out historical references as well as development steps. In case you did not notice, you are now on the "sizecoding wiki". This will help you not only to understand the interior of "Memories", but also enable you to produce something like this yourself. Feel free to look around! It is recommended that you start with the basics of this wiki first if you're new to sizecoding and/or x86 assembler. While the principles might be easy to grasp, details of the implementation might not.

Disclaimer

This article discusses the submitted DosBox version (256 bytes). The available archive contains versions for FreeDos and Windows XP DOS which, at the time of this writing, don't work on all computers. These alternative versions were included as proof of concept, to show that this program does not only run on the emulator "DosBox". For the "PC 256 bytes" competition of the event "Revision" in 2020, it was allowed to specify either "FreeDos" or "DosBox" as platform, the latter with a specific configuration. It is in fact possible to modify the DosBox version to work on FreeDos, MS DOS, WinXP and Win98 as the released alternative versions showcase, but this is not the focus of this article. Anyway, safe versions for all platform and all computers are in the works. Regarding possible further optimizations, I will discuss/explain only the submitted version although i already found several spots in the code that can be optimized. As the history of tiny effects show, almost no code is perfect, and can be brought down further in size. To avoid confusion between different versions, only the submitted version will be explained.

History of tiny effects

Size categories on http://www.pouet.net

Sizecoders think in size categories. For MS DOS, these categories are 256b, 128b, 64b and 32b. These are the standards of one of the biggest demoscene archives, http://www.pouet.net. There is no 16b category, although many tiny effects can be implemented in 16 bytes. Almost all of the effects of "Memories" have been coded and optimized by me before, and mostly the implementations were attempts to reduce an already existing effect in size, or do something similar in less size. Reducing the size in this case means, reducing it to the (one of the) next lower 2^N category. For example I might have seen an effect done in 33 to 64 bytes, and then reduced the size to 32 bytes or less. Almost everytime I moved an effect to a lower category, I submitted the resulting tiny program to a demoscene party which allowed remote entries in the 256b category and/or published the result to http://www.pouet.net . In this section I will introduce the effects and mention the origins and authors.

Array of chessboards

kasparov, 16 bytes

The origin of this effect is my own "Kasparov 16b" from 2018 (https://www.pouet.net/prod.php?which=75912) apparently I "had this piece lying around, stuck at 17 bytes and not that beautiful, until i found a trick ;)" (quote from the release notes) There was a similar effect in 32 bytes before : "ew" by "headcrash" from 2003. (https://www.pouet.net/prod.php?which=17567) In this case, my design goal was to show "real" chessboards of 8x8 fields and recognizable dark and light squares, as well as the correct orientation of the single chessboards, meaning that the bottom right corner (h1) has to be a white square. For "Memories", the effect had to be reimplemented with another approach of writing to the screen to work with the framework, also the scrolling direction was changed to be different from the "scrolling tilted plane" effect.

Zooming circles

The zooming circles were supposed to be entered to a demoparty as a 32 bytes intro, but I never actually did it. There is no real predecessor in 64 bytes, because in the 64b category much more complex effects are possible. The zooming circles were the result of desperately trying to reach the 32b category for a circular "tunnel" effect, for that my own record is still 52 bytes ("Neontube" - 2016)(https://www.pouet.net/prod.php?which=66808) which in turn was the optimization of the all-time-classic 64 byte effect "constant evolution" from ryg/Farbrausch (2003) (https://www.pouet.net/prod.php?which=8697). In the zooming circles routine, the distance and the angle are removed/ignored, so that the 32b category could be reached.

Scrolling tilted plane

Floorcast, 32 bytes version, variation

The scrolling tilted plane is one of my own releases, "floorcast 32b" from 2018. Floorcasts have its own history in sizecoding, the effect was brought down stepwise from 256 bytes to 32 bytes. The versions differ in the number of planes, some show two planes, others only one. For the "floorcast 32b" release I specifically decided against the "XOR" texture, whereas in "Memories" I used it again, but disguised with another final touch of "AND".

Parallax checkerboards

Projektbeschreibung, 32 bytes

The parallax checkerboards have been released by me as 32 bytes effect "Projektbeschreibung" in 2018. It was the direct attempt to bring "Follow the light" from "Digimind" (2006) or my own "Lucy" (2014) down to 32 bytes. Very helpful inspirations came from "Rrrolas" "Paralaxa" (32 bytes, 2007) and "Sensenstahl"s "Byteropolis" (32 bytes, 2013). In fact Rrrolas rendering approach was already close to the final solution, the code was modified to correct the planes arrangement, change the shape from triangles to checkerboards and improve on the colors. In "Memories", the coloring of "Digimind"s version was used. Also, the effect has been modified to reduce shearing as much as possible.

Sierpinski rotozoomer

colpinski, 16 bytes
rotastic, 32 bytes

This consists of two effects, the rotozoomer and the sierpinski effect as a texture. The sierpinski effect is based on my own "Colpinski 16b" from 2013 (https://www.pouet.net/prod.php?which=62079) where "frag" of "fsqrt" helped pushing the limits. There is no real predecessor to that, because it is known that this effect can be done directly by combining X and Y, instead of an Iterated Function System or Cellular Automaton approach. The rotozoomer was released as 32b intro "rotastic" by me in 2017 (https://www.pouet.net/prod.php?which=70112), it is based on the ideas of "ryg" of "Farbrausch" (51 bytes, 2002, https://www.pouet.net/prod.php?which=5015) and "Gargaj" of "Conspiracy" (49 bytes, 2002, https://www.pouet.net/prod.php?which=6088).

Raycast bent tunnel

Into a new era, 64 bytes version

The raycast bent tunnel is a tweaked version of my own 64 bytes "Into a new era" release (2018, https://www.pouet.net/prod.php?which=78044). The custom colors were exchanged for colors of the standard palette, the geometry was tweaked a bit and changed accordingly, so that no depth effect was used for the calculation of texture values. The standalone version of this effect has a size of 50 bytes. The two main inspirations for the 64 bytes version were two 128 byte intros, "Spongy" by "TBC" (2009, https://www.pouet.net/prod.php?which=53871) and "Wolf128" from "Baudsurfer" (2014, https://www.pouet.net/prod.php?which=63518), while the particular algorithm was developed independently.

Ocean night to day

Ocean, 64b version

The ocean effect is based on my own 64 bytes release "Ocean" from 2016 (https://www.pouet.net/prod.php?which=66857). The custom color generation and music generation have been stripped from the original, both were not compatible with the main framework of "Memories" without consuming a LOT of extra space. The special effect of "dawn" happens as a byproduct of the general framework approach. This will be explained in the next chapter.

Fading effect

The transition between two effects is an effect itself, which has no real predecessor. It is rather an idea that evolved over the years, and might have been done by many others in similar ways. In a nutshell, while calculating a frame, each pixel position is randomized and the time - which determines the effect to use - is offset by this randomization value which is downscaled before. This allows the usage of the standard VGA palette (picture, source) rather than creating custom colors for real crossfading and thus saves space.

Tiny megademo framework

For putting many tiny effects together into a "megademo", they have all to use the same approach and they have to be mostly free of assumptions (contents of memory and registers). Also they have to use the same timing values and be tuned to a shared timing in general. Preparing the single effects to be includeable in a framework like this took quite some time and required a lot of extra space initially. It has to be noted, that - sadly - some of my most impressive effects (judged by audience reactions and views on social media) could not be included at all, because the approaches were not combineable without a HUGE overhead. Once all effects were streamlined, I could think about "factoring" out common calculations which repeatedly occur, resulting in a few saved bytes. The framework does the following:

  • Setting up screen mode to 320 x 200 pixels in 256 colors
  • Initializing a pointer to the screen
  • Installing a callback for timing and music
  • Mainloop
    • Calculation of X and Y from the current screen location
    • P = Pseudorandom value from screen location
    • Offset current time T by a scaled value P'
    • Selection of effect number N according to T'
    • Execution of effect N (X,Y) for current pixel
      • On entrance, XY is in register DX (DL,DH)
      • On entrance, the current time is in register BP
      • On exit, the pixel color is expected in AL
    • Advance to next pixel
    • Triple diagonal interlacing for smoothing
    • Repeat until frame is complete
  • Set timer to a tempo of ~35 FPS
  • Check keyboard for ESC
    • Quit on ESC, otherwise continue

Code of framework

org 100h
s:
	mov al,0x13				; set AL to mode 320*200 in 256 colors
	int 0x10	 			; call BIOS to set mode
	xchg bp,ax				; set timing value to 0x13 
	push 0xa000-10			; write the screen adress to register ES
	pop es					; works in conjunction with Rrrola trick
	mov ax,0x251c			; parameter for changing timer interrupt
	mov dl,timer			; adress of timer routine, assume DH=1
	int 0x21				; install timer routine
top:
	mov ax,0xcccd			; load magic Rrrola constant
	mul di					; transform screen pointer to X, Y
	add al,ah				; use transformation garbage as
	xor ah,ah				; pseudorandom value and clear AH
	add ax,bp				; add time value to random value
	shr ax,9				; divide by 512 (basically the speed)
	and al,15				; filter effect number
	xchg bx,ax				; move effect number to BX
	mov bh,1				; reset BH to align with start of code
	mov bl,[byte bx+table]	; read the effect address from the table
	call bx					; call the effect
	stosb					; write the return value and advance
	inc di					; triple interlace trick for after
	inc di					; effect and smoothing the animation
	jnz top					; repeat until the frame is complete
	mov al,tempo			; set AL to divider for timer
	out 40h,al				; set timing (dual pass)
	in al,0x60				; read keyboard
	dec al					; quit on ESC
	jnz top					; otherwise repeat loop
sounds: db 0xc3, 11, 0x93; 0xc3 is MIDI/RET; fx2-s is used as volume
table: 	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s

Explanation of effects

Array of chessboards

array of chessboards

The most simple effect and a good one to start with. After offsetting the row by the time, the classic XOR pattern is applied. To achieve the impression of a grid of chessboards, all bits but two are set in the color. The real trick lies in shifting to a "good" palette spot. The portion of the frame which is perceived as black, isn't actually black, it's the dark part of the standard VGA palette. Shifting that way also makes the dark and light square look like they are from an ancient old chessboard.

 
	xchg dx,ax		; get XY into AX
	sub ax,bp		; subtract time from row
	xor al,ah		; XOR pattern (x xor y)
	or al,0xDB		; pattern for array of boards
	add al,13h		; shift to good palette spot

Zooming circles

zooming circles

The distance D of a point (X,Y) to the center (0,0) is sqrt(X²+Y²). The framework already makes sure the DL contains a centered X coordinate, DH=Y has to be centered though. Performing a square root on x86 requires quite a bit of code, but this actually can be omitted. With a clever selection of colors, the impression of zooming circles without square root is quite decent.

 
	mov al,dh		; get Y in AL
	sub al,100		; align Y vertically
	imul al			; AL = Y²
	xchg dx,ax		; Y²/256 in DH, X in AL
	imul al			; AL = X²
	add dh,ah		; DH = (X² + Y²)/256
	mov al,dh		; AL = (X² + Y²)/256
	add ax,bp		; offset color by time
	and al,8+16		; select special rings

Scrolling tilted plane

scrolling tilted plane

This effect is achieved as follow. First, a large constant is divided by the row number Y, to simulate distance. Then the resulting value is used twice, a) scaled by a centered X value and b) offset by current time. These results are then combined with the XOR pattern of which a special pattern is selected.

 
	mov ax,0x1329	; initialize with constant
	add dh,al		; preventing divide overflow
	div dh			; reverse divide AL = C/Y'
	xchg dx,ax		; DL = C/Y', AL = X
	imul dl			; AH = CX/Y'
	sub dx,bp		; DL = C/Y'-T 	
	xor ah,dl		; AH = (CX/Y') ^ (C/Y'-T)
	mov al,ah		; move to AL
	and al,4+8+16	; select special pattern

Parallax checkerboards

parallax checker boards

This is kind of raycasting with dynamic geometry. Each object plane is split horizontally because of the signed 16 bit multiplication and vertically because of an implicit logic column number operation. Additionally, implicit shearing is applied, so that the edges of the resulting grid (4 "solid" patches, 4 "transparent" patches, alternating) connect. If the ray hits one of the solid patches, the color becomes the iteration number (+palette offset into grayscale), if not, the plane is shifted by the screenpointer and the process is repeated until the maximum iteration number is reached.

	mov cx,bp		; set inital point to time
	mov bx,-16		; limit to 16 iterations
fx3L:
	add cx,di		; offset point by screenpointer
	mov ax,819		; magic, related to Rrrola constant
	imul cx			; get X',Y' in DX
	ror dx,1		; set carry flag on "hit"
	inc bx			; increment iteration count
	ja fx3L			; loop until "hit" or "iter=max"
	lea ax,[bx+31]	; map value to standard gray scale

Sierpinski rotozoomer

sierpinski rotozoomer
1/cos(atan(x)), made with www.google.com

Rotozooming normally requires trigonometric functions, or good estimations of them. Let's look at the usual 2D rotation equation and interpret it in a special way to get rid of the trigonometry in the calculation:

x' = x*cos(a) - y*sin(a)
y' = x*sin(a) + y*cos(a)

If we add zooming to it, it looks like this:

x' = z * (x*cos(a) - y*sin(a))
y' = z * (x*sin(a) + y*cos(a))

Now let's assume, we don't specify z ourself, and factor out 1/cos(a):

x' = 1/cos(a) * (x - y*tan(a))
y' = 1/cos(a) * (x*tan(a) + y)

Finally, let's substitute tan(a) with our time variable, since the tangens function rises to infinity the closer we get to 180°:

x' = 1/cos(atan(T)) * (x - y*T)
y' = 1/cos(atan(T)) * (x*T + y)

As long as we don't care about specifying the zoomfactor and don't directly control the angle, we can now rotate between -180° and +180° without using trigonometric functions. As a byproduct, the zoomfactor is now tied to the time T. The function for the zoomfactor is shown in the image, it results in zooming from infinitely small to one (original) and back to infinitely small. Quite a few bytes have been spent to beautify the effect, correct the timing offset, speeding up the animation, enlarging the sierpinski pixels and have decent colors, but I thought it's worth the space.

	lea cx,[bp-2048]; center time to pass zero
	sal cx,3		; speed up by factor 8!
	movzx ax,dh		; get X into AL
	movsx dx,dl		; get Y int DL
	mov bx,ax		; save X in BX
	imul bx,cx		; BX = X*T
	add bh,dl		; BH = X*T/256+Y
	imul dx,cx		; DX = Y*T
	sub al,dh		; AL = X-Y*T/256
	and al,bh		; AL = (X-Y*T/256)&(X*T/256+Y)
	and al,252		; thicker sierpinski
	salc			; set pixel value to black
	jnz fx4q		; leave black if not sierpinski
	mov al,0x2A		; otherwise: a nice orange
	fx4q:

Raycast bent tunnel

raycast bent tunnel

This is a variation of a the raycast tunnel "into a new era" (see also above). A detailed explanation has been posted to reddit for the similar production "Essence". The custom colors have been removed, the bend direction is different, and the geometry is a bit more closed, to increase the performance on slower computers and DosBox.

	mov cl,-9		; start with depth 9 (moves backwards)
	fx5L: 
	push dx			; save DX, destroyed inside the loop
		mov al,dh	; Get Y into AL
		sub al,100	; Centering Y has to be done "manually".
		imul cl		; Multiply AL=Y by the current distance, to get a projection(1)
		xchg ax,dx	; Get X into AL, while saving the result in DX (DH)
		add al,cl	; add distance to projection, (bend to the right)
		imul cl		; Multiply AL=X by the current distance, to get a projection(2)
		mov al,dh	; Get projection(1) in AL
		xor al,ah	; combine with projection(2)
		add al,4	; center the walls around 0
		test al,-8	; check if the wall is hit
	pop dx			; restore DX
	loopz fx5L		; repeat until "hit" or "iter=max"
	sub cx,bp		; offset depth by time
	xor al,cl		; XOR pattern for texture 
	aam 6			; irregular pattern with MOD 6
	add al,20		; offset into grayscale palette

Ocean night to day

oceannight
oceanday

The ocean effect is a wonderful example for a "happy accident". If you load a value into the FPU as integer, and store it as floating point value, and interpret it as integer again, you get a cool pattern. Combine that with a reverse divide and you get a neat wavy effect. This works in tight conjunction with the register DX, which has a sign flip at the position we want, so we can easily separate sky and sea. The final touch is the coloring though. By design of the global framework, the value of AL is determined on function entrance, it contains the effect adress. With a bit of code shuffling, the color of the sky came "for free", without using instructions, as well as the "fading to day"-color, which is also an effect adress. And that is no happy accident. In other versions than the original submitted DosBox version, the sky color can vary for that reason.

	sub dh,120			; check if pixel is in the sky
	js fx6q				; quit if that's the case
	mov [bx+si],dx		; move XY to a memory location
	fild word [bx+si]	; read memory location as integer
	fidivr dword [bx+si]; reverse divide by constant
	fstp dword [bx+si-1]; store result as floating point
	mov ax,[bx+si]		; get the result into AX
	add ax,bp			; modify color by time
	and al,128			; threshold into two bands
	dec ax				; beautify colors to blue/black

Fading effect

For convenience, the fading effect from the global framework, isolated. Basically it is generating a pseudorandom value from the screenpointer, then offsetting the time by a scaled value of it, then calling the destined effect.

	mov ax,0xcccd			; load magic Rrrola constant
	mul di					; transform screen pointer to X, Y
	add al,ah				; use transformation garbage as
	xor ah,ah				; pseudorandom value and clear AH
	add ax,bp				; add time value to random value
	shr ax,9				; divide by 512 (basically the speed)
	and al,15				; filter effect number
	xchg bx,ax				; move effect number to BX
	mov bh,1				; reset BH to align with start of code
	mov bl,[byte bx+table]	; read the effect address from the table

MIDI music part

This code section increments the timing value and produces sound. By chosing channel 3 we can reuse the "change channel instrument" instruction again as "RET". By shifting around the code of the effects, a reasonable value for the volume could be created and thus save another byte. Be aware that this code sequence only works if the MIDI device is already in UART mode, otherwise three more bytes have to be spent. Quite a few people from the audience and sceners told my afterwards that the melody sounds a bit like Incantations from Mike Oldfield, but it is rather created by a simple approach. Starting with "Hypnoteye" in 2015, I experimented with procedural MIDI and eventually released a small MIDI framework (64 bytes). The basic principle is to jump through tonal space with fixed steps, and map high values back to lower ones (by modulo). With simple combinations of stepwidth and modvalue, interesting effects can be produced. For example stepwidth=3 for reduced minor chords which are never resolved, likewise stepwidth=4, or stepwidth=6 for tritonal effects. With a well chosen modvalue, these patterns can create progressions. I did not do an appropriate theoretical analysis yet, rather explored the tonal space and noted when something sounded interesting.

sounds: db 0xc3, 11, 0x93, fx2-s
...
		inc bp				; increment timing value
		test bp, 7			; play a note every 8th step
		jnz nomuse			; quit if in between
		mov dx,0x330		; port number for MIDI
		mov si,sounds		; adress for sound data
		outsb				; change instrument of channel 3
		outsb				; to vibraphone
		outsb				; play a note on channel 3
		imul ax,bp,-19*32*4	; the magic melody constant
		shr ax,10			; scale down and implicit "and 63"
		add al,22			; pitch base is 22
		out dx,al			; play THIS note on channel 3
		outsb				; play it with THIS volume

Original release code (complete)

 
; "memories" by HellMood/DESiRE
; the tiny megademo, 256 byte msdos intro
; shown in April 2020 @ REVISION
;
;   (= WILL BE COMMENTED IN DETAIL LATER =)
;
; create : nasm.exe memories.asm -fbin -o memories.com
; CHOOSE YOUR TARGET PLATFORM (compo version is dosbox)
; be sure to use the dosbox.conf from this archive!
; only ONE of the defines should be active!
%define dosbox			; size : 256 bytes
;%define freedos		; size : 230 bytes
;%define winxpdos		; size : 263 bytes

; DON'T TOUCH THESE UNLESS YOU KNOW WHAT YOU'RE DOING
%ifdef winxpdos
	%define music
	%define switch_uart
	%define safe_dx
	%define safe_segment
%endif
%ifdef freedos
	%define safe_dx
%endif
%ifdef dosbox
	%define music
	;%define safe_dx ; sometimes needed
%endif

; GLOBAL PARAMETERS, TUNE WITH CARE!
%define volume 127	; not used on dosbox (optimization)
%define instrument 11
%define scale_mod -19*32*4; 
%define time_mask 7
%define targetFPS 35
%define tempo 1193182/256/targetFPS		
%define sierp_color 0x2A
%define tunnel_base_color 20
%define tunnel_pattern 6
%define tilt_plate_pattern 4+8+16
%define circles_pattern 8+16

org 100h
s:
%ifdef freedos
	mov fs,ax
	mov [fs:0x46c],ax
%endif
	mov al,0x13
	int 0x10	 
	xchg bp,ax
	push 0xa000-10
	pop es
%ifndef freedos
	mov ax,0x251c
	%ifdef safe_dx	
		mov dx,timer	
	%else ; assume DH=1, mostly true on DosBox
		mov dl,timer
	%endif
	int 0x21
%endif
top:
%ifdef freedos
	mov bp,[fs:0x46c]
%endif	
	mov ax,0xcccd
	mul di
	add al,ah
	xor ah,ah
	add ax,bp
	shr ax,9
	and al,15
	xchg bx,ax
	mov bh,1
	mov bl,[byte bx+table]
	call bx
	stosb
	inc di
	inc di
	jnz top
	mov al,tempo
	out 40h,al
	in al,0x60
	dec al
	jnz top
sounds:
	db 0xc3	; is MIDI/RET
%ifdef music
	db instrument,0x93
	%ifdef switch_uart
		db volume		; without switch, volume is in table
		db 0x3f 
	%endif
%endif
table: ; first index is volume, change order with care!		    					
	db fx2-s,fx1-s,fx0-s,fx3-s,fx4-s,fx5-s,fx6-s,sounds-s,stop-s
stop:
	pop ax
	ret
timer:
%ifndef freedos
	%ifdef safe_segment
		push cs
		pop ds
	%endif
		inc bp
	%ifdef music	
		test bp, time_mask
		jnz nomuse
		mov dx,0x330
		mov si,sounds
		outsb
		outsb
		outsb
		imul ax,bp,scale_mod
		shr ax,10
		add al,22
		out dx,al
		outsb
		%ifdef switch_uart
			inc dx
			outsb
		%endif
	%endif
nomuse:
	iret
%endif	
fx0: ; tilted plane, scrolling
	mov ax,0x1329
	add dh,al
	div dh
	xchg dx,ax
	imul dl
	sub dx,bp
	xor ah,dl
	mov al,ah
	and al,tilt_plate_pattern
ret
fx2: ; board of chessboards
	xchg dx,ax
	sub ax,bp
	xor al,ah
	or al,0xDB
	add al,13h
ret
fx1: ; circles, zooming
	mov al,dh
	sub al,100
	imul al
	xchg dx,ax
	imul al
	add dh,ah
	mov al,dh
	add ax,bp
	and al,circles_pattern
ret
fx3: ; parallax checkerboards
	mov cx,bp
	mov bx,-16
fx3L:
	add cx,di
	mov ax,819
	imul cx	 
	ror dx,1	 
	inc bx	 
	ja fx3L
	lea ax,[bx+31]	 
ret
fx4: ; sierpinski rotozoomer	
	lea cx,[bp-2048]
	sal cx,3
	movzx ax,dh
	movsx dx,dl
	mov bx,ax
	imul bx,cx
	add bh,dl
	imul dx,cx
	sub al,dh
	and al,bh
	and al,0b11111100
	salc				; VERY slow on dosbox, but ok
	jnz fx4q
	mov al,sierp_color
	fx4q:
ret
fx5: ; raycast bent tunnel
	mov cl,-9
	fx5L: 
	push dx
		mov al,dh
		sub al,100
		imul cl
		xchg ax,dx	
		add al,cl
		imul cl
		mov al,dh
		xor al,ah
		add al,4
		test al,-8
	pop dx
	loopz fx5L
	sub cx,bp
	xor al,cl
	aam tunnel_pattern; VERY slow on dosbox, but ok
	add al,tunnel_base_color
ret
fx6: ; ocean night / to day sky
	sub dh,120
	js fx6q
	mov [bx+si],dx
	fild word [bx+si]
	fidivr dword [bx+si]
	fstp dword [bx+si-1]
	mov ax,[bx+si]
	add ax,bp
	and al,128
	dec ax
fx6q:
ret

Further tweaks

To be done ;)

Differences between dos versions

To be done ;)

Bonus - NFO/ASCII

 
                                                             art : hammerfist
         ∂#MW%e                              _d$Ng,
         'B,  b                   _jM@$QZb,cQ"  )@
  ,edRB$b,l@   Wk,yGR$KM&$b,     ,dP"     Wl ]bsd%UR8BG6&$@DSyG#ZKM&$b,
,dP      "T%L  'MGF      "*R_   Tg    "*4Zk,#I  YP   W"    7P      "*R
4M   gd@    ^   ∂@   d@b   dQ$#@Z@R3L_    "*GMj  'W      ,gd$   d@b   9Q$#%b
W#,  `M          Wb  `*  _4P   `Qk  *#N8L   `H5   @b   'QR7YK   `*  _4F"   Qk
`6@L             dML            '@          ,BK   'M    B  *b,            '#L
  ^QBb,_     _,4&M∞∂@=,_       _dGL       _gQKM    GL    @k  'Mg,_         _dG,
    "*BN5W$2#MNP"   "*G3WRM8&B5P"`Y@QNW3Z5P" ∂#$W8BRM3XZN87    "*GW38M%EBDW5P"`


                              p r e s e n t s

            4
           d@,
         _& `Wl
      _,aP   "#baedM$#@@K JP*"?ML
 ,ad@$#P"         ,d@NEWVB"     X,aQPYb,_
V@Mm,_          ,d@MW#BW'      EMP"   'R ,ngBP^fML
 M@N@y        Y#BNW#M"       J9"      `MQ9"      "MgRBq  ,QBMg,
  VN#P` ,d@@    `WM@^                   7f         F` 7kY"   ^G  _.eQNE1.
   ]B _G@MWN$,   `P                     '     4b       QP      w@F*^  ^Qb
   ]O@NRM#W@MNB,         ;                    ^`      j        JP^       Yl
  J#NRNWM@#BcT"^        ,A  _J                     _q@                   `X
 WM#B@WdY`,7        _G#YN#PM                 _,gG"                      M,
  *BN#WP"  dK       ,Q@NRMB"]9       ,      _,M@Q*                        #A
   "U^      V@h,   iNBW#NT  J'      J9     s@QN"         _;               'D,
             RMBv&NMQR@9  .W      .K'     "9`         ,6BA   _JL          ]l
              Y#NE@W#NRP   #[      8               _d@MW#B_jW#W          BN
               "GQ@MR#W    QL_      *B            _,p#NBW#NQMG@WY          3Q
                  "Y@F     ,XW@M%im,_Yb_     _,g5@#MW@QMNE@E@NRMB         ,WM
                    `  _,gP*"#REM#GB@N#MQbnd@N#M@MW#R8QSB^'WQERM@        ;4NB,
                     ,GYKL    E#B8R8QSB@M@#BM#W@MNB"`_  ,  "^` N       ,dW@Ql
                   _Q`'W`*t    '¶@GS#MBQ#E@W#NQBW[     'LvQ_   ,K    _dNABGM#N
                  ,F   '          `^WAB@QGE9*"9^*@L    jP7FY,  h,_.jWM#BR#GBM,
                 J;    ,   _                  '       '   "LL  YxE#B8R8QSBNW@W;
                AP   _,Ag6^          _   J                  A  `"Q#M@MW#R8E#P
               j@   `"XQW[            'LvK,_      'L_,/      @t    Y#NE@WNR"
              :M/     9^*@L           jP7F"       _PYKL     _,A;     ¶RSNQ"
              dKL     '     `        '   "L      "`'W`*t   `"XQb      `W^
              Q`8t            'L_,/         ,   _   '        9^Q
             ,W               _PYKL       _,Ag6^             ' W,     _ ,#N&
             !N  _   J       "`'W`*t     `"XQW[       _  J     N!_JG9^RwQ' *t
             `W,  'LvK,_        '        _gGB8@L   _   'LvK,_ ,WgB'    V    7L
         _.,gm&@B&wBZF"                j@'`  "WL _gML  jZd7Yb lN"          dBWl
      ,g&QB*"^`    `"*G@g, .gR&k,_   ,N"      '@QF  k;gMF  *QvQ     jQ, ,@N@B#,
   .eQF*`              `Yb@"  "*6Qg,gF     ,   7     XMN"    'MNB,    ^QWSER@N;
 ,gP"           qy,      W'       ^Q'     &L      ,g@W'       `QMEL     `"WBNWP
g7              9      ,X         M?     9"   _q8MSK           EMt       *@K
Vh   _,m#L             _AH        le         ,GBDNE9^A,          *@F        NMg
 L,qQ@ND           _.m@Bl        We      ,gM@B8#Q'   h_                   lWE,
  W9NHW@`          JWM#B@]        @e     4WR@NGF^      'QL                  dRWl
   VMd*            "@BE@PM        'N      *UP"           VW,               JRSB;
  ,@F       j       WK W,        t                     XNt            _A@E#N
_JP       ,6&         "GLdM         XD,               _.g8NMA@k,_    _,gG#NMGR;
"Z      .JRER           'VMi     _jNB#W&_         _,j@E@W#Nl MBGMNQGNQMG@QBW9
 h   ,G@NRMBl            `"   ,d#R@M$F Mg,_.gp&@@NEWVBWBMG  *QMN8R8SBN$E@WF
  Vb dW#R8QSRb,                 *YM@EQ,_ 'MENBW#NQMG#B@R@MW#l   "BM@QNENRQG'
   *WGS#MBMNEYL                    `^"*8M@Q@NRM#W@BWSNW@QBF"`     `^*@QBF^ [HFT]
    ^M@MW#Q9 ^Wt                           `^¶RQ@W8NQGP*`
     Q#@P     Vk                            lA `"^`
      Y"       `MA                           J#,
                *R@,                        ,MQl
                 Y#Wk,                      GWM8L
                  W8RQSt,_                 AQ@MR#,
                  `@M@#SB@Mbm.,_          QNBW#NW
                    QB8R8SBN$WNRM@#GNtwg@NMQR@B'
                     *MBQ#8R8QS@NE@WNBW#NQMG@NR;
                      `WGS#MBQ#R8QSB@NE@W#NQBW9
                        *OMW@QMNE@E@NRMW@QMB@*
                          `^"YQW@Q#SB#NE@EGP
                               `^"*8R@GBQF`