# Difference between revisions of "Game of Life 32b"

(→Original version : 65 bytes) |
(→Original version : 65 bytes) |
||

Line 5: | Line 5: | ||

<br /> | <br /> | ||

− | It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules "normal" game of life rules, but with a twist. Instead of regarding only neighbour cells inside a 3x3 neighbourhood , ALL cells are taken into consideration, and the rules are reinterpreted as: | + | It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules "normal" game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as: |

− | + | <br /> | |

* If the number of cells is 3, the center cell will be alive. | * If the number of cells is 3, the center cell will be alive. | ||

* If the number of cells is 4, the center cell keeps its state. | * If the number of cells is 4, the center cell keeps its state. | ||

* Otherwise, the cell dies (or stays dead). | * Otherwise, the cell dies (or stays dead). | ||

+ | <br /> | ||

+ | Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work "in situ" (updating the current cell instantly will lead to wrong results of following calculations), current cells are "marked", and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be "corrected" by <code>shr [byte di-65],5</code> to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood. <br /> | ||

+ | <br /> | ||

+ | At the start of the loop, there is already the first "trick" happening. The register of summation <code>cl</code> is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction <code>and al,0x20</code> after <code>xchg cx,ax</code>. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the <code>rcr</code> instruction. [https://kernfunny.org/x86/html/file_module_x86_id_273.html "The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits."]<br /> | ||

+ | <br /> | ||

+ | When the summation is complete, the aforementioned <code>rcr</code> is executed, but not before setting the carry flag (<code>stc</code>) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with <code>and al,0x20</code> we get exactly the value according to the rules defined above. <br /> | ||

+ | <br /> | ||

+ | This value is now set in the original cell with <code>or [si-1],al</code>. | ||

− | |||

− | |||

<syntaxhighlight lang="nasm"> | <syntaxhighlight lang="nasm"> |

## Revision as of 10:00, 30 April 2020

This writeup is still in the works! Come back later to get information on all the mean tricks ;) Meanwhile you can download and comment the intro.

## Contents

## Original version : 65 bytes

We'll start with the old 65 bytes version and bring it down to 32 bytes.

It will help to understand what the core algorithm does, before optimizing it. I will not go into the details of random number generation and key handling since these parts are removed in the final version anyway. Setting up screen mode and putting pixels to the screen is described in the basic sectionn of this Wiki. The core routine computes the [https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules "normal" game of life rules, but with a twist. Instead of regarding only eight neighbour cells inside a 3x3 neighbourhood , ALL nine cells are taken into consideration, and the rules are reinterpreted as:

- If the number of cells is 3, the center cell will be alive.
- If the number of cells is 4, the center cell keeps its state.
- Otherwise, the cell dies (or stays dead).

Like in other (trivial) implementations, the 2D space is parsed cell by cell, from left to right, and from top to bottom. Since the game of life does not work "in situ" (updating the current cell instantly will lead to wrong results of following calculations), current cells are "marked", and when the calculations are advanced far enough that the cell in question does not influence any calculation of the current iteration, it will be "corrected" by `shr [byte di-65],5`

to the target value of the next iteration. The summation is as usual, an inner loop, adding up 3 cells of one column, and the outer loop, shifting from right (+1) to the left (-1), thus adding up 9 cells of a 3x3 neighbourhood.

At the start of the loop, there is already the first "trick" happening. The register of summation `cl`

is not properly cleaned, but at this point it can either contain 0 or 32 from the instruction `and al,0x20`

after `xchg cx,ax`

. If an arbitrary amount of cells has this on bit set, that won't hurt the calculation because of a special property of the `rcr`

instruction. "The processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 leastsignificant bits."

When the summation is complete, the aforementioned `rcr`

is executed, but not before setting the carry flag (`stc`

) which will be rotated in from the left, and directly right of the original cell value. By extracting the 6th bit of this rotated value (with `and al,0x20`

we get exactly the value according to the rules defined above.

This value is now set in the original cell with `or [si-1],al`

.

```
; http://read.pudn.com/downloads208/sourcecode/asm/981812/LIFE65.ASM__.htm
; Life simulator, 72 bytes - Vladislav Kaipetsky and Tenie Remmel
; 65 bytes - Mark Andreas
; If no args, regs on startup are:
; AX = BX = 0000h
; SI = IP = 0100h
; DI = SP = FFFEh
IDEAL
MODEL TINY
P386
CODESEG
ORG 100h
Start: int 1ah ; ah=00: cx=hours, dx=tic counter
mov al,13h ; Set mode 13h
int 10h
xchg dx,ax
push 09000h ; DS = last 64K segment
pop ds
push 0A000h ; ES = video memory
pop es
; BX is already zero
RandLoop:
rol ax,1 ; Generate random number
adc [bx],al
dec bx
jnz RandLoop
; BX will not be equal to 3 the first time this loop is executed, but
; it will be for all other times. As SI = 0100h and DI = FFFEh on
; startup, SI - DI will be equal to 258.
LifeLoop:
xchg cx,ax
AccLoop:
add cl,[di+bx-64] ; Add in this column
add cl,[si+bx-2]
add cl,[si+bx+318]
dec bx ; Loop back
jnz AccLoop
mov al,[si] ; Get center cell, set pixel
stosb
stc ; 3 = birth, 4 = stay (tricky):
rcr al,cl ; 1.00?0000x --> 0.0x100?00 (rcr 3)
and al,20h ; ^carry | ^
; +---> 0.00x100?0 (rcr 4)
or [si-1],al ; Add in new cell ^
shr [byte di-65],5 ; Shift previous value
mov bl,3 ; 3 iterations in AccLoop
inc si ; Loop while not zero
jnz LifeLoop
mov ah,1 ; Check for key
int 16h
jz LifeLoop ; Loop if no key
xchg ax,bx ; Set text mode
int 10h
ret ; Return
End Start
```

## Remove key handler and RNG : 44 bytes

```
Start:
mov al,93h ; Set mode 13h
int 10h
push 09000h ; DS = last 64K segment
pop ds
push 0A000h ; ES = video memory
pop es
; BX is already zero
LifeLoop:
xchg cx,ax
AccLoop:
add cl,[di+bx-64] ; Add in this column
add cl,[si+bx-2]
add cl,[si+bx+318]
dec bx ; Loop back
jnz AccLoop
;mov al,[si] ; Get center cell, set pixel
lodsb
stosb
stc ; 3 = birth, 4 = stay (tricky):
rcr al,cl ; 1.00?0000x --> 0.0x100?00 (rcr 3)
and al,20h ; ^carry | ^
; +---> 0.00x100?0 (rcr 4)
or [si-1],al ; Add in new cell ^
shr [byte di-65],5 ; Shift previous value
mov bl,3 ; 3 iterations in AccLoop
; inc si ; Loop while not zero
jmp short LifeLoop
```

## Switching to Textmode : 39 bytes

```
push 0xb800
pop ds
LifeLoop:
stc ; 3 = birth, 4 = stay (tricky):
rcr al,cl ; 1.00?0000x --> 0.0x100?00 (rcr 3)
and al,20h ; ^carry | ^
or [si-2],al ; Add in new cell ^
shr byte [si-160-6],5 ; Shift previous value
mov bl,6
xchg cx,ax
AccLoop: add cl,[si+bx-160-4] ; Add in this column
add cl,[si+bx-4]
add cl,[si+bx+160-4]
dec bx ; Loop back
dec bx ; Loop back
jnz AccLoop
lodsw
jmp LifeLoop
```