Free space and code optimisation in "JSW"

Spider · July 2, 2018

Would the use of undocumented instructions possibly save a few more bytes, at the potential risk of issues on real machines ? :unsure:

Emulators ( at least as far as I'm aware ) will either cope with undocumented instructions or in some cases via options break into the debugger upon encountering one.

http://www.z80.info/z80undoc.htm perhaps

IRF · November 13, 2018

At #8828-34 and #88B8-C4, the LDIR method of editing the attributes of a character row can be replaced with a simple loop which uses the LD (HL), xx command. The latter approach consumes fewer bytes, employs fewer registers ~~and is probably marginally faster~~. EDIT: I removed that latter point, in light of Norman Sword's comments below.

e.g. the original code at #8828-34 requires 13 bytes:

LD HL, #5A60

LD DE, #5A61

LD BC, #001F

LD (HL), #46

LDIR

But the same thing could be achieved in 10 bytes:

LD HL, #5A60

LD B, #20

Loop:

LD (HL), #46

INC HL

DJNZ Loop

****

The LDIR method for copying the same value across multiple addresses is only necessary when updating #100 bytes or more, which is beyond what a single register can keep track of (e.g. see #8813-27).

When updating fewer than #100 (256) bytes, LDIR is only necessary if different values are being copied across to the individual addresses (e.g. #88FC-#8906).

Edited April 30, 2019 by IRF

Norman Sword · November 14, 2018

At #8828-34 and #88B8-C4, the LDIR method of editing the attributes of a character row can be replaced with a simple loop which uses the LD (HL), xx command. The latter approach comsumes fewer bytes, employs fewer registers and is probably marginally faster.

The z80 mnemonic LDIR which stands for Load Increment and Repeat

LDIR moves each byte in memory over a period of 21 clock cycles or 21 T-states

Your code is slower and writes each byte in around 29 T-states

LD HL, #5A60

LD B, #20

Loop:

LD (HL), #46 ;10T

INC HL ;6T

DJNZ Loop ;13/8T 29 T-states compared to LDIR 21 T-states

You can speed this up from 29 T-states to around 24 T-states.

1) use the "C" register to load HL, this saves 3T states

2) use INC L instead of INC HL, this saves 2 T-states. In the code indicated, HL does not cross a page boundary.

But even this, is still slower than LDIR

LD HL, #5A60

LD BC, #2046

Loop:

LD (HL),C ;7T

INC L ;4T

DJNZ Loop ;13/8T 24 T-states compared to LDIR 21 T-states

The timing T-states are ideal timings, writing to contended memory will slow this down.

However on very short writes when B is small, the slowing of the loop compared to LDIR will be counteracted by not having to load several registers.

The change in code does save memory, which is the purpose of the code change.

Edited November 14, 2018 by Norman Sword

IRF · November 14, 2018

Thanks Norman.

I would just add that the T-state overheads are higher for setting up a greater number of parameters prior to the loop, in the case of LDIR. But of course that is outweighed by the faster loop (Unless the loop is very short e.g. The LDIR method at #8684-90 in Manic Miner wouldn't be much faster overall).

Norman Sword · November 14, 2018

Sorry if my editing of post #243 seems to indicate that post #244 is reiterating the same issue. What actually happened, was I looked at what I had written and edited it. Went away and made a cup of tea, came back and posted the changes. Without seeing the response via post #244

IRF · November 19, 2018

Optimised code to check for any keypress:

loop:

XOR A AF

IN A, (#FE) DB FE

OR #E0 F6 E0

INC A 3C

JR Z, loop 28 F8

Optimised code to check for no keypresses (i.e. to ensure that the player has let go of all keys before proceeding - this is useful to stop 'accidental selections' if a key is pressed for too long):

loop:

XOR A AF

IN A, (#FE) DB FE

OR #E0 F6 E0

INC A 3C

JR NZ, loop 20 F8

Norman Sword · November 21, 2018

The method I use is different, and takes exactly the same number of bytes and T-states

It is slightly different, but only because I prefer to have bits set for keys pressed, and the spectrum port read gives the bit reset for keys pressed.

;WAIT FOR ANY KEY press
loop:
XOR A #AF
IN a,(#fE) #DB #FE
CPL #2F
AND #1F #E6 #1F
JR Z,loop #28 #F8

;WAIT FOR KEY RELEASE , Keyboard debounce
loop:
XOR A #AF
IN A,(#FE) #DB #FE
CPL #2F
AND #1F #E6 #1F
JR NZ,loop #20 #F8

Post #246 and post #247(this post)

Both differ only in the condition at the end JR Z when waiting for a key and JR NZ when waiting for the key to be released

I will stick with using the CPL instruction.

IRF · November 21, 2018

I've recently been working on an experimental project where a keypress is detected, and used to determine the starting room of the game (there are forty rooms in the game, and forty Spectrum keys). A partial disassembly is included below.

I could adopt it to a similar method to Norman Sword's, by reversing the conditionality of the test of the Carry Flag from a JR NC to a JR C instruction. However, that would also require the insertion of a CPL command - I don't think it could be done without requiring that additional byte? Also, this method necessitates the use of the BC register-pair to read the ports via IN A, (C) - the Accumulator A is too 'busy' to use the IN A, (#FE) here, I believe.

start_again:

LD BC, #FEFE BC starts off pointing at the half-row of keys SHIFT-V (Bit 0 of B is reset)

LD DE, #0805 D counts the eight half-rows; E counts the five keys in each half-row

LD H, #C0 H keeps track of the room number

keypress_loop_1:

IN A, (C) If a key in the half-row currently being interrogated is depressed, then one of bits 0-4 of A will be reset

keypress_loop_2:

INC H

RRCA A is rotated rightwards by the inner loop;

JR NC, room_selected if a reset bit moves past Bit 0 then the Carry Flag is reset, indicating that a room has been chosen

DEC E

JR NZ, keypress_loop_2

LD E, #05

RLC B B is rotated leftwards by the outer loop, so that Bits 0-7 are reset in turn; this allows each half-row in turn to be interrogated by the inner loop

DEC D

JR NZ, keypress_loop_1

JR start_again None of the forty keys were pressed, so go back to the start to check again

room_selected:

If we reach here then a keypress has been detected, and the H register is now pointing at the corresponding page of memory (#C1 to #E8) where the chosen room's definition is stored.

(N.B. Room 00 is not selected, corresponding to page #C0 of memory; that is a 'Cheat' screen.)

Edited January 22, 2019 by IRF

Norman Sword · November 21, 2018

I've recently been working on an experimental project where a keypress is detected, and used to determine the starting room of the game (there are forty rooms in the game, and forty Spectrum keys). A partial disassembly is included below.

I could adopt it to a similar method to Norman Sword's, by reversing the conditionality of the test of the Carry Flag from a JR NC to a JR C instruction. However, that would also require the insertion of a CPL command - I don't think it could be done without requiring that additional byte? Also, this method necessitates the use of the BC register-pair to read the ports via IN A, © - the Accumulator A is too 'busy' to use the IN A, (#FE) here, I believe.

start_again:

LD BC, #FEFE BC starts off pointing at the half-row of keys SHIFT-V (Bit 0 of B is reset)

~~LD DE, #0805~~ D counts the eight half-rows; E counts the five keys in each half-row

LD H, #C0 H keeps track of the room number

keypress_loop_1:

IN A, ( C ) If a key in the half-row currently being interrogated is depressed, then one of bits 0-4 of A will be reset

ld e,5 <<<<<< we check five bits for each scanned row

keypress_loop_2:

INC H

RRCA A is rotated rightwards by the inner loop;

JR NC, room_selected if a reset bit moves past Bit 0 then the Carry Flag is reset, indicating that a room has been chosen

DEC E

JR NZ, keypress_loop_2

~~LD E, #05~~

RLC B B is rotated leftwards by the outer loop, so that Bits 0-7 are reset in turn; this allows each half-row in turn to be interrogated by the inner loop

~~DEC D~~

~~JR NZ, keypress_~~loop_1
jr c,keypress_loop1 <<<<< when the cleared bit is rotated out, we have scanned the eight rows

JR start_again None of the forty keys were pressed, so go back to the start to check again

room_selected:

If we reach here then a keypress has been detected, and the H register is now pointing at the corresponding page of memory (#C1 to #E8) where the chosen room's definition is stored.

(N.B. Room 00 is not selected, corresponding to page #C0 of memory; that is a 'Cheat' screen.)

Changes are minimal. Only 4 bytes smaller

I would be tempted to use "L" in place of "E", this leaves the register pair "DE" unused.

IRF · November 21, 2018

Nice one Norman. :) That's a logical extension of what I did.

Incidentally, in the experimental project I referred to, L is used to point at the previously-unused Offsets #EE-#EF in the definition for the chosen room (H), wherein a suitable starting location is stored (Willy's initial x- and y- coordinates, and initial frame of animation and facing direction, are all compressed into two bytes). The LD A, (HL) command is used to pick up the data.

But that wouldn't preclude me from using L instead of E to count down the inner loop, in the way that you suggest.

Sign In

Free space and code optimisation in "JSW"

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

JohnElliott

IRF

JohnElliott

Posted Images

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Important Information