Jump to content


Photo

source code for JSW

members only source code jsw

69 replies to this topic

#51 Norman Sword

Norman Sword

    Advanced Member

  • Member
  • PipPipPip
  • 237 posts

Posted 17 July 2019 - 12:43 PM

Yes the BLOCKX_MOVE32 will clear the Parity Even Flag (PE flag) if and only if the last LDI counts the register pair "BC" down to zero.

Which is why the line that loads BC with 128 is specifically indicating that "BC" must be set to a multiple of 32. If this was changed to a value which is not divisible by 32. Then the attributes would never be drawn, because the ( "JP PE,n_raster" ) would always branch to ( "n_raster" )


Edited by Norman Sword, 17 July 2019 - 09:03 PM.

  • IRF likes this

#52 IRF

IRF

    Advanced Member

  • Contributor
  • 4,277 posts

Posted 17 July 2019 - 12:52 PM


N.B. My method may complicate things in cases where a chunk of code is being overwritten with a single value - where the first byte is overwritten directly and then the number of bytes to which the same values is to be copied in a loop is [size of chunk of code] minus one. e.g. for attribute update with a single value (such as for a screen flash effect), use #01FF instead of #0200 to define the size of the loop.

Norman's code deals with such cases by CALLing a late entry point into his subroutine, coinciding with the second LDI command in the subroutine. (But the JR NZ at the end of the subroutine jumps back to the first LDI in the subroutine.)

In such cases, I think my method would unavoidably end up 'overshooting', and overwriting one more byte than it should. (But in the aforementioned project, I didn't actually use an LDI method for 'block fill' purposes, only for 'block move'.)

 

On reflection, if I did want to use the same subroutine in the context of overwriting a block of code with a single value (e.g. to implement the Screen Flash effect), then it could be done like this (at the cost of five more bytes):

 

; 16 character rows to colour, so use A=#0F (15 in decimal) for the later loop.

 

LD HL, xxxx

LD DE. xxxx +1

; No need to define BC; it's not used now

LD (HL), colour_value           ; Either copied from A, or a fixed value specified here

 

LD A, #01

CALL subroutine_late_entry

LD A, #0F

loop:
CALL subroutine
DEC A
JR NZ, loop

 

subroutine:

LDI

subroutine_late_entry:

rept 31                                      ; I presume "rept y" means 'repeat the following code (up to the end marker) y times'?

   LDI

   endm                                     ; I presume this means 'end marker'?

   RET


Edited by IRF, 17 July 2019 - 01:04 PM.


#53 Norman Sword

Norman Sword

    Advanced Member

  • Member
  • PipPipPip
  • 237 posts

Posted 17 July 2019 - 01:45 PM

Standard assembler directives:-

One of the biggest helps is the macro language that you can build into pieces of standard code that are often used.

 

The standard macro is as such:

 

silly:    Macro      param1,param2,param3

 

       ld hl,param1

       ld de,param2

       ld bc,param3

       ldir

 

      endm.

 

The macro here is defined by the label silly. Every time the word silly is seen in the source code it will replace it with all the code between the MACRO and endm. 

The word endm is shorthand for "end macro"

 

So How do I use silly in a piece of code

 

( don't forget this example is called silly, because it is a silly example)

 

I simply write in to the assemble code

 

         silly     ATT0,ATT1,32

 

when the code is assembled the above will be substituted with

 

       ld hl,ATT0

       ld de,ATT1

       ld bc,32

       LDIR

 

which is the macro expanded and each of the labels re placed with the parameter passed to the macro.
e.g
       ld hl,ATT0       ; here param1 is replaced with ATT0
       ld de,ATT1      ; here param2 is replaced with ATT1
       ld bc,32          ; here param3 is replaced with 32
       LDIR
 

 

Macros are very helpful to expand code out that is repetitive.

 

another form of macros. is the rept directive. (rept = repeat). so getting back to to the original querry

 

        rept 32                       ;repeat 32 times

        ldi

       endm                          ; end macro

 

means repeat the line of code between the rept directive and then endm directive 32 times 

 

 

we can create big blocks of code by repeating and nesting : (these examples are just examples not taken from any code)

 

 

so lets look at

move        macro      count

 

                  rept  count

 

                  ld a,(hl)

                  ld (de),a

                  inc hl

                  inc d

                 endm

 

             endm

we can in the assembler now write

 

         move 8

 

and this will generate the inline code 

 

                  ld a,(hl)
                  ld (de),a
                  inc hl
                  inc d

                 

                  ld a,(hl)

                  ld (de),a
                  inc hl
                  inc d

 

                  ld a,(hl)

                  ld (de),a
                  inc hl
                  inc d

 

                  ld a,(hl)

                  ld (de),a
                  inc hl
                  inc d

 

                  ld a,(hl)

                  ld (de),a
                  inc hl
                  inc d

 

                  ld a,(hl)

                  ld (de),a
                  inc hl
                  inc d

               

                  ld a,(hl)

                  ld (de),a
                  inc hl
                  inc d

 

                 ld a,(hl)

                  ld (de),a
                  inc hl
                  inc d

 

; which is the quickest way of doing this operation possible. we have no loop counter.

 

The above probably seems to be pointless, but consider a piece of code I have mentioned several times and written out an example of once. which is the stack copy. To speed up a stack copy we set up a nest macro similar to the example above. When the macro is expanded we end up with a big block of inline code. (the expansion can end up with 500 or more lines of code) 

 

we can also pass counters that can be used and acted upon.

Macro's are also literally expanded and can cause no end of problems when the expansion does not seem to do what is wanted.


;-------------------------- TOO MUCH INFORMATION -------------------------------------

 

 

short answer

 

REPT is short for REPeaT

 

ENDM  is short for END Macro






 


Edited by Norman Sword, 17 July 2019 - 01:51 PM.

  • IRF likes this

#54 IRF

IRF

    Advanced Member

  • Contributor
  • 4,277 posts

Posted 17 July 2019 - 01:58 PM

Thanks for that explanation, Norman!

 

Going back to your version of the LDI method, did you manage to get it to work in conjunction with the Jagged Finger fix (with the rows of attributes being updated alongside the associated pixel-rows)?

 

Because if your subroutine is in the format (as you explained previously):

 

BLOCK_MOVE32:

    rept 32

   LDI

   endm

   DEC A

   JR NZ, BLOCK_MOVE32

   RET

 

...then the DEC A would affect the Overflow Flag, and therefore the JP PE,n_raster in the main routine wouldn't be responding to BC having counted down to zero, but to the decrement of A to zero (and thus the Overflow Flag would always have the same status when the code RETurns back to the main routine).

 

How do you get around that?


Edited by IRF, 17 July 2019 - 01:58 PM.


#55 Norman Sword

Norman Sword

    Advanced Member

  • Member
  • PipPipPip
  • 237 posts

Posted 17 July 2019 - 02:19 PM

Each example I write is self contained unless otherwise stated.

The last example I wrote uses no check on the block of 32 ldi's and just returns. The above code "as used in post 54" is an old example which does use "a" as a counter.



Give me five minutes and I will test the example as I wrote it and get back to you. But I think it will work as written

 

 

;-----------------------------------------------------------------------------------------------------------------------

 

 

Works exactly as I said it would.


Edited by Norman Sword, 17 July 2019 - 02:23 PM.

  • IRF likes this

#56 IRF

IRF

    Advanced Member

  • Contributor
  • 4,277 posts

Posted 17 July 2019 - 02:34 PM

I was just thinking in terms of the code which copies the 'master copies' to the 'working copies' [for pixels and for attributes], which would run faster* as you originally wrote it, with the countdown of raster lines embedded within the subroutine containing the chain of LDIs.

 

(*For the reason which I outlined earlier today - despite my initial thoughts last night - that you only have to CALL the subroutine once for the pixels and once for the attributes.  Whereas putting the loop counter commands in the Main Loop, as I suggested in post 54 44, means that you have to have execute multiple CALLs and RETs for copying each raster line in turn.)

 

I assume that your 32-LDI subroutine is available as common code for both purposes [master buffers -> working buffers, and then working buffers -> physical screen]?  It would seem wasteful to have 32 x LDI / RET in one subroutine, and a separate subroutine which goes 32 x LDI / DEC A / JR NZ / RET.


Edited by IRF, 17 July 2019 - 02:51 PM.


#57 IRF

IRF

    Advanced Member

  • Contributor
  • 4,277 posts

Posted 17 July 2019 - 02:53 PM


as I suggested in post 54 44

 

Sorry if the above typo caused any confusion! :wacko:

 

BTW, thanks for checking your code in post 49 works okay.  :)


Edited by IRF, 17 July 2019 - 03:16 PM.


#58 Norman Sword

Norman Sword

    Advanced Member

  • Member
  • PipPipPip
  • 237 posts

Posted 17 July 2019 - 04:14 PM

The original with the "a" register passing the amount of copy was in response to the limited space available in the source code.

 

ld hl,source

ld de,destin

ld bc,count

ldir

 

being 11 bytes in size.. modified to be

 

ld hl,source

ld de,destin

ld a,count/32

call BLOCK_MOVE32      ; which uses "a" as a counter

 

which is also 11 bytes in size and can be easily fitted into the same size space.

 

 

IGNORING the raster copy routine for now. Lets look at the format of the original BLOCK_MOVE32 which was

 

 

BLOCK_MOVE32:

       ldi

BLOCK_MOVE31

     ldi

     rept 30

      ldi

    endm

     dec a

     jr nz,BLOCK_MOVE32

    ret 

 

This has two labels BLOCK_MOVE32 and BLOCK_MOVE31. It could be written out to have every label from BLOCK_MOVE32 down to BLOCK_MOVE01. It doesn't because the other labels would be used infrequently and I am lazy. I can not be bothered writing them all out.

Problems using the above routine to move odd amounts........

 

The bulk of block moves in the game are multiples of 32 and a call to BLOCK_MOVE32 does the job. However we also have cases where we copy a block in a block copy sequence such as.

 

: this code will clear the screen attributes to black on black

 

ld hl,ATT0

ld de,ATT0+1

ld bc,$2ff

ld (hl),0

ldir

 

this might seem to not be of the same format. we are copying here $2ff of data which is not a multiple of 32. But look again at the amount. I can re write $2ff in a different way, such as $300-1. Clearly the same value and clearly this is very similar to the format being used in the standard block move. so lets adapt that code 

 

ld hl,ATT0

ld de,ATT0+1

ld (hl),0

ld a,$300/32                    ; MOVE $300 bytes

call BLOCK_MOVE31      ;MOVE $300 bytes -1    e.g. move $2ff bytes <<<<<<< NOTE BLOCK_MOVE31
 

this has moved $2ff bytes which is what we wanted.... In jsw the vast majority of block moves want to move either a multiple of 32 bytes or one short of a multiple of 32 bytes. Which is why I only expanded BLOCK_MOVE32 and BLOCK_MOVE31 out in the routine to move the data.

 

Going into jsw we have ... And this is just a quick grab from the source code. For all the big block moves

 

LD HL,CHAR0  ;L869F ;$4000  
LD DE,CHAR0+1  ; 86A2 ;$4001
LD BC,$1AFF  ; 86A5 ;                           ld a,$1b00/32
LD (HL),$00  ; 86A8 ;
LDIR   ; 86AA ;                                      call BLOCK_MOVE31

 

LD HL,code_att   ; 86E8 ;L9B80
LD DE,ATT8  ; 86EB ;$5900 

LD BC,$0080  ; 86EE ;                           ld a,$80/32
LDIR   ; 86F1 ;                                       call BLOCK_MOVE32

 

LD HL,CHAR0  ;L8813 ;$4000  
LD DE,CHAR0+1  ; 8816 ;$4001
LD BC,$17FF  ; 8819 ;                          ld a,$1800/32
LD (HL),$00  ; 881C ;
LDIR                                                    call BLOCK_MOVE31

 

LD HL,logo_att   ; 8820 ;L9800 
LD BC,$0300  ; 8823 ;                           ld a,$300/32
LDIR   ; 8826 ;                                      call BLOCK_MOVE32
;recolour line 19

 

LD HL,ATT19  ; 8828 ;$5A60  
LD DE,ATT19+1  ; 882B ;$5A61
LD BC,$001F  ; 882E ;                         ld a,$20/32
LD (HL),$46  ; 8831 ;
LDIR                                                   call BLOCK_MOVE31

 

LD HL,ATT19  ; 88B8 ;$5A60  
LD DE,ATT19+1  ; 88BB ;$5A61
LD BC,$001F  ; 88BE ;                         ld a,$20/32
LD (HL),$4F  ; 88C1 ;
LDIR   ; 88C3 ;                                     call BLOCK_MOVE31

 

LD HL,bottom_att ; 8907 ;L9A00  
LD DE,ATT16  ; 890A ;$5A00
LD BC,$0100  ; 890D ;                         ld a,$100
LDIR                                                   call BLOCK_MOVE32

 

LD DE,room_layout ; 891A ;L8000 
LD BC,$0100  ; 891D ;                          ld a,$100/32
LDIR                                                   call BLOCK_MOVE32

 

LD HL,CHAR16  ; 8958 ;$5000 
LD DE,CHAR16+1   ; 895B ;$5001
LD BC,$07FF  ; 895E ;                          ld a,$800
LD (HL),$00  ; 8961 ; 
LDIR                                                    call BLOCK_MOVE31 

 

LD HL,att_master ; 89B0 ;$5E00  
LD DE,att_work   ; 89B3 ;$5C00

LD BC,$0200  ; 89B6 ;                           ld a,$200/32
LDIR                                                    call BLOCK_MOVE32

 

LD HL,char_master ; 89BB ;$7000  
LD DE,char_work  ; 89BE ;$6000
LD BC,$1000  ; 89C1 ;                          ld a,$1000/32
LDIR                                                   call BLOCK_MOVE32

 

LD HL,char_work  ;L89F5 ;6000 

LD DE,CHAR0  ; 89F8 ;$4000
LD BC,$1000  ; 89FB ;                          ld a,$1000/32  
LDIR   ; 89FE ;                                     call BLOCK_MOVE32

 

LD HL,att_work   ; 8A1A ;$5C00  
LD DE,att_work+1 ; 8A1D ;$5C01
LD BC,$01FF  ; 8A20 ;                         ld a,$200/32
LD (HL),A  ; 8A23 ;                             <<<<<< a reg problem
LDIR                                                   call BLOCK_MOVE31 

 

LD HL,att_work   ;L8A26 ;$5C00  
LD DE,ATT0  ; 8A29 ;$5800
LD BC,$0200  ; 8A2C ;                          ld a,$200/32
LDIR   ; 8A2F ;                                     call BLOCK_MOVE32 

 

LD HL,bottom_att ;L8B07 ;L9A00  
LD DE,ATT16  ; 8B0A ;$5A00
LD BC,$0100  ; 8B0D ;                         ld a,$100/32
LDIR                                                   call BLOCK_MOVE32

 

LD HL,ATT0  ;L8C03 ;$5800
LD DE,ATT0+1  ; 8C06 ;$5801
LD BC,$01FF  ; 8C09 ;                          ld a,$200/32 
LD (HL),A  ; 8C0C ;                               <<<<<< problems here with the a register
LDIR                                                    call BLOCK_MOVE31

 

LD HL,CHAR0  ;L8C4A ;$4000  
LD DE,CHAR0+1  ; 8C4D ;$4001
LD BC,$0FFF  ; 8C50 ;                         ld a,$1000/32
LD (HL),$00  ; 8C53 ;      
LDIR                                                   call BLOCK_MOVE31

 

LD HL,att_master ; 96F4 ;$5E00  
LD DE,ATT0  ; 96F7 ;$5800
LD BC,$0200  ; 96FA ;                          ld a,$200/32
LDIR   ; 96FD ;                                     call BLOCK_MOVE32

 

LD HL,CHAR0  ; 96FF ;$4000 
LD DE,CHAR0+1  ; 9702 ;$4001
LD BC,$0FFF  ; 9705 ;                          ld a,$1000/32
LD (HL),$18  ; 9708 ; 
LDIR                                                    call BLOCK_MOVE31

 

 

 

The above illustrates why we have BLOCK_MOVE32 and BLOCK_MOVE31 . In a game like JSW we are moving multiples of 32 in the vast majority of cases. Only two of the above cases causes a pause and a need to work out how to preserve the a register. (it might contain more instances where the "A" register needs to be preserved)

 

 

 

very easy to change and very easy to work out.

 

 

The problem then comes with ,  how we manage the raster copy. Which in the recent posting uses a differnt BLOCK_MOVE 


Edited by Norman Sword, 17 July 2019 - 05:16 PM.

  • IRF likes this

#59 Norman Sword

Norman Sword

    Advanced Member

  • Member
  • PipPipPip
  • 237 posts

Posted 17 July 2019 - 05:47 PM

Since the last posted raster copy is a lot faster without using the "a" register and we don't want to supply two lots of LDI routines..... Lets just chop up the old LDI routine and make that return without modifying the "a" register. Then we have only one LDI routine.

 

The chopping up of the routine takes 5 bytes to modify and five bytes to put it back. Since these two modifications are executed only once, it is a lot faster than having to re-use the "a" register inside the loop. 

 

The Block move from post #58

 

BLOCK_MOVE32:
       ldi
BLOCK_MOVE31
     ldi
     rept 30
      ldi
    endm

S_M_C_fast: equ $

     dec a                                  ; change this opcode
     jr nz,BLOCK_MOVE32
    ret

 

;-----------------------------------------------

The raster copy is expanded to become

 

;modify

 ld hl,S_M_C_fast   ;position to modify

 ld (hl),$C9              ;opcode  value for ret

 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

;copy work and attribute screens
    ld hl,att_work
    ld de,ATT0
    ;;;;;; ld b,0                         ; this was set for usage in a different routine
    exx
    ld hl,ytable
    ld bc,128   ; must be a multiple of 32  ; this is 4*32 ;- that is 4 raster lines before the attributes are written in
;loop executed 128 times on each game loop
raster:
  ld e,(hl)
  inc l
  push hl
  ld h,(hl)
  ld l,e
  ld d,h
  res 5,d
  call BLOCK_MOVE32        ;executed 128 times on each game loop
  jp pe,n_raster
  exx                                  ; this code is executed 16 times on each game loop
  ;;;;ld c,32                             ; this was set for usage in a different routine
  call BLOCK_MOVE32       
  exx
  inc b
n_raster:
  pop hl
  inc l
  jr nz,raster

 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>

;restore back to old

 

  ld hl,S_M_C_fast

  ld (hl),$3d              ;opcode value for dec a

 

; rest of code

 

 

which leaves only one ldi routine.

 

 

That's today's version..... 


Edited by Norman Sword, 18 July 2019 - 12:47 PM.

  • IRF likes this

#60 Norman Sword

Norman Sword

    Advanced Member

  • Member
  • PipPipPip
  • 237 posts

Posted 17 July 2019 - 07:10 PM

since the source code for me is just a blank canvas to edit. I am not restricted to worrying about will a piece of code fit in here or there. If I delete a byte in the source code, I immediately have that byte and it can be used anywhere I want in the limits of the memory I am editing. The restrictions are removed.

Block move

 

ld hl,source

ld de,destination

ld bc,size

ldir

 

looking at this yet again, but from the no restrictions point of view. The best solution is slightly different

 

Shift_block32:                              ; new label to distinguish from all the other code

        ldi

Shift_block31

      rept31 

      ldi

      endm

     jp    pe,Shift_block32

    ret

 

we write the code out exactly as specified in JSW but we now use one more byte on each instance and rather than LDIR we use an extra byte and call Shift_block32 or Shift_block31 ; Depending on the value being divisible by 32 or one short

This also does not use the "a" register- similar to  LDIR on its own. This method also is as close to a replacement LDIR as we can generate in code,  returning with the same parameters as would be set with LDIR.

 

 

so 

LD HL,CHAR0  ;L869F ;$4000 
LD DE,CHAR0+1  ; 86A2 ;$4001
LD BC,$1AFF  ; 86A5 ;    
LD (HL),$00  ; 86A8 ;
LDIR   ; 86AA ;                                      call shift_block31

 

 

This is faster again (marginally) but uses the same registers as the original 

 

I suppose you could modify this routine  to use a similar raster copy routine as the one posted above. e.g.

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------
;copy work and attribute screens

 

   ld hl, S_M_C_New_Mod          ;place to modify 

   ld (hl),$c9                              ; modify in a "ret"

 

    ld hl,att_work
    ld de,ATT0
    ;;;;ld b,0                         ; this was set for usage in a different routine
    exx
    ld hl,ytable
    ld bc,128   ; must be a multiple of 32  ; this is 4*32 ;- that is 4 raster lines before the attributes are written in
;loop executed 128 times on each game loop
raster:
  ld e,(hl)
  inc l
  push hl
  ld h,(hl)
  ld l,e
  ld d,h
  res 5,d
  call Shift_block32        ;executed 128 times on each game loop
  jp pe,n_raster
  exx                                  ; this code is executed 16 times on each game loop
  ;;;;ld c,32                             ; this was set for usage in a different routine
  call Shift_block32       
  exx
  inc b
n_raster:
  pop hl
  inc l
  jr nz,raster

 

 ld hl,S_M_C_New_Mod     ;place to midify

 Ld (hl),$ea                       ;opcode for Jp PE,xx

 

rest of code

-----------------------------------------------------------------------------------------
Shift_block32:                              ; new label to distinguish from all the other code
        ldi
Shift_block31:
      rept31
      ldi
      endm

S_M_C_New_Mod Equ $
     jp    pe,Shift_block32               ;opcode toggled between ( "ret" ) and ( "jp pe,xx" )
   ret


Edited by Norman Sword, 18 July 2019 - 12:48 PM.

  • IRF likes this





Also tagged with one or more of these keywords: members only, source code, jsw

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users