Jump to content
Jet Set Willy & Manic Miner Community

[File] JSW jagged finger effect (demo)


Norman Sword

Recommended Posts

I can't download the the files right now, I'll have a look at them later :).

 

Danny, I've removed the files for now, so I can carry out further tweaking, and then re-upload a (hopefully) new and improved version.  That'll save you from looking at various iterations, with slight improvements each time - it would be better for you to see the final article, and observe the greatest 'Before vs After' contrast in one hit!

Link to comment
Share on other sites

This should do it - with this 'regime' in place, the maximum delay between a graphic byte being drawn, and its associated attribute byte being copied to the screen, should be the length of time it takes to draw 128 graphic bytes (equivalent to four whole pixel-rows).  So it should minimise the 'Delayed [or Premature] Attribute Effect' (as well as the 'Jagged Finger Effect').

 

It's a bit more efficient than my previous attempt so it also restores the three bytes spare for a CALL to the Screen Flash routine at the start.

 

I'll try it out later:

 

(Norman

Edited by IRF
Link to comment
Share on other sites

When the subject of updating the attributes at the same time was mentioned. I jotted down the code below. 

 

The +code is the code added to do the attributes at the same time.

 

org 35317

 

35317              LD HL,05C00H ;  +3

                        LD DE,05800H ;  +6

                        EXX                  :  +7

                        XOR A              ;  +8

        ld hl,08200H                    ;3

loop

        ld      e,(hl)                       ;4

        inc    l                               ;5

        push  hl                            ;6

        ld      d,(hl)                       ;7

        ld      l,e                           ;8

        ld      h,d                          ;9

        res    5,d                         ;11

        ld      bc,32                     ;14

        ldir                                  ;16

        pop     hl                         ;17

        inc     l                            ;18

                        inc     a           ;  +9               ;** extra over head 1

                        and     7         ;  +11              ;** extra overhead 2

                        jr       nz loop  ;  +13

                        exx                 ;  +14

                        ld       bc,32   ;  +17

                        ldir                 ;  +19

                        exx                 ;  +20

                        dec     l           ;  +21

                        inc     l            :  +22

        jr      nz,loop                   ;20

;----------------------------

        ld      a,(34271)             ;23 ; this code is moved down in memory

        and    2                         ;25

        rrca                               ;26

        ld      hl,34258               ;29

        or      (hl)                       ;30

        ld      (hl),a                     ;31

        jr      35377                    ;33

 

 

my main concern was that this loop is executed 128 times per game loop.

 

Which makes any checking done, needs to not introduce too much overhead.

 

The extra overhead needed on (128-16) lines of code. (the extra code lines that are executed but do not result in anything extra being done)

is a mere two instructions (indicated by **)

 

The other 16 times through the loop handle copying the atrributes.
Link to comment
Share on other sites

This should do it - with this 'regime' in place, the maximum delay between a graphic byte being drawn, and its associated attribute byte being copied to the screen, should be the length of time it takes to draw 128 graphic bytes (equivalent to four whole pixel-rows).  So it should minimise the 'Delayed [or Premature] Attribute Effect' (as well as the 'Jagged Finger Effect').

 

In contrast, in Matthew's original regime, there were considerable delays involved, which varied according to the position on the screen.  Considering two extreme examples:

 

- after the final graphic byte was drawn at the bottom-right corner of the playing area, there was a relatively short delay whilst the 512 attribute bytes were copied (as well as the time taken to enact the Toilet Dash double-speed effect and, if applicable, the 'Screen Flash' routine);

 

- after the first graphic byte was drawn at the top-left corner of the screen, its associated attribute wouldn't get copied until every other graphic byte had been drawn - that's 32*16*8=4096!

Edited by IRF
Link to comment
Share on other sites

Norman, thanks for your latest post (which crossed over with one of mine).

 

Looking at my latest code, I believe there are four instructions that you would term 'extra overhead' (#89FB-#89FF in my latest disassembly, in post no 32).  Would that slow down the game considerably?  (I haven't tried it out yet.)

 

Balanced against that, my understanding of your latest code which copies the attributes at the same time, is that all eight rows of graphic-bytes would be copied to a particular cell-row before its attributes are distributed.  That means that there would be a delay caused by up to 256 graphic bytes being copied, before a bitmap in the top pixel-row of its cell-row is united with its associated attribute.

 

Of course, the situation would be much better for the bitmaps in the bottom pixel-row of a cell-row, because only 32 bytes would have to be copied in the intervening period.  But my latest effort is an attempt to 'evenly spread' the delays, because last night I came up with a similar solution to yours [EDIT: similar in terms of the sequencing, not the execution], but I noticed that there was still a residual (albeit improved) 'Delayed Attribute Effect' - particularly for (fast-moving) Arrows located in the top pixel-rows of their host cell-rows.

 

EDIT: Would replacing your XOR A (at 35317+7) with a LD A, #04 achieve what I'm after?  i.e. Copying the attributes after the four pixel-rows but before the bottom four pixel-rows, for each cell-row in turn?

 

******

 

Looking at the length of your code, I believe it pips mine to the post by one byte in terms of efficiency. :)  EDIT: Or it would be exactly the same length if we were to set the initial value of A to #04.

Edited by IRF
Link to comment
Share on other sites

The delay before attribute update, is a swings and roundabout affair.  Update before the graphics are updated and then the arrows will be in their old position and have their colours shifted , Update after and the reverse happens.  Update in the middle and you might introduce another effect.

 

When I wrote the code I was aware that the original "xor a" could be changed to permit the attribute copying to be set to anywhere needed. 

 

You have include the Jr at the bottom in the size of the code- which is missing from yours.   

 

The speed of execution will be faster overall for mine, because the attribute calculations are not needed in each attribute copy. The overhead then is

 

exx                   4 t-states

- copy attribute                        ld bc,32-ldir

exx                   4 t-states   8

dec l                 4 t states  12

inc l                  4 t states  16

 

which is 4 sets of 4 Tstate operations ... 16 in total

 

yours  

 

 PUSH HL        10 t-states       10

 LD D, #58        7 t states        17
 SLA E              8                     25

 LD L, E             4                    29
 JR NC, #01       7/12               36   41
 INC D                4                    40   45
 LD H, D             4                    44   49
 SET 2, H           7                    51   55

 

copy attribute                              ld bc,32-ldir  

                 

 POP HL             10                  61   66

 

the extra overheads are 61+ Tstates- 

 

So mine is faster and smaller. (but this is not a contest)

Edited by Norman Sword
Link to comment
Share on other sites

My replies in red:

 

The delay before attribute update, is a swings and roundabout affair.  Update before the graphics are updated and then the arrows will be in their old position and have their colours shifted , Update after and the reverse happens.  Update in the middle and you might introduce another effect.

 

By "another effect", are you referring to something specific, or do you mean in an 'unintended consequences' way?

 

When I wrote the code I was aware that the original "xor a" could be changed to permit the attribute copying to be set to anywhere needed. 

 

I'll try a starting value of #04 because, going by my recent experience, it should almost entirely eliminate the arrows flickering in a colour than the one intended.  Barring any unforseen problems, I think that would be the optimum solution.

 

You have include the Jr at the bottom in the size of the code- which is missing from yours.   

 

I omitted the JR because it would no longer be needed if the unused 'Screen Flash' routine and its preceding check of the Screen Flash variable are removed from the Main Loop (with scope for it to be CALLed from elsewhere, with the CALL command being inserted just after the CALL to the item-drawing routine).

 

However, you're right that in my quick comparison, I omitted to deduct two bytes from your total as well (for the same reason).

 

The speed of execution will be faster overall for mine, because the attribute calculations are not needed in each attribute copy. The overhead then is

 

exx                   4 t-states

- copy attribute                        ld bc,32-ldir

exx                   4 t-states   8

dec l                 4 t states  12

inc l                  4 t states  16

 

which is 4 sets of 4 Tstate operations ... 16 in total

 

yours  

 

 PUSH HL        10 t-states       10

 LD D, #58        7 t states        17
 SLA E              8                     25

 LD L, E             4                    29
 JR NC, #01       7/12               36   41
 INC D                4                    40   45
 LD H, D             4                    44   49
 SET 2, H           7                    51   55

 

copy attribute                              ld bc,32-ldir  

                 

 POP HL             10                  61   66

 

the extra overheads are 61+ Tstates- 

 

I haven't got my head fully around the T-States stuff, but I can see in general terms why yours would be faster.

 

So mine is faster and smaller. (but this is not a contest)

 

Indeed!  But any spare bytes in this tight spot in the Main Loop could come in handy for other purposes, so I would probably opt for the most byte-efficient solution (i.e. yours).

Edited by IRF
Link to comment
Share on other sites

Okay, please see the three attached test files.  (Ignore the fact that Willy jumps upon start-up, and walks backwards; those are relics from some earlier testing.)

 

The 'Ian' fixed file has my latest iteration of a patch for the 'Delayed Attribute Effect', woven into Norman's original fix for the 'Jagged Finger Effect' as per my disassembly in post no. 32.  I've also attached an unfixed file to allow for a Before vs. After contrast.  The difference is striking, and the 'Delayed Attribute' is almost entirely eliminated.

 

N.B. I tried to create an additional file, based on Norman's latest code (post no. 33), but that is still work in progress.  Following Norman's code exactly (with the XOR A at the start) it worked okay, but the suppression of the 'Delayed Attribute' wasn't as effective as it is in my version (see previous discussion).  So I tried to swap the XOR A for a LD A, #04.  Alas, that caused the file to crash soon after the game started.  So I need to look into what's going wrong (I know that it arose out of my minor departure from Norman's code, but I haven't yet figured out exactly why it doesn't work).

 

EDIT: The screen is actually drawn prior to the crash, and I think that the problem is arising because the drawing loop doesn't know when to come to an end.  That in turn is because the copying of attributes to the bottom cell-row doesn't follow on immediately after the final pixel-row has been drawn (i.e. the point at which L, the index to the table at #8200, has wrapped back round to zero).  So the end of the outer loop might need to be tweaked a bit...

 

UPDATE: I've also attached a 'Norman' file in which I've implemented Norman's fix from post no. 40 33, only I've modified it slightly so that  As with my fix, the attributes are copied for each cell-row, mid-way through the process of copying the graphic bytes for that cell-row (i.e. after the first four pixel-rows have been drawn).

 

Norman's fix required three (UPDATE: five) fewer bytes than mine, but I get the impression that the improvement in terms of the Delayed Attributes is slightly greater with my fix in place (EDIT: or maybe not?)  However, there's not much in it, and both fixes offer a vast improvement in comparison with the 'unfixed' file.  :)

Jagged Finger & Delayed Attributes Ian Fix.z80

Jagged Finger & Delayed Attributes Bugs.z80

Jagged Finger & Delayed Attributes Norman Fix.z80

Edited by IRF
Link to comment
Share on other sites

For the record, this is how I tweaked Norman's latest code (one additional byte compared with post no. 33, changes in bold):

 

EDIT: Note that this method leaves six spare bytes very usefully located in the spare loop, just prior to the screen-drawing code (at #89F5-#89FA).  That's enough to insert a CALL to the Screen Flash Routine (which would have to be reinstated elsewhere) AND a CALL to the Main Loop Patch Vector.  :)

 

org #89FB
 
              LD HL, 05C00H 
              LD DE, 05800H 
              EXX                  
              LD A, #04         
       
        ld HL, 08200H                    
 
loop
        ld      e,(hl)                   
        inc    l                           
        push  hl                        
        ld      d,(hl)                    
        ld      l,e                         
        ld      h,d                        
        res    5,d                        
        ld      bc,32                    
        ldir                                 
        pop     hl                        
        inc     l                            
        JR Z, #0E   If L has reached zero, then jump forward to the code which doubles Willy's speed during the Toilet Dash
 
                        inc     a          
                        and     7         
                        jr       nz loop 
                        exx                
                        ld       bc,32   
                        ldir                 
                        exx                
                        dec     L           
                        inc      L          
        JR      loop                   Jump back to draw the next pixel-row (always necessary at this point, as the attributes are copied                                                                     midway through the drawing of each cell-row)
 
;----------------------------
        ld      a,(34271)             ;23 ; this code is moved down in memory
        and    2                         ;25
        rrca                               ;26
        ld      hl,34258               ;29
        or      (hl)                       ;30
        ld      (hl),a                     ;31
        jr      35377                    ;33
Edited by IRF
Link to comment
Share on other sites

Spectacular crash when I tried modifying the xor a. This code is shorter and fixes the problem. I have to admit I do not like the additional Jr inserted into the main loop to overcome this small change. (XOR A changed to LD A,4)  In loops of this nature, each change in flow is repeated and repeated. So just this one small change is the equivalent of inserting 255 1 byte opcodes. In itself not a visible factor in JSW's  speed. but slowdown is accumulative.

 

 

org #89FB

              LD HL, 05C00H
              LD DE, 05800H
              EXX                 
              LD A, #04        
      
        ld HL, 08200H                   

loop
        ld      e,(hl)                  
        inc    l                          
        push  hl                       
        ld      d,(hl)                   
        ld      l,e                        
        ld      h,d                       
        res    5,d                       
        ld      bc,32                   
        ldir                                
        pop     hl                       
                        inc     a         
                        and     7        
                        jr       nz not_attrib
                        exx               
                        ld       bc,32  
                        ldir                
                        exx               
      not_attrib          
                        inc      L         
        JR     nz,loop    

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.