Les membres ayant 30 points peuvent parler sur les canaux annonces, projets et hs du chat.

Forum Casio - Projets de programmation


Index du Forum » Projets de programmation » Terrario, a Terraria rewrite for the calculator
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Terrario, a Terraria rewrite for the calculator

Posté le 10/07/2020 16:05

Hi. I noticed a while ago there weren't any games like Terraria or Minecraft available for Casio calculators. For the past while I've been working on rewriting Terraria in C for the SH4 calculators using gint. I'm not sure when if ever I'll finish it, since it is a fairly big project, so I've decided to put it here for now.

Here are a few screenshots of the progress so far (some may be out-of-date):
Main menu


Gameplay


Inventory


Crafting


Equipment


A visualisation of a generated world (click for full detail)



The game runs at 30FPS. Worlds are 1000x250 tiles large (640x250 on the 35+E II / GIII).

The control scheme and a crafting guide can be found in the game's About menu.

This forum page is updated regularly with the latest release of the game, as well as a changelog in the comments.

If you aren't sure what an item does, feel free to search it up on the official Terraria wiki.

Most recent update:
NPCs.

Up next:
Money and shop NPCs.

The attached file contains the latest build of the game, as well as instructions and a screenshot compiling script and map tool.

The source code repository as well as early builds of the game can be found at this GitHub repo and its Gitea mirror. Obviously, expect bugs in these early builds, though I take care to remove the major ones I find before releasing.

Due to the very large world, the save files for this game are big. Make sure you have at least 450kB of storage space before installing the addin (300kB on Graph 35+E II), and try to keep at least 300kB free afterwards. Tampering with the files in the TERRARIO folder will corrupt the save, so don't do that. The game will warn you if you have low storage space available, so that you can optimise your storage.

NOTE: You must have a Graph 35+ E, Graph 35+E II, fx9860GII, or fx9750GIII model calculator to run this game.

Fichier joint


Précédente 1, 2, 3, 4, 5, 6, 7, 8 Suivante
Dark storm Hors ligne Labélisateur Points: 11501 Défis: 176 Message

Citer : Posté le 20/07/2020 10:58 | #


Did you consider some compression or lazy loading from storage memory? It may improve a bit the world size, even if it can add some latencies when moving to another world part.
Finir est souvent bien plus difficile que commencer. — Jack Beauregard
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 20/07/2020 11:10 | #


Lephenixnoir and I were considering lazy loading, but the downsides outweight the benefits - BFile is very slow, and I can't display gray while it's working so I'd have to have a "Loading..." screen.

Compression is an interesting idea, though... it wouldn't really be feasible to constantly compress and decompress data, but I could definitely use it to decrease region file sizes.

Ajouté le 20/07/2020 à 13:15 :
Another change - to improve the collisions, the game will now run at 30FPS/60UPS (or as close to that as possible)
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 20/07/2020 15:42 | #


Maybe you could keep the active region uncompressed and then compress the rest to fit in RAM? On fx-9860G you don't have a lack of computational power. (If we manage to control the various DSPs there might even be more to use.)

Another change - to improve the collisions, the game will now run at 30FPS/60UPS (or as close to that as possible)

Nice idea! Raw dclear(); dupdate() runs at > 900 FPS without overclock so I'm confident you will have absolutely no issue in maintaining this specification
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 20/07/2020 22:19 | #


Rendering takes a surprising amount of time - around 4 RTC ticks (32FPS) when it's drawing a full screen of 180 8x8 tiles. I've verified that it's dsubimage bottlenecking the process, but that's probably been optimised as much as possible.
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 20/07/2020 23:37 | #


Hmm, this is a problem. The gray engine puts a lot of pressure, ~50% of the time is usually spent flipping the screen. But that performance is still not up to the platform. The first version of bopti could display a fullscreen scrolling gray background image at more than 500 FPS, and this was before the improved layering method that I built in the current version of gint.

I made some basic tests and measured 10 ms to render a screenful of 8x8 tiles. The added cost of gray rendering plus the pressure from the engine makes a solid 30 ms plausible, but I am utterly dissatisfied there. More basic tests show that rendering 4 sets of 128x16 "tiles" takes only 2.8 ms, so the internal logic might be responsible. Given the time I spent on this, I definitely won't live with such disappointing results. x)

On a less technical but more personal note, please don't use the RTC for performance measurements. It's unreliable, it has a very low precision, and it's not made for that. It's the kind of hacks that we used back with fxlib, which gint tries hard to avoid o(T_T)o
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 21/07/2020 00:51 | #


Noted, I was planning on moving update and render functions to separate gint timers instead of one RTC-governed loop ASAP anyway.
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 21/07/2020 09:13 | #


I have investigated your rendering issue a bit further; the problem is basically that bopti scales terribly with subdivision.

In essence, bopti is a glorious wrapper around the fastest critical loop I could come up with for image rendering. With satisfying results: rendering a 128x64 images takes about 550 µs even at odd positions, which you can compare to the bare minimum of 70 µs which is needed to even fill in 1024 bytes of RAM. (dupdate() takes about a non-compressible 1 ms so grinding anything below is not a priority.)

In practice, it's a trade-off of a large setup cost against a very optimized rendering loop. Which, as you can guess, does not cope well with the small size and large number of 8x8 tiles you're rendering (which I can 100% reproduce).

Now I don't want to end up as MonochromeLib did before me with a number of repeated, hard-to-maintain functions (it had functions for 8x8, 16x16, and general bitmaps ; OR, AND, and XOR mode for each of these sizes ; and clip and noclip versions for some but IIRC not all of the combinations of size of mode). I've come up with a few specializations ideas to recover efficiency in small/aligned situations without compromising the code. I'll try some of them today and keep you up-to-date
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 21/07/2020 11:09 | #


Awesome, faster rendering means a bigger frame budget to put cool stuff in I've migrated the update and rendering to timers and it works really well.
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 23/07/2020 10:21 | #


I've started optimizing out the stuff. My benchmark is filling the screen with 8x8 tiles, just like you, though without the gray (because the gray engine would affect performance measurements). The special case that I implemented is called single-column single-position, it applies when the source tile is in a single 32-pixel column of the source image and the destination is in a single 32-bit column of the VRAM. When using 8x8 tiles, this is always the case.

The original time without the optimization and with clipping was 8400 µs to fill the screen. With the optimization and with DIMAGE_NOCLIP, it is now about 2900 µs. This is still way too much, but I have other leads. I'll see how low I can get this.

Please pull from the dev branch to benefit from this optimization. There is nothing you need to do specifically to enable it, the image format did not change and dsubimage() will automatically detect the special case. Please use DIMAGE_NOCLIP to render the tiles, which saves up to 16% time here. Let me know if you have noticeable improvements.

Since this is a large change, it is possible that a special case escaped my test set (sigh). If something's not displaying as expected and you think gint might be responsible, please provide me with the original image and the corresponding dsubimage() call, I will debug it.
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 23/07/2020 10:58 | #


Instantly seeing a massive improvement thank you so much, you may have just saved the 30FPS framerate.

From my experience with Monochromelib I was using as little clipping handling as possible anyway, but I can't stop using DIMAGE_NONE completely unfortunately, as I have to render partial tiles around the edges of the screen.
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 23/07/2020 14:07 | #


I've grinded off another 25% speed increase, bringing the benchmark time from 2900 µs to about 2150 µs. I'm not sure I can do much better without writing more assembler, as the code produced by GCC seems kind of sloppy at times.

Well at least this optimized situation is 4 times faster than before so I'm happy with it for now

Though if you don't animate horizontally (ie. your 8x8 tiles are always drawn on a multiple of 8 on the x axis) I'm pretty sure you can divide that again by at least 4 with some custom code.
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 27/07/2020 08:11 | #


Update: After a fair bit of work, items and the inventory are finally here! You now have a 3-slot hotbar, and [SHIFT] brings up a 24-slot inventory to store stuff in. I implemented both full-stack picking and single-item, for precision amounts.

Rendering has also undergone massive optimisation (thanks to Lephenixnoir), world size has increased to 1000x250, physics is calculated at 60UPS, generating a world is now extremely fast, and exiting the game no longer requires a reboot! All this comes alongside numerous bugfixes and optimisations.

Next up, I plan on adding tile variations - applicable tiles will choose from one of 3 possible sprites for their state, meaning the world will look a lot less repetitive
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 27/07/2020 09:17 | #


Awesome update! Lots of good stuff there, especially world size.

Next up, I plan on adding tile variations - applicable tiles will choose from one of 3 possible sprites for their state, meaning the world will look a lot less repetitive

I assume you considered ways to distribute tile alternatives in deterministic ways to avoid storing it in the world file?
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 27/07/2020 09:27 | #


I have 2 bits free in each Tile struct, for now I'm planning on using them to store the variant.

If I ever need to use those bits for a more important thing, I can construct a scrolling buffer the size of the screen to store variations in - they'll be randomised when they go off and back on-screen, but that won't really be noticeable.

Ajouté le 29/07/2020 à 10:08 :
I'm attempting to add the optimisation, but I'm still getting assembler errors.

mov.l #0x1D4, r5
and
mov.l #0x049A, r0
give me "Error: invalid operands for opcode".

Calling a syscall with your method:

.section ".pretext"

#define syscall(id)                ;\
    mov.l    syscall_table, r2    ;\
    mov.l    1f, r0                ;\
    jmp    @r2                        ;\
    nop                            ;\
1:    .long    id

...

        bra     .try
        add     #1, r14
      
.exit:  
      
        syscall(0x046b)
        nop

        add     r0, r14
      
        mov.l #0x049A, r0

gives :
"Error: misaligned data"
"Error: pcrel too far"
"Error: offset to unaligned destination"
for each syscall.
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 29/07/2020 10:18 | #


Due to the SuperH 16-bit instruction size, there is no mov instruction with an immediate operand of more than 8 bits. The normal way to move a large value into a register is to load it from a PC-relative address:

/* Load the value at PC+16 into r5 */
mov.l  @(16,pc), r5

Since keeping track of the value of PC would be a nightmare, you can use a label instead.

mov.l    .value, r5
/* Later, after the function ends */
.value: .long 0x1d4

Renesas's SuperH assembler does this more transparently, it allows you to write mov #0x1d4, r5 as long as you specify a .pool nearby, and then moves the constant to the pool location and replaces the immediate value with a suitable PC-relative address. I don't think GNU as supports this.

There are a few possible problems with your code.
• First, the syscall() macro contains a .long with the syscall number passed as parameter (0x046b) in your case, it is likely this long that is unaligned in the first error messages.
• Second, the syscall() macro references a .syscall_table symbol which is originally defined at the end of kernel/syscalls.S in gint's source code and shared between all syscall(). The "pcrel too far" is likely a consequence of this symbol not being defined in your file.
• Finally, and most importantly, you can't start other applications (or generally call non-trivial syscalls) from within gint, if there is any way to make it work it has to be through a switch. Even then I never tried it myself so expect trial-and-error.

Edit : Also the syscall() macro performs a terminal call so the code below is never going to be executed. Use jsr and save pr if you need a subroutine call.
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 29/07/2020 10:40 | #


I've managed to fix everything but the alignment issues, how would I deal with those?

I'm going to call the optimization with gint_switch, it seemed to work before with the raw syscall (albeit with the crash).
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 29/07/2020 10:45 | #


I've managed to fix everything but the alignment issues, how would I deal with those?

Add .align 4 before the .long.

Well good luck, It'd be really nice if you made it work. I don't know yet whether the TLB is invalidated during the process, please check this if you can. If you can manage to do the optimize+restart method, I might be able to run SMEM optimization without restarting (no promises though, OS behavior is full of surprises). Also remember to put the code calling the syscall in RAM for obvious reasons.
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 29/07/2020 11:08 | #


It doesn't seem like the code can run from the stack, I get a system error with TARGET=0009001D and PC=88023E78 (in the stack?) even if the code just has a rts.

I guess I'll just settle for the normal syscall and have the system crash/reboot at the end.
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 29/07/2020 11:17 | #


This is correct. The stack is virtualised without execution permissions. Put your function in ILRAM with .section .ilram.

Edit : Sorry, I mean static RAM is mapped without execution permissions. You should be able to place your code in normal RAM, either the stack or the physical address for static RAM, but it's more tedious than just using ILRAM.
Kbd2 Hors ligne Membre Points: 264 Défis: 0 Message

Citer : Posté le 29/07/2020 11:22 | #


Alright, now it hangs for about half a minute then goes to a blank screen, only way to recover is manually resetting. I'm guessing the optimization works, but something goes wrong when trying to re-enter the Terrario addin.
Lephenixnoir En ligne Administrateur Points: 20340 Défis: 143 Message

Citer : Posté le 29/07/2020 11:23 | #


Sounds about right. Is the storage memory optimized after the reset?
Précédente 1, 2, 3, 4, 5, 6, 7, 8 Suivante

LienAjouter une imageAjouter une vidéoAjouter un lien vers un profilAjouter du codeCiterAjouter un spoiler(texte affichable/masquable par un clic)Ajouter une barre de progressionItaliqueGrasSoulignéAfficher du texte barréCentréJustifiéPlus petitPlus grandPlus de smileys !
Cliquez pour épingler Cliquez pour détacher Cliquez pour fermer
Alignement de l'image: Redimensionnement de l'image (en pixel):
Afficher la liste des membres
:bow: :cool: :good: :love: ^^
:omg: :fusil: :aie: :argh: :mdr:
:boulet2: :thx: :champ: :whistle: :bounce:
valider
 :)  ;)  :D  :p
 :lol:  8)  :(  :@
 0_0  :oops:  :grr:  :E
 :O  :sry:  :mmm:  :waza:
 :'(  :here:  ^^  >:)

Σ π θ ± α β γ δ Δ σ λ
Veuillez donner la réponse en chiffre
Vous devez activer le Javascript dans votre navigateur pour pouvoir valider ce formulaire.

Si vous n'avez pas volontairement désactivé cette fonctionnalité de votre navigateur, il s'agit probablement d'un bug : contactez l'équipe de Planète Casio.

Planète Casio v42 © créé par Neuronix et Muelsaco 2004 - 2021 | Il y a 57 connectés | Nous contacter | Qui sommes-nous ? | Licences et remerciements

Planète Casio est un site communautaire non affilié à Casio. Toute reproduction de Planète Casio, même partielle, est interdite.
Les programmes et autres publications présentes sur Planète Casio restent la propriété de leurs auteurs et peuvent être soumis à des licences ou copyrights.
CASIO est une marque déposée par CASIO Computer Co., Ltd