Forum Casio - Outils communautaires de programmation on-calc par Lephenixnoir

Forum Casio - Projets de programmation

Index du Forum » Projets de programmation » Outils communautaires de programmation on-calc

Lephenixnoir Hors ligne Administrateur Points: 25159 Défis: 174 Message

Outils communautaires de programmation on-calc

Posté le 10/01/2025 13:58

Dans le topic Les projets de Planète Casio pour 2025 Sabercat a relancé l'idée d'avoir des bons outils de programmation on-calc (en plus de la compatibilité 35+E II mais ça ça ira dans un autre topic peut-être).

Je liste ici les messages de cette discussion avec un résumé.

Messages principaux : #198728, #198735, #198761, #198763, #198767, #198773, #198774, #198775, #198777, #198786, #198788

Ce qu'on pourrait viser comme langages :

Python : ok, PythonExtra
LuaFX : à porter
Malical : à porter — y a-t-il de la demande ?
Quelque chose pour coder des add-ins (C ? Autre ?)
Basic : il y a déjà C.Basic (intégration sans doute impossible)

Ce qu'on peut viser comme éditeur :

A priori plutôt un éditeur séparé plutôt qu'un éditeur embarqué dans chaque appli
Kiwi Text : mais copyright, pas de sources, apparemment pas complètement stable
Micropy : existe déjà et marche, toutefois basé sur le PrizmSDK, et le support langage reste à coder
Nouveau programme à base de gint + JustUI comme PythonExtra ou text-viewer

Sabercat a mentionné qu'il serait bien de pouvoir coder des add-ins sur la calto. Je suis d'accord. Par contre, avoir un compilateur + linker sur la calto c'est trèèès ambitieux et porter les outils GNU c'est pas possible. Personnellement, je pense qu'il serait plus intelligent de coder des add-ins sur la calto dans un autre langage que le C. Je sais pas ce que vous en pensez...

Précédente – 1, 2

Druzyek Hors ligne Membre Points: 62 Défis: 0 Message

Citer : Posté le 30/05/2025 20:10 | #

Mb88 a écrit :
Comme éditeur il y a fxpyedit aussi.

Good idea! That could work.

Loieducode a écrit :
Couldn't one use the UBC as a makeshift error handling system? I don't actually know much about it but it feels like you could just use that and not have to make that sacrifice.

What does UBC mean? If that's what handles memory access exceptions, it might work. Does the OS store anything in RAM that could accidentally be corrupted? I know the part of RAM that add-ins access is safe but not sure about the rest of RAM. On the other hand, if the program can write anywhere, it could write to its own executable memory and generate machine code that you might not be able to recover from.

Mb88 Hors ligne Rédacteur Points: 1251 Défis: 3 Message

Citer : Posté le 30/05/2025 20:13 | #

Add-ins run in privileged mode, so if they write somewhere they shouldn't it can break stuff (in the worst case brick the calculator).

BUILDER : https://gitea.planet-casio.com/mibi88/Builder !

libMicrofx : https://www.planet-casio.com/Fr/forums/topic17259-2-libmicrofx-remplacez-fxlib-pour-faire-des-add-ins-tres-legers.html !

Racer3D : https://www.planet-casio.com/Fr/programmes/programme4444-1-racer3d-mb88-jeux-add-ins.html

Lephenixnoir Hors ligne Administrateur Points: 25159 Défis: 174 Message

Citer : Posté le 30/05/2025 20:20 | #

I'm not convinced it's either possible or straightforward to sanitize the user code in a way that prevents crashes or malfunctions. There's just so many ways things can go wrong... which is not to say the assembler isn't worth making, but I'm doubtful the "sandbox" approach is tenable. For one thing, valid addresses wouldn't be a clean interval. Then many addresses are computed on-the-fly with non-trivial addressing modes so the assembler would have to rewrite these, which requires an extra register, so computations using that register would also have to be rewritten. Then you need to worry about what code gets called, cause if you can jump anywhere that won't do. Oh and also if you have a heap you could break it, but masking won't safeguard things properly the way e.g. AddressSanitizer would. Right now you can even just misalign the stack to cause a crash (gint doesn't realign it in interrupts and I don't believe the OS does either).

Basically I feel like it'd be easier to interpret the assembly than actually run it.

Mon graphe (28 Janvier): (MPM ; serial gint ; (Rogue Life || HH2) ; PythonExtra ; ? ; Boson X ; passe gint 3 ; ...) || (shoutbox v5 ; v5)

Druzyek Hors ligne Membre Points: 62 Défis: 0 Message

Citer : Posté le 31/05/2025 08:04 | #

Lephenixnoir a écrit :
I'm not convinced it's either possible or straightforward to sanitize the user code in a way that prevents crashes or malfunctions.
...
Basically I feel like it'd be easier to interpret the assembly than actually run it.

You may be right about that. I think interpreting is appealing especially for things like single-stepping in a debugger which I think an on-calculator assembler would definitely need. It certainly solves a lot of problems. On the other hand, I think I have come up with a way to sanitize the code that might work. I put all of the instructions that seem relevant in a spreadsheet: SH4 instructions. Here's what I would do:

Code memory
- The assembler writes machine code to a part of RAM that the assembly code can't read or write.
- Branch instructions to fixed targets work without modification but must take a named label as an argument rather than a number to insure they jump to a valid instruction: BF, BF/S, BT, BT/S, BRA, BSR
For branches there are two possible options:

Option 1
- There is no way for the user to access the PR register directly.
- A macro like PUSH_PR pushes PR onto a stack that the user can't access. POP_PR does the opposite and checks for stack underflow.
- Instructions that load an address from a register are excluded: BRAF, BSRF, JMP, and JSR.
- JMP and JSR can be replaced with an alternate JMP and JSR that take a label name instead of a register if the +/-4K range of BRA and BSR isn't big enough.

Option 2
- PR returns an ID number instead of an address so no problem if the user accesses and manipulates the ID. The ID is an index into a table of target addresses created by the assembler.
- Loading a label into a register loads the label's ID instead of its real address.
- BRAF, BSRF, JMP, JSR, and RTS all expect an ID number and check that it's a valid table index before jumping. Supplying the wrong ID number jumps to somewhere unexpected but all ID numbers are valid targets so no crashes.

Data memory
- The assembled program is allocated a block of memory for data that it can read and write which is aligned to it's own size, ie a 64K block starting at 0x....0000.
- All constant data like tables and strings are copied into this block at startup. No constant data is stored in code memory since the assembled program has no way to access it.
- Addressing for the data memory starts at 0, so each instruction that accesses memory is masked (0xFFFF if data is 64K) then added to the base address (0x....0000). There are a couple of ways to do the masking:

Option 1
- Two registers (R14 and R15 for example) are off limits. The assembler will error if either one is used in an instruction.
- One holds the mask and the other holds the base address. Address sanitizing is AND R14, Rm then ADD R15, Rm, so just two instructions in most cases and four instructions for @(R0,Rn) and @Rm+,@Rn+ addressing.

Option 2
- All registers are free for the user to use including R14 and R15.
- The masking code stashes R15 somewhere temporarily to free the register up. Is DBR free for this?
- Another possibility is GBR. The assembler can make a copy of GBR and restore it before calling any gint functions but not sure if GBR is also needed in interrupts.
- Another possibility is PR as long as it's free before BSR or JSR.
- After R15 is stashed somewhere, the mask and base are loaded from a constant pool. 6 instructions: store R15, load mask, and with address, load base, add to address, restore R15.
- (Anywhere after an unconditional jump (BRA, BSR, JMP, JSR, RTS) is ok for the constant pool since all branches have to be to named labels so no way to jump between an unconditional jump and the next label and accidentally execute constants as code. In the worst case, output the constant pool with a jump over it.)
- Since branches only go to named labels, an easy optimization is to skip storing and restoring R15 until it's needed or there's a jump which brings the masking code from 6 to 4 instructions.

Option 3
- Like option 2, all registers including R14 and R15 are free.
- Use LDC/STC Rm,Rn_BANK to hold mask and base. Does gint use the other set of R0-R7 for exception handling?
- This should be faster since there's nothing to fetch from the constant pool. Also, running out of room and putting the constant pool in a random place will work but is annoying.

Instructions
- Instructions with Rm,@-Rn addressing decrements the address by up to 4 after masking, so 4 extra bytes before the data memory need to be available.
- Instructions with @(disp,Rm),Rn addressing increments the address by up to 60 bytes after masking, so 60 extra bytes after the the data memory need to be available.
- The only logical place for GBR to point is the beginning of data memory which is offset zero, so GBR can be ignored even if gint doesn't need it in interrupts. @(R0,GBR) just becomes the equivalent of @(R0) and @(disp,GBR) just becomes @(disp).

Lephenixnoir Hors ligne Administrateur Points: 25159 Défis: 174 Message

Citer : Posté le 31/05/2025 10:21 | #

Just as a quick reaction... I think you're getting near "safe" territory but at what cost? If you end up with a reduced programming model and significant expansions for many instructions it's not quite the original assembler. I'm not saying it's not fine, just not quite the same result.

Anyway, on a technical level... for masks you can't just modify the target register. If I do mov r0, r4 to backup a string pointer then go @r4+ a bunch of times I can measure the length of my run as r4-r0, but not if you modify r4. You need another register and you need to rewrite the access to use that register. (You can save on r14 if the size is 0xffff as you can do extu.w, but that's only for 64 kiB specifically.) For @(r0,rm) and related addressing mode you don't need to AND twice. Just ADD, AND, ADD, which also doesn't modify r0 (still modifies rm but that's a start).

For stashing registers using other registers creates risks for your own program to fail. Using the stack can work. A better approach IMO is to stash in a small permanent struct lying around pointed to at all times by e.g. r12. You can keep r14/r15 stashed (spilled, basically) at all times and just get them out for computations. You can even statically analyze whether there are multiple memory accesses, or multiple computations involving r14/r15 in a row, and optimize the stores and reloads, if performance is a concern. I believe this is similar to your Option 2 optimization.

I'd advise against using "rare" registers randomly. GBR can point anywhere, limiting it to 0 seems silly to me, in fact having it point to after the constants would probably be useful. DBR could be used by gint. The alternate bank is currently not used but I would like to use it to improve interrupt performance in the future.

As far as jumps I concerned I find the limitations a bit dizzying. Why not just replace potentially-arbitrary jumps (jmp, jsr, rts, braf, bsrf) with a short call that validates that the target address is in the correct range? Remember than control flow doesn't necessarily follow functions (longjmp). braf is also used for switch. I'd rather take a small performance hit than constrain the programming model this much.

(FYI half the things you're discussing, and most of my response, are in the scope of a thought experiment I attempted about making a g1a emulator on CG.)

Mon graphe (28 Janvier): (MPM ; serial gint ; (Rogue Life || HH2) ; PythonExtra ; ? ; Boson X ; passe gint 3 ; ...) || (shoutbox v5 ; v5)

Druzyek Hors ligne Membre Points: 62 Défis: 0 Message

Citer : Posté le 31/05/2025 16:53 | #

Lephenixnoir a écrit :
Just as a quick reaction... I think you're getting near "safe" territory but at what cost? If you end up with a reduced programming model and significant expansions for many instructions it's not quite the original assembler. I'm not saying it's not fine, just not quite the same result.

Yes, good point. I think this is the most performance you could get programming on the calculator with a language that is crash proof. I would expect any sort of interpretation to be 20x slower or more. The fastest way of interpreting is to convert the instructions to native machine code like a JIT with a few safety rails. This is a similar idea but doing it at assembly time instead of run time. An interpreter would have to do all the checks I'm doing here but would just be much slower. I also think an interpreter would leave out all the system instructions to modify VBR and do cache invalidation and so on, so I don't think any of the reductions there are a real loss except maybe the modified way jumps work. The non-jump instructions that do address masking would assemble and work identically on the PC without the masks as long as the programmer didn't make an address miscalculation that is hidden by the address masking.

Anyway, on a technical level... for masks you can't just modify the target register. If I do mov r0, r4 to backup a string pointer then go @r4+ a bunch of times I can measure the length of my run as r4-r0, but not if you modify r4. You need another register and you need to rewrite the access to use that register. (You can save on r14 if the size is 0xffff as you can do extu.w, but that's only for 64 kiB specifically.) For @(r0,rm) and related addressing mode you don't need to AND twice. Just ADD, AND, ADD, which also doesn't modify r0 (still modifies rm but that's a start).

I think you could still use R4 to measure the length of your run. If R0 and R4 both point to valid addresses in the 64K block, then the distance between them will always be correct and the same as if the addresses were unmasked. The masking actually has no effect at all unless your @R4+ runs off the end of the 64K block and the masking causes it to wrap around to the beginning of the block instead of corrupting something higher in memory. Wrapping is just the fastest behavior that prevents crashing. An option could also insert longer and slower code to check bounds and stop executing. I agree about @(R0,Rm). It's not clear which might be an address and which is an index so it doesn't make sense to convert both to addresses. It would need another register as you say.

Hmm, I don't get what you mean about extu.w. The 0xFFFF mask is just an example. As long as the base is aligned to the size of the data memory, the data memory can be any size that is a power of 2. The mask is just size-1. For a 32K block, the base might be 0x12348000 (note aligned to 32K boundary) and the mask would be 7FFF. ANDing and ADDing means any input address will always fall in the correct 32K range between 0x12348000 and 0x1234FFFF.

For stashing registers using other registers creates risks for your own program to fail. Using the stack can work. A better approach IMO is to stash in a small permanent struct lying around pointed to at all times by e.g. r12.

Yes, a permanent place to put them is a better idea. I could use a global variable addressed with GBR (assuming it's not used for something else) but then it has to go through R0 so maybe R0 should stay stashed. That's a bit of a bummer since R0 is used a lot.

I'd advise against using "rare" registers randomly. GBR can point anywhere, limiting it to 0 seems silly to me, in fact having it point to after the constants would probably be useful.

Yes, pointing to the constants is a good idea. Otherwise, it has no meaning since the assembled program has no knowledge of any of the global symbols the C compiler uses the GBR to access. Setting it to zero would waste the @(R0,GBR) addressing mode but @(disp,GBR) would let you access the first 1K or memory in a single instruction which might be useful. Edit: when I say set BGR to zero, I mean set it to the base of the data memory so offset zero from the beginning of data memory.

As far as jumps I concerned I find the limitations a bit dizzying. Why not just replace potentially-arbitrary jumps (jmp, jsr, rts, braf, bsrf) with a short call that validates that the target address is in the correct range? Remember than control flow doesn't necessarily follow functions (longjmp). braf is also used for switch. I'd rather take a small performance hit than constrain the programming model this much.

Since the top goal is preventing a crash, it's not enough to make sure the address is in the correct range since it could still jump into a constant pool or the middle of masking code. The jump targets don't need to be functions; they just need to be labels which is what you would generate for a switch function anyway. For option 2 above, return addresses fed to PR would also get an ID number like labels. The programmer generally doesn't know or care what JSR puts in PR or what gets fed to RTS as long as execution flows as they expect, so it shouldn't matter whether the value is a real address or an ID number. The only time I can think of that it might matter is doing address calculations to load data out of code memory (which is already disallowed) or for calculating a jump into an unrolled loop like a Duff's device which I can live without.

Précédente – 1, 2

Nom d'utilisateur

Adresse email

Message

Ajouter un spoiler(texte affichable/masquable par un clic)

Nom du lien (facultatif): Adresse du lien:

Adresse de l'image: Alignement de l'image: Normal Flottante à gauche Centrée Flottante à droite Redimensionnement de l'image (en pixel): Largeur : Hauteur :

Adresse de la vidéo:

Pseudo du profil: Afficher la liste des membres

Auteur de la citation (faculatif):

Texte de déroulage du spoiler (modification faculative): Texte d'enroulage du spoiler (modification faculative):

Titre de la barre de progression: Pourcentage de la barre de progression entre 1 et 100:

→ ⇒ √ Σ ∫ ≠ ≥ ≤ π θ ◢ ± α β γ δ Δ σ ≈ ∞ ∈ λ

Fichier joint

Me prévenir par email lorsqu'une réponse est postée

Combien font quatre plus quatre ?

Veuillez donner la réponse en chiffre

Vous devez activer le Javascript dans votre navigateur pour pouvoir valider ce formulaire.

Si vous n'avez pas volontairement désactivé cette fonctionnalité de votre navigateur, il s'agit probablement d'un bug : contactez l'équipe de Planète Casio.

Planète Casio v4.3 © créé par Neuronix et Muelsaco 2004 - 2025 | Il y a 55 connectés | Nous contacter | Qui sommes-nous ? | Licences et remerciements

Planète Casio est un site communautaire non affilié à Casio. Toute reproduction de Planète Casio, même partielle, est interdite.
Les programmes et autres publications présentes sur Planète Casio restent la propriété de leurs auteurs et peuvent être soumis à des licences ou copyrights.
CASIO est une marque déposée par CASIO Computer Co., Ltd