holey-bytes/spec.md
2023-06-12 00:40:24 +02:00

253 lines
7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HoleyBytes ISA Specification
# Bytecode format
- All numbers are encoded little-endian
- There is 60 registers (0 59), they are represented by a byte
- Immediate values are 64 bit
### Instruction encoding
- Instruction parameters are packed (no alignment)
- [opcode, …parameters…]
### Instruction parameter types
- R = Register
- I = Immediate
| Name | Size |
|:----:|:--------|
| RRRR | 32 bits |
| RRR | 24 bits |
| RRI | 80 bits |
| RR | 16 bits |
| RI | 72 bits |
| I | 64 bits |
| N | 0 bits |
# Instructions
- `#n`: register in parameter *n*
- `imm #n`: for immediate in parameter *n*
- `P ← V`: Set register P to value V
- `[x]`: Address x
## No-op
- N type
| Opcode | Name | Action |
|:------:|:----:|:----------:|
| 0 | NOP | Do nothing |
## Integer binary ops.
- RRR type
- `#0 ← #1 <op> #2`
| Opcode | Name | Action |
|:------:|:----:|:-----------------------:|
| 1 | ADD | Wrapping addition |
| 2 | SUB | Wrapping subtraction |
| 3 | MUL | Wrapping multiplication |
| 4 | AND | Bitand |
| 5 | OR | Bitor |
| 6 | XOR | Bitxor |
| 7 | SL | Unsigned left bitshift |
| 8 | SR | Unsigned right bitshift |
| 9 | SRS | Signed right bitshift |
### Comparsion
| Opcode | Name | Action |
|:------:|:----:|:-------------------:|
| 10 | CMP | Signed comparsion |
| 11 | CMPU | Unsigned comparsion |
#### Comparsion table
| #1 *op* #2 | Result |
|:----------:|:------:|
| < | -1 |
| = | 0 |
| > | 1 |
### Division-remainder
- Type RRRR
- In case of `#3` is zero, the resulting value is all-ones
- `#0 ← #2 ÷ #3`
- `#1 ← #2 % #3`
| Opcode | Name | Action |
|:------:|:----:|:-------------------------------:|
| 12 | DIR | Divide and remainder combinated |
### Negations
- Type RR
- `#0 ← #1 <op> #2`
| Opcode | Name | Action |
|:------:|:----:|:----------------:|
| 13 | NEG | Bit negation |
| 14 | NOT | Logical negation |
## Integer immediate binary ops.
- Type RRI
- `#0 ← #1 <op> imm #2`
| Opcode | Name | Action |
|:------:|:----:|:-----------------------:|
| 18 | ADDI | Wrapping addition |
| 19 | MULI | Wrapping subtraction |
| 20 | ANDI | Bitand |
| 21 | ORI | Bitor |
| 22 | XORI | Bitxor |
| 23 | SLI | Unsigned left bitshift |
| 24 | SRI | Unsigned right bitshift |
| 25 | SRSI | Signed right bitshift |
### Comparsion
- Comparsion is the same as when RRR type
| Opcode | Name | Action |
|:------:|:-----:|:-------------------:|
| 26 | CMPI | Signed comparsion |
| 27 | CMPUI | Unsigned comparsion |
## Register value set / copy
### Copy
- Type RR
- `#0 ← #1`
| Opcode | Name | Action |
|:------:|:----:|:------:|
| 28 | CP | Copy |
### Load immediate
- Type RI
- `#0 ← #1`
| Opcode | Name | Action |
|:------:|:----:|:--------------:|
| 29 | LI | Load immediate |
## Memory operations
- Type RRI
### Load
- `#0 ← [#1 + imm #2]`
| Opcode | Name | Action |
|:------:|:----:|:----------------------:|
| 30 | LB | Load byte (8 bits) |
| 31 | LD | Load doublet (16 bits) |
| 32 | LQ | Load quadlet (32 bits) |
| 33 | LO | Load octlet (64 bits) |
### Store
- `[#1 + imm #2] ← #0`
| Opcode | Name | Action |
|:------:|:----:|:-----------------------:|
| 34 | SB | Store byte (8 bits) |
| 35 | SD | Store doublet (16 bits) |
| 36 | SQ | Store quadlet (32 bits) |
| 37 | SO | Store octlet (64 bits) |
## Control flow
### Unconditional jump
- Type RI
| Opcode | Name | Action |
|:------:|:----:|:---------------------:|
| 38 | JMP | Jump at `#0 + imm #1` |
### Conditional jumps
- Type RRI
- Jump at `imm #2` if `#0 <op> #1`
| Opcode | Name | Comparsion |
|:------:|:----:|:------------:|
| 39 | JEQ | = |
| 40 | JNE | ≠ |
| 41 | JLT | < (signed) |
| 42 | JGT | > (signed) |
| 43 | JLTU | < (unsigned) |
| 44 | JGTU | > (unsigned) |
### Environment call
- Type N
| Opcode | Name | Action |
|:------:|:-----:|:-------------------------------------:|
| 45 | ECALL | Cause an trap to the host environment |
## Floating point operations
- Type RRR
- `#0 ← #1 <op> #2`
| Opcode | Name | Action |
|:------:|:----:|:--------------:|
| 46 | ADDF | Addition |
| 47 | MULF | Multiplication |
### Division-remainder
- Type RRRR
| Opcode | Name | Action |
|:------:|:----:|:--------------------------------------:|
| 48 | DIRF | Same flow applies as for integer `DIR` |
## Floating point immediate operations
- Type RRI
- `#0 ← #1 <op> imm #2`
| Opcode | Name | Action |
|:------:|:-----:|:--------------:|
| 49 | ADDFI | Addition |
| 50 | MULFI | Multiplication |
# Registers
- There is 59 registers + one zero register (with index 0)
- Reading from zero register yields zero
- Writing to zero register is a no-op
# Memory
- Addresses are 64 bit
- Memory implementation is arbitrary
- In case of accessing invalid address:
- Program shall trap (LoadAccessEx, StoreAccessEx) with parameter of accessed address
- Value of register when trapped is undefined
## Recommendations
- Leave address `0x0` as invalid
- If paging used:
- Leave first page invalid
- Pages should be at least 4 KiB
# Program execution
- The way of program execution is implementation defined
- The order of instruction is arbitrary, as long all observable
effects are applied in the program's order
# Program validation
- Invalid program should cause runtime error:
- The form of error is arbitrary. Can be a trap or an interpreter-specified error
- It shall not be handleable from within the program
- Executing invalid opcode should trap
- Program can be validaded either before execution or when executing
# Traps
| Name | Parameters | Cause |
|:-------------:|:----------------:|:--------------------------:|
| LoadAccessEx | Accessed address | Loading invalid address |
| StoreAccessEx | Accessed address | Storing to invalid address |
| InvalidOpcode | Loaded opcode | Executing invalid opcode |
| Ecall | None | Ecall instruction |
# Assembly
HoleyBytes assembly format is not defined, this is just a weak description
of `hbasm` syntax.
- Opcode names correspond to specified opcode names, lowercase (`nop`)
- Parameters are separated by comma (`addi r0, r0, 1`)
- Instructions are separated by either line feed or semicolon
- Registers are represented by `r` followed by the number (`r10`)
- Labels are defined by label name followed with colon (`loop:`)
- Labels are references simply by their name (`print`)
- Immediates are entered plainly. Negative numbers supported.