WIP opcodes
This commit is contained in:
parent
e71d687db8
commit
39d088bd9c
|
|
@ -0,0 +1,134 @@
|
||||||
|
#+TITLE Project Specification
|
||||||
|
|
||||||
|
* Binary interface
|
||||||
|
|
||||||
|
The VM does not use floating point numbers, it instead uses fixed point numbers.
|
||||||
|
|
||||||
|
This is for portability reasons as some devices might not have a FPU in them
|
||||||
|
|
||||||
|
especially microcontrollers and some retro game systems like the PS1.
|
||||||
|
|
||||||
|
** Numbers
|
||||||
|
| type | size (bytes) | description |
|
||||||
|
|------+--------------+---------------------------------------|
|
||||||
|
| u8 | 1 | unsigned 8bit, alias =char= and =byte= |
|
||||||
|
| bool | 1 | unsigned 8bit, =false= or =true= |
|
||||||
|
| i8 | 1 | signed 8bit for interop |
|
||||||
|
| u16 | 2 | unsigned 16bit for interop |
|
||||||
|
| i16 | 2 | signed 16bit for interop |
|
||||||
|
| u32 | 4 | unsigned 32bit, alias =nat= |
|
||||||
|
| i32 | 4 | signed 32bit, alias =int= |
|
||||||
|
| f32 | 4 | signed 32bit fixed number, alias =real= |
|
||||||
|
|
||||||
|
* Memory
|
||||||
|
|
||||||
|
Uses a harvard style archecture, meaning the code and ram memory
|
||||||
|
are split up into two seperate blocks.
|
||||||
|
|
||||||
|
In the C version you can see these are two seperate arrays 'code' and 'mem'.
|
||||||
|
|
||||||
|
During compilation constants and local variables are put onto 'mem'
|
||||||
|
|
||||||
|
* Opcodes
|
||||||
|
|
||||||
|
| 2^n | count |
|
||||||
|
|----+-------|
|
||||||
|
| 2^3 | 8 |
|
||||||
|
| 2^4 | 16 |
|
||||||
|
| 2^5 | 32 |
|
||||||
|
| 2^6 | 64 |
|
||||||
|
| 2^8 | 128 |
|
||||||
|
|
||||||
|
** could be encoded
|
||||||
|
- op type [this is to maximize jump immidate and load immidate size]
|
||||||
|
- memory location
|
||||||
|
- local value / register
|
||||||
|
- local value type
|
||||||
|
|
||||||
|
*** Simplest
|
||||||
|
[opcode][dest][src1][src2]
|
||||||
|
[8][8][8][8]
|
||||||
|
|
||||||
|
*** Maximize inline jump and load-immidate
|
||||||
|
[0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0]
|
||||||
|
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 | 0 0 ] noop
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 1 | 0 0 ] call
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 1 0 | 0 0 ] return
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 1 1 | 0 0 ] syscall?
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 1][0 0 0 | 1 0 0 | 0 0 ] exit
|
||||||
|
|
||||||
|
[0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 | 0 1 ] jump ~1GB range
|
||||||
|
[0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 | 1 0 ] load-immidate 2^30 max
|
||||||
|
|
||||||
|
*** multibyte ops
|
||||||
|
- ones that would be easier if they were multibyte
|
||||||
|
- jump
|
||||||
|
- load immidate
|
||||||
|
- syscall
|
||||||
|
- call
|
||||||
|
|
||||||
|
|
||||||
|
0 0 - system, lowest because lower opcodes are faster
|
||||||
|
0 1 - memory
|
||||||
|
1 0 - math
|
||||||
|
1 1 - jump
|
||||||
|
|
||||||
|
[0 0][0 0 0 0 0 0] = [system][no op]
|
||||||
|
[0 0][1 0 0 0 0 0] = [system][loadimm]
|
||||||
|
[0 0][0 0 0 0 0 0] = [system][return]
|
||||||
|
|
||||||
|
|
||||||
|
[0 0 0 | r r r r r][0 0 0 | r r r r r][0 0 0 | r r r r r][b b | t | o o o | 1 1] [math][add][f][8]
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 | 1 | 0 0 1 | 1 1] [math][add][f][8]
|
||||||
|
|
||||||
|
[0 1][0 0 1][0][0 0] = [math][sub][f][16]
|
||||||
|
[0 1][0 1 0][0][0 0] = [math][mul][f][32]
|
||||||
|
[0 1][0 1 1][0][0 0] = [math][div][f][?]
|
||||||
|
[0 1][1 0 0][0][0 0] = [math][and][f][?]
|
||||||
|
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][eq]
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][ne]
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][lt]
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][gt]
|
||||||
|
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][le]
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][ge]
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][]
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][]
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
3 2 2 1
|
||||||
|
[...] i 32 0
|
||||||
|
|
||||||
|
3 1 3 1
|
||||||
|
[...] u rl j
|
||||||
|
[...] u ab j
|
||||||
|
[...] u eq j
|
||||||
|
[...] u nq j
|
||||||
|
[...] u lt j
|
||||||
|
[...] u le j
|
||||||
|
[...] u gt j
|
||||||
|
[...] u ge j
|
||||||
|
|
||||||
|
[jmp][dest][18]
|
||||||
|
[lli][dest][2]
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
int 3
|
||||||
|
|
||||||
|
add
|
||||||
|
sub
|
||||||
|
mul
|
||||||
|
div
|
||||||
|
|
||||||
|
jeq
|
||||||
|
jne
|
||||||
|
|
||||||
|
jlt
|
||||||
|
jle
|
||||||
|
jgt
|
||||||
|
jge
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
#+TITLE Project Specification
|
#TITLE Project Specification
|
||||||
|
|
||||||
* Binary interface
|
* Binary interface
|
||||||
|
|
||||||
|
|
@ -31,15 +31,156 @@ During compilation constants and local variables are put onto 'mem'
|
||||||
|
|
||||||
* Opcodes
|
* Opcodes
|
||||||
|
|
||||||
Most opcodes are 4 bytes
|
| 2^n | count |
|
||||||
|
|----+-------|
|
||||||
|
| 2^3 | 8 |
|
||||||
|
| 2^4 | 16 |
|
||||||
|
| 2^5 | 32 |
|
||||||
|
| 2^6 | 64 |
|
||||||
|
| 2^8 | 128 |
|
||||||
|
|
||||||
|
** could be encoded
|
||||||
|
- op type [this is to maximize jump immidate and load immidate size]
|
||||||
|
- memory location
|
||||||
|
- local value / register
|
||||||
|
- local value type
|
||||||
|
|
||||||
|
*** Simplest
|
||||||
[opcode][dest][src1][src2]
|
[opcode][dest][src1][src2]
|
||||||
|
[8][8][8][8]
|
||||||
|
|
||||||
A small number are multibyte like load-long-immidiate
|
*** Maximize inline jump and load-immidate
|
||||||
|
[0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0]
|
||||||
|
|
||||||
[multibyte-opcode][0u8][0u8][opcode] . [u32]
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 | 0 0 ] noop
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 1 | 0 0 ] call
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 1 0 | 0 0 ] return
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 1 1 | 0 0 ] syscall?
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 1][0 0 0 | 1 0 0 | 0 0 ] exit
|
||||||
|
|
||||||
These are decoded during runtime and selected.
|
[0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 | 0 1 ] jump ~1GB range
|
||||||
|
[0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 | 1 0 ] load-immidate 2^30 max
|
||||||
|
|
||||||
In theory, bitshift decoding is faster than
|
*** multibyte ops
|
||||||
accessing an unknown n bytes of memory in the 'code' array.
|
- ones that would be easier if they were multibyte
|
||||||
|
- jump
|
||||||
|
- load immidate
|
||||||
|
- syscall
|
||||||
|
- call
|
||||||
|
|
||||||
|
|
||||||
|
0 0 - system, lowest because lower opcodes are faster
|
||||||
|
0 1 - memory
|
||||||
|
1 0 - math
|
||||||
|
1 1 - jump
|
||||||
|
|
||||||
|
[0 0][0 0 0 0 0 0] = [system][no op]
|
||||||
|
[0 0][1 0 0 0 0 0] = [system][loadimm]
|
||||||
|
[0 0][0 0 0 0 0 0] = [system][return]
|
||||||
|
|
||||||
|
|
||||||
|
J [0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 | 0 1]
|
||||||
|
L [0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0][0 | o o o o o | 1 0]
|
||||||
|
|
||||||
|
R [r r r r r r | r r][r r r r | r r r r][r r | b b | 0 0 0 0][o o o o o o | 1 1]
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[0 0 0 0 0 | r r r][r r r r | r r r r][r r | r r r r r r][b b | t | o o o | 1 1]
|
||||||
|
|
||||||
|
[0 0 0 | r r r r r][0 0 0 | r r r r r][0 0 0 | r r r r r][b b | t | o o o | 1 1]
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 | 1 | 0 0 1 | 1 1] [math][add][f][8]
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 | 1 | 0 1 0 | 1 1] [math][sub][f][8]
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 | 1 | 0 1 1 | 1 1] [math][mul][f][8]
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 | 1 | 1 0 0 | 1 1] [math][div][f][8]
|
||||||
|
[0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 0 | 0 0 0 0 0][0 0 | 1 | 1 0 1 | 1 1] [math][mod][f][8]
|
||||||
|
|
||||||
|
[0 1][0 0 1][0][0 0] = [math][sub][f][16]
|
||||||
|
[0 1][0 1 0][0][0 0] = [math][mul][f][32]
|
||||||
|
[0 1][0 1 1][0][0 0] = [math][div][f][?]
|
||||||
|
[0 1][1 0 0][0][0 0] = [math][and][f][?]
|
||||||
|
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][eq]
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][ne]
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][lt]
|
||||||
|
[1 1][0][0 0][0 0] = [jmp][u][gt]
|
||||||
|
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][le]
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][ge]
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][]
|
||||||
|
[1 1][1][0 0 0][0 0] = [jmp][s][]
|
||||||
|
|
||||||
|
3 2 2 1
|
||||||
|
[...] i 32 0
|
||||||
|
|
||||||
|
3 1 3 1
|
||||||
|
[...] u rl j
|
||||||
|
[...] u ab j
|
||||||
|
[...] u eq j
|
||||||
|
[...] u nq j
|
||||||
|
[...] u lt j
|
||||||
|
[...] u le j
|
||||||
|
[...] u gt j
|
||||||
|
[...] u ge j
|
||||||
|
|
||||||
|
[jmp][dest][18]
|
||||||
|
[lli][dest][2]
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
int 3
|
||||||
|
|
||||||
|
add
|
||||||
|
sub
|
||||||
|
mul
|
||||||
|
div
|
||||||
|
|
||||||
|
jeq
|
||||||
|
jne
|
||||||
|
|
||||||
|
jlt
|
||||||
|
jle
|
||||||
|
jgt
|
||||||
|
jge
|
||||||
|
|
||||||
|
Maybe opcodes could be Huffman encoded? That way the smaller opcodes get more operand data within the u32
|
||||||
|
|
||||||
|
At compile time each function gets N number of locals (up to 255). These are allocated onto memory along with everything else, but they come before the heap values.
|
||||||
|
|
||||||
|
Maybe instead of push pushing onto a stack it instead pushes onto a child frame? Pops would just mark what locals need to be replaced by the child function? Maybe we can get rid of call/return and just use jumps?
|
||||||
|
|
||||||
|
U8 u8 u8 u8
|
||||||
|
Push/pop parent_local child_local metadata
|
||||||
|
|
||||||
|
** Maybe more flexible calling convention?
|
||||||
|
|
||||||
|
Memory-to-memory with register characteristics?
|
||||||
|
|
||||||
|
Passed in values
|
||||||
|
|
||||||
|
Copy each argument from the callers local to the callees local. This includes pointers.
|
||||||
|
|
||||||
|
child modifies the heap
|
||||||
|
|
||||||
|
If a child modifies a value in the parents heap do nothing, this is expected behavior.
|
||||||
|
|
||||||
|
If a child changes the size of a parents heap then copy the heap value to the child’s frame.
|
||||||
|
|
||||||
|
Returned values
|
||||||
|
|
||||||
|
If a primitive value just copy from child local to parent local
|
||||||
|
|
||||||
|
If a heap value is returned but placed in a new local in the parent then copy the child to the parent and update the frames memory pointer
|
||||||
|
|
||||||
|
If a heap value is replaced (i.e. the return sets a heap value with its modified version) then
|
||||||
|
Sort each returned value by its pointers location in memory, lowest first
|
||||||
|
Move to position of returned values lowest ptr position.
|
||||||
|
Read fat ptr size of the earliest value.
|
||||||
|
Take the current size of heap.
|
||||||
|
Move to just after the end of the size + ptr.
|
||||||
|
Copy all values from that location through current end of heap to the old start location of that value.
|
||||||
|
Subtract the old size of the value from the mp.
|
||||||
|
Copy the new sized value and put it at the current end of the heap.
|
||||||
|
Update the new pointer’s local position.
|
||||||
|
Add the new size to the mp.
|
||||||
|
Repeat for each returned value that is a replaced heap value.
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue