undar-lang/SPECIFICATION.org

15 KiB
Raw Blame History

#TITLE Project Specification

Binary interface

The VM does not use floating point numbers, it instead uses fixed point numbers.

This is for portability reasons as some devices might not have a FPU in them

especially microcontrollers and some retro game systems like the PS1.

Numbers

type size (bytes) description
bool 1 unsigned 8bit, false or true
u8 1 unsigned 8bit, alias char and byte
i8 1 signed 8bit for interop
u16 2 unsigned 16bit for interop
i16 2 signed 16bit for interop
u32 4 unsigned 32bit, alias nat
i32 4 signed 32bit, alias int
f32 4 signed 32bit fixed number, alias real

Memory

Uses a harvard style archecture, meaning the code and ram memory are split up into two seperate blocks.

In the C version you can see these are two seperate arrays 'code' and 'mem'.

During compilation constants and local variables are put onto 'mem'

Opcodes

All 32 bit instructions (registers are all 32 bit values)

Type A: [8:opcode][8:dest][8:src1][8:src2]
Type B: [8:opcode][8:dest][16:immediate]
Type C: [8:opcode][24:immediate]
Instruction Opcode type opcode arguments notes
halt A all zeros halt execution
call A dest args return creates a new frame
return B dest return_flags returns from a frame to the parent frame
syscall A id args mem_ptr does a system call based on id with args
load_immediate B locals[dest] = const as u16
load_upper_immediate B locals[dest] = const as u32 << 16 & 16
load_indirect_8 A locals[dest] = memory[locals[src1]] as u8
load_indirect_16 A locals[dest] = memory[locals[src1]] as u16
load_indirect_32 A locals[dest] = memory[locals[src1]] as u32
load_absolute_8 A locals[dest] = memory[src1] as u8
load_absolute_16 A locals[dest] = memory[src1] as u16
load_absolute_32 A locals[dest] = memory[src1] as u32
load_offset_8 A locals[dest] = memory[locals[src1] + src2] as u8
load_offset_16 A locals[dest] = memory[locals[src1] + src2] as u16
load_offset_32 A locals[dest] = memory[locals[src1] + src2] as u32
store_absolute_8 A memory[dest] = src1 && 0xFF
store_absolute_16 A memory[dest] = src1 && 0xFFFF
store_absolute_32 A memory[dest] = src1
store_indirect_8 A memory[dest] = locals[src1] && 0xFF
store_indirect_16 A memory[dest] = locals[src1] && 0xFFFF
store_indirect_32 A memory[dest] = locals[src1]
store_offset_8 A memory[locals[dest] + src2] = locals[src1] && 0xFF
store_offset_16 A memory[locals[dest] + src2] = locals[src1] && 0xFFFF
store_offset_32 A memory[locals[dest] + src2] = locals[src1]
alloc A memory[dest] = [locals[src1] as size + 4]
memcpy_8 A memory[src1..src1+src2] = memory[dest..dest+src2]
memcpy_16 A memory[src1..src1+src2] = memory[dest..dest+src2]
memcpy_32 A memory[src1..src1+src2] = memory[dest..dest+src2]
memset_8 A memory[dest..dest+src2] = local[src1] as u8
memset_16 A memory[dest..dest+src2] = local[src1] as u16
memset_32 A memory[dest..dest+src2] = local[src1] as u32
mov A locals[dest] = locals[src1]
add_int A locals[dest] = locals[src1] + locals[src2]
sub_int A locals[dest] = locals[src1] - locals[src2]
mul_int A locals[dest] = locals[src1] * locals[src2]
div_int A locals[dest] = locals[src1] / locals[src2]
add_nat A locals[dest] = locals[src1] + locals[src2]
sub_nat A locals[dest] = locals[src1] - locals[src2]
mul_nat A locals[dest] = locals[src1] * locals[src2]
div_nat A locals[dest] = locals[src1] / locals[src2]
add_real A locals[dest] = locals[src1] + locals[src2]
sub_real A locals[dest] = locals[src1] - locals[src2]
mul_real A locals[dest] = locals[src1] * locals[src2]
div_real A locals[dest] = locals[src1] / locals[src2]
int_to_real A locals[dest] = locals[src1] as real
int_to_nat A locals[dest] = locals[src1] as nat
nat_to_real A locals[dest] = locals[src1] as real
nat_to_int A locals[dest] = locals[src1] as int
real_to_int A locals[dest] = locals[src1] as int
real_to_nat A locals[dest] = locals[src1] as nat
bit_shift_left A locals[dest] = locals[src1] << locals[src2]
bit_shift_right A locals[dest] = locals[src1] >> locals[src2]
bit_shift_r_ext A locals[dest] as i32 = locals[src1] >> locals[src2]
bit_and A locals[dest] = locals[src1] & locals[src2]
bit_or A locals[dest] = locals[src1] \ locals[src2]
bit_xor A locals[dest] = locals[src1] ^ locals[src2]
jump_immediate C jump to imm unconditionally
jump_absolute A jump to locals[dest] unconditionally
jump_offset A jump to locals[dest] + locals[src1] unconditionally
jump_if_flag A jump to locals[dest] if flag > 0
jump_eq_int A jump to locals[dest] if locals[src1] as int == locals[src2] as int
jump_neq_int A jump to locals[dest] if locals[src1] as int != locals[src2] as int
jump_gt_int A jump to locals[dest] if locals[src1] as int > locals[src2] as int
jump_lt_int A jump to locals[dest] if locals[src1] as int < locals[src2] as int
jump_le_int A jump to locals[dest] if locals[src1] as int <= locals[src2] as int
jump_ge_int A jump to locals[dest] if locals[src1] as int >= locals[src2] as int
jump_eq_nat A jump to locals[dest] if locals[src1] as nat == locals[src2] as nat
jump_neq_nat A jump to locals[dest] if locals[src1] as nat != locals[src2] as nat
jump_gt_nat A jump to locals[dest] if locals[src1] as nat > locals[src2] as nat
jump_lt_nat A jump to locals[dest] if locals[src1] as nat < locals[src2] as nat
jump_le_nat A jump to locals[dest] if locals[src1] as nat <= locals[src2] as nat
jump_ge_nat A jump to locals[dest] if locals[src1] as nat >= locals[src2] as nat
jump_eq_real A jump to locals[dest] if locals[src1] as real == locals[src2] as real
jump_neq_real A jump to locals[dest] if locals[src1] as real != locals[src2] as real
jump_ge_real A jump to locals[dest] if locals[src1] as real >= locals[src2] as real
jump_gt_real A jump to locals[dest] if locals[src1] as real > locals[src2] as real
jump_lt_real A jump to locals[dest] if locals[src1] as real < locals[src2] as real
jump_le_real A jump to locals[dest] if locals[src1] as real <= locals[src2] as real

Maybe more flexible calling convention?

At compile time each function gets N number of locals (up to 255). These are allocated onto memory along with everything else, but they come before the heap values.

Memory-to-memory with register characteristics?

Passed in values

Copy each argument from the callers local to the callees local. This includes pointers.

child modifies the heap

If a child modifies a value in the parents heap do nothing, this is expected behavior.

If a child changes the size of a parents heap then copy the heap value to the childs frame.

Returned values

If a primitive value just copy from child local to parent local

If a heap value is returned but placed in a new local in the parent then copy the child to the parent and update the frames memory pointer

If a heap value is replaced (i.e. the return sets a heap value with its modified version) then Sort each returned value by its pointers location in memory, lowest first Move to position of returned values lowest ptr position. Read fat ptr size of the earliest value. Take the current size of heap. Move to just after the end of the size + ptr. Copy all values from that location through current end of heap to the old start location of that value. Subtract the old size of the value from the mp. Copy the new sized value and put it at the current end of the heap. Update the new pointers local position. Add the new size to the mp. Repeat for each returned value that is a replaced heap value.