13 releases
0.1.12 | Sep 22, 2024 |
---|---|
0.1.11 | Sep 22, 2024 |
#335 in Development tools
218 downloads per month
1MB
3K
SLoC
Virtual Computer, Assembler, and Compiler
This project is a virtual 8 bit computer that takes a vector of bytes and runs it as instructions It includes the virtual machine, assembler, and compiler for custom assembly and high level language.
I created a blog post about this project and you can find that on Hashnode. It goes deeper into explaining the different parts of this project.
Add to Your Project
vc_8bit can be added to your Rust project with the terminal command:
cargo add vc_8bit
The code shown below assumes you are using the following modules:
use vc_8bit::{assembly, c_lang, vc_8bit::Computer};
Virtual Computer
The virtual computer works by instantiating the Computer and inserting the program into the ram as bytes.
let bytes = vec![Byte::from_string("11010100"), Byte::from_string("1001100")];
let mut computer: Computer = Computer::new();
computer.ram.insert_bytes(bytes);
computer.run();
The VC (Virtual Computer) is basically a big function that will take an array of bytes and run the instructions associated with the bytes. I have emulated components like a Binary Decoder, RAM, ALU, and CPU.
What the VC does is take the binary and figure out the instructions that go with it. It uses Binary Decoders in a match statement to decide what instruction it is. Every Byte is an instruction. Some examples of instructions are moving from memory to registers, adding the values from one register to another using the ALU. This is why assembly is basically binary. Maybe the binary sequence 01001110
is the MOV
instruction. All an assembler does is convert the instructions like CPY
, LDR
, and ADD
to there corrosponding bytes. It gets a little more complex than this. For this VC, the first 6 bits are for the instruction, and the last 2 bits are for the register. If this sounds interesting to you, I highly recommend watching Core Dumpped and his videos.
When working with the VC, remember the RAM has 256 byte limit because the VC is only an 8 bit computer compared to modern 64 bit computers.
Assembler
The assembler works by first assembling the code to binary. It will then turn the binary into an array of bytes.
let value = "MOV R0 50";
// assemble code
let contents = assembly::compile_assembly_to_binary(value.to_string());
let bytes = assembly::string_to_bytes(contents.as_str());
// run on VC
let mut computer: Computer = Computer::new();
computer.ram.insert_bytes(bytes);
computer.run();
The assembler will go line by line the code into binary. I created a custom assembly language to work with the VC.
HALT
: Stops the programSTR R0 #0000000
: Stores the value in the register to the address in memoryLDR R0 #0000000
: Loads the value in the memory address to the registerMOV R0 #0000000
: Moves a byte value into a registerCPY R0 R1
: Copys the value of 1 register to anotherSHR R0 #0000000
: Shifts a register value by the left many times the number in the byte isSHL R0 #0000000
: Shifts a register value by the right however many times the number in the byte isOUT R0
: Outputs the value in the register to the console as a byte valueMSG R0
: Outputs the value in the register to the console as an ASCII characterINC R0
: Increments the value in the registerDEC R0
: Decrements the value in the registerJMP #0000000
: Moves the RAM indexJMP_NEG #0000000
: Moves the RAM index if the ALU Negative flag is onJMP_ZRO #0000000
: Moves the RAM index if the ALU Zero flag is onJMP_ABV #0000000
: Moves the RAM index if neither the ALU negative flag or Zero flag is onCMP_NEG R0 #0000000
: Moves11111111
to the register if the ALU negative flag is on. If not, it moves00000000
to the registerCMP_ZRO R0 #0000000
: Moves11111111
to the register if the ALU zero flag is on. If not, it moves00000000
to the registerCMP_ABV R0 #0000000
: Moves11111111
to the register if neither the ALU negative flag or Zero flag is on. If not, it moves00000000
to the registerADD R0 R1
: Adds the byte value from registers 0 and 1 and moves the result to the first registerSUB R0 R1
: Subtracts the byte value from registers 0 and 1 and moves the result to the first registerMUL R0 R1
: Multiplies the byte value from registers 0 and 1 and moves the result to the first registerDIV R0 R1
: Divides the byte value from registers 0 and 1 and moves the result to the first registerAND R0 R1
: Does an and operation on the bytes from registers 0 and 1 and moves the result to the first registerOR R0 R1
: Does an or operation on the bytes from registers 0 and 1 and moves the result to the first registerXOR R0 R1
: Does an exclusive or operation on the bytes from registers 0 and 1 and moves the result to the first registerNOT R0
: Does a not operation on the byte in register 1 and moves result to itRPRT R0 #0000000
: Reads the value in the port at the address and writes the value to register 0. The address needs to be 0 through 7.WPRT R0 #0000000
: Writes the value in register 0 to the port at the address. The address needs to be 0 through 7.
The assembler will identify integers, bytes, and hexadecimals:
MOV R0 5 ; moves 5 into R0
MOV R1 0x3A ; moves the hexadecimal 3A (integer 58) to R1
MOV R2 #00110100 ; moves the byte value 00110100 (integer 52) to R2
There is also the %ASSIGN
feature which can be used as constants inside the assembly
%ASSIGN VARIABLE_ADDRESS #11110100
MOV R0 37
STR R0 VARIABLE_ADDRESS
Compiler
The compiler works by compiling the code into assembly.
let value = "print('a');";
// compile code
let asm = c_lang::compile(value.to_string());
// assemble code
let contents = assembly::compile_assembly_to_binary(&asm);
let bytes = assembly::string_to_bytes(contents.as_str());
// run on VC
let mut computer: Computer = Computer::new();
computer.ram.insert_bytes(bytes);
computer.run();
Language
The language looks like C with a few key distinctions.
There are 3 types:
uint8
an unsigned 8 bit integer. Any whole number between 0 and 255.bool
a true of false value. Is either00000000
or11111111
in binarychar
a character value. Uses the ASCII codes to convert to binary
There are 5 functions:
to_char(uint8)
turns a number into a character. This works for numbers 0 - 9. Any number above that will get the ASCII character of that number plus 48. This is because 48 is the ASCII character for 0. Here is an example of th function:char c = to_char(7)
out(any)
will output the byte value to the console of any type. For example, the value 5 will output 00000101.print(char)
will output the character to the console.write_port(uint8, any)
will write the byte value to the port number 0 - 7. The address needs to be a constant. For examplewrite_port(2, 'a')
is fine butwrite_port(99 - 3, 'a')
andwrite_port(variable, 'a')
will not work.read_port(uint8)
will read the byte value from the port address 0 - 7. The address needs to be a constant. For exampleread_port(2)
is fine butread_port(99 - 3)
andread_port(variable)
will not work.
There is only the if
and while
statements. The code inside of the statement needs to be seperated by commas and not semicolons. The last line inside the statement can not have a comma. The ending bracket of the statement needs to end with a semicolon. Here is what it looks like:
if (true && 0 == 0) {
char c = 'a',
print(c)
};
All operators work except for the +=
type of operators. Just use A = A + B
instead. For the <<
and >>
operators, the right operand needs to be constant. For example, 'a' >> 3
works but 'a' >> 3 + 1
or 'a' >> variable
do not work.
++
,--
increment and decrement. Works withuint8
.+
,-
,*
,/
arithmatic operators. Works withuint8
.&&
,||
,==
,!=
logic operators. Works withbool
.<<
,>>
bit shift operators. Shifting amount needs to be constant. Works with any type.&
,|
,^
,!
and, or, excluseive or, and not operators. Works with any type.=
set variable.
There is no way to define functions in this language. Also remember there are only 256 bytes in memory to work with in the VC. This means your program must be less than 255 bytes, and any variables that you use will take up one of those bytes.
Example
This simple program took me 27 bytes to write it in assembly. The compiler was able to use 38 bytes. It's not the smartest compiler it works.
uint8 a = 0;
while (a < 5) {
a = a + 1,
char c = to_char(a),
print(c)
}
The assembly output for this code is as follows:
MOV R0 #00000000
STR R0 #11111110 ; store created variable ; BYTE ADDRESS 4
LDR R0 #11111110 ; load variable ; value left
MOV R1 #00000101 ; value right
SUB R0 R1
CMP_NEG R0 ; compare
CPY R3 R0 ; copy value right ; get value for statement
MOV R2 0 ; set R2 to 0
SUB R2 R3 ; check if statement is true
JMP_ZRO 37 ; jump if false
LDR R0 #11111110 ; load variable ; value left
MOV R1 #00000001 ; value right
ADD R0 R1 ; math
STR R0 #11111110 ; store variable
LDR R3 #11111110 ; load variable
MOV R2 48 ; 48 is ascii '0'
ADD R3 R2 ; get ascii
CPY R0 R3 ; move value to correct register
STR R0 #11111101 ; store created variable
LDR R2 #11111101 ; load variable
MSG R2 ; print value
JMP 4 ; jump back to start ; BYTE ADDRESS 38
HALT
The binary code that was assembled from this:
110010000000000011000000111111101100010011111110110010010000010100010001111100001100111100000000110010100000000000011011111010100010010111000100111111101100100100000001000000011100000011111110110001111111111011001010001100000000111011001100110000001100000011111101110001101111110111011110111010000000010011111111
Dependencies
~2–3MB
~53K SLoC