JustAnotherJAPL/README.md

# NimVM - A stack-based bytecode virtual machine written in Nim
A basic programming language written in Nim

## Project structure

The project is split into several directories and submodules:
- `build.py` -> The build script (TODO, not pushed yet)
- `docs` -> Contains markdown files with the various specifications for NimVM (bytecode, grammar, etc)
- `src` -> Contains source files
  - `src/backend` -> Contains the backend of the language such as the parser, compiler and optimizer
    - `src/meta` -> Contains meta-structures used during compilation and parsing
  - `src/frontend` -> Contains the runtime environment of NimVM
    - `src/frontend/types` -> Contains the type system
  - `src/memory` -> Contains NimVM's allocator and memory manager


## Language design

NimVM is a generic stack-based bytecode VM implementation, meaning that source files are compiled into an
imaginary instruction set for which all required operations are implemented in a virtual machine. NimVM
uses a triple-pass compiler where the input is first tokenized and parsed into an AST, then optimized and
eventually translated to bytecode. Note that each module (but NOT its submodules like `ast.nim` for the parser
or `token.nim` for the tokenizer) is entirely independent: NimVM has been designed to be entirely modular and 
the only module relying on others is the one that will be running the REPL/execute files.

The compilation toolchain has been designed as follows:
- First, the input is tokenized. This process aims to break down the source input into a sequence of easier to
    process tokens for the next step. The lexer (or tokenizer) detects basic syntax errors like unterminated
    string literals, invalid usage of unknown tokens (for example UTF-8 runes) and incorrect number literals
- Then, the tokens are fed into a parser. The parser recursively traverses the list of tokens coming from the lexer
  and builds a higher-level structure called an Abstract Syntax Tree-- or AST for short-- and also catches the rest of
  static or syntax errors such as illegal expressions or precedence errors
- After the AST has been built, it goes trough the optimizer. As the name suggests, this step aims to perform a few optimizations,
  namely:
  - constant folding (meaning 1 + 2 will be replaced with 3 instead of producing 2 constant opcodes and 1 addition opcode)
  - Name resolution checks. This is possible because NimVM's syntax only allows for variables to be defined in a way that
    is statically inferrable, so "name error" exceptions can be caught before any code is ran or even compiled. This means
    that NimVM, like many others, enforces block scoping
  - throw warnings for things like unreachable code after return statements (optional).

    The optimization step is entirely optional and enabled by default
- Once the optimizater is done, the compiler takes the AST and compiles it to bytecode for it to be later interpreted
  by the virtual machine. To be more specific, the compiler writes a file with some metadata and the produced bytecode
  to disk, and then the tool orchestrating compilation and execution will deserialize it and pass everything to the VM.


## Language syntax

NimVM uses a syntax mostly inspired from C and Java, although some influences come from Python as well.

## Credits

NimVM was inspired by Bob Nystrom's amazing [Crafting Interpreters](https://craftinginterpreters.com) book
Added note on syntax and inspiration. Improved title 2021-08-21 15:12:27 +02:00			`# NimVM - A stack-based bytecode virtual machine written in Nim`
Updated README 2021-07-12 18:29:11 +02:00			`A basic programming language written in Nim`
Updated README with project info 2021-07-13 16:09:53 +02:00
			`## Project structure`

Minor fixes 2021-07-18 21:27:42 +02:00			`The project is split into several directories and submodules:`
Fixed lexer bugs, removed struct keyword, added more keywords (class, import, from, async, raise) and triple-character tokens support 2021-08-19 15:57:49 +02:00			- `build.py` -> The build script (TODO, not pushed yet)
Updated README with project info 2021-07-13 16:09:53 +02:00			- `docs` -> Contains markdown files with the various specifications for NimVM (bytecode, grammar, etc)
			- `src` -> Contains source files
Re-extended project structure 2021-08-21 15:14:56 +02:00			- `src/backend` -> Contains the backend of the language such as the parser, compiler and optimizer
			- `src/meta` -> Contains meta-structures used during compilation and parsing
			- `src/frontend` -> Contains the runtime environment of NimVM
			- `src/frontend/types` -> Contains the type system
Fixed indentation issue in README 2021-08-21 15:24:22 +02:00			- `src/memory` -> Contains NimVM's allocator and memory manager
Re-extended project structure 2021-08-21 15:14:56 +02:00

Updated README with project info 2021-07-13 16:09:53 +02:00
Added compilation pipeline 2021-07-18 16:21:07 +02:00			`## Language design`

			`NimVM is a generic stack-based bytecode VM implementation, meaning that source files are compiled into an`
Added note on syntax and inspiration. Improved title 2021-08-21 15:12:27 +02:00			`imaginary instruction set for which all required operations are implemented in a virtual machine. NimVM`
			`uses a triple-pass compiler where the input is first tokenized and parsed into an AST, then optimized and`
Completely redesigned the AST structure: parsing is almost complete. Formal spec should be mostly fine as well 2021-09-26 16:26:05 +02:00			eventually translated to bytecode. Note that each module (but NOT its submodules like `ast.nim` for the parser
			or `token.nim` for the tokenizer) is entirely independent: NimVM has been designed to be entirely modular and
			`the only module relying on others is the one that will be running the REPL/execute files.`
Added compilation pipeline 2021-07-18 16:21:07 +02:00
			`The compilation toolchain has been designed as follows:`
			`- First, the input is tokenized. This process aims to break down the source input into a sequence of easier to`
			`process tokens for the next step. The lexer (or tokenizer) detects basic syntax errors like unterminated`
Completely redesigned the AST structure: parsing is almost complete. Formal spec should be mostly fine as well 2021-09-26 16:26:05 +02:00			`string literals, invalid usage of unknown tokens (for example UTF-8 runes) and incorrect number literals`
Added compilation pipeline 2021-07-18 16:21:07 +02:00			`- Then, the tokens are fed into a parser. The parser recursively traverses the list of tokens coming from the lexer`
			`and builds a higher-level structure called an Abstract Syntax Tree-- or AST for short-- and also catches the rest of`
Completely redesigned the AST structure: parsing is almost complete. Formal spec should be mostly fine as well 2021-09-26 16:26:05 +02:00			`static or syntax errors such as illegal expressions or precedence errors`
Added compilation pipeline 2021-07-18 16:21:07 +02:00			`- After the AST has been built, it goes trough the optimizer. As the name suggests, this step aims to perform a few optimizations,`
			`namely:`
			`- constant folding (meaning 1 + 2 will be replaced with 3 instead of producing 2 constant opcodes and 1 addition opcode)`
Completely redesigned the AST structure: parsing is almost complete. Formal spec should be mostly fine as well 2021-09-26 16:26:05 +02:00			`- Name resolution checks. This is possible because NimVM's syntax only allows for variables to be defined in a way that`
			`is statically inferrable, so "name error" exceptions can be caught before any code is ran or even compiled. This means`
			`that NimVM, like many others, enforces block scoping`
Minor fixes 2021-07-18 21:27:42 +02:00			`- throw warnings for things like unreachable code after return statements (optional).`
Minor visual change 2021-07-18 16:26:25 +02:00
Fixed lexer bugs, removed struct keyword, added more keywords (class, import, from, async, raise) and triple-character tokens support 2021-08-19 15:57:49 +02:00			`The optimization step is entirely optional and enabled by default`
Added compilation pipeline 2021-07-18 16:21:07 +02:00			`- Once the optimizater is done, the compiler takes the AST and compiles it to bytecode for it to be later interpreted`
Completely redesigned the AST structure: parsing is almost complete. Formal spec should be mostly fine as well 2021-09-26 16:26:05 +02:00			`by the virtual machine. To be more specific, the compiler writes a file with some metadata and the produced bytecode`
			`to disk, and then the tool orchestrating compilation and execution will deserialize it and pass everything to the VM.`
Added compilation pipeline 2021-07-18 16:21:07 +02:00
Added note on syntax and inspiration. Improved title 2021-08-21 15:12:27 +02:00
			`## Language syntax`

			`NimVM uses a syntax mostly inspired from C and Java, although some influences come from Python as well.`

			`## Credits`

			`NimVM was inspired by Bob Nystrom's amazing [Crafting Interpreters](https://craftinginterpreters.com) book`