Compare commits
13 Commits
32ae21d143
...
879fec20fe
Author | SHA1 | Date |
---|---|---|
Mattia Giambirtone | 879fec20fe | |
Mattia Giambirtone | 40d0f23135 | |
Mattia Giambirtone | 20da594116 | |
Mattia Giambirtone | 7bae3ad249 | |
Mattia Giambirtone | 42ba738620 | |
Mattia Giambirtone | de9b51152e | |
Mattia Giambirtone | dc195409c9 | |
Mattia Giambirtone | cd853bb140 | |
Mattia Giambirtone | a706fdad7a | |
Mattia Giambirtone | a7899e8473 | |
Mattia Giambirtone | 3e9b84fb4f | |
Mattia Giambirtone | 2d9a6b9a8d | |
Mattia Giambirtone | 8277472819 |
22
README.md
22
README.md
|
@ -14,7 +14,8 @@ Peon is a multi-paradigm, statically-typed programming language inspired by C, N
|
|||
features such as automatic type inference, parametrically polymorphic generic types, pure functions, closures, interfaces, single inheritance,
|
||||
reference types, templates, coroutines, raw pointers and exceptions.
|
||||
|
||||
The memory management model is rather simple: a Mark and Sweep garbage collector is employed to reclaim unused memory.
|
||||
The memory management model is rather simple: a Mark and Sweep garbage collector is employed to reclaim unused memory, although more garbage
|
||||
collection strategies (such as generational GC or deferred reference counting) are planned to be added in the future.
|
||||
|
||||
Peon features a native cooperative concurrency model designed to take advantage of the inherent waiting of typical I/O workloads, without the use of more than one OS thread (wherever possible), allowing for much greater efficiency and a smaller memory footprint. The asynchronous model used forces developers to write code that is both easy to reason about, thanks to the [Structured concurrency](https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/) model that is core to peon's async event loop implementation, and works as expected every time (without dropping signals, exceptions, or task return values).
|
||||
|
||||
|
@ -27,21 +28,13 @@ In peon, all objects are first-class (this includes functions, iterators, closur
|
|||
**Disclaimer 1**: The project is still in its very early days: lots of stuff is not implemented, a work in progress or
|
||||
otherwise outright broken. Feel free to report bugs!
|
||||
|
||||
|
||||
**Disclaimer 2**: Currently the REPL is very basic (it adds your code to previous input plus a newline, as if it was compiling a new file every time),
|
||||
because incremental compilation is designed for modules and it doesn't play well with the interactive nature of a REPL session. To show the current state
|
||||
of the REPL, type `#show` (this will print all the code that has been typed so far), while to reset everything, type `#reset`. You can also type
|
||||
`#clear` if you want a clean slate to type in, but note that it won't reset the REPL state. If adding a new piece of code causes compilation to fail, the REPL will not add the last piece of code to the input so you can type it again and recompile without having to exit the program and start from scratch. You can move through the code using left/right arrows and go to a new line by pressing Ctrl+Enter. Using the up/down keys on your keyboard
|
||||
will move through the input history (which is never reset). Also note that UTF-8 is currently unsupported in the REPL (it will be soon though!)
|
||||
|
||||
|
||||
**Disclaimer 3**: Currently, the `std` module has to be _always_ imported explicitly for even the most basic snippets to work. This is because intrinsic types and builtin operators are defined within it: if it is not imported, peon won't even know how to parse `2 + 2` (and even if it could, it would have no idea what the type of the expression would be). You can have a look at the [peon standard library](src/peon/stdlib) to see how the builtins are defined (be aware that they heavily rely on compiler black magic to work) and can even provide your own implementation if you're so inclined.
|
||||
**Disclaimer 2**: Currently, the `std` module has to be _always_ imported explicitly for even the most basic snippets to work. This is because intrinsic types and builtin operators are defined within it: if it is not imported, peon won't even know how to parse `2 + 2` (and even if it could, it would have no idea what the type of the expression would be). You can have a look at the [peon standard library](src/peon/stdlib) to see how the builtins are defined (be aware that they heavily rely on compiler black magic to work) and can even provide your own implementation if you're so inclined.
|
||||
|
||||
|
||||
### TODO List
|
||||
|
||||
In no particular order, here's a list of stuff that's done/to do (might be incomplete/out of date):
|
||||
- User-defined types
|
||||
- User-defined types
|
||||
- Function calls ✅
|
||||
- Control flow (if-then-else, switch) ✅
|
||||
- Looping (while) ✅
|
||||
|
@ -57,7 +50,6 @@ In no particular order, here's a list of stuff that's done/to do (might be incom
|
|||
- Named scopes/blocks ✅
|
||||
- Inheritance
|
||||
- Interfaces
|
||||
- Indexing operator
|
||||
- Generics ✅
|
||||
- Automatic types ✅
|
||||
- Iterators/Generators
|
||||
|
@ -76,12 +68,14 @@ In no particular order, here's a list of stuff that's done/to do (might be incom
|
|||
Here's a random list of high-level features I would like peon to have and that I think are kinda neat (some may
|
||||
have been implemented alredady):
|
||||
- Reference types are not nullable by default (must use `#pragma[nullable]`)
|
||||
- The `commutative` pragma, which allows to define just one implementation of an operator
|
||||
and have it become commutative
|
||||
- Easy C/Nim interop via FFI
|
||||
- C/C++ backend
|
||||
- Nim backend
|
||||
- [Structured concurrency](https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/) (must-have!)
|
||||
- Simple OOP (with multiple dispatch!)
|
||||
- RTTI, with methods that dispatch at runtime based on the true type of a value
|
||||
- RTTI, with methods that dispatch at runtime based on the true (aka runtime) type of a value
|
||||
- Limited compile-time evaluation (embed the Peon VM in the C/C++/Nim backend and use that to execute peon code at compile time)
|
||||
|
||||
|
||||
|
@ -134,5 +128,7 @@ out for yourself. Fortunately, the process is quite straightforward:
|
|||
automate this soon, but as of right now the work is all manual (and it's part of the fun, IMHO ;))
|
||||
|
||||
|
||||
__Note__: On Linux, peon will also look into `~/.local/peon/stdlib`
|
||||
|
||||
If you've done everything right, you should be able to run `peon` in your terminal and have it drop you into the REPL. Good
|
||||
luck and have fun!
|
|
@ -1,7 +1,8 @@
|
|||
# Peon - Bytecode Specification
|
||||
|
||||
This document aims to document peon's bytecode as well as how it is (de-)serialized to/from files and
|
||||
other file-like objects.
|
||||
other file-like objects. Note that the segments in a bytecode dump appear in the order they are listed
|
||||
in this document.
|
||||
|
||||
## Code Structure
|
||||
|
||||
|
@ -9,12 +10,12 @@ A peon program is compiled into a tightly packed sequence of bytes that contain
|
|||
the VM needs to execute said program. There is no dependence between the frontend and the backend outside of the
|
||||
bytecode format (which is implemented in a separate serialiazer module) to allow for maximum modularity.
|
||||
|
||||
A peon bytecode dump contains:
|
||||
A peon bytecode file contains the following:
|
||||
|
||||
- Constants
|
||||
- The bytecode itself
|
||||
- Debugging information
|
||||
- File and version metadata
|
||||
- The program's code
|
||||
- Debugging information (file and version metadata, module info. Optional)
|
||||
|
||||
|
||||
## File Headers
|
||||
|
||||
|
@ -34,7 +35,7 @@ in release builds.
|
|||
### Line data segment
|
||||
|
||||
The line data segment contains information about each instruction in the code segment and associates them
|
||||
1:1 with a line number in the original source file for easier debugging using run-length encoding. The section's
|
||||
1:1 with a line number in the original source file for easier debugging using run-length encoding. The segment's
|
||||
size is fixed and is encoded at the beginning as a sequence of 4 bytes (i.e. a single 32 bit integer). The data
|
||||
in this segment can be decoded as explained in [this file](../src/frontend/compiler/targgets/bytecode/opcodes.nim#L29), which is quoted
|
||||
below:
|
||||
|
@ -57,7 +58,7 @@ below:
|
|||
|
||||
This segment contains details about each function in the original file. The segment's size is fixed and is encoded at the
|
||||
beginning as a sequence of 4 bytes (i.e. a single 32 bit integer). The data in this segment can be decoded as explained
|
||||
in [this file](../src/frontend/compiler/targgets/bytecode/opcodes.nim#L39), which is quoted below:
|
||||
in [this file](../src/frontend/compiler/targets/bytecode/opcodes.nim#L39), which is quoted below:
|
||||
|
||||
```
|
||||
[...]
|
||||
|
@ -74,6 +75,26 @@ in [this file](../src/frontend/compiler/targgets/bytecode/opcodes.nim#L39), whic
|
|||
[...]
|
||||
```
|
||||
|
||||
### Modules segment
|
||||
|
||||
This segment contains details about the modules that make up the original source code which produced a given bytecode dump.
|
||||
The data in this segment can be decoded as explained in [this file](../src/frontend/compiler/targets/bytecode/opcodes.nim#L49), which is quoted below:
|
||||
```
|
||||
[...]
|
||||
## modules contains information about all the peon modules that the compiler has encountered,
|
||||
## along with their start/end offset in the code. Unlike other bytecode-compiled languages like
|
||||
## Python, peon does not produce a bytecode file for each separate module it compiles: everything
|
||||
## is contained within a single binary blob. While this simplifies the implementation and makes
|
||||
## bytecode files entirely "self-hosted", it also means that the original module information is
|
||||
## lost: this segment serves to fix that. The segment's size is encoded at the beginning as a 4-byte
|
||||
## sequence (i.e. a single 32-bit integer) and its encoding is similar to that of the functions segment:
|
||||
## - First, the position into the bytecode where the module begins is encoded (as a 3 byte integer)
|
||||
## - Second, the position into the bytecode where the module ends is encoded (as a 3 byte integer)
|
||||
## - Lastly, the module's name is encoded in ASCII, prepended with its size as a 2-byte integer
|
||||
[...]
|
||||
```
|
||||
|
||||
|
||||
## Constant segment
|
||||
|
||||
The constant segment contains all the read-only values that the code will need at runtime, such as hardcoded
|
||||
|
@ -87,6 +108,6 @@ real-world scenarios it likely won't be.
|
|||
|
||||
## Code segment
|
||||
|
||||
The code segment contains the linear sequence of bytecode instructions of a peon program. It is to be read directly
|
||||
and without modifications. The segment's size is fixed and is encoded at the beginning as a sequence of 3 bytes
|
||||
The code segment contains the linear sequence of bytecode instructions of a peon program to be fed directly to
|
||||
peon's virtual machine. The segment's size is fixed and is encoded at the beginning as a sequence of 3 bytes
|
||||
(i.e. a single 24 bit integer). All the instructions are documented [here](../src/frontend/compiler/targgets/bytecode/opcodes.nim)
|
|
@ -68,7 +68,8 @@ type
|
|||
## this system and is not handled
|
||||
## manually by the VM
|
||||
bytesAllocated: tuple[total, current: int]
|
||||
cycles: int
|
||||
when debugGC or debugAlloc:
|
||||
cycles: int
|
||||
nextGC: int
|
||||
pointers: HashSet[uint64]
|
||||
PeonVM* = object
|
||||
|
@ -93,9 +94,10 @@ type
|
|||
frames: seq[uint64] # Stores the bottom of stack frames
|
||||
results: seq[uint64] # Stores function return values
|
||||
gc: PeonGC # A reference to the VM's garbage collector
|
||||
breakpoints: seq[uint64] # Breakpoints where we call our debugger
|
||||
debugNext: bool # Whether to debug the next instruction
|
||||
lastDebugCommand: string # The last debugging command input by the user
|
||||
when debugVM:
|
||||
breakpoints: seq[uint64] # Breakpoints where we call our debugger
|
||||
debugNext: bool # Whether to debug the next instruction
|
||||
lastDebugCommand: string # The last debugging command input by the user
|
||||
|
||||
|
||||
# Implementation of peon's memory manager
|
||||
|
@ -105,25 +107,17 @@ proc newPeonGC*: PeonGC =
|
|||
## garbage collector
|
||||
result.bytesAllocated = (0, 0)
|
||||
result.nextGC = FirstGC
|
||||
result.cycles = 0
|
||||
when debugGC or debugAlloc:
|
||||
result.cycles = 0
|
||||
|
||||
|
||||
proc collect*(self: var PeonVM)
|
||||
|
||||
|
||||
# Our pointer tagging routines
|
||||
template tag(p: untyped): untyped = cast[pointer](cast[uint64](p) or (1'u64 shl 63'u64))
|
||||
template untag(p: untyped): untyped = cast[pointer](cast[uint64](p) and 0x7fffffffffffffff'u64)
|
||||
template getTag(p: untyped): untyped = (p and (1'u64 shl 63'u64)) == 0
|
||||
|
||||
|
||||
proc reallocate*(self: var PeonVM, p: pointer, oldSize: int, newSize: int): pointer =
|
||||
## Simple wrapper around realloc with
|
||||
## built-in garbage collection. Callers
|
||||
## should keep in mind that the returned
|
||||
## pointer is tagged (bit 63 is set to 1)
|
||||
## and should be passed to untag() before
|
||||
## being dereferenced or otherwise used
|
||||
## built-in garbage collection
|
||||
self.gc.bytesAllocated.current += newSize - oldSize
|
||||
try:
|
||||
when debugMem:
|
||||
|
@ -147,7 +141,7 @@ proc reallocate*(self: var PeonVM, p: pointer, oldSize: int, newSize: int): poin
|
|||
else:
|
||||
if self.gc.bytesAllocated.current >= self.gc.nextGC:
|
||||
self.collect()
|
||||
result = tag(realloc(untag(p), newSize))
|
||||
result = realloc(p, newSize)
|
||||
except NilAccessDefect:
|
||||
stderr.writeLine("Peon: could not manage memory, segmentation fault")
|
||||
quit(139) # For now, there's not much we can do if we can't get the memory we need, so we exit
|
||||
|
@ -178,12 +172,12 @@ proc allocate(self: var PeonVM, kind: ObjectKind, size: typedesc, count: int): p
|
|||
## Allocates an object on the heap and adds its
|
||||
## location to the internal pointer list of the
|
||||
## garbage collector
|
||||
result = cast[ptr HeapObject](untag(self.reallocate(nil, 0, sizeof(HeapObject))))
|
||||
result = cast[ptr HeapObject](self.reallocate(nil, 0, sizeof(HeapObject)))
|
||||
setkind(result[], kind, kind)
|
||||
result.marked = false
|
||||
case kind:
|
||||
of String:
|
||||
result.str = cast[ptr UncheckedArray[char]](untag(self.reallocate(nil, 0, sizeof(size) * count)))
|
||||
result.str = cast[ptr UncheckedArray[char]](self.reallocate(nil, 0, sizeof(size) * count))
|
||||
result.len = count
|
||||
else:
|
||||
discard # TODO
|
||||
|
@ -213,30 +207,33 @@ proc markRoots(self: var PeonVM): HashSet[ptr HeapObject] =
|
|||
# Unlike what Bob does in his book, we keep track
|
||||
# of objects another way, mainly due to the difference
|
||||
# of our respective designs. Specifically, our VM only
|
||||
# handles a single type (uint64) while Lox stores all objects
|
||||
# in heap-allocated structs (which is convenient, but slow).
|
||||
# What we do is store the pointers to the objects we allocated in
|
||||
# a hash set and then, at collection time, do a set difference
|
||||
# between the reachable objects and the whole set and discard
|
||||
# whatever is left; Unfortunately, this means that if a primitive
|
||||
# object's value happens to collide with an active pointer the GC
|
||||
# will mistakenly assume the object to be reachable, potentially
|
||||
# leading to a nasty memory leak. Let's just hope a 48+ bit address
|
||||
# space makes this occurrence rare enough not to be a problem
|
||||
# handles a single type (uint64), while Lox has a stack
|
||||
# of heap-allocated structs (which is convenient, but slow).
|
||||
# The previous implementation would just store all pointers
|
||||
# allocated by us in a hash set and then check if any source
|
||||
# of roots contained any of the integer values that it was
|
||||
# keeping track of, but this meant that if a primitive object's
|
||||
# value happened to collide with an active pointer the GC would
|
||||
# mistakenly assume the object was reachable, potentially leading
|
||||
# to a nasty memory leak. The current implementation uses pointer
|
||||
# tagging: we know that modern CPUs never use bit 63 in addresses,
|
||||
# so if it set we know it cannot be a pointer, and if it is set we
|
||||
# just need to check if it's in our list of active addresses or not.
|
||||
# This should resolve the potential memory leak (hopefully)
|
||||
# What we do instead is store all pointers allocated by us
|
||||
# in a hash set and then check if any source of roots contained
|
||||
# any of the integer values that we're keeping track of. Note
|
||||
# that this means that if a primitive object's value happens to
|
||||
# collide with an active pointer, the GC will mistakenly assume
|
||||
# the object to be reachable (potentially leading to a nasty
|
||||
# memory leak). Hopefully, in a 64-bit address space, this
|
||||
# occurrence is rare enough for us to ignore
|
||||
var result = initHashSet[uint64](self.gc.pointers.len())
|
||||
for obj in self.calls:
|
||||
if not obj.getTag():
|
||||
continue
|
||||
if obj in self.gc.pointers:
|
||||
result.incl(obj)
|
||||
for obj in self.operands:
|
||||
if not obj.getTag():
|
||||
continue
|
||||
if obj in self.gc.pointers:
|
||||
result.incl(obj)
|
||||
result.incl(obj)
|
||||
var obj: ptr HeapObject
|
||||
for p in result:
|
||||
obj = cast[ptr HeapObject](p)
|
||||
|
@ -301,7 +298,6 @@ proc sweep(self: var PeonVM) =
|
|||
## during the mark phase.
|
||||
when debugGC:
|
||||
echo "DEBUG - GC: Beginning sweeping phase"
|
||||
when debugGC:
|
||||
var count = 0
|
||||
var current: ptr HeapObject
|
||||
var freed: HashSet[uint64]
|
||||
|
@ -380,19 +376,19 @@ proc newPeonVM*: PeonVM =
|
|||
# Getters for singleton types
|
||||
{.push inline.}
|
||||
|
||||
proc getNil*(self: var PeonVM): uint64 = self.cache[2]
|
||||
func getNil*(self: var PeonVM): uint64 = self.cache[2]
|
||||
|
||||
proc getBool*(self: var PeonVM, value: bool): uint64 =
|
||||
func getBool*(self: var PeonVM, value: bool): uint64 =
|
||||
if value:
|
||||
return self.cache[1]
|
||||
return self.cache[0]
|
||||
|
||||
proc getInf*(self: var PeonVM, positive: bool): uint64 =
|
||||
func getInf*(self: var PeonVM, positive: bool): uint64 =
|
||||
if positive:
|
||||
return self.cache[3]
|
||||
return self.cache[4]
|
||||
|
||||
proc getNan*(self: var PeonVM): uint64 = self.cache[5]
|
||||
func getNan*(self: var PeonVM): uint64 = self.cache[5]
|
||||
|
||||
|
||||
# Thanks to nim's *genius* idea of making x > y a template
|
||||
|
@ -402,11 +398,11 @@ proc getNan*(self: var PeonVM): uint64 = self.cache[5]
|
|||
# and https://github.com/nim-lang/Nim/issues/10425 and try not to
|
||||
# bang your head against the nearest wall), we need a custom operator
|
||||
# that preserves the natural order of evaluation
|
||||
proc `!>`[T](a, b: T): auto {.inline.} =
|
||||
func `!>`[T](a, b: T): auto =
|
||||
b < a
|
||||
|
||||
|
||||
proc `!>=`[T](a, b: T): auto {.inline, used.} =
|
||||
proc `!>=`[T](a, b: T): auto {.used.} =
|
||||
b <= a
|
||||
|
||||
|
||||
|
@ -414,26 +410,26 @@ proc `!>=`[T](a, b: T): auto {.inline, used.} =
|
|||
# that go through the (get|set|peek)c wrappers are frame-relative,
|
||||
# meaning that the given index is added to the current stack frame's
|
||||
# bottom to obtain an absolute stack index
|
||||
proc push(self: var PeonVM, obj: uint64) =
|
||||
func push(self: var PeonVM, obj: uint64) =
|
||||
## Pushes a value object onto the
|
||||
## operand stack
|
||||
self.operands.add(obj)
|
||||
|
||||
|
||||
proc pop(self: var PeonVM): uint64 =
|
||||
func pop(self: var PeonVM): uint64 =
|
||||
## Pops a value off the operand
|
||||
## stack and returns it
|
||||
return self.operands.pop()
|
||||
|
||||
|
||||
proc peekb(self: PeonVM, distance: BackwardsIndex = ^1): uint64 =
|
||||
func peekb(self: PeonVM, distance: BackwardsIndex = ^1): uint64 =
|
||||
## Returns the value at the given (backwards)
|
||||
## distance from the top of the operand stack
|
||||
## without consuming it
|
||||
return self.operands[distance]
|
||||
|
||||
|
||||
proc peek(self: PeonVM, distance: int = 0): uint64 =
|
||||
func peek(self: PeonVM, distance: int = 0): uint64 =
|
||||
## Returns the value at the given
|
||||
## distance from the top of the
|
||||
## operand stack without consuming it
|
||||
|
@ -442,33 +438,33 @@ proc peek(self: PeonVM, distance: int = 0): uint64 =
|
|||
return self.operands[self.operands.high() + distance]
|
||||
|
||||
|
||||
proc pushc(self: var PeonVM, val: uint64) =
|
||||
func pushc(self: var PeonVM, val: uint64) =
|
||||
## Pushes a value onto the
|
||||
## call stack
|
||||
self.calls.add(val)
|
||||
|
||||
|
||||
proc popc(self: var PeonVM): uint64 =
|
||||
func popc(self: var PeonVM): uint64 =
|
||||
## Pops a value off the call
|
||||
## stack and returns it
|
||||
return self.calls.pop()
|
||||
|
||||
|
||||
proc peekc(self: PeonVM, distance: int = 0): uint64 {.used.} =
|
||||
func peekc(self: PeonVM, distance: int = 0): uint64 {.used.} =
|
||||
## Returns the value at the given
|
||||
## distance from the top of the
|
||||
## call stack without consuming it
|
||||
return self.calls[self.calls.high() + distance]
|
||||
|
||||
|
||||
proc getc(self: PeonVM, idx: int): uint64 =
|
||||
func getc(self: PeonVM, idx: int): uint64 =
|
||||
## Getter method that abstracts
|
||||
## indexing our call stack through
|
||||
## stack frames
|
||||
return self.calls[idx.uint64 + self.frames[^1]]
|
||||
|
||||
|
||||
proc setc(self: var PeonVM, idx: int, val: uint64) =
|
||||
func setc(self: var PeonVM, idx: int, val: uint64) =
|
||||
## Setter method that abstracts
|
||||
## indexing our call stack through
|
||||
## stack frames
|
||||
|
@ -700,7 +696,7 @@ proc dispatch*(self: var PeonVM) =
|
|||
while true:
|
||||
{.computedgoto.} # https://nim-lang.org/docs/manual.html#pragmas-computedgoto-pragma
|
||||
when debugVM:
|
||||
if self.ip in self.breakpoints or self.breakpoints.len() == 0 or self.debugNext:
|
||||
if self.ip in self.breakpoints or self.debugNext:
|
||||
self.debug()
|
||||
instruction = OpCode(self.readByte())
|
||||
case instruction:
|
||||
|
@ -768,6 +764,10 @@ proc dispatch*(self: var PeonVM) =
|
|||
# not needed there anymore
|
||||
discard self.pop()
|
||||
discard self.pop()
|
||||
of ReplExit:
|
||||
# Preserves the VM's state for the next
|
||||
# execution. Used in the REPL
|
||||
return
|
||||
of Return:
|
||||
# Returns from a function.
|
||||
# Every peon program is wrapped
|
||||
|
@ -829,9 +829,13 @@ proc dispatch*(self: var PeonVM) =
|
|||
# not a great idea)
|
||||
self.pushc(self.pop())
|
||||
of LoadVar:
|
||||
# Pushes a variable from the call stack
|
||||
# Pushes a local variable from the call stack
|
||||
# onto the operand stack
|
||||
self.push(self.getc(self.readLong().int))
|
||||
of LoadGlobal:
|
||||
# Pushes a global variable from the call stack
|
||||
# onto the operand stack
|
||||
self.push(self.calls[self.readLong().int])
|
||||
of NoOp:
|
||||
# Does nothing
|
||||
continue
|
||||
|
@ -1002,6 +1006,8 @@ proc dispatch*(self: var PeonVM) =
|
|||
self.push(self.getBool(cast[float32](self.pop()) !>= cast[float32](self.pop())))
|
||||
of Float32LessOrEqual:
|
||||
self.push(self.getBool(cast[float32](self.pop()) <= cast[float32](self.pop())))
|
||||
of Identity:
|
||||
self.push(cast[uint64](self.pop() == self.pop()))
|
||||
# Print opcodes
|
||||
of PrintInt64:
|
||||
echo cast[int64](self.pop())
|
||||
|
@ -1050,23 +1056,41 @@ proc dispatch*(self: var PeonVM) =
|
|||
discard
|
||||
|
||||
|
||||
proc run*(self: var PeonVM, chunk: Chunk, breakpoints: seq[uint64] = @[]) =
|
||||
proc run*(self: var PeonVM, chunk: Chunk, breakpoints: seq[uint64] = @[], repl: bool = false) =
|
||||
## Executes a piece of Peon bytecode
|
||||
self.chunk = chunk
|
||||
self.frames = @[]
|
||||
self.calls = @[]
|
||||
self.operands = @[]
|
||||
self.breakpoints = breakpoints
|
||||
self.results = @[]
|
||||
self.ip = 0
|
||||
self.lastDebugCommand = ""
|
||||
when debugVM:
|
||||
self.breakpoints = breakpoints
|
||||
self.lastDebugCommand = ""
|
||||
try:
|
||||
self.dispatch()
|
||||
except NilAccessDefect:
|
||||
stderr.writeLine("Memory Access Violation: SIGSEGV")
|
||||
quit(1)
|
||||
if not repl:
|
||||
# We clean up after ourselves!
|
||||
self.collect()
|
||||
|
||||
|
||||
proc resume*(self: var PeonVM, chunk: Chunk) =
|
||||
## Resumes execution of the given chunk (which
|
||||
## may have changed since the last call to run()).
|
||||
## No other state mutation occurs and all stacks as
|
||||
## well as other metadata are left intact. This should
|
||||
## not be used directly unless you know what you're
|
||||
## doing, as incremental compilation support is very
|
||||
## experimental and highly unstable
|
||||
self.chunk = chunk
|
||||
try:
|
||||
self.dispatch()
|
||||
except NilAccessDefect:
|
||||
stderr.writeLine("Memory Access Violation: SIGSEGV")
|
||||
quit(1)
|
||||
# We clean up after ourselves!
|
||||
self.collect()
|
||||
|
||||
|
||||
{.pop.}
|
||||
|
|
|
@ -15,14 +15,14 @@
|
|||
import strformat
|
||||
|
||||
# These variables can be tweaked to debug and test various components of the toolchain
|
||||
const debugLexer* {.booldefine.} = false # Print the tokenizer's output
|
||||
const debugParser* {.booldefine.} = false # Print the AST generated by the parser
|
||||
const debugCompiler* {.booldefine.} = false # Disassemble and/or print the code generated by the compiler
|
||||
var debugLexer* = false # Print the tokenizer's output
|
||||
var debugParser* = false # Print the AST generated by the parser
|
||||
var debugCompiler* = false # Disassemble and/or print the code generated by the compiler
|
||||
const debugVM* {.booldefine.} = false # Enable the runtime debugger in the bytecode VM
|
||||
const debugGC* {.booldefine.} = false # Debug the Garbage Collector (extremely verbose)
|
||||
const debugAlloc* {.booldefine.} = false # Trace object allocation (extremely verbose)
|
||||
const debugMem* {.booldefine.} = false # Debug the memory allocator (extremely verbose)
|
||||
const debugSerializer* {.booldefine.} = false # Validate the bytecode serializer's output
|
||||
var debugSerializer* = false # Validate the bytecode serializer's output
|
||||
const debugStressGC* {.booldefine.} = false # Make the GC run a collection at every allocation (VERY SLOW!)
|
||||
const debugMarkGC* {.booldefine.} = false # Trace the marking phase object by object (extremely verbose)
|
||||
const PeonBytecodeMarker* = "PEON_BYTECODE" # Magic value at the beginning of bytecode files
|
||||
|
@ -70,8 +70,11 @@ Options
|
|||
yes/on and no/off
|
||||
--noWarn Disable a specific warning (for example, --noWarn:unusedVariable)
|
||||
--showMismatches Show all mismatches when function dispatching fails (output is really verbose)
|
||||
--backend Select the compilation backend (valid values are: 'c', 'cpp' and 'bytecode'). Note
|
||||
--backend Select the compilation backend (valid values are: 'c' and 'bytecode'). Note
|
||||
that the REPL always uses the bytecode target. Defaults to 'bytecode'
|
||||
-o, --output Rename the output file with this value (with --backend:bytecode, a '.pbc' extension
|
||||
is added if not already present)
|
||||
--debug-dump Debug the bytecode serializer. Only makes sense with --backend:bytecode
|
||||
--debug-lexer Debug the peon lexer
|
||||
--debug-parser Debug the peon parser
|
||||
"""
|
||||
|
|
|
@ -12,19 +12,7 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
# Copyright 2022 Mattia Giambirtone & All Contributors
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
import std/tables
|
||||
import std/strformat
|
||||
import std/algorithm
|
||||
|
@ -52,7 +40,7 @@ export ast, token, symbols, config, errors
|
|||
type
|
||||
PeonBackend* = enum
|
||||
## An enumeration of the peon backends
|
||||
Bytecode, NativeC, NativeCpp
|
||||
Bytecode, NativeC
|
||||
|
||||
PragmaKind* = enum
|
||||
## An enumeration of pragma types
|
||||
|
@ -146,7 +134,7 @@ type
|
|||
node*: Declaration
|
||||
# Who is this name exported to? (Only makes sense if isPrivate
|
||||
# equals false)
|
||||
exportedTo*: HashSet[Name]
|
||||
exportedTo*: HashSet[string]
|
||||
# Has the compiler generated this name internally or
|
||||
# does it come from user code?
|
||||
isReal*: bool
|
||||
|
@ -224,7 +212,7 @@ type
|
|||
# The module importing us, if any
|
||||
parentModule*: Name
|
||||
# Currently imported modules
|
||||
modules*: HashSet[Name]
|
||||
modules*: HashSet[string]
|
||||
|
||||
TypedNode* = ref object
|
||||
## A wapper for AST nodes
|
||||
|
@ -353,11 +341,9 @@ proc step*(self: Compiler): ASTNode {.inline.} =
|
|||
# and can be reused across multiple compilation backends
|
||||
|
||||
proc resolve*(self: Compiler, name: string): Name =
|
||||
## Traverses all existing namespaces and returns
|
||||
## the first object with the given name. Returns
|
||||
## nil when the name can't be found. Note that
|
||||
## when a type or function declaration is first
|
||||
## resolved, it is also compiled on-the-fly
|
||||
## Traverses all existing namespaces in reverse order
|
||||
## and returns the first object with the given name.
|
||||
## Returns nil when the name can't be found
|
||||
for obj in reversed(self.names):
|
||||
if obj.ident.token.lexeme == name:
|
||||
if obj.owner.path != self.currentModule.path:
|
||||
|
@ -368,11 +354,12 @@ proc resolve*(self: Compiler, name: string): Name =
|
|||
# module, so we definitely can't
|
||||
# use it
|
||||
continue
|
||||
elif self.currentModule in obj.exportedTo:
|
||||
elif self.currentModule.path in obj.exportedTo:
|
||||
# The name is public in its owner
|
||||
# module and said module has explicitly
|
||||
# exported it to us: we can use it
|
||||
result = obj
|
||||
result.resolved = true
|
||||
break
|
||||
# If the name is public but not exported in
|
||||
# its owner module, then we act as if it's
|
||||
|
@ -382,6 +369,7 @@ proc resolve*(self: Compiler, name: string): Name =
|
|||
# might not want to also have access to C's and D's
|
||||
# names as they might clash with its own stuff)
|
||||
continue
|
||||
# We own this name, so we can definitely access it
|
||||
result = obj
|
||||
result.resolved = true
|
||||
break
|
||||
|
@ -725,7 +713,7 @@ method findByName*(self: Compiler, name: string): seq[Name] =
|
|||
for obj in reversed(self.names):
|
||||
if obj.ident.token.lexeme == name:
|
||||
if obj.owner.path != self.currentModule.path:
|
||||
if obj.isPrivate or self.currentModule notin obj.exportedTo:
|
||||
if obj.isPrivate or self.currentModule.path notin obj.exportedTo:
|
||||
continue
|
||||
result.add(obj)
|
||||
|
||||
|
@ -739,11 +727,13 @@ method findInModule*(self: Compiler, name: string, module: Name): seq[Name] =
|
|||
## the current one or not
|
||||
if name == "":
|
||||
for obj in reversed(self.names):
|
||||
if not obj.isPrivate and obj.owner == module:
|
||||
if obj.owner.isNil():
|
||||
continue
|
||||
if not obj.isPrivate and obj.owner.path == module.path:
|
||||
result.add(obj)
|
||||
else:
|
||||
for obj in self.findInModule("", module):
|
||||
if obj.ident.token.lexeme == name and self.currentModule in obj.exportedTo:
|
||||
if obj.ident.token.lexeme == name and self.currentModule.path in obj.exportedTo:
|
||||
result.add(obj)
|
||||
|
||||
|
||||
|
@ -1046,7 +1036,7 @@ proc declare*(self: Compiler, node: ASTNode): Name {.discardable.} =
|
|||
break
|
||||
if name.ident.token.lexeme != declaredName:
|
||||
continue
|
||||
if name.owner != n.owner and (name.isPrivate or n.owner notin name.exportedTo):
|
||||
if name.owner != n.owner and (name.isPrivate or n.owner.path notin name.exportedTo):
|
||||
continue
|
||||
if name.kind in [NameKind.Var, NameKind.Module, NameKind.CustomType, NameKind.Enum]:
|
||||
if name.depth < n.depth:
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -46,10 +46,21 @@ type
|
|||
## - After that follows the argument count as a 1 byte integer
|
||||
## - Lastly, the function's name (optional) is encoded in ASCII, prepended with
|
||||
## its size as a 2-byte integer
|
||||
## modules contains information about all the peon modules that the compiler has encountered,
|
||||
## along with their start/end offset in the code. Unlike other bytecode-compiled languages like
|
||||
## Python, peon does not produce a bytecode file for each separate module it compiles: everything
|
||||
## is contained within a single binary blob. While this simplifies the implementation and makes
|
||||
## bytecode files entirely "self-hosted", it also means that the original module information is
|
||||
## lost: this segment serves to fix that. The segment's size is encoded at the beginning as a 4-byte
|
||||
## sequence (i.e. a single 32-bit integer) and its encoding is similar to that of the functions segment:
|
||||
## - First, the position into the bytecode where the module begins is encoded (as a 3 byte integer)
|
||||
## - Second, the position into the bytecode where the module ends is encoded (as a 3 byte integer)
|
||||
## - Lastly, the module's name is encoded in ASCII, prepended with its size as a 2-byte integer
|
||||
consts*: seq[uint8]
|
||||
code*: seq[uint8]
|
||||
lines*: seq[int]
|
||||
functions*: seq[uint8]
|
||||
modules*: seq[uint8]
|
||||
|
||||
OpCode* {.pure.} = enum
|
||||
## Enum of Peon's bytecode opcodes
|
||||
|
@ -136,6 +147,7 @@ type
|
|||
Float32GreaterOrEqual,
|
||||
Float32LessOrEqual,
|
||||
LogicalNot,
|
||||
Identity, # Pointer equality
|
||||
## Print opcodes
|
||||
PrintInt64,
|
||||
PrintUInt64,
|
||||
|
@ -188,7 +200,9 @@ type
|
|||
PushC, # Pop off the operand stack onto the call stack
|
||||
SysClock64, # Pushes the output of a monotonic clock on the stack
|
||||
LoadTOS, # Pushes the top of the call stack onto the operand stack
|
||||
DupTop # Duplicates the top of the operand stack onto the operand stack
|
||||
DupTop, # Duplicates the top of the operand stack onto the operand stack
|
||||
ReplExit, # Exits the VM immediately, leaving its state intact. Used in the REPL
|
||||
LoadGlobal # Loads a global variable
|
||||
|
||||
|
||||
# We group instructions by their operation/operand types for easier handling when debugging
|
||||
|
@ -267,7 +281,9 @@ const simpleInstructions* = {Return, LoadNil,
|
|||
Float32LessThan,
|
||||
Float32GreaterOrEqual,
|
||||
Float32LessOrEqual,
|
||||
DupTop
|
||||
DupTop,
|
||||
ReplExit,
|
||||
Identity
|
||||
}
|
||||
|
||||
# Constant instructions are instructions that operate on the bytecode constant table
|
||||
|
@ -280,7 +296,7 @@ const constantInstructions* = {LoadInt64, LoadUInt64,
|
|||
|
||||
# Stack triple instructions operate on the stack at arbitrary offsets and pop arguments off of it in the form
|
||||
# of 24 bit integers
|
||||
const stackTripleInstructions* = {StoreVar, LoadVar, }
|
||||
const stackTripleInstructions* = {StoreVar, LoadVar, LoadGlobal}
|
||||
|
||||
# Stack double instructions operate on the stack at arbitrary offsets and pop arguments off of it in the form
|
||||
# of 16 bit integers
|
||||
|
|
|
@ -461,7 +461,8 @@ proc handleBuiltinFunction(self: BytecodeCompiler, fn: Type, args: seq[Expressio
|
|||
"PrintString": PrintString,
|
||||
"SysClock64": SysClock64,
|
||||
"LogicalNot": LogicalNot,
|
||||
"NegInf": LoadNInf
|
||||
"NegInf": LoadNInf,
|
||||
"Identity": Identity
|
||||
}.to_table()
|
||||
if fn.builtinOp == "print":
|
||||
let typ = self.inferOrError(args[0])
|
||||
|
@ -565,6 +566,8 @@ proc endScope(self: BytecodeCompiler) =
|
|||
var names: seq[Name] = @[]
|
||||
var popCount = 0
|
||||
for name in self.names:
|
||||
if self.replMode and name.depth == 0:
|
||||
continue
|
||||
# We only pop names in scopes deeper than ours
|
||||
if name.depth > self.depth:
|
||||
if name.depth == 0 and not self.isMainModule:
|
||||
|
@ -999,9 +1002,12 @@ proc terminateProgram(self: BytecodeCompiler, pos: int) =
|
|||
## Utility to terminate a peon program
|
||||
self.patchForwardDeclarations()
|
||||
self.endScope()
|
||||
self.emitByte(OpCode.Return, self.peek().token.line)
|
||||
self.emitByte(0, self.peek().token.line) # Entry point has no return value (TODO: Add easter eggs, cuz why not)
|
||||
self.patchReturnAddress(pos)
|
||||
if self.replMode:
|
||||
self.emitByte(ReplExit, self.peek().token.line)
|
||||
else:
|
||||
self.emitByte(OpCode.Return, self.peek().token.line)
|
||||
self.emitByte(0, self.peek().token.line) # Entry point has no return value
|
||||
self.patchReturnAddress(pos)
|
||||
|
||||
|
||||
proc beginProgram(self: BytecodeCompiler): int =
|
||||
|
@ -1228,10 +1234,14 @@ method identifier(self: BytecodeCompiler, node: IdentExpr, name: Name = nil, com
|
|||
if not s.belongsTo.isNil() and s.belongsTo.valueType.fun.kind == funDecl and FunDecl(s.belongsTo.valueType.fun).isTemplate:
|
||||
discard
|
||||
else:
|
||||
# Loads a regular variable from the current frame
|
||||
self.emitByte(LoadVar, s.ident.token.line)
|
||||
# No need to check for -1 here: we already did a nil check above!
|
||||
self.emitBytes(s.position.toTriple(), s.ident.token.line)
|
||||
if s.depth > 0:
|
||||
# Loads a regular variable from the current frame
|
||||
self.emitByte(LoadVar, s.ident.token.line)
|
||||
# No need to check for -1 here: we already did a nil check above!
|
||||
self.emitBytes(s.position.toTriple(), s.ident.token.line)
|
||||
else:
|
||||
self.emitByte(LoadGlobal, s.ident.token.line)
|
||||
self.emitBytes(s.position.toTriple(), s.ident.token.line)
|
||||
|
||||
|
||||
method assignment(self: BytecodeCompiler, node: ASTNode, compile: bool = true): Type {.discardable.} =
|
||||
|
@ -1468,8 +1478,9 @@ method lambdaExpr(self: BytecodeCompiler, node: LambdaExpr, compile: bool = true
|
|||
line: node.token.line,
|
||||
kind: NameKind.Function,
|
||||
belongsTo: function,
|
||||
isReal: true)
|
||||
if compile and node notin self.lambdas:
|
||||
isReal: true,
|
||||
)
|
||||
if compile and node notin self.lambdas and not node.body.isNil():
|
||||
self.lambdas.add(node)
|
||||
let jmp = self.emitJump(JumpForwards, node.token.line)
|
||||
if BlockStmt(node.body).code.len() == 0:
|
||||
|
@ -1677,7 +1688,7 @@ proc importStmt(self: BytecodeCompiler, node: ImportStmt, compile: bool = true)
|
|||
# Importing a module automatically exports
|
||||
# its public names to us
|
||||
for name in self.findInModule("", module):
|
||||
name.exportedTo.incl(self.currentModule)
|
||||
name.exportedTo.incl(self.currentModule.path)
|
||||
except IOError:
|
||||
self.error(&"could not import '{module.ident.token.lexeme}': {getCurrentExceptionMsg()}")
|
||||
except OSError:
|
||||
|
@ -1695,22 +1706,22 @@ proc exportStmt(self: BytecodeCompiler, node: ExportStmt, compile: bool = true)
|
|||
var name = self.resolveOrError(node.name)
|
||||
if name.isPrivate:
|
||||
self.error("cannot export private names")
|
||||
name.exportedTo.incl(self.parentModule)
|
||||
name.exportedTo.incl(self.parentModule.path)
|
||||
case name.kind:
|
||||
of NameKind.Module:
|
||||
# We need to export everything
|
||||
# this module defines!
|
||||
for name in self.findInModule("", name):
|
||||
name.exportedTo.incl(self.parentModule)
|
||||
name.exportedTo.incl(self.parentModule.path)
|
||||
of NameKind.Function:
|
||||
# Only exporting a single function (or, well
|
||||
# all of its implementations)
|
||||
for name in self.findByName(name.ident.token.lexeme):
|
||||
if name.kind != NameKind.Function:
|
||||
continue
|
||||
name.exportedTo.incl(self.parentModule)
|
||||
name.exportedTo.incl(self.parentModule.path)
|
||||
else:
|
||||
discard
|
||||
self.error("unsupported export type")
|
||||
|
||||
|
||||
proc breakStmt(self: BytecodeCompiler, node: BreakStmt) =
|
||||
|
@ -1972,12 +1983,12 @@ proc funDecl(self: BytecodeCompiler, node: FunDecl, name: Name) =
|
|||
self.patchJump(jump)
|
||||
self.endScope()
|
||||
# Terminates the function's context
|
||||
let stop = self.chunk.code.len().toTriple()
|
||||
self.emitByte(OpCode.Return, self.peek().token.line)
|
||||
if hasVal:
|
||||
self.emitByte(1, self.peek().token.line)
|
||||
else:
|
||||
self.emitByte(0, self.peek().token.line)
|
||||
let stop = self.chunk.code.len().toTriple()
|
||||
self.chunk.functions[idx] = stop[0]
|
||||
self.chunk.functions[idx + 1] = stop[1]
|
||||
self.chunk.functions[idx + 2] = stop[2]
|
||||
|
@ -2046,26 +2057,32 @@ proc compile*(self: BytecodeCompiler, ast: seq[Declaration], file: string, lines
|
|||
self.chunk = newChunk()
|
||||
else:
|
||||
self.chunk = chunk
|
||||
self.ast = ast
|
||||
self.file = file
|
||||
self.depth = 0
|
||||
self.currentFunction = nil
|
||||
self.current = 0
|
||||
self.lines = lines
|
||||
self.source = source
|
||||
if self.replMode:
|
||||
self.ast &= ast
|
||||
self.source &= "\n" & source
|
||||
self.lines &= lines
|
||||
else:
|
||||
self.ast = ast
|
||||
self.current = 0
|
||||
self.stackIndex = 1
|
||||
self.lines = lines
|
||||
self.source = source
|
||||
self.isMainModule = isMainModule
|
||||
self.disabledWarnings = disabledWarnings
|
||||
self.showMismatches = showMismatches
|
||||
self.mode = mode
|
||||
self.stackIndex = 1
|
||||
let start = self.chunk.code.len()
|
||||
if not incremental:
|
||||
self.jumps = @[]
|
||||
let pos = self.beginProgram()
|
||||
let idx = self.stackIndex
|
||||
self.stackIndex = idx
|
||||
while not self.done():
|
||||
self.declaration(Declaration(self.step()))
|
||||
self.terminateProgram(pos)
|
||||
# TODO: REPL is broken, we need a new way to make
|
||||
# incremental compilation resume from where it stopped!
|
||||
result = self.chunk
|
||||
|
||||
|
||||
|
@ -2083,7 +2100,7 @@ proc compileModule(self: BytecodeCompiler, module: Name) =
|
|||
break
|
||||
elif i == searchPath.high():
|
||||
self.error(&"""could not import '{path}': module not found""")
|
||||
if self.modules.contains(module):
|
||||
if self.modules.contains(module.path):
|
||||
return
|
||||
let source = readFile(path)
|
||||
let current = self.current
|
||||
|
@ -2094,13 +2111,23 @@ proc compileModule(self: BytecodeCompiler, module: Name) =
|
|||
let currentModule = self.currentModule
|
||||
let mainModule = self.isMainModule
|
||||
let parentModule = self.parentModule
|
||||
let replMode = self.replMode
|
||||
self.replMode = false
|
||||
self.parentModule = currentModule
|
||||
self.currentModule = module
|
||||
let start = self.chunk.code.len()
|
||||
discard self.compile(self.parser.parse(self.lexer.lex(source, path),
|
||||
path, self.lexer.getLines(),
|
||||
self.lexer.getSource(), persist=true),
|
||||
path, self.lexer.getLines(), self.lexer.getSource(), chunk=self.chunk, incremental=true,
|
||||
isMainModule=false, self.disabledWarnings, self.showMismatches, self.mode)
|
||||
# Mark the end of a new module
|
||||
self.chunk.modules.extend(start.toTriple())
|
||||
self.chunk.modules.extend(self.chunk.code.high().toTriple())
|
||||
# I swear to god if someone ever creates a peon module with a name that's
|
||||
# longer than 2^16 bytes I will hit them with a metal pipe. Mark my words
|
||||
self.chunk.modules.extend(self.currentModule.ident.token.lexeme.len().toDouble())
|
||||
self.chunk.modules.extend(self.currentModule.ident.token.lexeme.toBytes())
|
||||
module.file = path
|
||||
# No need to save the old scope depth: import statements are
|
||||
# only allowed at the top level!
|
||||
|
@ -2111,6 +2138,7 @@ proc compileModule(self: BytecodeCompiler, module: Name) =
|
|||
self.currentModule = currentModule
|
||||
self.isMainModule = mainModule
|
||||
self.parentModule = parentModule
|
||||
self.replMode = replMode
|
||||
self.lines = lines
|
||||
self.source = src
|
||||
self.modules.incl(module)
|
||||
self.modules.incl(module.path)
|
||||
|
|
|
@ -22,12 +22,15 @@ import std/terminal
|
|||
|
||||
|
||||
type
|
||||
Function = ref object
|
||||
start, stop, bottom, argc: int
|
||||
Function = object
|
||||
start, stop, argc: int
|
||||
name: string
|
||||
Module = object
|
||||
start, stop: int
|
||||
name: string
|
||||
started, stopped: bool
|
||||
Debugger* = ref object
|
||||
chunk: Chunk
|
||||
modules: seq[Module]
|
||||
functions: seq[Function]
|
||||
current: int
|
||||
|
||||
|
@ -66,21 +69,38 @@ proc checkFunctionStart(self: Debugger, n: int) =
|
|||
## Checks if a function begins at the given
|
||||
## bytecode offset
|
||||
for i, e in self.functions:
|
||||
if n == e.start and not (e.started or e.stopped):
|
||||
e.started = true
|
||||
# Avoids duplicate output
|
||||
if n == e.start:
|
||||
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Function Start ", fgYellow, &"'{e.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||
styledEcho fgGreen, "\t- Start offset: ", fgYellow, $e.start
|
||||
styledEcho fgGreen, "\t- End offset: ", fgYellow, $e.stop
|
||||
styledEcho fgGreen, "\t- Argument count: ", fgYellow, $e.argc
|
||||
styledEcho fgGreen, "\t- Argument count: ", fgYellow, $e.argc, "\n"
|
||||
|
||||
|
||||
proc checkFunctionEnd(self: Debugger, n: int) =
|
||||
## Checks if a function ends at the given
|
||||
## bytecode offset
|
||||
for i, e in self.functions:
|
||||
if n == e.stop and e.started and not e.stopped:
|
||||
e.stopped = true
|
||||
if n == e.stop:
|
||||
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Function End ", fgYellow, &"'{e.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||
|
||||
|
||||
proc checkModuleStart(self: Debugger, n: int) =
|
||||
## Checks if a module begins at the given
|
||||
## bytecode offset
|
||||
for i, m in self.modules:
|
||||
if m.start == n:
|
||||
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Module Start ", fgYellow, &"'{m.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||
styledEcho fgGreen, "\t- Start offset: ", fgYellow, $m.start
|
||||
styledEcho fgGreen, "\t- End offset: ", fgYellow, $m.stop, "\n"
|
||||
|
||||
|
||||
proc checkModuleEnd(self: Debugger, n: int) =
|
||||
## Checks if a module ends at the given
|
||||
## bytecode offset
|
||||
for i, m in self.modules:
|
||||
if m.stop == n:
|
||||
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Module End ", fgYellow, &"'{m.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||
|
||||
|
||||
proc simpleInstruction(self: Debugger, instruction: OpCode) =
|
||||
|
@ -94,9 +114,6 @@ proc simpleInstruction(self: Debugger, instruction: OpCode) =
|
|||
else:
|
||||
stdout.styledWriteLine(fgYellow, "No")
|
||||
self.current += 1
|
||||
self.checkFunctionEnd(self.current - 2)
|
||||
self.checkFunctionEnd(self.current - 1)
|
||||
self.checkFunctionEnd(self.current)
|
||||
|
||||
|
||||
proc stackTripleInstruction(self: Debugger, instruction: OpCode) =
|
||||
|
@ -168,20 +185,27 @@ proc jumpInstruction(self: Debugger, instruction: OpCode) =
|
|||
self.current += 4
|
||||
while self.chunk.code[self.current] == NoOp.uint8:
|
||||
inc(self.current)
|
||||
for i in countup(orig, self.current + 1):
|
||||
self.checkFunctionStart(i)
|
||||
|
||||
|
||||
proc disassembleInstruction*(self: Debugger) =
|
||||
## Takes one bytecode instruction and prints it
|
||||
let opcode = OpCode(self.chunk.code[self.current])
|
||||
self.checkModuleStart(self.current)
|
||||
self.checkFunctionStart(self.current)
|
||||
printDebug("Offset: ")
|
||||
stdout.styledWriteLine(fgYellow, $(self.current))
|
||||
printDebug("Line: ")
|
||||
stdout.styledWriteLine(fgYellow, &"{self.chunk.getLine(self.current)}")
|
||||
var opcode = OpCode(self.chunk.code[self.current])
|
||||
case opcode:
|
||||
of simpleInstructions:
|
||||
self.simpleInstruction(opcode)
|
||||
# Functions (and modules) only have a single return statement at the
|
||||
# end of their body, so we never execute this more than once per module/function
|
||||
if opcode == Return:
|
||||
# -2 to skip the hardcoded argument to return
|
||||
# and the increment by simpleInstruction()
|
||||
self.checkFunctionEnd(self.current - 2)
|
||||
self.checkModuleEnd(self.current - 1)
|
||||
of constantInstructions:
|
||||
self.constantInstruction(opcode)
|
||||
of stackDoubleInstructions:
|
||||
|
@ -197,7 +221,9 @@ proc disassembleInstruction*(self: Debugger) =
|
|||
else:
|
||||
echo &"DEBUG - Unknown opcode {opcode} at index {self.current}"
|
||||
self.current += 1
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
proc parseFunctions(self: Debugger) =
|
||||
## Parses function information in the chunk
|
||||
|
@ -206,7 +232,7 @@ proc parseFunctions(self: Debugger) =
|
|||
name: string
|
||||
idx = 0
|
||||
size = 0
|
||||
while idx < len(self.chunk.functions) - 1:
|
||||
while idx < self.chunk.functions.high():
|
||||
start = int([self.chunk.functions[idx], self.chunk.functions[idx + 1], self.chunk.functions[idx + 2]].fromTriple())
|
||||
idx += 3
|
||||
stop = int([self.chunk.functions[idx], self.chunk.functions[idx + 1], self.chunk.functions[idx + 2]].fromTriple())
|
||||
|
@ -220,15 +246,36 @@ proc parseFunctions(self: Debugger) =
|
|||
self.functions.add(Function(start: start, stop: stop, argc: argc, name: name))
|
||||
|
||||
|
||||
proc parseModules(self: Debugger) =
|
||||
## Parses module information in the chunk
|
||||
var
|
||||
start, stop: int
|
||||
name: string
|
||||
idx = 0
|
||||
size = 0
|
||||
while idx < self.chunk.modules.high():
|
||||
start = int([self.chunk.modules[idx], self.chunk.modules[idx + 1], self.chunk.modules[idx + 2]].fromTriple())
|
||||
idx += 3
|
||||
stop = int([self.chunk.modules[idx], self.chunk.modules[idx + 1], self.chunk.modules[idx + 2]].fromTriple())
|
||||
idx += 3
|
||||
size = int([self.chunk.modules[idx], self.chunk.modules[idx + 1]].fromDouble())
|
||||
idx += 2
|
||||
name = self.chunk.modules[idx..<idx + size].fromBytes()
|
||||
inc(idx, size)
|
||||
self.modules.add(Module(start: start, stop: stop, name: name))
|
||||
|
||||
|
||||
proc disassembleChunk*(self: Debugger, chunk: Chunk, name: string) =
|
||||
## Takes a chunk of bytecode and prints it
|
||||
self.chunk = chunk
|
||||
styledEcho fgBlue, &"==== Peon Bytecode Disassembler - Chunk '{name}' ====\n"
|
||||
self.current = 0
|
||||
self.parseFunctions()
|
||||
self.parseModules()
|
||||
while self.current < self.chunk.code.len:
|
||||
self.disassembleInstruction()
|
||||
echo ""
|
||||
|
||||
styledEcho fgBlue, &"==== Peon Bytecode Disassembler - Chunk '{name}' ===="
|
||||
|
||||
|
||||
|
|
|
@ -64,7 +64,8 @@ proc newSerializer*(self: Serializer = nil): Serializer =
|
|||
|
||||
|
||||
proc writeHeaders(self: Serializer, stream: var seq[byte]) =
|
||||
## Writes the Peon bytecode headers in-place into a byte stream
|
||||
## Writes the Peon bytecode headers in-place into the
|
||||
## given byte sequence
|
||||
stream.extend(PeonBytecodeMarker.toBytes())
|
||||
stream.add(byte(PEON_VERSION.major))
|
||||
stream.add(byte(PEON_VERSION.minor))
|
||||
|
@ -77,25 +78,31 @@ proc writeHeaders(self: Serializer, stream: var seq[byte]) =
|
|||
|
||||
proc writeLineData(self: Serializer, stream: var seq[byte]) =
|
||||
## Writes line information for debugging
|
||||
## bytecode instructions
|
||||
## bytecode instructions to the given byte
|
||||
## sequence
|
||||
stream.extend(len(self.chunk.lines).toQuad())
|
||||
for b in self.chunk.lines:
|
||||
stream.extend(b.toTriple())
|
||||
|
||||
|
||||
proc writeCFIData(self: Serializer, stream: var seq[byte]) =
|
||||
## Writes Call Frame Information for debugging
|
||||
## functions
|
||||
proc writeFunctions(self: Serializer, stream: var seq[byte]) =
|
||||
## Writes debug info about functions to the
|
||||
## given byte sequence
|
||||
stream.extend(len(self.chunk.functions).toQuad())
|
||||
stream.extend(self.chunk.functions)
|
||||
|
||||
|
||||
proc writeConstants(self: Serializer, stream: var seq[byte]) =
|
||||
## Writes the constants table in-place into the
|
||||
## given stream
|
||||
## byte sequence
|
||||
stream.extend(self.chunk.consts.len().toQuad())
|
||||
for constant in self.chunk.consts:
|
||||
stream.add(constant)
|
||||
stream.extend(self.chunk.consts)
|
||||
|
||||
|
||||
proc writeModules(self: Serializer, stream: var seq[byte]) =
|
||||
## Writes module information to the given stream
|
||||
stream.extend(self.chunk.modules.len().toQuad())
|
||||
stream.extend(self.chunk.modules)
|
||||
|
||||
|
||||
proc writeCode(self: Serializer, stream: var seq[byte]) =
|
||||
|
@ -106,7 +113,7 @@ proc writeCode(self: Serializer, stream: var seq[byte]) =
|
|||
|
||||
|
||||
proc readHeaders(self: Serializer, stream: seq[byte], serialized: Serialized): int =
|
||||
## Reads the bytecode headers from a given stream
|
||||
## Reads the bytecode headers from a given sequence
|
||||
## of bytes
|
||||
var stream = stream
|
||||
if stream[0..<len(PeonBytecodeMarker)] != PeonBytecodeMarker.toBytes():
|
||||
|
@ -131,7 +138,6 @@ proc readHeaders(self: Serializer, stream: seq[byte], serialized: Serialized): i
|
|||
result += 8
|
||||
|
||||
|
||||
|
||||
proc readLineData(self: Serializer, stream: seq[byte]): int =
|
||||
## Reads line information from a stream
|
||||
## of bytes
|
||||
|
@ -142,10 +148,11 @@ proc readLineData(self: Serializer, stream: seq[byte]): int =
|
|||
self.chunk.lines.add(int([stream[0], stream[1], stream[2]].fromTriple()))
|
||||
result += 3
|
||||
stream = stream[3..^1]
|
||||
doAssert len(self.chunk.lines) == int(size)
|
||||
|
||||
|
||||
proc readCFIData(self: Serializer, stream: seq[byte]): int =
|
||||
## Reads Call Frame Information from a stream
|
||||
proc readFunctions(self: Serializer, stream: seq[byte]): int =
|
||||
## Reads the function segment from a stream
|
||||
## of bytes
|
||||
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
||||
result += 4
|
||||
|
@ -153,22 +160,34 @@ proc readCFIData(self: Serializer, stream: seq[byte]): int =
|
|||
for i in countup(0, int(size) - 1):
|
||||
self.chunk.functions.add(stream[i])
|
||||
inc(result)
|
||||
doAssert len(self.chunk.functions) == int(size)
|
||||
|
||||
|
||||
proc readConstants(self: Serializer, stream: seq[byte]): int =
|
||||
## Reads the constant table from the given stream
|
||||
## of bytes
|
||||
## Reads the constant table from the given
|
||||
## byte sequence
|
||||
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
||||
result += 4
|
||||
var stream = stream[4..^1]
|
||||
for i in countup(0, int(size) - 1):
|
||||
self.chunk.consts.add(stream[i])
|
||||
inc(result)
|
||||
doAssert len(self.chunk.consts) == int(size)
|
||||
|
||||
|
||||
proc readModules(self: Serializer, stream: seq[byte]): int =
|
||||
## Reads module information
|
||||
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
||||
result += 4
|
||||
var stream = stream[4..^1]
|
||||
for i in countup(0, int(size) - 1):
|
||||
self.chunk.modules.add(stream[i])
|
||||
inc(result)
|
||||
doAssert len(self.chunk.modules) == int(size)
|
||||
|
||||
|
||||
proc readCode(self: Serializer, stream: seq[byte]): int =
|
||||
## Reads the bytecode from a given stream and writes
|
||||
## it into the given chunk
|
||||
## Reads the bytecode from a given byte sequence
|
||||
let size = [stream[0], stream[1], stream[2]].fromTriple()
|
||||
var stream = stream[3..^1]
|
||||
for i in countup(0, int(size) - 1):
|
||||
|
@ -178,13 +197,16 @@ proc readCode(self: Serializer, stream: seq[byte]): int =
|
|||
|
||||
|
||||
proc dumpBytes*(self: Serializer, chunk: Chunk, filename: string): seq[byte] =
|
||||
## Dumps the given bytecode and file to a sequence of bytes and returns it.
|
||||
## Dumps the given chunk to a sequence of bytes and returns it.
|
||||
## The filename argument is for error reporting only, use dumpFile
|
||||
## to dump bytecode to a file
|
||||
self.filename = filename
|
||||
self.chunk = chunk
|
||||
self.writeHeaders(result)
|
||||
self.writeLineData(result)
|
||||
self.writeCFIData(result)
|
||||
self.writeFunctions(result)
|
||||
self.writeConstants(result)
|
||||
self.writeModules(result)
|
||||
self.writeCode(result)
|
||||
|
||||
|
||||
|
@ -207,8 +229,9 @@ proc loadBytes*(self: Serializer, stream: seq[byte]): Serialized =
|
|||
try:
|
||||
stream = stream[self.readHeaders(stream, result)..^1]
|
||||
stream = stream[self.readLineData(stream)..^1]
|
||||
stream = stream[self.readCFIData(stream)..^1]
|
||||
stream = stream[self.readFunctions(stream)..^1]
|
||||
stream = stream[self.readConstants(stream)..^1]
|
||||
stream = stream[self.readModules(stream)..^1]
|
||||
stream = stream[self.readCode(stream)..^1]
|
||||
except IndexDefect:
|
||||
self.error("truncated bytecode stream")
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
|
||||
import std/strformat
|
||||
import std/strutils
|
||||
import std/tables
|
||||
import std/os
|
||||
|
||||
|
||||
|
@ -31,9 +32,6 @@ export token, ast, errors
|
|||
|
||||
|
||||
type
|
||||
|
||||
LoopContext {.pure.} = enum
|
||||
Loop, None
|
||||
Precedence {.pure.} = enum
|
||||
## Operator precedence
|
||||
## clearly stolen from
|
||||
|
@ -66,18 +64,16 @@ type
|
|||
# Only meaningful for parse errors
|
||||
file: string
|
||||
# The list of tokens representing
|
||||
# the source code to be parsed.
|
||||
# In most cases, those will come
|
||||
# from the builtin lexer, but this
|
||||
# behavior is not enforced and the
|
||||
# tokenizer is entirely separate from
|
||||
# the parser
|
||||
# the source code to be parsed
|
||||
tokens: seq[Token]
|
||||
# Little internal attribute that tells
|
||||
# us if we're inside a loop or not. This
|
||||
# allows us to detect errors like break
|
||||
# being used outside loops
|
||||
currentLoop: LoopContext
|
||||
# Just like scope depth tells us how
|
||||
# many nested scopes are above us, the
|
||||
# loop depth tells us how many nested
|
||||
# loops are above us. It's just a simple
|
||||
# way of statically detecting stuff like
|
||||
# the break statement being used outside
|
||||
# loops. Maybe a bit overkill for a parser?
|
||||
loopDepth: int
|
||||
# Stores the current function
|
||||
# being parsed. This is a reference
|
||||
# to either a FunDecl or LambdaExpr
|
||||
|
@ -96,8 +92,13 @@ type
|
|||
lines: seq[tuple[start, stop: int]]
|
||||
# The source of the current module
|
||||
source: string
|
||||
# Keeps track of imported modules
|
||||
modules: seq[tuple[name: string, loaded: bool]]
|
||||
# Keeps track of imported modules.
|
||||
# The key is the module's fully qualified
|
||||
# path, while the boolean indicates whether
|
||||
# it has been fully loaded. This is useful
|
||||
# to avoid importing a module twice and to
|
||||
# detect recursive dependency cycles
|
||||
modules: TableRef[string, bool]
|
||||
ParseError* = ref object of PeonException
|
||||
## A parsing exception
|
||||
parser*: Parser
|
||||
|
@ -140,7 +141,7 @@ proc newOperatorTable: OperatorTable =
|
|||
result.tokens = @[]
|
||||
for prec in Precedence:
|
||||
result.precedence[prec] = @[]
|
||||
# These operators are currently not built-in
|
||||
# These operators are currently hardcoded
|
||||
# due to compiler limitations
|
||||
result.addOperator("=")
|
||||
result.addOperator(".")
|
||||
|
@ -161,11 +162,12 @@ proc newParser*: Parser =
|
|||
result.file = ""
|
||||
result.tokens = @[]
|
||||
result.currentFunction = nil
|
||||
result.currentLoop = LoopContext.None
|
||||
result.loopDepth = 0
|
||||
result.scopeDepth = 0
|
||||
result.operators = newOperatorTable()
|
||||
result.tree = @[]
|
||||
result.source = ""
|
||||
result.modules = newTable[string, bool]()
|
||||
|
||||
|
||||
# Public getters for improved error formatting
|
||||
|
@ -180,7 +182,7 @@ template endOfLine(msg: string, tok: Token = nil) = self.expect(Semicolon, msg,
|
|||
|
||||
|
||||
|
||||
proc peek(self: Parser, distance: int = 0): Token =
|
||||
proc peek(self: Parser, distance: int = 0): Token {.inline.} =
|
||||
## Peeks at the token at the given distance.
|
||||
## If the distance is out of bounds, an EOF
|
||||
## token is returned. A negative distance may
|
||||
|
@ -201,7 +203,7 @@ proc done(self: Parser): bool {.inline.} =
|
|||
result = self.peek().kind == EndOfFile
|
||||
|
||||
|
||||
proc step(self: Parser, n: int = 1): Token =
|
||||
proc step(self: Parser, n: int = 1): Token {.inline.} =
|
||||
## Steps n tokens into the input,
|
||||
## returning the last consumed one
|
||||
if self.done():
|
||||
|
@ -227,7 +229,7 @@ proc error(self: Parser, message: string, token: Token = nil) {.raises: [ParseEr
|
|||
# as a symbol and in the cases where we need a specific token we just match the string
|
||||
# directly
|
||||
proc check[T: TokenType or string](self: Parser, kind: T,
|
||||
distance: int = 0): bool =
|
||||
distance: int = 0): bool {.inline.} =
|
||||
## Checks if the given token at the given distance
|
||||
## matches the expected kind and returns a boolean.
|
||||
## The distance parameter is passed directly to
|
||||
|
@ -239,7 +241,7 @@ proc check[T: TokenType or string](self: Parser, kind: T,
|
|||
self.peek(distance).lexeme == kind
|
||||
|
||||
|
||||
proc check[T: TokenType or string](self: Parser, kind: openarray[T]): bool =
|
||||
proc check[T: TokenType or string](self: Parser, kind: openarray[T]): bool {.inline.} =
|
||||
## Calls self.check() in a loop with each entry of
|
||||
## the given openarray of token kinds and returns
|
||||
## at the first match. Note that this assumes
|
||||
|
@ -251,7 +253,7 @@ proc check[T: TokenType or string](self: Parser, kind: openarray[T]): bool =
|
|||
return false
|
||||
|
||||
|
||||
proc match[T: TokenType or string](self: Parser, kind: T): bool =
|
||||
proc match[T: TokenType or string](self: Parser, kind: T): bool {.inline.} =
|
||||
## Behaves like self.check(), except that when a token
|
||||
## matches it is also consumed
|
||||
if self.check(kind):
|
||||
|
@ -261,7 +263,7 @@ proc match[T: TokenType or string](self: Parser, kind: T): bool =
|
|||
result = false
|
||||
|
||||
|
||||
proc match[T: TokenType or string](self: Parser, kind: openarray[T]): bool =
|
||||
proc match[T: TokenType or string](self: Parser, kind: openarray[T]): bool {.inline.} =
|
||||
## Calls self.match() in a loop with each entry of
|
||||
## the given openarray of token kinds and returns
|
||||
## at the first match. Note that this assumes
|
||||
|
@ -273,7 +275,7 @@ proc match[T: TokenType or string](self: Parser, kind: openarray[T]): bool =
|
|||
result = false
|
||||
|
||||
|
||||
proc expect[T: TokenType or string](self: Parser, kind: T, message: string = "", token: Token = nil) =
|
||||
proc expect[T: TokenType or string](self: Parser, kind: T, message: string = "", token: Token = nil) {.inline.} =
|
||||
## Behaves like self.match(), except that
|
||||
## when a token doesn't match, an error
|
||||
## is raised. If no error message is
|
||||
|
@ -285,7 +287,7 @@ proc expect[T: TokenType or string](self: Parser, kind: T, message: string = "",
|
|||
self.error(message)
|
||||
|
||||
|
||||
proc expect[T: TokenType or string](self: Parser, kind: openarray[T], message: string = "", token: Token = nil) {.used.} =
|
||||
proc expect[T: TokenType or string](self: Parser, kind: openarray[T], message: string = "", token: Token = nil) {.inline, used.} =
|
||||
## Behaves like self.expect(), except that
|
||||
## an error is raised only if none of the
|
||||
## given token kinds matches
|
||||
|
@ -307,6 +309,7 @@ proc funDecl(self: Parser, isAsync: bool = false, isGenerator: bool = false,
|
|||
isLambda: bool = false, isOperator: bool = false, isTemplate: bool = false): Declaration
|
||||
proc declaration(self: Parser): Declaration
|
||||
proc parse*(self: Parser, tokens: seq[Token], file: string, lines: seq[tuple[start, stop: int]], source: string, persist: bool = false): seq[Declaration]
|
||||
proc findOperators(self: Parser, tokens: seq[Token])
|
||||
# End of forward declarations
|
||||
|
||||
|
||||
|
@ -436,7 +439,7 @@ proc makeCall(self: Parser, callee: Expression): CallExpr =
|
|||
proc parseGenericArgs(self: Parser) =
|
||||
## Parses function generic arguments
|
||||
## like function[type](arg)
|
||||
discard
|
||||
discard # TODO
|
||||
|
||||
|
||||
proc call(self: Parser): Expression =
|
||||
|
@ -596,12 +599,12 @@ proc assertStmt(self: Parser): Statement =
|
|||
result.file = self.file
|
||||
|
||||
|
||||
proc beginScope(self: Parser) =
|
||||
proc beginScope(self: Parser) {.inline.} =
|
||||
## Begins a new lexical scope
|
||||
inc(self.scopeDepth)
|
||||
|
||||
|
||||
proc endScope(self: Parser) =
|
||||
proc endScope(self: Parser) {.inline.} =
|
||||
## Ends a new lexical scope
|
||||
dec(self.scopeDepth)
|
||||
|
||||
|
@ -631,8 +634,7 @@ proc namedBlockStmt(self: Parser): Statement =
|
|||
self.expect(Identifier, "expecting block name after 'block'")
|
||||
var name = newIdentExpr(self.peek(-1), self.scopeDepth)
|
||||
name.file = self.file
|
||||
let enclosingLoop = self.currentLoop
|
||||
self.currentLoop = Loop
|
||||
inc(self.loopDepth)
|
||||
self.expect(LeftBrace, "expecting '{' after 'block'")
|
||||
while not self.check(RightBrace) and not self.done():
|
||||
code.add(self.declaration())
|
||||
|
@ -642,14 +644,14 @@ proc namedBlockStmt(self: Parser): Statement =
|
|||
result = newNamedBlockStmt(code, name, tok)
|
||||
result.file = self.file
|
||||
self.endScope()
|
||||
self.currentLoop = enclosingLoop
|
||||
dec(self.loopDepth)
|
||||
|
||||
|
||||
proc breakStmt(self: Parser): Statement =
|
||||
## Parses break statements
|
||||
let tok = self.peek(-1)
|
||||
var label: IdentExpr
|
||||
if self.currentLoop != Loop:
|
||||
if self.loopDepth == 0:
|
||||
self.error("'break' cannot be used outside loops")
|
||||
if self.match(Identifier):
|
||||
label = newIdentExpr(self.peek(-1), self.scopeDepth)
|
||||
|
@ -673,7 +675,7 @@ proc continueStmt(self: Parser): Statement =
|
|||
## Parses continue statements
|
||||
let tok = self.peek(-1)
|
||||
var label: IdentExpr
|
||||
if self.currentLoop != Loop:
|
||||
if self.loopDepth == 0:
|
||||
self.error("'continue' cannot be used outside loops")
|
||||
if self.match(Identifier):
|
||||
label = newIdentExpr(self.peek(-1), self.scopeDepth)
|
||||
|
@ -747,8 +749,7 @@ proc raiseStmt(self: Parser): Statement =
|
|||
proc forEachStmt(self: Parser): Statement =
|
||||
## Parses C#-like foreach loops
|
||||
let tok = self.peek(-1)
|
||||
let enclosingLoop = self.currentLoop
|
||||
self.currentLoop = Loop
|
||||
inc(self.loopDepth)
|
||||
self.expect(Identifier)
|
||||
let identifier = newIdentExpr(self.peek(-1), self.scopeDepth)
|
||||
self.expect("in")
|
||||
|
@ -756,10 +757,7 @@ proc forEachStmt(self: Parser): Statement =
|
|||
self.expect(LeftBrace)
|
||||
result = newForEachStmt(identifier, expression, self.blockStmt(), tok)
|
||||
result.file = self.file
|
||||
self.currentLoop = enclosingLoop
|
||||
|
||||
|
||||
proc findOperators(self: Parser, tokens: seq[Token])
|
||||
dec(self.loopDepth)
|
||||
|
||||
|
||||
proc importStmt(self: Parser, fromStmt: bool = false): Statement =
|
||||
|
@ -806,6 +804,10 @@ proc importStmt(self: Parser, fromStmt: bool = false): Statement =
|
|||
break
|
||||
elif i == searchPath.high():
|
||||
self.error(&"""could not import '{path}': module not found""")
|
||||
if not self.modules.getOrDefault(path, true):
|
||||
self.error(&"coult not import '{path}' (recursive dependency detected)")
|
||||
else:
|
||||
self.modules[path] = false
|
||||
try:
|
||||
var source = readFile(path)
|
||||
var tree = self.tree
|
||||
|
@ -819,6 +821,8 @@ proc importStmt(self: Parser, fromStmt: bool = false): Statement =
|
|||
self.tree = tree
|
||||
self.current = current
|
||||
self.tokens = tokens
|
||||
# Module has been fully loaded and can now be used
|
||||
self.modules[path] = true
|
||||
except IOError:
|
||||
self.error(&"could not import '{path}': {getCurrentExceptionMsg()}")
|
||||
except OSError:
|
||||
|
@ -859,14 +863,13 @@ proc whileStmt(self: Parser): Statement =
|
|||
## Parses a C-style while loop statement
|
||||
let tok = self.peek(-1)
|
||||
self.beginScope()
|
||||
let enclosingLoop = self.currentLoop
|
||||
inc(self.loopDepth)
|
||||
let condition = self.expression()
|
||||
self.expect(LeftBrace)
|
||||
self.currentLoop = Loop
|
||||
result = newWhileStmt(condition, self.blockStmt(), tok)
|
||||
result.file = self.file
|
||||
self.currentLoop = enclosingLoop
|
||||
self.endScope()
|
||||
dec(self.loopDepth)
|
||||
|
||||
|
||||
proc ifStmt(self: Parser): Statement =
|
||||
|
@ -1049,7 +1052,7 @@ proc parseFunExpr(self: Parser): LambdaExpr =
|
|||
|
||||
|
||||
proc parseGenericConstraint(self: Parser): Expression =
|
||||
## Recursivelt parses a generic constraint
|
||||
## Recursively parses a generic constraint
|
||||
## and returns it as an expression
|
||||
result = self.expression() # First value is always an identifier of some sort
|
||||
if not self.check(RightBracket):
|
||||
|
@ -1301,6 +1304,7 @@ proc typeDecl(self: Parser): TypeDecl =
|
|||
var generics: seq[tuple[name: IdentExpr, cond: Expression]] = @[]
|
||||
var pragmas: seq[Pragma] = @[]
|
||||
result = newTypeDecl(name, fields, defaults, isPrivate, token, pragmas, generics, nil, false, false)
|
||||
result.file = self.file
|
||||
if self.match(LeftBracket):
|
||||
self.parseGenerics(result)
|
||||
self.expect("=", "expecting '=' after type name")
|
||||
|
@ -1315,7 +1319,6 @@ proc typeDecl(self: Parser): TypeDecl =
|
|||
result.isEnum = true
|
||||
of "object":
|
||||
discard self.step()
|
||||
discard # Default case
|
||||
else:
|
||||
hasNone = true
|
||||
if hasNone:
|
||||
|
@ -1334,7 +1337,7 @@ proc typeDecl(self: Parser): TypeDecl =
|
|||
self.expect(LeftBrace, "expecting '{' after type declaration")
|
||||
if self.match(TokenType.Pragma):
|
||||
for pragma in self.parsePragmas():
|
||||
pragmas.add(pragma)
|
||||
result.pragmas.add(pragma)
|
||||
var
|
||||
argName: IdentExpr
|
||||
argPrivate: bool
|
||||
|
@ -1356,8 +1359,6 @@ proc typeDecl(self: Parser): TypeDecl =
|
|||
else:
|
||||
if not self.check(RightBrace):
|
||||
self.expect(",", "expecting comma after enum field declaration")
|
||||
result.pragmas = pragmas
|
||||
result.file = self.file
|
||||
|
||||
|
||||
proc declaration(self: Parser): Declaration =
|
||||
|
@ -1420,11 +1421,12 @@ proc parse*(self: Parser, tokens: seq[Token], file: string, lines: seq[tuple[sta
|
|||
self.lines = lines
|
||||
self.current = 0
|
||||
self.scopeDepth = 0
|
||||
self.currentLoop = LoopContext.None
|
||||
self.loopDepth = 0
|
||||
self.currentFunction = nil
|
||||
self.tree = @[]
|
||||
if not persist:
|
||||
self.operators = newOperatorTable()
|
||||
self.modules = newTable[string, bool]()
|
||||
self.findOperators(tokens)
|
||||
while not self.done():
|
||||
self.tree.add(self.declaration())
|
||||
|
|
120
src/main.nim
120
src/main.nim
|
@ -51,28 +51,28 @@ proc getLineEditor: LineEditor =
|
|||
result.bindHistory(history)
|
||||
|
||||
|
||||
proc repl(warnings: seq[WarningKind] = @[], mismatches: bool = false, mode: CompileMode = Debug) =
|
||||
proc repl(warnings: seq[WarningKind] = @[], mismatches: bool = false, mode: CompileMode = Debug, breakpoints: seq[uint64] = @[]) =
|
||||
styledEcho fgMagenta, "Welcome into the peon REPL!"
|
||||
var
|
||||
keep = true
|
||||
tokens: seq[Token] = @[]
|
||||
tree: seq[Declaration] = @[]
|
||||
compiler = newBytecodeCompiler(replMode=true)
|
||||
compiled: Chunk
|
||||
compiled: Chunk = newChunk()
|
||||
serialized: Serialized
|
||||
tokenizer = newLexer()
|
||||
vm = newPeonVM()
|
||||
parser = newParser()
|
||||
debugger = newDebugger()
|
||||
serializer = newSerializer()
|
||||
editor = getLineEditor()
|
||||
input: string
|
||||
current: string
|
||||
first: bool = false
|
||||
tokenizer.fillSymbolTable()
|
||||
editor.bindEvent(jeQuit):
|
||||
stdout.styledWriteLine(fgGreen, "Goodbye!")
|
||||
keep = false
|
||||
input = ""
|
||||
current = ""
|
||||
editor.bindKey("ctrl+a"):
|
||||
editor.content.home()
|
||||
editor.bindKey("ctrl+e"):
|
||||
|
@ -80,21 +80,15 @@ proc repl(warnings: seq[WarningKind] = @[], mismatches: bool = false, mode: Comp
|
|||
while keep:
|
||||
try:
|
||||
input = editor.read()
|
||||
if input == "#reset":
|
||||
compiled = newChunk()
|
||||
current = ""
|
||||
continue
|
||||
elif input == "#show":
|
||||
echo current
|
||||
elif input == "#clear":
|
||||
if input == "#clear":
|
||||
stdout.write("\x1Bc")
|
||||
continue
|
||||
elif input == "":
|
||||
continue
|
||||
tokens = tokenizer.lex(current & input & "\n", "stdin")
|
||||
tokens = tokenizer.lex(input, "stdin")
|
||||
if tokens.len() == 0:
|
||||
continue
|
||||
when debugLexer:
|
||||
if debugLexer:
|
||||
styledEcho fgCyan, "Tokenization step:"
|
||||
for i, token in tokens:
|
||||
if i == tokens.high():
|
||||
|
@ -102,22 +96,22 @@ proc repl(warnings: seq[WarningKind] = @[], mismatches: bool = false, mode: Comp
|
|||
break
|
||||
styledEcho fgGreen, "\t", $token
|
||||
echo ""
|
||||
tree = newParser().parse(tokens, "stdin", tokenizer.getLines(), current & input & "\n")
|
||||
tree = parser.parse(tokens, "stdin", tokenizer.getLines(), input, persist=true)
|
||||
if tree.len() == 0:
|
||||
continue
|
||||
when debugParser:
|
||||
if debugParser:
|
||||
styledEcho fgCyan, "Parsing step:"
|
||||
for node in tree:
|
||||
styledEcho fgGreen, "\t", $node
|
||||
echo ""
|
||||
compiled = newBytecodeCompiler(replMode=true).compile(tree, "stdin", tokenizer.getLines(), current & input & "\n", showMismatches=mismatches, disabledWarnings=warnings, mode=mode)
|
||||
when debugCompiler:
|
||||
compiled = compiler.compile(tree, "stdin", tokenizer.getLines(), input, chunk=compiled, showMismatches=mismatches, disabledWarnings=warnings, mode=mode, incremental=true)
|
||||
if debugCompiler:
|
||||
styledEcho fgCyan, "Compilation step:\n"
|
||||
debugger.disassembleChunk(compiled, "stdin")
|
||||
echo ""
|
||||
|
||||
serialized = serializer.loadBytes(serializer.dumpBytes(compiled, "stdin"))
|
||||
when debugSerializer:
|
||||
if debugSerializer:
|
||||
styledEcho fgCyan, "Serialization step: "
|
||||
styledEcho fgBlue, "\t- Peon version: ", fgYellow, &"{serialized.version.major}.{serialized.version.minor}.{serialized.version.patch}", fgBlue, " (commit ", fgYellow, serialized.commit[0..8], fgBlue, ") on branch ", fgYellow, serialized.branch
|
||||
stdout.styledWriteLine(fgBlue, "\t- Compilation date & time: ", fgYellow, fromUnix(serialized.compileDate).format("d/M/yyyy HH:mm:ss"))
|
||||
|
@ -141,8 +135,11 @@ proc repl(warnings: seq[WarningKind] = @[], mismatches: bool = false, mode: Comp
|
|||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
vm.run(serialized.chunk)
|
||||
current &= input & "\n"
|
||||
if not first:
|
||||
vm.run(serialized.chunk, repl=true, breakpoints=breakpoints)
|
||||
first = true
|
||||
else:
|
||||
vm.resume(serialized.chunk)
|
||||
except LexingError:
|
||||
print(LexingError(getCurrentException()))
|
||||
except ParseError:
|
||||
|
@ -157,7 +154,7 @@ proc repl(warnings: seq[WarningKind] = @[], mismatches: bool = false, mode: Comp
|
|||
quit(0)
|
||||
|
||||
|
||||
proc runFile(f: string, fromString: bool = false, dump: bool = true, breakpoints: seq[uint64] = @[], dis: bool = false,
|
||||
proc runFile(f: string, fromString: bool = false, dump: bool = true, breakpoints: seq[uint64] = @[],
|
||||
warnings: seq[WarningKind] = @[], mismatches: bool = false, mode: CompileMode = Debug, run: bool = true,
|
||||
backend: PeonBackend = PeonBackend.Bytecode, output: string) =
|
||||
var
|
||||
|
@ -186,7 +183,7 @@ proc runFile(f: string, fromString: bool = false, dump: bool = true, breakpoints
|
|||
tokens = tokenizer.lex(input, f)
|
||||
if tokens.len() == 0:
|
||||
return
|
||||
when debugLexer:
|
||||
if debugLexer:
|
||||
styledEcho fgCyan, "Tokenization step:"
|
||||
for i, token in tokens:
|
||||
if i == tokens.high():
|
||||
|
@ -197,7 +194,7 @@ proc runFile(f: string, fromString: bool = false, dump: bool = true, breakpoints
|
|||
tree = parser.parse(tokens, f, tokenizer.getLines(), input)
|
||||
if tree.len() == 0:
|
||||
return
|
||||
when debugParser:
|
||||
if debugParser:
|
||||
styledEcho fgCyan, "Parsing step:"
|
||||
for node in tree:
|
||||
styledEcho fgGreen, "\t", $node
|
||||
|
@ -205,11 +202,9 @@ proc runFile(f: string, fromString: bool = false, dump: bool = true, breakpoints
|
|||
case backend:
|
||||
of PeonBackend.Bytecode:
|
||||
compiled = compiler.compile(tree, f, tokenizer.getLines(), input, disabledWarnings=warnings, showMismatches=mismatches, mode=mode)
|
||||
when debugCompiler:
|
||||
if debugCompiler:
|
||||
styledEcho fgCyan, "Compilation step:\n"
|
||||
debugger.disassembleChunk(compiled, f)
|
||||
if dis:
|
||||
debugger.disassembleChunk(compiled, f)
|
||||
var path = splitFile(if output.len() > 0: output else: f).dir
|
||||
if path.len() > 0:
|
||||
path &= "/"
|
||||
|
@ -224,31 +219,35 @@ proc runFile(f: string, fromString: bool = false, dump: bool = true, breakpoints
|
|||
stderr.styledWriteLine(fgRed, styleBright, "Error: ", fgDefault, "the selected backend is not implemented yet")
|
||||
elif backend == PeonBackend.Bytecode:
|
||||
serialized = serializer.loadFile(f)
|
||||
if backend == PeonBackend.Bytecode:
|
||||
when debugSerializer:
|
||||
styledEcho fgCyan, "Serialization step: "
|
||||
styledEcho fgBlue, "\t- Peon version: ", fgYellow, &"{serialized.version.major}.{serialized.version.minor}.{serialized.version.patch}", fgBlue, " (commit ", fgYellow, serialized.commit[0..8], fgBlue, ") on branch ", fgYellow, serialized.branch
|
||||
stdout.styledWriteLine(fgBlue, "\t- Compilation date & time: ", fgYellow, fromUnix(serialized.compileDate).format("d/M/yyyy HH:mm:ss"))
|
||||
stdout.styledWrite(fgBlue, &"\t- Constants segment: ")
|
||||
if serialized.chunk.consts == compiled.consts:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
stdout.styledWrite(fgBlue, &"\t- Code segment: ")
|
||||
if serialized.chunk.code == compiled.code:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
stdout.styledWrite(fgBlue, "\t- Line info segment: ")
|
||||
if serialized.chunk.lines == compiled.lines:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
stdout.styledWrite(fgBlue, "\t- Functions segment: ")
|
||||
if serialized.chunk.functions == compiled.functions:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
if backend == PeonBackend.Bytecode and debugSerializer:
|
||||
styledEcho fgCyan, "Serialization step: "
|
||||
styledEcho fgBlue, "\t- Peon version: ", fgYellow, &"{serialized.version.major}.{serialized.version.minor}.{serialized.version.patch}", fgBlue, " (commit ", fgYellow, serialized.commit[0..8], fgBlue, ") on branch ", fgYellow, serialized.branch
|
||||
stdout.styledWriteLine(fgBlue, "\t- Compilation date & time: ", fgYellow, fromUnix(serialized.compileDate).format("d/M/yyyy HH:mm:ss"))
|
||||
stdout.styledWrite(fgBlue, &"\t- Constants segment: ")
|
||||
if serialized.chunk.consts == compiled.consts:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
stdout.styledWrite(fgBlue, &"\t- Code segment: ")
|
||||
if serialized.chunk.code == compiled.code:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
stdout.styledWrite(fgBlue, "\t- Line info segment: ")
|
||||
if serialized.chunk.lines == compiled.lines:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
stdout.styledWrite(fgBlue, "\t- Functions segment: ")
|
||||
if serialized.chunk.functions == compiled.functions:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
stdout.styledWrite(fgBlue, "\t- Modules segment: ")
|
||||
if serialized.chunk.modules == compiled.modules:
|
||||
styledEcho fgGreen, "OK"
|
||||
else:
|
||||
styledEcho fgRed, "Corrupted"
|
||||
if run:
|
||||
case backend:
|
||||
of PeonBackend.Bytecode:
|
||||
|
@ -284,7 +283,6 @@ when isMainModule:
|
|||
var dump: bool = true
|
||||
var warnings: seq[WarningKind] = @[]
|
||||
var breaks: seq[uint64] = @[]
|
||||
var dis: bool = false
|
||||
var mismatches: bool = false
|
||||
var mode: CompileMode = CompileMode.Debug
|
||||
var run: bool = true
|
||||
|
@ -350,7 +348,7 @@ when isMainModule:
|
|||
stderr.styledWriteLine(fgRed, styleBright, "Error: ", fgDefault, &"error: invalid breakpoint value '{point}'")
|
||||
quit()
|
||||
of "disassemble":
|
||||
dis = true
|
||||
debugCompiler = true
|
||||
of "compile":
|
||||
run = false
|
||||
of "output":
|
||||
|
@ -361,8 +359,12 @@ when isMainModule:
|
|||
backend = PeonBackend.Bytecode
|
||||
of "c":
|
||||
backend = PeonBackend.NativeC
|
||||
of "cpp":
|
||||
backend = PeonBackend.NativeCpp
|
||||
of "debug-dump":
|
||||
debugSerializer = true
|
||||
of "debug-lexer":
|
||||
debugLexer = true
|
||||
of "debug-parser":
|
||||
debugParser = true
|
||||
else:
|
||||
stderr.styledWriteLine(fgRed, styleBright, "Error: ", fgDefault, &"error: unkown option '{key}'")
|
||||
quit()
|
||||
|
@ -403,14 +405,16 @@ when isMainModule:
|
|||
of "c":
|
||||
run = false
|
||||
of "d":
|
||||
dis = true
|
||||
debugCompiler = true
|
||||
else:
|
||||
stderr.styledWriteLine(fgRed, styleBright, "Error: ", fgDefault, &"unkown option '{key}'")
|
||||
quit()
|
||||
else:
|
||||
echo "usage: peon [options] [filename.pn]"
|
||||
quit()
|
||||
if breaks.len() == 0 and debugVM:
|
||||
breaks.add(0)
|
||||
if file == "":
|
||||
repl(warnings, mismatches, mode)
|
||||
repl(warnings, mismatches, mode, breaks)
|
||||
else:
|
||||
runFile(file, fromString, dump, breaks, dis, warnings, mismatches, mode, run, backend, output)
|
||||
runFile(file, fromString, dump, breaks, warnings, mismatches, mode, run, backend, output)
|
||||
|
|
|
@ -2,6 +2,11 @@
|
|||
import values;
|
||||
|
||||
|
||||
operator `is`*[T: any](a, b: T): bool {
|
||||
#pragma[magic: "Identity", pure]
|
||||
}
|
||||
|
||||
|
||||
operator `>`*[T: UnsignedInteger](a, b: T): bool {
|
||||
#pragma[magic: "GreaterThan", pure]
|
||||
}
|
||||
|
@ -12,7 +17,7 @@ operator `<`*[T: UnsignedInteger](a, b: T): bool {
|
|||
}
|
||||
|
||||
|
||||
operator `==`*[T: Number | inf](a, b: T): bool {
|
||||
operator `==`*[T: Number | inf | bool](a, b: T): bool {
|
||||
#pragma[magic: "Equal", pure]
|
||||
}
|
||||
|
||||
|
|
|
@ -16,4 +16,9 @@ export comparisons;
|
|||
|
||||
var version* = 1;
|
||||
var _private = 5; # Invisible outside the module (underscore is to silence warning)
|
||||
var test* = 0x60;
|
||||
var test* = 0x60;
|
||||
|
||||
|
||||
fn testGlobals*: bool {
|
||||
return version == 1 and _private == 5 and test == 0x60;
|
||||
}
|
|
@ -1,4 +1,5 @@
|
|||
import std;
|
||||
import time;
|
||||
|
||||
|
||||
fn fib(n: int): int {
|
||||
|
@ -10,7 +11,7 @@ fn fib(n: int): int {
|
|||
|
||||
|
||||
print("Computing the value of fib(37)");
|
||||
var x = clock();
|
||||
var x = time.clock();
|
||||
print(fib(37));
|
||||
print(clock() - x);
|
||||
print(time.clock() - x);
|
||||
print("Done!");
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
import std;
|
||||
|
||||
|
||||
const max = 50000;
|
||||
const max = 500000;
|
||||
|
||||
var x = max;
|
||||
var s = "just a test";
|
||||
|
|
Loading…
Reference in New Issue