Compare commits
2 Commits
7bae3ad249
...
40d0f23135
Author | SHA1 | Date |
---|---|---|
Mattia Giambirtone | 40d0f23135 | |
Mattia Giambirtone | 20da594116 |
|
@ -1,7 +1,8 @@
|
||||||
# Peon - Bytecode Specification
|
# Peon - Bytecode Specification
|
||||||
|
|
||||||
This document aims to document peon's bytecode as well as how it is (de-)serialized to/from files and
|
This document aims to document peon's bytecode as well as how it is (de-)serialized to/from files and
|
||||||
other file-like objects.
|
other file-like objects. Note that the segments in a bytecode dump appear in the order they are listed
|
||||||
|
in this document.
|
||||||
|
|
||||||
## Code Structure
|
## Code Structure
|
||||||
|
|
||||||
|
@ -9,12 +10,12 @@ A peon program is compiled into a tightly packed sequence of bytes that contain
|
||||||
the VM needs to execute said program. There is no dependence between the frontend and the backend outside of the
|
the VM needs to execute said program. There is no dependence between the frontend and the backend outside of the
|
||||||
bytecode format (which is implemented in a separate serialiazer module) to allow for maximum modularity.
|
bytecode format (which is implemented in a separate serialiazer module) to allow for maximum modularity.
|
||||||
|
|
||||||
A peon bytecode dump contains:
|
A peon bytecode file contains the following:
|
||||||
|
|
||||||
- Constants
|
- Constants
|
||||||
- The bytecode itself
|
- The program's code
|
||||||
- Debugging information
|
- Debugging information (file and version metadata, module info. Optional)
|
||||||
- File and version metadata
|
|
||||||
|
|
||||||
## File Headers
|
## File Headers
|
||||||
|
|
||||||
|
@ -34,7 +35,7 @@ in release builds.
|
||||||
### Line data segment
|
### Line data segment
|
||||||
|
|
||||||
The line data segment contains information about each instruction in the code segment and associates them
|
The line data segment contains information about each instruction in the code segment and associates them
|
||||||
1:1 with a line number in the original source file for easier debugging using run-length encoding. The section's
|
1:1 with a line number in the original source file for easier debugging using run-length encoding. The segment's
|
||||||
size is fixed and is encoded at the beginning as a sequence of 4 bytes (i.e. a single 32 bit integer). The data
|
size is fixed and is encoded at the beginning as a sequence of 4 bytes (i.e. a single 32 bit integer). The data
|
||||||
in this segment can be decoded as explained in [this file](../src/frontend/compiler/targgets/bytecode/opcodes.nim#L29), which is quoted
|
in this segment can be decoded as explained in [this file](../src/frontend/compiler/targgets/bytecode/opcodes.nim#L29), which is quoted
|
||||||
below:
|
below:
|
||||||
|
@ -57,7 +58,7 @@ below:
|
||||||
|
|
||||||
This segment contains details about each function in the original file. The segment's size is fixed and is encoded at the
|
This segment contains details about each function in the original file. The segment's size is fixed and is encoded at the
|
||||||
beginning as a sequence of 4 bytes (i.e. a single 32 bit integer). The data in this segment can be decoded as explained
|
beginning as a sequence of 4 bytes (i.e. a single 32 bit integer). The data in this segment can be decoded as explained
|
||||||
in [this file](../src/frontend/compiler/targgets/bytecode/opcodes.nim#L39), which is quoted below:
|
in [this file](../src/frontend/compiler/targets/bytecode/opcodes.nim#L39), which is quoted below:
|
||||||
|
|
||||||
```
|
```
|
||||||
[...]
|
[...]
|
||||||
|
@ -74,6 +75,26 @@ in [this file](../src/frontend/compiler/targgets/bytecode/opcodes.nim#L39), whic
|
||||||
[...]
|
[...]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Modules segment
|
||||||
|
|
||||||
|
This segment contains details about the modules that make up the original source code which produced a given bytecode dump.
|
||||||
|
The data in this segment can be decoded as explained in [this file](../src/frontend/compiler/targets/bytecode/opcodes.nim#L49), which is quoted below:
|
||||||
|
```
|
||||||
|
[...]
|
||||||
|
## modules contains information about all the peon modules that the compiler has encountered,
|
||||||
|
## along with their start/end offset in the code. Unlike other bytecode-compiled languages like
|
||||||
|
## Python, peon does not produce a bytecode file for each separate module it compiles: everything
|
||||||
|
## is contained within a single binary blob. While this simplifies the implementation and makes
|
||||||
|
## bytecode files entirely "self-hosted", it also means that the original module information is
|
||||||
|
## lost: this segment serves to fix that. The segment's size is encoded at the beginning as a 4-byte
|
||||||
|
## sequence (i.e. a single 32-bit integer) and its encoding is similar to that of the functions segment:
|
||||||
|
## - First, the position into the bytecode where the module begins is encoded (as a 3 byte integer)
|
||||||
|
## - Second, the position into the bytecode where the module ends is encoded (as a 3 byte integer)
|
||||||
|
## - Lastly, the module's name is encoded in ASCII, prepended with its size as a 2-byte integer
|
||||||
|
[...]
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
## Constant segment
|
## Constant segment
|
||||||
|
|
||||||
The constant segment contains all the read-only values that the code will need at runtime, such as hardcoded
|
The constant segment contains all the read-only values that the code will need at runtime, such as hardcoded
|
||||||
|
@ -87,6 +108,6 @@ real-world scenarios it likely won't be.
|
||||||
|
|
||||||
## Code segment
|
## Code segment
|
||||||
|
|
||||||
The code segment contains the linear sequence of bytecode instructions of a peon program. It is to be read directly
|
The code segment contains the linear sequence of bytecode instructions of a peon program to be fed directly to
|
||||||
and without modifications. The segment's size is fixed and is encoded at the beginning as a sequence of 3 bytes
|
peon's virtual machine. The segment's size is fixed and is encoded at the beginning as a sequence of 3 bytes
|
||||||
(i.e. a single 24 bit integer). All the instructions are documented [here](../src/frontend/compiler/targgets/bytecode/opcodes.nim)
|
(i.e. a single 24 bit integer). All the instructions are documented [here](../src/frontend/compiler/targgets/bytecode/opcodes.nim)
|
|
@ -68,7 +68,8 @@ type
|
||||||
## this system and is not handled
|
## this system and is not handled
|
||||||
## manually by the VM
|
## manually by the VM
|
||||||
bytesAllocated: tuple[total, current: int]
|
bytesAllocated: tuple[total, current: int]
|
||||||
cycles: int
|
when debugGC or debugAlloc:
|
||||||
|
cycles: int
|
||||||
nextGC: int
|
nextGC: int
|
||||||
pointers: HashSet[uint64]
|
pointers: HashSet[uint64]
|
||||||
PeonVM* = object
|
PeonVM* = object
|
||||||
|
@ -93,9 +94,10 @@ type
|
||||||
frames: seq[uint64] # Stores the bottom of stack frames
|
frames: seq[uint64] # Stores the bottom of stack frames
|
||||||
results: seq[uint64] # Stores function return values
|
results: seq[uint64] # Stores function return values
|
||||||
gc: PeonGC # A reference to the VM's garbage collector
|
gc: PeonGC # A reference to the VM's garbage collector
|
||||||
breakpoints: seq[uint64] # Breakpoints where we call our debugger
|
when debugVM:
|
||||||
debugNext: bool # Whether to debug the next instruction
|
breakpoints: seq[uint64] # Breakpoints where we call our debugger
|
||||||
lastDebugCommand: string # The last debugging command input by the user
|
debugNext: bool # Whether to debug the next instruction
|
||||||
|
lastDebugCommand: string # The last debugging command input by the user
|
||||||
|
|
||||||
|
|
||||||
# Implementation of peon's memory manager
|
# Implementation of peon's memory manager
|
||||||
|
@ -105,7 +107,8 @@ proc newPeonGC*: PeonGC =
|
||||||
## garbage collector
|
## garbage collector
|
||||||
result.bytesAllocated = (0, 0)
|
result.bytesAllocated = (0, 0)
|
||||||
result.nextGC = FirstGC
|
result.nextGC = FirstGC
|
||||||
result.cycles = 0
|
when debugGC or debugAlloc:
|
||||||
|
result.cycles = 0
|
||||||
|
|
||||||
|
|
||||||
proc collect*(self: var PeonVM)
|
proc collect*(self: var PeonVM)
|
||||||
|
@ -214,6 +217,16 @@ proc markRoots(self: var PeonVM): HashSet[ptr HeapObject] =
|
||||||
# will mistakenly assume the object to be reachable, potentially
|
# will mistakenly assume the object to be reachable, potentially
|
||||||
# leading to a nasty memory leak. Let's just hope a 48+ bit address
|
# leading to a nasty memory leak. Let's just hope a 48+ bit address
|
||||||
# space makes this occurrence rare enough not to be a problem
|
# space makes this occurrence rare enough not to be a problem
|
||||||
|
# handles a single type (uint64), while Lox has a stack
|
||||||
|
# of heap-allocated structs (which is convenient, but slow).
|
||||||
|
# What we do instead is store all pointers allocated by us
|
||||||
|
# in a hash set and then check if any source of roots contained
|
||||||
|
# any of the integer values that we're keeping track of. Note
|
||||||
|
# that this means that if a primitive object's value happens to
|
||||||
|
# collide with an active pointer, the GC will mistakenly assume
|
||||||
|
# the object to be reachable (potentially leading to a nasty
|
||||||
|
# memory leak). Hopefully, in a 64-bit address space, this
|
||||||
|
# occurrence is rare enough for us to ignore
|
||||||
var result = initHashSet[uint64](self.gc.pointers.len())
|
var result = initHashSet[uint64](self.gc.pointers.len())
|
||||||
for obj in self.calls:
|
for obj in self.calls:
|
||||||
if obj in self.gc.pointers:
|
if obj in self.gc.pointers:
|
||||||
|
@ -285,7 +298,6 @@ proc sweep(self: var PeonVM) =
|
||||||
## during the mark phase.
|
## during the mark phase.
|
||||||
when debugGC:
|
when debugGC:
|
||||||
echo "DEBUG - GC: Beginning sweeping phase"
|
echo "DEBUG - GC: Beginning sweeping phase"
|
||||||
when debugGC:
|
|
||||||
var count = 0
|
var count = 0
|
||||||
var current: ptr HeapObject
|
var current: ptr HeapObject
|
||||||
var freed: HashSet[uint64]
|
var freed: HashSet[uint64]
|
||||||
|
@ -1050,10 +1062,11 @@ proc run*(self: var PeonVM, chunk: Chunk, breakpoints: seq[uint64] = @[], repl:
|
||||||
self.frames = @[]
|
self.frames = @[]
|
||||||
self.calls = @[]
|
self.calls = @[]
|
||||||
self.operands = @[]
|
self.operands = @[]
|
||||||
self.breakpoints = breakpoints
|
|
||||||
self.results = @[]
|
self.results = @[]
|
||||||
self.ip = 0
|
self.ip = 0
|
||||||
self.lastDebugCommand = ""
|
when debugVM:
|
||||||
|
self.breakpoints = breakpoints
|
||||||
|
self.lastDebugCommand = ""
|
||||||
try:
|
try:
|
||||||
self.dispatch()
|
self.dispatch()
|
||||||
except NilAccessDefect:
|
except NilAccessDefect:
|
||||||
|
|
|
@ -134,7 +134,7 @@ type
|
||||||
node*: Declaration
|
node*: Declaration
|
||||||
# Who is this name exported to? (Only makes sense if isPrivate
|
# Who is this name exported to? (Only makes sense if isPrivate
|
||||||
# equals false)
|
# equals false)
|
||||||
exportedTo*: HashSet[Name]
|
exportedTo*: HashSet[string]
|
||||||
# Has the compiler generated this name internally or
|
# Has the compiler generated this name internally or
|
||||||
# does it come from user code?
|
# does it come from user code?
|
||||||
isReal*: bool
|
isReal*: bool
|
||||||
|
@ -212,7 +212,7 @@ type
|
||||||
# The module importing us, if any
|
# The module importing us, if any
|
||||||
parentModule*: Name
|
parentModule*: Name
|
||||||
# Currently imported modules
|
# Currently imported modules
|
||||||
modules*: HashSet[Name]
|
modules*: HashSet[string]
|
||||||
|
|
||||||
TypedNode* = ref object
|
TypedNode* = ref object
|
||||||
## A wapper for AST nodes
|
## A wapper for AST nodes
|
||||||
|
@ -354,7 +354,7 @@ proc resolve*(self: Compiler, name: string): Name =
|
||||||
# module, so we definitely can't
|
# module, so we definitely can't
|
||||||
# use it
|
# use it
|
||||||
continue
|
continue
|
||||||
elif self.currentModule in obj.exportedTo:
|
elif self.currentModule.path in obj.exportedTo:
|
||||||
# The name is public in its owner
|
# The name is public in its owner
|
||||||
# module and said module has explicitly
|
# module and said module has explicitly
|
||||||
# exported it to us: we can use it
|
# exported it to us: we can use it
|
||||||
|
@ -713,7 +713,7 @@ method findByName*(self: Compiler, name: string): seq[Name] =
|
||||||
for obj in reversed(self.names):
|
for obj in reversed(self.names):
|
||||||
if obj.ident.token.lexeme == name:
|
if obj.ident.token.lexeme == name:
|
||||||
if obj.owner.path != self.currentModule.path:
|
if obj.owner.path != self.currentModule.path:
|
||||||
if obj.isPrivate or self.currentModule notin obj.exportedTo:
|
if obj.isPrivate or self.currentModule.path notin obj.exportedTo:
|
||||||
continue
|
continue
|
||||||
result.add(obj)
|
result.add(obj)
|
||||||
|
|
||||||
|
@ -727,11 +727,13 @@ method findInModule*(self: Compiler, name: string, module: Name): seq[Name] =
|
||||||
## the current one or not
|
## the current one or not
|
||||||
if name == "":
|
if name == "":
|
||||||
for obj in reversed(self.names):
|
for obj in reversed(self.names):
|
||||||
if not obj.isPrivate and obj.owner == module:
|
if obj.owner.isNil():
|
||||||
|
continue
|
||||||
|
if not obj.isPrivate and obj.owner.path == module.path:
|
||||||
result.add(obj)
|
result.add(obj)
|
||||||
else:
|
else:
|
||||||
for obj in self.findInModule("", module):
|
for obj in self.findInModule("", module):
|
||||||
if obj.ident.token.lexeme == name and self.currentModule in obj.exportedTo:
|
if obj.ident.token.lexeme == name and self.currentModule.path in obj.exportedTo:
|
||||||
result.add(obj)
|
result.add(obj)
|
||||||
|
|
||||||
|
|
||||||
|
@ -1034,7 +1036,7 @@ proc declare*(self: Compiler, node: ASTNode): Name {.discardable.} =
|
||||||
break
|
break
|
||||||
if name.ident.token.lexeme != declaredName:
|
if name.ident.token.lexeme != declaredName:
|
||||||
continue
|
continue
|
||||||
if name.owner != n.owner and (name.isPrivate or n.owner notin name.exportedTo):
|
if name.owner != n.owner and (name.isPrivate or n.owner.path notin name.exportedTo):
|
||||||
continue
|
continue
|
||||||
if name.kind in [NameKind.Var, NameKind.Module, NameKind.CustomType, NameKind.Enum]:
|
if name.kind in [NameKind.Var, NameKind.Module, NameKind.CustomType, NameKind.Enum]:
|
||||||
if name.depth < n.depth:
|
if name.depth < n.depth:
|
||||||
|
|
|
@ -124,11 +124,11 @@ type
|
||||||
of Reference:
|
of Reference:
|
||||||
# A managed reference
|
# A managed reference
|
||||||
nullable*: bool # Is null a valid value for this type? (false by default)
|
nullable*: bool # Is null a valid value for this type? (false by default)
|
||||||
value*: Type # The type the reference points to
|
value*: TypedNode # The type the reference points to
|
||||||
of Pointer:
|
of Pointer:
|
||||||
# An unmanaged reference. Much
|
# An unmanaged reference. Much
|
||||||
# like a raw pointer in C
|
# like a raw pointer in C
|
||||||
data*: Type # The type we point to
|
data*: TypedNode # The type we point to
|
||||||
of TypeDecl:
|
of TypeDecl:
|
||||||
# A user-defined type
|
# A user-defined type
|
||||||
fields*: seq[TypedArgument] # List of fields in the object. May be empty
|
fields*: seq[TypedArgument] # List of fields in the object. May be empty
|
||||||
|
@ -317,17 +317,17 @@ proc step*(self: Compiler): ASTNode {.inline.} =
|
||||||
|
|
||||||
# Some forward declarations
|
# Some forward declarations
|
||||||
proc compareUnions*(self: Compiler, a, b: seq[tuple[match: bool, kind: Type]]): bool
|
proc compareUnions*(self: Compiler, a, b: seq[tuple[match: bool, kind: Type]]): bool
|
||||||
proc expression*(self: Compiler, node: Expression, compile: bool = true): Type {.discardable.} = nil
|
proc expression*(self: Compiler, node: Expression, compile: bool = true): TypedNode {.discardable.} = nil
|
||||||
proc identifier*(self: Compiler, node: IdentExpr, name: Name = nil, compile: bool = true, strict: bool = true): Type {.discardable.} = nil
|
proc identifier*(self: Compiler, node: IdentExpr, name: Name = nil, compile: bool = true, strict: bool = true): TypedNode {.discardable.} = nil
|
||||||
proc call*(self: Compiler, node: CallExpr, compile: bool = true): Type {.discardable.} = nil
|
proc call*(self: Compiler, node: CallExpr, compile: bool = true): TypedNode {.discardable.} = nil
|
||||||
proc getItemExpr*(self: Compiler, node: GetItemExpr, compile: bool = true, matching: Type = nil): Type {.discardable.} = nil
|
proc getItemExpr*(self: Compiler, node: GetItemExpr, compile: bool = true, matching: Type = nil): TypedNode {.discardable.} = nil
|
||||||
proc unary*(self: Compiler, node: UnaryExpr, compile: bool = true): Type {.discardable.} = nil
|
proc unary*(self: Compiler, node: UnaryExpr, compile: bool = true): TypedNode {.discardable.} = nil
|
||||||
proc binary*(self: Compiler, node: BinaryExpr, compile: bool = true): Type {.discardable.} = nil
|
proc binary*(self: Compiler, node: BinaryExpr, compile: bool = true): TypedNode {.discardable.} = nil
|
||||||
proc lambdaExpr*(self: Compiler, node: LambdaExpr, compile: bool = true): Type {.discardable.} = nil
|
proc lambdaExpr*(self: Compiler, node: LambdaExpr, compile: bool = true): TypedNode {.discardable.} = nil
|
||||||
proc literal*(self: Compiler, node: ASTNode, compile: bool = true): Type {.discardable.} = nil
|
proc literal*(self: Compiler, node: ASTNode, compile: bool = true): TypedNode {.discardable.} = nil
|
||||||
proc infer*(self: Compiler, node: LiteralExpr): Type
|
proc infer*(self: Compiler, node: LiteralExpr): TypedNode
|
||||||
proc infer*(self: Compiler, node: Expression): Type
|
proc infer*(self: Compiler, node: Expression): TypedNode
|
||||||
proc inferOrError*(self: Compiler, node: Expression): Type
|
proc inferOrError*(self: Compiler, node: Expression): TypedNode
|
||||||
proc findByName*(self: Compiler, name: string): seq[Name]
|
proc findByName*(self: Compiler, name: string): seq[Name]
|
||||||
proc findInModule*(self: Compiler, name: string, module: Name): seq[Name]
|
proc findInModule*(self: Compiler, name: string, module: Name): seq[Name]
|
||||||
proc findByType*(self: Compiler, name: string, kind: Type): seq[Name]
|
proc findByType*(self: Compiler, name: string, kind: Type): seq[Name]
|
||||||
|
@ -420,7 +420,7 @@ proc compare*(self: Compiler, a, b: Type): bool =
|
||||||
# a and b are of either of the two
|
# a and b are of either of the two
|
||||||
# types in this branch, so we just need
|
# types in this branch, so we just need
|
||||||
# to compare their values
|
# to compare their values
|
||||||
return self.compare(a.value, b.value)
|
return self.compare(a.value.value, b.value.value)
|
||||||
of Function:
|
of Function:
|
||||||
# Functions are a bit trickier to compare
|
# Functions are a bit trickier to compare
|
||||||
if a.arguments.len() != b.arguments.len():
|
if a.arguments.len() != b.arguments.len():
|
||||||
|
@ -569,7 +569,7 @@ proc toIntrinsic*(name: string): Type =
|
||||||
return Type(kind: String)
|
return Type(kind: String)
|
||||||
|
|
||||||
|
|
||||||
proc infer*(self: Compiler, node: LiteralExpr): Type =
|
proc infer*(self: Compiler, node: LiteralExpr): TypedNode =
|
||||||
## Infers the type of a given literal expression
|
## Infers the type of a given literal expression
|
||||||
if node.isNil():
|
if node.isNil():
|
||||||
return nil
|
return nil
|
||||||
|
@ -577,32 +577,32 @@ proc infer*(self: Compiler, node: LiteralExpr): Type =
|
||||||
of intExpr, binExpr, octExpr, hexExpr:
|
of intExpr, binExpr, octExpr, hexExpr:
|
||||||
let size = node.token.lexeme.split("'")
|
let size = node.token.lexeme.split("'")
|
||||||
if size.len() == 1:
|
if size.len() == 1:
|
||||||
return Type(kind: Int64)
|
return TypedNode(node: node, value: Type(kind: Int64))
|
||||||
let typ = size[1].toIntrinsic()
|
let typ = size[1].toIntrinsic()
|
||||||
if not self.compare(typ, nil):
|
if not self.compare(typ, nil):
|
||||||
return typ
|
return TypedNode(node: node, value: typ)
|
||||||
else:
|
else:
|
||||||
self.error(&"invalid type specifier '{size[1]}' for int", node)
|
self.error(&"invalid type specifier '{size[1]}' for int", node)
|
||||||
of floatExpr:
|
of floatExpr:
|
||||||
let size = node.token.lexeme.split("'")
|
let size = node.token.lexeme.split("'")
|
||||||
if size.len() == 1:
|
if size.len() == 1:
|
||||||
return Type(kind: Float64)
|
return TypedNode(node: node, value: Type(kind: Float64))
|
||||||
let typ = size[1].toIntrinsic()
|
let typ = size[1].toIntrinsic()
|
||||||
if not typ.isNil():
|
if not typ.isNil():
|
||||||
return typ
|
return TypedNode(node: node, value: typ)
|
||||||
else:
|
else:
|
||||||
self.error(&"invalid type specifier '{size[1]}' for float", node)
|
self.error(&"invalid type specifier '{size[1]}' for float", node)
|
||||||
of trueExpr:
|
of trueExpr:
|
||||||
return Type(kind: Bool)
|
return TypedNode(node: node, value: Type(kind: Bool))
|
||||||
of falseExpr:
|
of falseExpr:
|
||||||
return Type(kind: Bool)
|
return TypedNode(node: node, value: Type(kind: Bool))
|
||||||
of strExpr:
|
of strExpr:
|
||||||
return Type(kind: String)
|
return TypedNode(node: node, value: Type(kind: String))
|
||||||
else:
|
else:
|
||||||
discard # Unreachable
|
discard # Unreachable
|
||||||
|
|
||||||
|
|
||||||
proc infer*(self: Compiler, node: Expression): Type =
|
proc infer*(self: Compiler, node: Expression): TypedNode =
|
||||||
## Infers the type of a given expression and
|
## Infers the type of a given expression and
|
||||||
## returns it
|
## returns it
|
||||||
if node.isNil():
|
if node.isNil():
|
||||||
|
@ -621,9 +621,9 @@ proc infer*(self: Compiler, node: Expression): Type =
|
||||||
of NodeKind.callExpr:
|
of NodeKind.callExpr:
|
||||||
result = self.call(CallExpr(node), compile=false)
|
result = self.call(CallExpr(node), compile=false)
|
||||||
of NodeKind.refExpr:
|
of NodeKind.refExpr:
|
||||||
result = Type(kind: Reference, value: self.infer(Ref(node).value))
|
result = TypedNode(node: node, value: Type(kind: Reference, value: self.infer(Ref(node).value)))
|
||||||
of NodeKind.ptrExpr:
|
of NodeKind.ptrExpr:
|
||||||
result = Type(kind: Pointer, data: self.infer(Ptr(node).value))
|
result = TypedNode(node: node, value: Type(kind: Pointer, data: self.infer(Ptr(node).value)))
|
||||||
of NodeKind.groupingExpr:
|
of NodeKind.groupingExpr:
|
||||||
result = self.infer(GroupingExpr(node).expression)
|
result = self.infer(GroupingExpr(node).expression)
|
||||||
of NodeKind.getItemExpr:
|
of NodeKind.getItemExpr:
|
||||||
|
@ -634,7 +634,7 @@ proc infer*(self: Compiler, node: Expression): Type =
|
||||||
discard # TODO
|
discard # TODO
|
||||||
|
|
||||||
|
|
||||||
proc inferOrError*(self: Compiler, node: Expression): Type =
|
proc inferOrError*(self: Compiler, node: Expression): TypedNode =
|
||||||
## Attempts to infer the type of
|
## Attempts to infer the type of
|
||||||
## the given expression and raises an
|
## the given expression and raises an
|
||||||
## error if it fails
|
## error if it fails
|
||||||
|
@ -648,16 +648,16 @@ proc stringify*(self: Compiler, typ: Type): string =
|
||||||
## type object
|
## type object
|
||||||
if typ.isNil():
|
if typ.isNil():
|
||||||
return "nil"
|
return "nil"
|
||||||
case typ.value.kind:
|
case typ.kind:
|
||||||
of Int8, UInt8, Int16, UInt16, Int32,
|
of Int8, UInt8, Int16, UInt16, Int32,
|
||||||
UInt32, Int64, UInt64, Float32, Float64,
|
UInt32, Int64, UInt64, Float32, Float64,
|
||||||
Char, Byte, String, Nil, TypeKind.Nan, Bool,
|
Char, Byte, String, Nil, TypeKind.Nan, Bool,
|
||||||
TypeKind.Inf, Auto:
|
TypeKind.Inf, Auto:
|
||||||
result &= ($typ.value.kind).toLowerAscii()
|
result &= ($typ.kind).toLowerAscii()
|
||||||
of Pointer:
|
of Pointer:
|
||||||
result &= &"ptr {self.stringify(typ.value)}"
|
result &= &"ptr {self.stringify(typ)}"
|
||||||
of Reference:
|
of Reference:
|
||||||
result &= &"ref {self.stringify(typ.value)}"
|
result &= &"ref {self.stringify(typ)}"
|
||||||
of Any:
|
of Any:
|
||||||
return "any"
|
return "any"
|
||||||
of Union:
|
of Union:
|
||||||
|
@ -770,9 +770,9 @@ proc check*(self: Compiler, term: Expression, kind: Type) {.inline.} =
|
||||||
## Raises an error if appropriate and returns
|
## Raises an error if appropriate and returns
|
||||||
## otherwise
|
## otherwise
|
||||||
let k = self.inferOrError(term)
|
let k = self.inferOrError(term)
|
||||||
if not self.compare(k, kind):
|
if not self.compare(k.value, kind):
|
||||||
self.error(&"expecting value of type {self.stringify(kind)}, got {self.stringify(k)}", term)
|
self.error(&"expecting value of type {self.stringify(kind)}, got {self.stringify(k)}", term)
|
||||||
elif k.kind == Any and kind.kind != Any:
|
elif k.value.kind == Any and kind.kind != Any:
|
||||||
self.error(&"any is not a valid type in this context")
|
self.error(&"any is not a valid type in this context")
|
||||||
|
|
||||||
|
|
||||||
|
@ -857,7 +857,7 @@ proc unpackGenerics*(self: Compiler, condition: Expression, list: var seq[tuple[
|
||||||
## Recursively unpacks a type constraint in a generic type
|
## Recursively unpacks a type constraint in a generic type
|
||||||
case condition.kind:
|
case condition.kind:
|
||||||
of identExpr:
|
of identExpr:
|
||||||
list.add((accept, self.inferOrError(condition)))
|
list.add((accept, self.inferOrError(condition).value))
|
||||||
if list[^1].kind.kind == Auto:
|
if list[^1].kind.kind == Auto:
|
||||||
self.error("automatic types cannot be used within generics", condition)
|
self.error("automatic types cannot be used within generics", condition)
|
||||||
of binaryExpr:
|
of binaryExpr:
|
||||||
|
@ -883,7 +883,7 @@ proc unpackUnion*(self: Compiler, condition: Expression, list: var seq[tuple[mat
|
||||||
## Recursively unpacks a type union
|
## Recursively unpacks a type union
|
||||||
case condition.kind:
|
case condition.kind:
|
||||||
of identExpr:
|
of identExpr:
|
||||||
list.add((accept, self.inferOrError(condition)))
|
list.add((accept, self.inferOrError(condition).value))
|
||||||
of binaryExpr:
|
of binaryExpr:
|
||||||
let condition = BinaryExpr(condition)
|
let condition = BinaryExpr(condition)
|
||||||
case condition.operator.lexeme:
|
case condition.operator.lexeme:
|
||||||
|
@ -966,13 +966,13 @@ proc declare*(self: Compiler, node: ASTNode): Name {.discardable.} =
|
||||||
n.isGeneric = true
|
n.isGeneric = true
|
||||||
var typ: Type
|
var typ: Type
|
||||||
for argument in node.arguments:
|
for argument in node.arguments:
|
||||||
typ = self.infer(argument.valueType)
|
typ = self.infer(argument.valueType).value
|
||||||
if not typ.isNil() and typ.kind == Auto:
|
if not typ.isNil() and typ.kind == Auto:
|
||||||
n.obj.value.isAuto = true
|
n.obj.value.isAuto = true
|
||||||
if n.isGeneric:
|
if n.isGeneric:
|
||||||
self.error("automatic types cannot be used within generics", argument.valueType)
|
self.error("automatic types cannot be used within generics", argument.valueType)
|
||||||
break
|
break
|
||||||
typ = self.infer(node.returnType)
|
typ = self.infer(node.returnType).value
|
||||||
if not typ.isNil() and typ.kind == Auto:
|
if not typ.isNil() and typ.kind == Auto:
|
||||||
n.obj.value.isAuto = true
|
n.obj.value.isAuto = true
|
||||||
if n.isGeneric:
|
if n.isGeneric:
|
||||||
|
@ -1023,7 +1023,7 @@ proc declare*(self: Compiler, node: ASTNode): Name {.discardable.} =
|
||||||
else:
|
else:
|
||||||
case node.value.kind:
|
case node.value.kind:
|
||||||
of identExpr:
|
of identExpr:
|
||||||
n.obj.value = self.inferOrError(node.value)
|
n.obj.value = self.inferOrError(node.value).value
|
||||||
of binaryExpr:
|
of binaryExpr:
|
||||||
# Type union
|
# Type union
|
||||||
n.obj.value = Type(kind: Union, types: @[])
|
n.obj.value = Type(kind: Union, types: @[])
|
||||||
|
|
|
@ -46,10 +46,21 @@ type
|
||||||
## - After that follows the argument count as a 1 byte integer
|
## - After that follows the argument count as a 1 byte integer
|
||||||
## - Lastly, the function's name (optional) is encoded in ASCII, prepended with
|
## - Lastly, the function's name (optional) is encoded in ASCII, prepended with
|
||||||
## its size as a 2-byte integer
|
## its size as a 2-byte integer
|
||||||
|
## modules contains information about all the peon modules that the compiler has encountered,
|
||||||
|
## along with their start/end offset in the code. Unlike other bytecode-compiled languages like
|
||||||
|
## Python, peon does not produce a bytecode file for each separate module it compiles: everything
|
||||||
|
## is contained within a single binary blob. While this simplifies the implementation and makes
|
||||||
|
## bytecode files entirely "self-hosted", it also means that the original module information is
|
||||||
|
## lost: this segment serves to fix that. The segment's size is encoded at the beginning as a 4-byte
|
||||||
|
## sequence (i.e. a single 32-bit integer) and its encoding is similar to that of the functions segment:
|
||||||
|
## - First, the position into the bytecode where the module begins is encoded (as a 3 byte integer)
|
||||||
|
## - Second, the position into the bytecode where the module ends is encoded (as a 3 byte integer)
|
||||||
|
## - Lastly, the module's name is encoded in ASCII, prepended with its size as a 2-byte integer
|
||||||
consts*: seq[uint8]
|
consts*: seq[uint8]
|
||||||
code*: seq[uint8]
|
code*: seq[uint8]
|
||||||
lines*: seq[int]
|
lines*: seq[int]
|
||||||
functions*: seq[uint8]
|
functions*: seq[uint8]
|
||||||
|
modules*: seq[uint8]
|
||||||
|
|
||||||
OpCode* {.pure.} = enum
|
OpCode* {.pure.} = enum
|
||||||
## Enum of Peon's bytecode opcodes
|
## Enum of Peon's bytecode opcodes
|
||||||
|
|
|
@ -1006,7 +1006,7 @@ proc terminateProgram(self: BytecodeCompiler, pos: int) =
|
||||||
self.emitByte(ReplExit, self.peek().token.line)
|
self.emitByte(ReplExit, self.peek().token.line)
|
||||||
else:
|
else:
|
||||||
self.emitByte(OpCode.Return, self.peek().token.line)
|
self.emitByte(OpCode.Return, self.peek().token.line)
|
||||||
self.emitByte(0, self.peek().token.line) # Entry point has no return value (TODO: Add easter eggs, cuz why not)
|
self.emitByte(0, self.peek().token.line) # Entry point has no return value
|
||||||
self.patchReturnAddress(pos)
|
self.patchReturnAddress(pos)
|
||||||
|
|
||||||
|
|
||||||
|
@ -1478,8 +1478,9 @@ method lambdaExpr(self: BytecodeCompiler, node: LambdaExpr, compile: bool = true
|
||||||
line: node.token.line,
|
line: node.token.line,
|
||||||
kind: NameKind.Function,
|
kind: NameKind.Function,
|
||||||
belongsTo: function,
|
belongsTo: function,
|
||||||
isReal: true)
|
isReal: true,
|
||||||
if compile and node notin self.lambdas:
|
)
|
||||||
|
if compile and node notin self.lambdas and not node.body.isNil():
|
||||||
self.lambdas.add(node)
|
self.lambdas.add(node)
|
||||||
let jmp = self.emitJump(JumpForwards, node.token.line)
|
let jmp = self.emitJump(JumpForwards, node.token.line)
|
||||||
if BlockStmt(node.body).code.len() == 0:
|
if BlockStmt(node.body).code.len() == 0:
|
||||||
|
@ -1687,7 +1688,7 @@ proc importStmt(self: BytecodeCompiler, node: ImportStmt, compile: bool = true)
|
||||||
# Importing a module automatically exports
|
# Importing a module automatically exports
|
||||||
# its public names to us
|
# its public names to us
|
||||||
for name in self.findInModule("", module):
|
for name in self.findInModule("", module):
|
||||||
name.exportedTo.incl(self.currentModule)
|
name.exportedTo.incl(self.currentModule.path)
|
||||||
except IOError:
|
except IOError:
|
||||||
self.error(&"could not import '{module.ident.token.lexeme}': {getCurrentExceptionMsg()}")
|
self.error(&"could not import '{module.ident.token.lexeme}': {getCurrentExceptionMsg()}")
|
||||||
except OSError:
|
except OSError:
|
||||||
|
@ -1705,22 +1706,22 @@ proc exportStmt(self: BytecodeCompiler, node: ExportStmt, compile: bool = true)
|
||||||
var name = self.resolveOrError(node.name)
|
var name = self.resolveOrError(node.name)
|
||||||
if name.isPrivate:
|
if name.isPrivate:
|
||||||
self.error("cannot export private names")
|
self.error("cannot export private names")
|
||||||
name.exportedTo.incl(self.parentModule)
|
name.exportedTo.incl(self.parentModule.path)
|
||||||
case name.kind:
|
case name.kind:
|
||||||
of NameKind.Module:
|
of NameKind.Module:
|
||||||
# We need to export everything
|
# We need to export everything
|
||||||
# this module defines!
|
# this module defines!
|
||||||
for name in self.findInModule("", name):
|
for name in self.findInModule("", name):
|
||||||
name.exportedTo.incl(self.parentModule)
|
name.exportedTo.incl(self.parentModule.path)
|
||||||
of NameKind.Function:
|
of NameKind.Function:
|
||||||
# Only exporting a single function (or, well
|
# Only exporting a single function (or, well
|
||||||
# all of its implementations)
|
# all of its implementations)
|
||||||
for name in self.findByName(name.ident.token.lexeme):
|
for name in self.findByName(name.ident.token.lexeme):
|
||||||
if name.kind != NameKind.Function:
|
if name.kind != NameKind.Function:
|
||||||
continue
|
continue
|
||||||
name.exportedTo.incl(self.parentModule)
|
name.exportedTo.incl(self.parentModule.path)
|
||||||
else:
|
else:
|
||||||
discard
|
self.error("unsupported export type")
|
||||||
|
|
||||||
|
|
||||||
proc breakStmt(self: BytecodeCompiler, node: BreakStmt) =
|
proc breakStmt(self: BytecodeCompiler, node: BreakStmt) =
|
||||||
|
@ -2073,6 +2074,7 @@ proc compile*(self: BytecodeCompiler, ast: seq[Declaration], file: string, lines
|
||||||
self.disabledWarnings = disabledWarnings
|
self.disabledWarnings = disabledWarnings
|
||||||
self.showMismatches = showMismatches
|
self.showMismatches = showMismatches
|
||||||
self.mode = mode
|
self.mode = mode
|
||||||
|
let start = self.chunk.code.len()
|
||||||
if not incremental:
|
if not incremental:
|
||||||
self.jumps = @[]
|
self.jumps = @[]
|
||||||
let pos = self.beginProgram()
|
let pos = self.beginProgram()
|
||||||
|
@ -2081,8 +2083,6 @@ proc compile*(self: BytecodeCompiler, ast: seq[Declaration], file: string, lines
|
||||||
while not self.done():
|
while not self.done():
|
||||||
self.declaration(Declaration(self.step()))
|
self.declaration(Declaration(self.step()))
|
||||||
self.terminateProgram(pos)
|
self.terminateProgram(pos)
|
||||||
# TODO: REPL is broken, we need a new way to make
|
|
||||||
# incremental compilation resume from where it stopped!
|
|
||||||
result = self.chunk
|
result = self.chunk
|
||||||
|
|
||||||
|
|
||||||
|
@ -2100,7 +2100,7 @@ proc compileModule(self: BytecodeCompiler, module: Name) =
|
||||||
break
|
break
|
||||||
elif i == searchPath.high():
|
elif i == searchPath.high():
|
||||||
self.error(&"""could not import '{path}': module not found""")
|
self.error(&"""could not import '{path}': module not found""")
|
||||||
if self.modules.contains(module):
|
if self.modules.contains(module.path):
|
||||||
return
|
return
|
||||||
let source = readFile(path)
|
let source = readFile(path)
|
||||||
let current = self.current
|
let current = self.current
|
||||||
|
@ -2115,11 +2115,19 @@ proc compileModule(self: BytecodeCompiler, module: Name) =
|
||||||
self.replMode = false
|
self.replMode = false
|
||||||
self.parentModule = currentModule
|
self.parentModule = currentModule
|
||||||
self.currentModule = module
|
self.currentModule = module
|
||||||
|
let start = self.chunk.code.len()
|
||||||
discard self.compile(self.parser.parse(self.lexer.lex(source, path),
|
discard self.compile(self.parser.parse(self.lexer.lex(source, path),
|
||||||
path, self.lexer.getLines(),
|
path, self.lexer.getLines(),
|
||||||
self.lexer.getSource(), persist=true),
|
self.lexer.getSource(), persist=true),
|
||||||
path, self.lexer.getLines(), self.lexer.getSource(), chunk=self.chunk, incremental=true,
|
path, self.lexer.getLines(), self.lexer.getSource(), chunk=self.chunk, incremental=true,
|
||||||
isMainModule=false, self.disabledWarnings, self.showMismatches, self.mode)
|
isMainModule=false, self.disabledWarnings, self.showMismatches, self.mode)
|
||||||
|
# Mark the end of a new module
|
||||||
|
self.chunk.modules.extend(start.toTriple())
|
||||||
|
self.chunk.modules.extend(self.chunk.code.high().toTriple())
|
||||||
|
# I swear to god if someone ever creates a peon module with a name that's
|
||||||
|
# longer than 2^16 bytes I will hit them with a metal pipe. Mark my words
|
||||||
|
self.chunk.modules.extend(self.currentModule.ident.token.lexeme.len().toDouble())
|
||||||
|
self.chunk.modules.extend(self.currentModule.ident.token.lexeme.toBytes())
|
||||||
module.file = path
|
module.file = path
|
||||||
# No need to save the old scope depth: import statements are
|
# No need to save the old scope depth: import statements are
|
||||||
# only allowed at the top level!
|
# only allowed at the top level!
|
||||||
|
@ -2133,4 +2141,4 @@ proc compileModule(self: BytecodeCompiler, module: Name) =
|
||||||
self.replMode = replMode
|
self.replMode = replMode
|
||||||
self.lines = lines
|
self.lines = lines
|
||||||
self.source = src
|
self.source = src
|
||||||
self.modules.incl(module)
|
self.modules.incl(module.path)
|
||||||
|
|
|
@ -22,12 +22,15 @@ import std/terminal
|
||||||
|
|
||||||
|
|
||||||
type
|
type
|
||||||
Function = ref object
|
Function = object
|
||||||
start, stop, bottom, argc: int
|
start, stop, argc: int
|
||||||
|
name: string
|
||||||
|
Module = object
|
||||||
|
start, stop: int
|
||||||
name: string
|
name: string
|
||||||
started, stopped: bool
|
|
||||||
Debugger* = ref object
|
Debugger* = ref object
|
||||||
chunk: Chunk
|
chunk: Chunk
|
||||||
|
modules: seq[Module]
|
||||||
functions: seq[Function]
|
functions: seq[Function]
|
||||||
current: int
|
current: int
|
||||||
|
|
||||||
|
@ -66,21 +69,38 @@ proc checkFunctionStart(self: Debugger, n: int) =
|
||||||
## Checks if a function begins at the given
|
## Checks if a function begins at the given
|
||||||
## bytecode offset
|
## bytecode offset
|
||||||
for i, e in self.functions:
|
for i, e in self.functions:
|
||||||
if n == e.start and not (e.started or e.stopped):
|
# Avoids duplicate output
|
||||||
e.started = true
|
if n == e.start:
|
||||||
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Function Start ", fgYellow, &"'{e.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Function Start ", fgYellow, &"'{e.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||||
styledEcho fgGreen, "\t- Start offset: ", fgYellow, $e.start
|
styledEcho fgGreen, "\t- Start offset: ", fgYellow, $e.start
|
||||||
styledEcho fgGreen, "\t- End offset: ", fgYellow, $e.stop
|
styledEcho fgGreen, "\t- End offset: ", fgYellow, $e.stop
|
||||||
styledEcho fgGreen, "\t- Argument count: ", fgYellow, $e.argc
|
styledEcho fgGreen, "\t- Argument count: ", fgYellow, $e.argc, "\n"
|
||||||
|
|
||||||
|
|
||||||
proc checkFunctionEnd(self: Debugger, n: int) =
|
proc checkFunctionEnd(self: Debugger, n: int) =
|
||||||
## Checks if a function ends at the given
|
## Checks if a function ends at the given
|
||||||
## bytecode offset
|
## bytecode offset
|
||||||
for i, e in self.functions:
|
for i, e in self.functions:
|
||||||
if n == e.stop and e.started and not e.stopped:
|
if n == e.stop:
|
||||||
e.stopped = true
|
|
||||||
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Function End ", fgYellow, &"'{e.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Function End ", fgYellow, &"'{e.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||||
|
|
||||||
|
|
||||||
|
proc checkModuleStart(self: Debugger, n: int) =
|
||||||
|
## Checks if a module begins at the given
|
||||||
|
## bytecode offset
|
||||||
|
for i, m in self.modules:
|
||||||
|
if m.start == n:
|
||||||
|
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Module Start ", fgYellow, &"'{m.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||||
|
styledEcho fgGreen, "\t- Start offset: ", fgYellow, $m.start
|
||||||
|
styledEcho fgGreen, "\t- End offset: ", fgYellow, $m.stop, "\n"
|
||||||
|
|
||||||
|
|
||||||
|
proc checkModuleEnd(self: Debugger, n: int) =
|
||||||
|
## Checks if a module ends at the given
|
||||||
|
## bytecode offset
|
||||||
|
for i, m in self.modules:
|
||||||
|
if m.stop == n:
|
||||||
|
styledEcho fgBlue, "\n==== Peon Bytecode Disassembler - Module End ", fgYellow, &"'{m.name}' ", fgBlue, "(", fgYellow, $i, fgBlue, ") ===="
|
||||||
|
|
||||||
|
|
||||||
proc simpleInstruction(self: Debugger, instruction: OpCode) =
|
proc simpleInstruction(self: Debugger, instruction: OpCode) =
|
||||||
|
@ -94,9 +114,6 @@ proc simpleInstruction(self: Debugger, instruction: OpCode) =
|
||||||
else:
|
else:
|
||||||
stdout.styledWriteLine(fgYellow, "No")
|
stdout.styledWriteLine(fgYellow, "No")
|
||||||
self.current += 1
|
self.current += 1
|
||||||
self.checkFunctionEnd(self.current - 2)
|
|
||||||
self.checkFunctionEnd(self.current - 1)
|
|
||||||
self.checkFunctionEnd(self.current)
|
|
||||||
|
|
||||||
|
|
||||||
proc stackTripleInstruction(self: Debugger, instruction: OpCode) =
|
proc stackTripleInstruction(self: Debugger, instruction: OpCode) =
|
||||||
|
@ -168,20 +185,27 @@ proc jumpInstruction(self: Debugger, instruction: OpCode) =
|
||||||
self.current += 4
|
self.current += 4
|
||||||
while self.chunk.code[self.current] == NoOp.uint8:
|
while self.chunk.code[self.current] == NoOp.uint8:
|
||||||
inc(self.current)
|
inc(self.current)
|
||||||
for i in countup(orig, self.current + 1):
|
|
||||||
self.checkFunctionStart(i)
|
|
||||||
|
|
||||||
|
|
||||||
proc disassembleInstruction*(self: Debugger) =
|
proc disassembleInstruction*(self: Debugger) =
|
||||||
## Takes one bytecode instruction and prints it
|
## Takes one bytecode instruction and prints it
|
||||||
|
let opcode = OpCode(self.chunk.code[self.current])
|
||||||
|
self.checkModuleStart(self.current)
|
||||||
|
self.checkFunctionStart(self.current)
|
||||||
printDebug("Offset: ")
|
printDebug("Offset: ")
|
||||||
stdout.styledWriteLine(fgYellow, $(self.current))
|
stdout.styledWriteLine(fgYellow, $(self.current))
|
||||||
printDebug("Line: ")
|
printDebug("Line: ")
|
||||||
stdout.styledWriteLine(fgYellow, &"{self.chunk.getLine(self.current)}")
|
stdout.styledWriteLine(fgYellow, &"{self.chunk.getLine(self.current)}")
|
||||||
var opcode = OpCode(self.chunk.code[self.current])
|
|
||||||
case opcode:
|
case opcode:
|
||||||
of simpleInstructions:
|
of simpleInstructions:
|
||||||
self.simpleInstruction(opcode)
|
self.simpleInstruction(opcode)
|
||||||
|
# Functions (and modules) only have a single return statement at the
|
||||||
|
# end of their body, so we never execute this more than once per module/function
|
||||||
|
if opcode == Return:
|
||||||
|
# -2 to skip the hardcoded argument to return
|
||||||
|
# and the increment by simpleInstruction()
|
||||||
|
self.checkFunctionEnd(self.current - 2)
|
||||||
|
self.checkModuleEnd(self.current - 1)
|
||||||
of constantInstructions:
|
of constantInstructions:
|
||||||
self.constantInstruction(opcode)
|
self.constantInstruction(opcode)
|
||||||
of stackDoubleInstructions:
|
of stackDoubleInstructions:
|
||||||
|
@ -197,7 +221,9 @@ proc disassembleInstruction*(self: Debugger) =
|
||||||
else:
|
else:
|
||||||
echo &"DEBUG - Unknown opcode {opcode} at index {self.current}"
|
echo &"DEBUG - Unknown opcode {opcode} at index {self.current}"
|
||||||
self.current += 1
|
self.current += 1
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
proc parseFunctions(self: Debugger) =
|
proc parseFunctions(self: Debugger) =
|
||||||
## Parses function information in the chunk
|
## Parses function information in the chunk
|
||||||
|
@ -206,7 +232,7 @@ proc parseFunctions(self: Debugger) =
|
||||||
name: string
|
name: string
|
||||||
idx = 0
|
idx = 0
|
||||||
size = 0
|
size = 0
|
||||||
while idx < len(self.chunk.functions) - 1:
|
while idx < self.chunk.functions.high():
|
||||||
start = int([self.chunk.functions[idx], self.chunk.functions[idx + 1], self.chunk.functions[idx + 2]].fromTriple())
|
start = int([self.chunk.functions[idx], self.chunk.functions[idx + 1], self.chunk.functions[idx + 2]].fromTriple())
|
||||||
idx += 3
|
idx += 3
|
||||||
stop = int([self.chunk.functions[idx], self.chunk.functions[idx + 1], self.chunk.functions[idx + 2]].fromTriple())
|
stop = int([self.chunk.functions[idx], self.chunk.functions[idx + 1], self.chunk.functions[idx + 2]].fromTriple())
|
||||||
|
@ -220,15 +246,36 @@ proc parseFunctions(self: Debugger) =
|
||||||
self.functions.add(Function(start: start, stop: stop, argc: argc, name: name))
|
self.functions.add(Function(start: start, stop: stop, argc: argc, name: name))
|
||||||
|
|
||||||
|
|
||||||
|
proc parseModules(self: Debugger) =
|
||||||
|
## Parses module information in the chunk
|
||||||
|
var
|
||||||
|
start, stop: int
|
||||||
|
name: string
|
||||||
|
idx = 0
|
||||||
|
size = 0
|
||||||
|
while idx < self.chunk.modules.high():
|
||||||
|
start = int([self.chunk.modules[idx], self.chunk.modules[idx + 1], self.chunk.modules[idx + 2]].fromTriple())
|
||||||
|
idx += 3
|
||||||
|
stop = int([self.chunk.modules[idx], self.chunk.modules[idx + 1], self.chunk.modules[idx + 2]].fromTriple())
|
||||||
|
idx += 3
|
||||||
|
size = int([self.chunk.modules[idx], self.chunk.modules[idx + 1]].fromDouble())
|
||||||
|
idx += 2
|
||||||
|
name = self.chunk.modules[idx..<idx + size].fromBytes()
|
||||||
|
inc(idx, size)
|
||||||
|
self.modules.add(Module(start: start, stop: stop, name: name))
|
||||||
|
|
||||||
|
|
||||||
proc disassembleChunk*(self: Debugger, chunk: Chunk, name: string) =
|
proc disassembleChunk*(self: Debugger, chunk: Chunk, name: string) =
|
||||||
## Takes a chunk of bytecode and prints it
|
## Takes a chunk of bytecode and prints it
|
||||||
self.chunk = chunk
|
self.chunk = chunk
|
||||||
styledEcho fgBlue, &"==== Peon Bytecode Disassembler - Chunk '{name}' ====\n"
|
styledEcho fgBlue, &"==== Peon Bytecode Disassembler - Chunk '{name}' ====\n"
|
||||||
self.current = 0
|
self.current = 0
|
||||||
self.parseFunctions()
|
self.parseFunctions()
|
||||||
|
self.parseModules()
|
||||||
while self.current < self.chunk.code.len:
|
while self.current < self.chunk.code.len:
|
||||||
self.disassembleInstruction()
|
self.disassembleInstruction()
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
styledEcho fgBlue, &"==== Peon Bytecode Disassembler - Chunk '{name}' ===="
|
styledEcho fgBlue, &"==== Peon Bytecode Disassembler - Chunk '{name}' ===="
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -64,7 +64,8 @@ proc newSerializer*(self: Serializer = nil): Serializer =
|
||||||
|
|
||||||
|
|
||||||
proc writeHeaders(self: Serializer, stream: var seq[byte]) =
|
proc writeHeaders(self: Serializer, stream: var seq[byte]) =
|
||||||
## Writes the Peon bytecode headers in-place into a byte stream
|
## Writes the Peon bytecode headers in-place into the
|
||||||
|
## given byte sequence
|
||||||
stream.extend(PeonBytecodeMarker.toBytes())
|
stream.extend(PeonBytecodeMarker.toBytes())
|
||||||
stream.add(byte(PEON_VERSION.major))
|
stream.add(byte(PEON_VERSION.major))
|
||||||
stream.add(byte(PEON_VERSION.minor))
|
stream.add(byte(PEON_VERSION.minor))
|
||||||
|
@ -77,25 +78,31 @@ proc writeHeaders(self: Serializer, stream: var seq[byte]) =
|
||||||
|
|
||||||
proc writeLineData(self: Serializer, stream: var seq[byte]) =
|
proc writeLineData(self: Serializer, stream: var seq[byte]) =
|
||||||
## Writes line information for debugging
|
## Writes line information for debugging
|
||||||
## bytecode instructions
|
## bytecode instructions to the given byte
|
||||||
|
## sequence
|
||||||
stream.extend(len(self.chunk.lines).toQuad())
|
stream.extend(len(self.chunk.lines).toQuad())
|
||||||
for b in self.chunk.lines:
|
for b in self.chunk.lines:
|
||||||
stream.extend(b.toTriple())
|
stream.extend(b.toTriple())
|
||||||
|
|
||||||
|
|
||||||
proc writeCFIData(self: Serializer, stream: var seq[byte]) =
|
proc writeFunctions(self: Serializer, stream: var seq[byte]) =
|
||||||
## Writes Call Frame Information for debugging
|
## Writes debug info about functions to the
|
||||||
## functions
|
## given byte sequence
|
||||||
stream.extend(len(self.chunk.functions).toQuad())
|
stream.extend(len(self.chunk.functions).toQuad())
|
||||||
stream.extend(self.chunk.functions)
|
stream.extend(self.chunk.functions)
|
||||||
|
|
||||||
|
|
||||||
proc writeConstants(self: Serializer, stream: var seq[byte]) =
|
proc writeConstants(self: Serializer, stream: var seq[byte]) =
|
||||||
## Writes the constants table in-place into the
|
## Writes the constants table in-place into the
|
||||||
## given stream
|
## byte sequence
|
||||||
stream.extend(self.chunk.consts.len().toQuad())
|
stream.extend(self.chunk.consts.len().toQuad())
|
||||||
for constant in self.chunk.consts:
|
stream.extend(self.chunk.consts)
|
||||||
stream.add(constant)
|
|
||||||
|
|
||||||
|
proc writeModules(self: Serializer, stream: var seq[byte]) =
|
||||||
|
## Writes module information to the given stream
|
||||||
|
stream.extend(self.chunk.modules.len().toQuad())
|
||||||
|
stream.extend(self.chunk.modules)
|
||||||
|
|
||||||
|
|
||||||
proc writeCode(self: Serializer, stream: var seq[byte]) =
|
proc writeCode(self: Serializer, stream: var seq[byte]) =
|
||||||
|
@ -106,7 +113,7 @@ proc writeCode(self: Serializer, stream: var seq[byte]) =
|
||||||
|
|
||||||
|
|
||||||
proc readHeaders(self: Serializer, stream: seq[byte], serialized: Serialized): int =
|
proc readHeaders(self: Serializer, stream: seq[byte], serialized: Serialized): int =
|
||||||
## Reads the bytecode headers from a given stream
|
## Reads the bytecode headers from a given sequence
|
||||||
## of bytes
|
## of bytes
|
||||||
var stream = stream
|
var stream = stream
|
||||||
if stream[0..<len(PeonBytecodeMarker)] != PeonBytecodeMarker.toBytes():
|
if stream[0..<len(PeonBytecodeMarker)] != PeonBytecodeMarker.toBytes():
|
||||||
|
@ -131,7 +138,6 @@ proc readHeaders(self: Serializer, stream: seq[byte], serialized: Serialized): i
|
||||||
result += 8
|
result += 8
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
proc readLineData(self: Serializer, stream: seq[byte]): int =
|
proc readLineData(self: Serializer, stream: seq[byte]): int =
|
||||||
## Reads line information from a stream
|
## Reads line information from a stream
|
||||||
## of bytes
|
## of bytes
|
||||||
|
@ -142,10 +148,11 @@ proc readLineData(self: Serializer, stream: seq[byte]): int =
|
||||||
self.chunk.lines.add(int([stream[0], stream[1], stream[2]].fromTriple()))
|
self.chunk.lines.add(int([stream[0], stream[1], stream[2]].fromTriple()))
|
||||||
result += 3
|
result += 3
|
||||||
stream = stream[3..^1]
|
stream = stream[3..^1]
|
||||||
|
doAssert len(self.chunk.lines) == int(size)
|
||||||
|
|
||||||
|
|
||||||
proc readCFIData(self: Serializer, stream: seq[byte]): int =
|
proc readFunctions(self: Serializer, stream: seq[byte]): int =
|
||||||
## Reads Call Frame Information from a stream
|
## Reads the function segment from a stream
|
||||||
## of bytes
|
## of bytes
|
||||||
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
||||||
result += 4
|
result += 4
|
||||||
|
@ -153,22 +160,34 @@ proc readCFIData(self: Serializer, stream: seq[byte]): int =
|
||||||
for i in countup(0, int(size) - 1):
|
for i in countup(0, int(size) - 1):
|
||||||
self.chunk.functions.add(stream[i])
|
self.chunk.functions.add(stream[i])
|
||||||
inc(result)
|
inc(result)
|
||||||
|
doAssert len(self.chunk.functions) == int(size)
|
||||||
|
|
||||||
|
|
||||||
proc readConstants(self: Serializer, stream: seq[byte]): int =
|
proc readConstants(self: Serializer, stream: seq[byte]): int =
|
||||||
## Reads the constant table from the given stream
|
## Reads the constant table from the given
|
||||||
## of bytes
|
## byte sequence
|
||||||
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
||||||
result += 4
|
result += 4
|
||||||
var stream = stream[4..^1]
|
var stream = stream[4..^1]
|
||||||
for i in countup(0, int(size) - 1):
|
for i in countup(0, int(size) - 1):
|
||||||
self.chunk.consts.add(stream[i])
|
self.chunk.consts.add(stream[i])
|
||||||
inc(result)
|
inc(result)
|
||||||
|
doAssert len(self.chunk.consts) == int(size)
|
||||||
|
|
||||||
|
|
||||||
|
proc readModules(self: Serializer, stream: seq[byte]): int =
|
||||||
|
## Reads module information
|
||||||
|
let size = [stream[0], stream[1], stream[2], stream[3]].fromQuad()
|
||||||
|
result += 4
|
||||||
|
var stream = stream[4..^1]
|
||||||
|
for i in countup(0, int(size) - 1):
|
||||||
|
self.chunk.modules.add(stream[i])
|
||||||
|
inc(result)
|
||||||
|
doAssert len(self.chunk.modules) == int(size)
|
||||||
|
|
||||||
|
|
||||||
proc readCode(self: Serializer, stream: seq[byte]): int =
|
proc readCode(self: Serializer, stream: seq[byte]): int =
|
||||||
## Reads the bytecode from a given stream and writes
|
## Reads the bytecode from a given byte sequence
|
||||||
## it into the given chunk
|
|
||||||
let size = [stream[0], stream[1], stream[2]].fromTriple()
|
let size = [stream[0], stream[1], stream[2]].fromTriple()
|
||||||
var stream = stream[3..^1]
|
var stream = stream[3..^1]
|
||||||
for i in countup(0, int(size) - 1):
|
for i in countup(0, int(size) - 1):
|
||||||
|
@ -178,13 +197,16 @@ proc readCode(self: Serializer, stream: seq[byte]): int =
|
||||||
|
|
||||||
|
|
||||||
proc dumpBytes*(self: Serializer, chunk: Chunk, filename: string): seq[byte] =
|
proc dumpBytes*(self: Serializer, chunk: Chunk, filename: string): seq[byte] =
|
||||||
## Dumps the given bytecode and file to a sequence of bytes and returns it.
|
## Dumps the given chunk to a sequence of bytes and returns it.
|
||||||
|
## The filename argument is for error reporting only, use dumpFile
|
||||||
|
## to dump bytecode to a file
|
||||||
self.filename = filename
|
self.filename = filename
|
||||||
self.chunk = chunk
|
self.chunk = chunk
|
||||||
self.writeHeaders(result)
|
self.writeHeaders(result)
|
||||||
self.writeLineData(result)
|
self.writeLineData(result)
|
||||||
self.writeCFIData(result)
|
self.writeFunctions(result)
|
||||||
self.writeConstants(result)
|
self.writeConstants(result)
|
||||||
|
self.writeModules(result)
|
||||||
self.writeCode(result)
|
self.writeCode(result)
|
||||||
|
|
||||||
|
|
||||||
|
@ -207,8 +229,9 @@ proc loadBytes*(self: Serializer, stream: seq[byte]): Serialized =
|
||||||
try:
|
try:
|
||||||
stream = stream[self.readHeaders(stream, result)..^1]
|
stream = stream[self.readHeaders(stream, result)..^1]
|
||||||
stream = stream[self.readLineData(stream)..^1]
|
stream = stream[self.readLineData(stream)..^1]
|
||||||
stream = stream[self.readCFIData(stream)..^1]
|
stream = stream[self.readFunctions(stream)..^1]
|
||||||
stream = stream[self.readConstants(stream)..^1]
|
stream = stream[self.readConstants(stream)..^1]
|
||||||
|
stream = stream[self.readModules(stream)..^1]
|
||||||
stream = stream[self.readCode(stream)..^1]
|
stream = stream[self.readCode(stream)..^1]
|
||||||
except IndexDefect:
|
except IndexDefect:
|
||||||
self.error("truncated bytecode stream")
|
self.error("truncated bytecode stream")
|
||||||
|
|
|
@ -246,6 +246,11 @@ proc runFile(f: string, fromString: bool = false, dump: bool = true, breakpoints
|
||||||
styledEcho fgGreen, "OK"
|
styledEcho fgGreen, "OK"
|
||||||
else:
|
else:
|
||||||
styledEcho fgRed, "Corrupted"
|
styledEcho fgRed, "Corrupted"
|
||||||
|
stdout.styledWrite(fgBlue, "\t- Modules segment: ")
|
||||||
|
if serialized.chunk.modules == compiled.modules:
|
||||||
|
styledEcho fgGreen, "OK"
|
||||||
|
else:
|
||||||
|
styledEcho fgRed, "Corrupted"
|
||||||
if run:
|
if run:
|
||||||
case backend:
|
case backend:
|
||||||
of PeonBackend.Bytecode:
|
of PeonBackend.Bytecode:
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
import std;
|
import std;
|
||||||
|
|
||||||
|
|
||||||
const max = 50000;
|
const max = 500000;
|
||||||
|
|
||||||
var x = max;
|
var x = max;
|
||||||
var s = "just a test";
|
var s = "just a test";
|
||||||
|
|
Loading…
Reference in New Issue