Initial commit from JAPL with some changes

This commit is contained in:
Mattia Giambirtone 2022-04-04 12:29:23 +02:00
parent a20cfc532b
commit 76812a2091
24 changed files with 6001 additions and 53 deletions

222
LICENSE
View File

@ -1,85 +1,201 @@
The Artistic License 2.0
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Copyright (c) 2000-2006, The Perl Foundation.
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
1. Definitions.
Preamble
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
This license establishes the terms under which a given free software Package may be copied, modified, distributed, and/or redistributed. The intent is that the Copyright Holder maintains some artistic control over the development of that Package while still keeping the Package available as open source and free software.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
You are always permitted to make arrangements wholly outside of this license directly with the Copyright Holder of a given Package. If the terms of this license do not permit the full use that you propose to make of the Package, you should contact the Copyright Holder and seek a different licensing arrangement.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
Definitions
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Copyright Holder" means the individual(s) or organization(s) named in the copyright notice for the entire Package.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Contributor" means any party that has contributed code or other material to the Package, in accordance with the Copyright Holder's procedures.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"You" and "your" means any person who would like to copy, distribute, or modify the Package.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Package" means the collection of files distributed by the Copyright Holder, and derivatives of that collection and/or of those files. A given Package may consist of either the Standard Version, or a Modified Version.
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Distribute" means providing a copy of the Package or making it accessible to anyone else, or in the case of a company or organization, to others outside of your company or organization.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Distributor Fee" means any fee that you charge for Distributing this Package or providing support for this Package to another party. It does not mean licensing fees.
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
"Standard Version" refers to the Package if it has not been modified, or has been modified only in ways explicitly requested by the Copyright Holder.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
"Modified Version" means the Package, if it has been changed, and such changes were not explicitly requested by the Copyright Holder.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
"Original License" means this Artistic License as Distributed with the Standard Version of the Package, in its current version or as it may be modified by The Perl Foundation in the future.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
"Source" form means the source code, documentation source, and configuration files for the Package.
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
"Compiled" form means the compiled bytecode, object code, binary, or any other form resulting from mechanical transformation or translation of the Source form.
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
Permission for Use and Modification Without Distribution
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(1) You are permitted to use the Standard Version and create and use Modified Versions for any purpose without restriction, provided that you do not Distribute the Modified Version.
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
Permissions for Redistribution of the Standard Version
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
(2) You may Distribute verbatim copies of the Source form of the Standard Version of this Package in any medium without restriction, either gratis or for a Distributor Fee, provided that you duplicate all of the original copyright notices and associated disclaimers. At your discretion, such verbatim copies may or may not include a Compiled form of the Package.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
(3) You may apply any bug fixes, portability changes, and other modifications made available from the Copyright Holder. The resulting Package will still be considered the Standard Version, and as such will be subject to the Original License.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
Distribution of Modified Versions of the Package as Source
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
(4) You may Distribute your Modified Version as Source (either gratis or for a Distributor Fee, and with or without a Compiled form of the Modified Version) provided that you clearly document how it differs from the Standard Version, including, but not limited to, documenting any non-standard features, executables, or modules, and provided that you do at least ONE of the following:
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
(a) make the Modified Version available to the Copyright Holder of the Standard Version, under the Original License, so that the Copyright Holder may include your modifications in the Standard Version.
(b) ensure that installation of your Modified Version does not prevent the user installing or running the Standard Version. In addition, the Modified Version must bear a name that is different from the name of the Standard Version.
(c) allow anyone who receives a copy of the Modified Version to make the Source form of the Modified Version available to others under
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
(i) the Original License or
(ii) a license that permits the licensee to freely copy, modify and redistribute the Modified Version using the same licensing terms that apply to the copy that the licensee received, and requires that the Source form of the Modified Version, and of any works derived from it, be made freely available in that license fees are prohibited but Distributor Fees are allowed.
END OF TERMS AND CONDITIONS
Distribution of Compiled Forms of the Standard Version or Modified Versions without the Source
APPENDIX: How to apply the Apache License to your work.
(5) You may Distribute Compiled forms of the Standard Version without the Source, provided that you include complete instructions on how to get the Source of the Standard Version. Such instructions must be valid at the time of your distribution. If these instructions, at any time while you are carrying out such distribution, become invalid, you must provide new instructions on demand or cease further distribution. If you provide valid instructions or cease distribution within thirty days after you become aware that the instructions are invalid, then you do not forfeit any of your rights under this license.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
(6) You may Distribute a Modified Version in Compiled form without the Source, provided that you comply with Section 4 with respect to the Source of the Modified Version.
Copyright [yyyy] [name of copyright owner]
Aggregating or Linking the Package
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
(7) You may aggregate the Package (either the Standard Version or Modified Version) with other packages and Distribute the resulting aggregation provided that you do not charge a licensing fee for the Package. Distributor Fees are permitted, and licensing fees for other components in the aggregation are permitted. The terms of this license apply to the use and Distribution of the Standard or Modified Versions as included in the aggregation.
http://www.apache.org/licenses/LICENSE-2.0
(8) You are permitted to link Modified and Standard Versions with other works, to embed the Package in a larger work of your own, or to build stand-alone binary or bytecode versions of applications that include the Package, and Distribute the result without restriction, provided the result does not expose a direct interface to the Package.
Items That are Not Considered Part of a Modified Version
(9) Works (including, but not limited to, modules and scripts) that merely extend or make use of the Package, do not, by themselves, cause the Package to be a Modified Version. In addition, such works are not considered parts of the Package itself, and are not subject to the terms of this license.
General Provisions
(10) Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license.
(11) If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license.
(12) This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder.
(13) This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed.
(14) Disclaimer of Warranty:
THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -0,0 +1,196 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Implementation of a custom list data type for JAPL objects (used also internally by the VM)
import iterable
import ../../memory/allocator
import baseObject
import strformat
type
ArrayList*[T] = object of Iterable
## Implementation of a simple dynamic
## array with amortized O(1) append complexity
## and O(1) complexity when popping/deleting
## the last element
container: ptr UncheckedArray[T]
ArrayListIterator*[T] = object of Iterator
list: ArrayList[T]
current: int
proc newArrayList*[T]: ptr ArrayList[T] =
## Allocates a new, empty array list
result = allocateObj(ArrayList[T], ObjectType.List)
result.capacity = 0
result.container = nil
result.length = 0
proc append*[T](self: ptr ArrayList[T], elem: T) =
## Appends an object to the end of the list
## in amortized constant time (~O(1))
if self.capacity <= self.length:
self.capacity = growCapacity(self.capacity)
self.container = resizeArray(T, self.container, self.length, self.capacity)
self.container[self.length] = elem
self.length += 1
proc pop*[T](self: ptr ArrayList[T], idx: int = -1): T =
## Pops an item from the list. By default, the last
## element is popped, in which case the operation's
## time complexity is O(1). When an arbitrary element
## is popped, the complexity rises to O(k) where k
## is the number of elements that had to be shifted
## by 1 to avoid empty slots
var idx = idx
if self.length == 0:
raise newException(IndexDefect, "pop from empty ArrayList")
if idx == -1:
idx = self.length - 1
if idx notin 0..self.length - 1:
raise newException(IndexDefect, &"ArrayList index out of bounds: {idx} notin 0..{self.length - 1}")
result = self.container[idx]
if idx != self.length - 1:
for i in countup(idx, self.length - 1):
self.container[i] = self.container[i + 1]
self.capacity -= 1
self.length -= 1
proc `[]`*[T](self: ptr ArrayList[T], idx: int): T =
## Retrieves an item from the list, in constant
## time
if self.length == 0:
raise newException(IndexDefect, &"ArrayList index out of bounds: : {idx} notin 0..{self.length - 1}")
if idx notin 0..self.length - 1:
raise newException(IndexDefect, &"ArrayList index out of bounds: {idx} notin 0..{self.length - 1}")
result = self.container[idx]
proc `[]`*[T](self: ptr ArrayList[T], slice: Hslice[int, int]): ptr ArrayList[T] =
## Retrieves a subset of the list, in O(k) time where k is the size
## of the slice
if self.length == 0:
raise newException(IndexDefect, "ArrayList index out of bounds")
if slice.a notin 0..self.length - 1 or slice.b notin 0..self.length:
raise newException(IndexDefect, "ArrayList index out of bounds")
result = newArrayList[T]()
for i in countup(slice.a, slice.b - 1):
result.append(self.container[i])
proc `[]=`*[T](self: ptr ArrayList[T], idx: int, obj: T) =
## Assigns an object to the given index, in constant
## time
if self.length == 0:
raise newException(IndexDefect, "ArrayList is empty")
if idx notin 0..self.length - 1:
raise newException(IndexDefect, "ArrayList index out of bounds")
self.container[idx] = obj
proc delete*[T](self: ptr ArrayList[T], idx: int) =
## Deletes an object from the given index.
## This method shares the time complexity
## of self.pop()
if self.length == 0:
raise newException(IndexDefect, "delete from empty ArrayList")
if idx notin 0..self.length - 1:
raise newException(IndexDefect, &"ArrayList index out of bounds: {idx} notin 0..{self.length - 1}")
discard self.pop(idx)
proc contains*[T](self: ptr ArrayList[T], elem: T): bool =
## Returns true if the given object is present
## in the list, false otherwise. O(n) complexity
if self.length > 0:
for i in 0..self.length - 1:
if self[i] == elem:
return true
return false
proc high*[T](self: ptr ArrayList[T]): int =
## Returns the index of the last
## element in the list, in constant time
if self.length == 0:
raise newException(IndexDefect, "ArrayList is empty")
result = self.length - 1
proc len*[T](self: ptr ArrayList[T]): int =
## Returns the length of the list
## in constant time
result = self.length
iterator pairs*[T](self: ptr ArrayList[T]): tuple[key: int, val: T] =
## Implements pairwise iteration (similar to python's enumerate)
for i in countup(0, self.length - 1):
yield (key: i, val: self[i])
iterator items*[T](self: ptr ArrayList[T]): T =
## Implements iteration
for i in countup(0, self.length - 1):
yield self[i]
proc reversed*[T](self: ptr ArrayList[T], first: int = -1, last: int = 0): ptr ArrayList[T] =
## Returns a reversed version of the given list, from first to last.
## First defaults to -1 (the end of the list) and last defaults to 0 (the
## beginning of the list)
var first = first
if first == -1:
first = self.length - 1
result = newArrayList[T]()
for i in countdown(first, last):
result.append(self[i])
proc extend*[T](self: ptr ArrayList[T], other: seq[T]) =
## Iteratively calls self.append() with the elements
## from a nim sequence
for elem in other:
self.append(elem)
proc extend*[T](self: ptr ArrayList[T], other: ptr ArrayList[T]) =
## Iteratively calls self.append() with the elements
## from another ArrayList
for elem in other:
self.append(elem)
proc `$`*[T](self: ptr ArrayList[T]): string =
## Returns a string representation
## of self
result = "["
if self.length > 0:
for i in 0..self.length - 1:
result = result & $self.container[i]
if i < self.length - 1:
result = result & ", "
result = result & "]"
proc getIter*[T](self: ptr ArrayList[T]): Iterator =
## Returns the iterator object of the
## arraylist
result = allocate(ArrayListIterator, ) # TODO

View File

@ -0,0 +1,84 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## The base JAPL object
import ../../memory/allocator
type
ObjectType* {.pure.} = enum
## All the possible object types
String, Exception, Function,
Class, Module, BaseObject,
Native, Integer, Float,
Bool, NotANumber, Infinity,
Nil, List, Dict, Set, Tuple
Obj* = object of RootObj
## The base object for all
## JAPL types. Every object
## in JAPL implicitly inherits
## from this base type and extends
## its functionality
kind*: ObjectType
hashValue*: uint64
## Object constructors and allocators
proc allocateObject*(size: int, kind: ObjectType): ptr Obj =
## Wrapper around reallocate() to create a new generic JAPL object
result = cast[ptr Obj](reallocate(nil, 0, size))
result.kind = kind
template allocateObj*(kind: untyped, objType: ObjectType): untyped =
## Wrapper around allocateObject to cast a generic object
## to a more specific type
cast[ptr kind](allocateObject(sizeof kind, objType))
proc newObj*: ptr Obj =
## Allocates a generic JAPL object
result = allocateObj(Obj, ObjectType.BaseObject)
result.hashValue = 0x123FFFF
## Default object methods implementations
# In JAPL code, this method will be called
# stringify()
proc `$`*(self: ptr Obj): string = "<object>"
proc stringify*(self: ptr Obj): string = $self
proc hash*(self: ptr Obj): int64 = 0x123FFAA # Constant hash value
# I could've used mul, sub and div, but "div" is a reserved
# keyword and using `div` looks ugly. So to keep everything
# consistent I just made all names long
proc multiply*(self, other: ptr Obj): ptr Obj = nil
proc sum*(self, other: ptr Obj): ptr Obj = nil
proc divide*(self, other: ptr Obj): ptr Obj = nil
proc subtract*(self, other: ptr Obj): ptr Obj = nil
# Returns 0 if self == other, a negative number if self < other
# and a positive number if self > other. This is a convenience
# method to implement all basic comparison operators in one
# method
proc compare*(self, other: ptr Obj): ptr Obj = nil
# Specific methods for each comparison
proc equalTo*(self, other: ptr Obj): ptr Obj = nil
proc greaterThan*(self, other: ptr Obj): ptr Obj = nil
proc lessThan*(self, other: ptr Obj): ptr Obj = nil
proc greaterOrEqual*(self, other: ptr Obj): ptr Obj = nil
proc lessOrEqual*(self, other: ptr Obj): ptr Obj = nil

View File

@ -0,0 +1,48 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Type dispatching module
import baseObject
import intObject
import floatObject
proc dispatch*(obj: ptr Obj, p: proc (self: ptr Obj): ptr Obj): ptr Obj =
## Dispatches a given one-argument procedure according to
## the provided object's runtime type and returns its result
case obj.kind:
of BaseObject:
result = p(obj)
of ObjectType.Float:
result = p(cast[ptr Float](obj))
of ObjectType.Integer:
result = p(cast[ptr Integer](obj))
else:
discard
proc dispatch*(a, b: ptr Obj, p: proc (self: ptr Obj, other: ptr Obj): ptr Obj): ptr Obj =
## Dispatches a given two-argument procedure according to
## the provided object's runtime type and returns its result
case a.kind:
of BaseObject:
result = p(a, b)
of ObjectType.Float:
# Further type casting for b is expected to occur later
# in the given procedure
result = p(cast[ptr Float](a), b)
of ObjectType.Integer:
result = p(cast[ptr Integer](a), b)
else:
discard

View File

@ -0,0 +1,49 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Implementation of integer types
import baseObject
import lenientops
type Float* = object of Obj
value: float64
proc newFloat*(value: float): ptr Float =
## Initializes a new JAPL
## float object from
## a machine native float
result = allocateObj(Float, ObjectType.Float)
result.value = value
proc toNativeFloat*(self: ptr Float): float =
## Returns the float's machine
## native underlying value
result = self.value
proc `$`*(self: ptr Float): string = $self.value
proc hash*(self: ptr Float): int64 =
## Implements hashing
## for the given float
if self.value - int(self.value) == self.value:
result = int(self.value)
else:
result = 2166136261 xor int(self.value) # TODO: Improve this
result *= 16777619

View File

@ -0,0 +1,207 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ../../memory/allocator
import ../../config
import baseObject
import iterable
type
Entry = object
## Low-level object to store key/value pairs.
## Using an extra value for marking the entry as
## a tombstone instead of something like detecting
## tombstones as entries with null keys but full values
## may seem wasteful. The thing is, though, that since
## we want to implement sets on top of this hashmap and
## the implementation of a set is *literally* a dictionary
## with empty values and keys as the elements, this would
## confuse our findEntry method and would force us to override
## it to account for a different behavior.
## Using a third field takes up more space, but saves us
## from the hassle of rewriting code
key: ptr Obj
value: ptr Obj
tombstone: bool
HashMap* = object of Iterable
## An associative array with O(1) lookup time,
## similar to nim's Table type, but using raw
## memory to be more compatible with JAPL's runtime
## memory management
entries: ptr UncheckedArray[ptr Entry]
# This attribute counts *only* non-deleted entries
actual_length: int
proc newHashMap*: ptr HashMap =
## Initializes a new, empty hashmap
result = allocateObj(HashMap, ObjectType.Dict)
result.actual_length = 0
result.entries = nil
result.capacity = 0
result.length = 0
proc freeHashMap*(self: ptr HashMap) =
## Frees the memory associated with the hashmap
discard freeArray(UncheckedArray[ptr Entry], self.entries, self.capacity)
self.length = 0
self.actual_length = 0
self.capacity = 0
self.entries = nil
proc findEntry(self: ptr UncheckedArray[ptr Entry], key: ptr Obj, capacity: int): ptr Entry =
## Low-level method used to find entries in the underlying
## array, returns a pointer to an entry
var capacity = uint64(capacity)
var idx = uint64(key.hash()) mod capacity
while true:
result = self[idx]
if system.`==`(result.key, nil):
# We found an empty bucket
break
elif result.tombstone:
# We found a previously deleted
# entry. In this case, we need
# to make sure the tombstone
# will get overwritten when the
# user wants to add a new value
# that would replace it, BUT also
# for it to not stop our linear
# probe sequence. Hence, if the
# key of the tombstone is the same
# as the one we're looking for,
# we break out of the loop, otherwise
# we keep searching
if result.key == key:
break
elif result.key == key:
# We were looking for a specific key and
# we found it, so we also bail out
break
# If none of these conditions match, we have a collision!
# This means we can just move on to the next slot in our probe
# sequence until we find an empty slot. The way our resizing
# mechanism works makes the empty slot invariant easy to
# maintain since we increase the underlying array's size
# before we are actually full
idx = (idx + 1) mod capacity
proc adjustCapacity(self: ptr HashMap) =
var newCapacity = growCapacity(self.capacity)
var entries = allocate(UncheckedArray[ptr Entry], Entry, newCapacity)
var oldEntry: ptr Entry
var newEntry: ptr Entry
self.length = 0
for x in countup(0, newCapacity - 1):
entries[x] = allocate(Entry, Entry, 1)
entries[x].tombstone = false
entries[x].key = nil
entries[x].value = nil
for x in countup(0, self.capacity - 1):
oldEntry = self.entries[x]
if not system.`==`(oldEntry.key, nil):
newEntry = entries.findEntry(oldEntry.key, newCapacity)
newEntry.key = oldEntry.key
newEntry.value = oldEntry.value
self.length += 1
discard freeArray(UncheckedArray[ptr Entry], self.entries, self.capacity)
self.entries = entries
self.capacity = newCapacity
proc setEntry(self: ptr HashMap, key: ptr Obj, value: ptr Obj): bool =
if float64(self.length + 1) >= float64(self.capacity) * MAP_LOAD_FACTOR:
self.adjustCapacity()
var entry = findEntry(self.entries, key, self.capacity)
result = system.`==`(entry.key, nil)
if result:
self.actual_length += 1
self.length += 1
entry.key = key
entry.value = value
entry.tombstone = false
proc `[]`*(self: ptr HashMap, key: ptr Obj): ptr Obj =
var entry = findEntry(self.entries, key, self.capacity)
if system.`==`(entry.key, nil) or entry.tombstone:
raise newException(KeyError, "Key not found: " & $key)
result = entry.value
proc `[]=`*(self: ptr HashMap, key: ptr Obj, value: ptr Obj) =
discard self.setEntry(key, value)
proc len*(self: ptr HashMap): int =
result = self.actual_length
proc del*(self: ptr HashMap, key: ptr Obj) =
if self.len() == 0:
raise newException(KeyError, "delete from empty hashmap")
var entry = findEntry(self.entries, key, self.capacity)
if not system.`==`(entry.key, nil):
self.actual_length -= 1
entry.tombstone = true
else:
raise newException(KeyError, "Key not found: " & $key)
proc contains*(self: ptr HashMap, key: ptr Obj): bool =
let entry = findEntry(self.entries, key, self.capacity)
if not system.`==`(entry.key, nil) and not entry.tombstone:
result = true
else:
result = false
iterator keys*(self: ptr HashMap): ptr Obj =
var entry: ptr Entry
for i in countup(0, self.capacity - 1):
entry = self.entries[i]
if not system.`==`(entry.key, nil) and not entry.tombstone:
yield entry.key
iterator values*(self: ptr HashMap): ptr Obj =
for key in self.keys():
yield self[key]
iterator pairs*(self: ptr HashMap): tuple[key: ptr Obj, val: ptr Obj] =
for key in self.keys():
yield (key: key, val: self[key])
iterator items*(self: ptr HashMap): ptr Obj =
for k in self.keys():
yield k
proc `$`*(self: ptr HashMap): string =
var i = 0
result &= "{"
for key, value in self.pairs():
result &= $key & ": " & $value
if i < self.len() - 1:
result &= ", "
i += 1
result &= "}"

View File

@ -0,0 +1,40 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Implementation of integer types
import baseObject
type Integer* = object of Obj
value: int64
proc newInteger*(value: int64): ptr Integer =
## Initializes a new JAPL
## integer object from
## a machine native integer
result = allocateObj(Integer, ObjectType.Integer)
result.value = value
proc toNativeInteger*(self: ptr Integer): int64 =
## Returns the integer's machine
## native underlying value
result = self.value
proc `$`*(self: ptr Integer): string = $self.value
proc hash*(self: ptr Integer): int64 = self.value

View File

@ -0,0 +1,45 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Implementation of iterable types and iterators in JAPL
import baseObject
type
Iterable* = object of Obj
## Defines the standard interface
## for iterable types in JAPL
length*: int
capacity*: int
Iterator* = object of Iterable
## This object drives iteration
## for every iterable type in JAPL except
## generators
iterable*: ptr Obj
iterCount*: int
proc getIter*(self: Iterable): ptr Iterator =
## Returns the iterator object of an
## iterable, which drives foreach
## loops
return nil
proc next*(self: Iterator): ptr Obj =
## Returns the next element from
## the iterator or nil if the
## iterator has been consumed
return nil

View File

@ -0,0 +1,15 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# JAPL string implementations

20
src/backend/vm.nim Normal file
View File

@ -0,0 +1,20 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## The JAPL runtime environment
type
VM* = ref object
stack:

61
src/config.nim Normal file
View File

@ -0,0 +1,61 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import strformat
const BYTECODE_MARKER* = "JAPL_BYTECODE"
const MAP_LOAD_FACTOR* = 0.75 # Load factor for builtin hashmaps
when MAP_LOAD_FACTOR >= 1.0:
{.fatal: "Hashmap load factor must be < 1".}
const HEAP_GROW_FACTOR* = 2 # How much extra memory to allocate for dynamic arrays and garbage collection when resizing
when HEAP_GROW_FACTOR <= 1:
{.fatal: "Heap growth factor must be > 1".}
const MAX_STACK_FRAMES* = 800 # The maximum number of stack frames at any one time. Acts as a recursion limiter (1 frame = 1 call)
when MAX_STACK_FRAMES <= 0:
{.fatal: "The frame limit must be > 0".}
const JAPL_VERSION* = (major: 0, minor: 4, patch: 0)
const JAPL_RELEASE* = "alpha"
const JAPL_COMMIT_HASH* = "ba9c8b4e5664c0670eb8925d65b307e397d6ed82"
when len(JAPL_COMMIT_HASH) != 40:
{.fatal: "The git commit hash must be exactly 40 characters long".}
const JAPL_BRANCH* = "master"
when len(JAPL_BRANCH) >= 255:
{.fatal: "The git branch name's length must be less than or equal to 255 characters".}
const DEBUG_TRACE_VM* = false # Traces VM execution
const SKIP_STDLIB_INIT* = false # Skips stdlib initialization (can be imported manually)
const DEBUG_TRACE_GC* = false # Traces the garbage collector (TODO)
const DEBUG_TRACE_ALLOCATION* = false # Traces memory allocation/deallocation
const DEBUG_TRACE_COMPILER* = false # Traces the compiler
const JAPL_VERSION_STRING* = &"JAPL {JAPL_VERSION.major}.{JAPL_VERSION.minor}.{JAPL_VERSION.patch} {JAPL_RELEASE} ({JAPL_BRANCH}, {CompileDate}, {CompileTime}, {JAPL_COMMIT_HASH[0..8]}) [Nim {NimVersion}] on {hostOS} ({hostCPU})"
const HELP_MESSAGE* = """The JAPL programming language, Copyright (C) 2022 Mattia Giambirtone & All Contributors
This program is free software, see the license distributed with this program or check
http://www.apache.org/licenses/LICENSE-2.0 for more info.
Basic usage
-----------
$ jpl Opens an interactive session (REPL)
$ jpl file.jpl Runs the given JAPL source file
Command-line options
--------------------
-h, --help Shows this help text and exits
-v, --version Prints the JAPL version number and exits
-s, --string Executes the passed string as if it was a file
-i, --interactive Enables interactive mode, which opens a REPL session after execution of a file or source string
-c, --nocache Disables dumping the result of bytecode compilation to files for caching
-d, --cache-delay Configures the bytecode cache invalidation threshold, in minutes (defaults to 60)
"""

1048
src/frontend/compiler.nim Normal file

File diff suppressed because it is too large Load Diff

574
src/frontend/lexer.nim Normal file
View File

@ -0,0 +1,574 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## A simple and modular tokenizer implementation with arbitrary lookahead
import strutils
import parseutils
import strformat
import tables
import meta/token
import meta/errors
export token # Makes Token available when importing the lexer module
export errors
type SymbolTable = object
## A table of symbols used
## to lex a source file
keywords: TableRef[string, Token]
operators: TableRef[string, Token]
# Table of all single-character tokens
var tokens = to_table({
'(': LeftParen, ')': RightParen,
'{': LeftBrace, '}': RightBrace,
'.': Dot, ',': Comma, '-': Minus,
'+': Plus, '*': Asterisk,
'>': GreaterThan, '<': LessThan, '=': Equal,
'~': Tilde, '/': Slash, '%': Percentage,
'[': LeftBracket, ']': RightBracket,
':': Colon, '^': Caret, '&': Ampersand,
'|': Pipe, ';': Semicolon})
# Table of all double-character tokens
const double = to_table({"**": DoubleAsterisk,
">>": RightShift,
"<<": LeftShift,
"==": DoubleEqual,
"!=": NotEqual,
">=": GreaterOrEqual,
"<=": LessOrEqual,
"//": FloorDiv,
"+=": InplaceAdd,
"-=": InplaceSub,
"/=": InplaceDiv,
"*=": InplaceMul,
"^=": InplaceXor,
"&=": InplaceAnd,
"|=": InplaceOr,
"%=": InplaceMod,
})
# Table of all triple-character tokens
const triple = to_table({"//=": InplaceFloorDiv,
"**=": InplacePow,
">>=": InplaceRightShift,
"<<=": InplaceLeftShift
})
# Constant table storing all the reserved keywords (which are parsed as identifiers)
const keywords = to_table({
"fun": Fun, "raise": Raise,
"if": If, "else": Else,
"for": For, "while": While,
"var": Var, "nil": Nil,
"true": True, "false": False,
"return": Return, "break": Break,
"continue": Continue, "inf": Infinity,
"nan": NotANumber, "is": Is,
"lambda": Lambda, "class": Class,
"async": Async, "import": Import,
"isnot": IsNot, "from": From,
"const": Const, "not": LogicalNot,
"assert": Assert, "or": LogicalOr,
"and": LogicalAnd, "del": Del,
"async": Async, "await": Await,
"foreach": Foreach, "yield": Yield,
"private": Private, "public": Public,
"static": Static, "dynamic": Dynamic,
"as": As, "of": Of, "defer": Defer,
"except": Except, "finally": Finally,
"try": Try
})
type
Lexer* = ref object
## A lexer object
source: string
tokens: seq[Token]
line: int
start: int
current: int
file: string
lines: seq[tuple[start, stop: int]]
lastLine: int
# Simple public getters
proc getStart*(self: Lexer): int = self.start
proc getCurrent*(self: Lexer): int = self.current
proc getLine*(self: Lexer): int = self.line
proc getSource*(self: Lexer): string = self.source
proc getRelPos*(self: Lexer, line: int): tuple[start, stop: int] = (if line > 1: self.lines[line - 2] else: (start: 0, stop: self.current))
proc initLexer*(self: Lexer = nil): Lexer =
## Initializes the lexer or resets
## the state of an existing one
new(result)
if self != nil:
result = self
result.source = ""
result.tokens = @[]
result.line = 1
result.start = 0
result.current = 0
result.file = ""
result.lines = @[]
result.lastLine = 0
proc done(self: Lexer): bool =
## Returns true if we reached EOF
result = self.current >= self.source.len
proc incLine(self: Lexer) =
## Increments the lexer's line
## and updates internal line
## metadata
self.lines.add((start: self.lastLine, stop: self.current))
self.line += 1
self.lastLine = self.current
proc step(self: Lexer, n: int = 1): string =
## Steps n characters forward in the
## source file (default = 1). A null
## terminator is returned if the lexer
## is at EOF. The amount of skipped
## characters is returned
if self.done():
return "\0"
self.current = self.current + n
result = self.source[self.current..self.current + n]
proc peek(self: Lexer, distance: int = 0): string =
## Returns the character in the source file at
## the given distance, without consuming it.
## The character is converted to a string of
## length one for compatibility with the rest
## of the lexer.
## A null terminator is returned if the lexer
## is at EOF. The distance parameter may be
## negative to retrieve previously consumed
## tokens, while the default distance is 0
## (retrieves the next token to be consumed).
## If the given distance goes beyond EOF, a
## null terminator is returned
if self.done() or self.current + distance > self.source.high():
result = "\0"
else:
# hack to "convert" a char to a string
result = &"{self.source[self.current + distance]}"
proc peek(self: Lexer, distance: int = 0, length: int = 1): string =
## Behaves like self.peek(), but
## can peek more than one character,
## starting from the given distance.
## A string of exactly length characters
## is returned. If the length of the
## desired string goes beyond EOF,
## the resulting string is padded
## with null terminators
var i = distance
while i <= length:
result.add(self.peek(i))
inc(i)
proc error(self: Lexer, message: string) =
## Raises a lexing error with a formatted
## error message
raise newException(LexingError, &"A fatal error occurred while parsing '{self.file}', line {self.line} at '{self.peek()}' -> {message}")
proc check(self: Lexer, what: string, distance: int = 0): bool =
## Behaves like match, without consuming the
## token. False is returned if we're at EOF
## regardless of what the token to check is.
## The distance is passed directly to self.peek()
if self.done():
return false
return self.peek(distance) == what
proc check(self: Lexer, what: string): bool =
## Calls self.check() in a loop with
## each character from the given source
## string. Useful to check multi-character
## strings in one go
for i, chr in what:
# Why "i" you ask? Well, since check
# does not consume the tokens it checks
# against we need some way of keeping
# track where we are in the string the
# caller gave us, otherwise this will
# not behave as expected
if not self.check(&"{chr}", i):
return false
return true
proc check(self: Lexer, what: openarray[string]): bool =
## Calls self.check() in a loop with
## each character from the given seq of
## char and returns at the first match.
## Useful to check multiple tokens in a situation
## where only one of them may match at one time
for s in what:
if self.check(s):
return true
return false
proc match(self: Lexer, what: char): bool =
## Returns true if the next character matches
## the given character, and consumes it.
## Otherwise, false is returned
if self.done():
self.error("unexpected EOF")
return false
elif not self.check(what):
self.error(&"expecting '{what}', got '{self.peek()}' instead")
return false
self.current += 1
return true
proc match(self: Lexer, what: string): bool =
## Calls self.match() in a loop with
## each character from the given source
## string. Useful to match multi-character
## strings in one go
for chr in what:
if not self.match(chr):
return false
return true
proc createToken(self: Lexer, tokenType: TokenType) =
## Creates a token object and adds it to the token
## list
var tok: Token = new(Token)
tok.kind = tokenType
tok.lexeme = self.source[self.start..<self.current]
tok.line = self.line
tok.pos = (start: self.start, stop: self.current)
self.tokens.add(tok)
proc parseEscape(self: Lexer) =
# Boring escape sequence parsing. For more info check out
# https://en.wikipedia.org/wiki/Escape_sequences_in_C.
# As of now, \u and \U are not supported, but they'll
# likely be soon. Another notable limitation is that
# \xhhh and \nnn are limited to the size of a char
# (i.e. uint8, or 256 values)
case self.peek():
of 'a':
self.source[self.current] = cast[char](0x07)
of 'b':
self.source[self.current] = cast[char](0x7f)
of 'e':
self.source[self.current] = cast[char](0x1B)
of 'f':
self.source[self.current] = cast[char](0x0C)
of 'n':
when defined(windows):
# We natively convert LF to CRLF on Windows, and
# gotta thank Microsoft for the extra boilerplate!
self.source[self.current] = cast[char](0x0D)
self.source.insert(self.current + 1, 0X0A)
when defined(darwin):
# Thanks apple, lol
self.source[self.current] = cast[char](0x0A)
when defined(linux):
self.source[self.current] = cast[char](0X0D)
of 'r':
self.source[self.current] = cast[char](0x0D)
of 't':
self.source[self.current] = cast[char](0x09)
of 'v':
self.source[self.current] = cast[char](0x0B)
of '"':
self.source[self.current] = '"'
of '\'':
self.source[self.current] = '\''
of '\\':
self.source[self.current] = cast[char](0x5C)
of '0'..'9':
var code = ""
var value = 0
var i = self.current
while i < self.source.high() and (let c = self.source[
i].toLowerAscii(); c in '0'..'7') and len(code) < 3:
code &= self.source[i]
i += 1
assert parseOct(code, value) == code.len()
if value > uint8.high().int:
self.error("escape sequence value too large (> 255)")
self.source[self.current] = cast[char](value)
of 'u', 'U':
self.error("unicode escape sequences are not supported (yet)")
of 'x':
var code = ""
var value = 0
var i = self.current
while i < self.source.high() and (let c = self.source[
i].toLowerAscii(); c in 'a'..'f' or c in '0'..'9'):
code &= self.source[i]
i += 1
assert parseHex(code, value) == code.len()
if value > uint8.high().int:
self.error("escape sequence value too large (> 255)")
self.source[self.current] = cast[char](value)
else:
self.error(&"invalid escape sequence '\\{self.peek()}'")
proc parseString(self: Lexer, delimiter: char, mode: string = "single") =
## Parses string literals. They can be expressed using matching pairs
## of either single or double quotes. Most C-style escape sequences are
## supported, moreover, a specific prefix may be prepended
## to the string to instruct the lexer on how to parse it:
## - b -> declares a byte string, where each character is
## interpreted as an integer instead of a character
## - r -> declares a raw string literal, where escape sequences
## are not parsed and stay as-is
## - f -> declares a format string, where variables may be
## interpolated using curly braces like f"Hello, {name}!".
## Braces may be escaped using a pair of them, so to represent
## a literal "{" in an f-string, one would use {{ instead
## Multi-line strings can be declared using matching triplets of
## either single or double quotes. They can span across multiple
## lines and escape sequences in them are not parsed, like in raw
## strings, so a multi-line string prefixed with the "r" modifier
## is redundant, although multi-line byte/format strings are supported
while not self.check(delimiter) and not self.done():
if self.check('\n'):
if mode == "multi":
self.incLine()
else:
self.error("unexpected EOL while parsing string literal")
if mode in ["raw", "multi"]:
discard self.step()
if self.check('\\'):
# This madness here serves to get rid of the slash, since \x is mapped
# to a one-byte sequence but the string '\x' actually 2 bytes (or more,
# depending on the specific escape sequence)
self.source = self.source[0..<self.current] & self.source[
self.current + 1..^1]
self.parseEscape()
if mode == "format" and self.check('{'):
discard self.step()
if self.check('{'):
self.source = self.source[0..<self.current] & self.source[
self.current + 1..^1]
continue
while not self.check(['}', '"']):
discard self.step()
if self.check('"'):
self.error("unclosed '{' in format string")
elif mode == "format" and self.check('}'):
if not self.check('}', 1):
self.error("unmatched '}' in format string")
else:
self.source = self.source[0..<self.current] & self.source[
self.current + 1..^1]
discard self.step()
if mode == "multi":
if not self.match(delimiter.repeat(3)):
self.error("unexpected EOL while parsing multi-line string literal")
if self.done():
self.error("unexpected EOF while parsing string literal")
return
else:
discard self.step()
self.createToken(String)
proc parseBinary(self: Lexer) =
## Parses binary numbers
while self.peek().isDigit():
if not self.check(['0', '1']):
self.error(&"invalid digit '{self.peek()}' in binary literal")
discard self.step()
self.createToken(Binary)
# To make our life easier, we pad the binary number in here already
while (self.tokens[^1].lexeme.len() - 2) mod 8 != 0:
self.tokens[^1].lexeme = "0b" & "0" & self.tokens[^1].lexeme[2..^1]
proc parseOctal(self: Lexer) =
## Parses octal numbers
while self.peek().isDigit():
if self.peek() notin '0'..'7':
self.error(&"invalid digit '{self.peek()}' in octal literal")
discard self.step()
self.createToken(Octal)
proc parseHex(self: Lexer) =
## Parses hexadecimal numbers
while self.peek().isAlphaNumeric():
if not self.peek().isDigit() and self.peek().toLowerAscii() notin 'a'..'f':
self.error(&"invalid hexadecimal literal")
discard self.step()
self.createToken(Hex)
proc parseNumber(self: Lexer) =
## Parses numeric literals, which encompass
## integers and floats composed of arabic digits.
## Floats also support scientific notation
## (i.e. 3e14), while the fractional part
## must be separated from the decimal one
## using a dot (which acts as a "comma").
## Literals such as 32.5e3 are also supported.
## The "e" for the scientific notation of floats
## is case-insensitive. Binary number literals are
## expressed using the prefix 0b, hexadecimal
## numbers with the prefix 0x and octal numbers
## with the prefix 0o
case self.peek():
of 'b':
discard self.step()
self.parseBinary()
of 'x':
discard self.step()
self.parseHex()
of 'o':
discard self.step()
self.parseOctal()
else:
var kind: TokenType = Integer
while isDigit(self.peek()):
discard self.step()
if self.check(['e', 'E']):
kind = Float
discard self.step()
while self.peek().isDigit():
discard self.step()
elif self.check('.'):
# TODO: Is there a better way?
discard self.step()
if not isDigit(self.peek()):
self.error("invalid float number literal")
kind = Float
while isDigit(self.peek()):
discard self.step()
if self.check(['e', 'E']):
discard self.step()
while isDigit(self.peek()):
discard self.step()
self.createToken(kind)
proc parseIdentifier(self: Lexer) =
## Parses identifiers and keywords.
## Note that multi-character tokens
## such as UTF runes are not supported
while self.peek().isAlphaNumeric() or self.check('_'):
discard self.step()
var name: string = self.source[self.start..<self.current]
if name in keywords:
# It's a keyword
self.createToken(keywords[name])
else:
# Identifier!
self.createToken(Identifier)
proc next(self: Lexer) =
## Scans a single token. This method is
## called iteratively until the source
## file reaches EOF
if self.done():
return
var single = self.step()
if single in [' ', '\t', '\r', '\f',
'\e']: # We skip whitespaces, tabs and other useless characters
return
elif single == '\n':
self.incLine()
elif single in ['"', '\'']:
if self.check(single) and self.check(single, 1):
# Multiline strings start with 3 quotes
discard self.step(2)
self.parseString(single, "multi")
else:
self.parseString(single)
elif single.isDigit():
self.parseNumber()
elif single.isAlphaNumeric() and self.check(['"', '\'']):
# Like Python, we support bytes and raw literals
case single:
of 'r':
self.parseString(self.step(), "raw")
of 'b':
self.parseString(self.step(), "bytes")
of 'f':
self.parseString(self.step(), "format")
else:
self.error(&"unknown string prefix '{single}'")
elif single.isAlphaNumeric() or single == '_':
self.parseIdentifier()
else:
# Comments are a special case
if single == '#':
while not (self.check('\n') or self.done()):
discard self.step()
return
# We start by checking for multi-character tokens,
# in descending length so //= doesn't translate
# to the pair of tokens (//, =) for example
for key in triple.keys():
if key[0] == single and self.check(key[1..^1]):
discard self.step(2) # We step 2 characters
self.createToken(triple[key])
return
for key in double.keys():
if key[0] == single and self.check(key[1]):
discard self.step()
self.createToken(double[key])
return
if single in tokens:
# Eventually we emit a single token
self.createToken(tokens[single])
else:
self.error(&"unexpected token '{single}'")
proc lex*(self: Lexer, source, file: string): seq[Token] =
## Lexes a source file, converting a stream
## of characters into a series of tokens
discard self.initLexer()
self.source = source
self.file = file
while not self.done():
self.next()
self.start = self.current
self.tokens.add(Token(kind: EndOfFile, lexeme: "",
line: self.line))
return self.tokens

764
src/frontend/meta/ast.nim Normal file
View File

@ -0,0 +1,764 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## An Abstract Syntax Tree (AST) structure for our recursive-descent
## top-down parser. For more info, check out docs/grammar.md
import strformat
import strutils
import token
type
NodeKind* = enum
## Enumeration of the AST
## node types, sorted by
## precedence
# Declarations
classDecl = 0u8,
funDecl,
varDecl,
# Statements
forStmt, # Unused for now (for loops are compiled to while loops)
ifStmt,
returnStmt,
breakStmt,
continueStmt,
whileStmt,
forEachStmt,
blockStmt,
raiseStmt,
assertStmt,
delStmt,
tryStmt,
yieldStmt,
awaitStmt,
fromImportStmt,
importStmt,
deferStmt,
# An expression followed by a semicolon
exprStmt,
# Expressions
assignExpr,
lambdaExpr,
awaitExpr,
yieldExpr,
setItemExpr, # Set expressions like a.b = "c"
binaryExpr,
unaryExpr,
sliceExpr,
callExpr,
getItemExpr, # Get expressions like a.b
# Primary expressions
groupingExpr, # Parenthesized expressions such as (true) and (3 + 4)
trueExpr,
listExpr,
tupleExpr,
dictExpr,
setExpr,
falseExpr,
strExpr,
intExpr,
floatExpr,
hexExpr,
octExpr,
binExpr,
nilExpr,
nanExpr,
infExpr,
identExpr, # Identifier
ASTNode* = ref object of RootObj
## An AST node
kind*: NodeKind
# Regardless of the type of node, we keep the token in the AST node for internal usage.
# This is not shown when the node is printed, but makes it a heck of a lot easier to report
# errors accurately even deep in the compilation pipeline
token*: Token
# Here I would've rather used object variants, and in fact that's what was in
# place before, but not being able to re-declare a field of the same type in
# another case branch is kind of a deal breaker long-term, so until that is
# fixed (check out https://github.com/nim-lang/RFCs/issues/368 for more info)
# I'll stick to using inheritance instead
LiteralExpr* = ref object of ASTNode
# Using a string for literals makes it much easier to handle numeric types, as
# there is no overflow nor underflow or float precision issues during parsing.
# Numbers are just serialized as strings and then converted back to numbers
# before being passed to the VM, which also keeps the door open in the future
# to implementing bignum arithmetic that can take advantage of natively supported
# machine types, meaning that if a numeric type fits into a 64 bit signed/unsigned
# int then it is stored in such a type to save space, otherwise it is just converted
# to a bigint. Bigfloats with arbitrary-precision arithmetic would also be nice,
# although arguably less useful (and probably significantly slower than bigints)
literal*: Token
IntExpr* = ref object of LiteralExpr
OctExpr* = ref object of LiteralExpr
HexExpr* = ref object of LiteralExpr
BinExpr* = ref object of LiteralExpr
FloatExpr* = ref object of LiteralExpr
StrExpr* = ref object of LiteralExpr
# There are technically keywords, not literals!
TrueExpr* = ref object of ASTNode
FalseExpr* = ref object of ASTNode
NilExpr* = ref object of ASTNode
NanExpr* = ref object of ASTNode
InfExpr* = ref object of ASTNode
# Although this is *technically* a literal, Nim doesn't
# allow us to redefine fields from supertypes so it's
# a tough luck for us
ListExpr* = ref object of ASTNode
members*: seq[ASTNode]
SetExpr* = ref object of ListExpr
TupleExpr* = ref object of ListExpr
DictExpr* = ref object of ASTNode
keys*: seq[ASTNode]
values*: seq[ASTNode]
IdentExpr* = ref object of ASTNode
name*: Token
GroupingExpr* = ref object of ASTNode
expression*: ASTNode
GetItemExpr* = ref object of ASTNode
obj*: ASTNode
name*: ASTNode
SetItemExpr* = ref object of GetItemExpr
# Since a setItem expression is just
# a getItem one followed by an assignment,
# inheriting it from getItem makes sense
value*: ASTNode
CallExpr* = ref object of ASTNode
callee*: ASTNode # The thing being called
arguments*: tuple[positionals: seq[ASTNode], keyword: seq[tuple[
name: ASTNode, value: ASTNode]]]
UnaryExpr* = ref object of ASTNode
operator*: Token
a*: ASTNode
BinaryExpr* = ref object of UnaryExpr
# Binary expressions can be seen here as unary
# expressions with an extra operand so we just
# inherit from that and add a second operand
b*: ASTNode
YieldExpr* = ref object of ASTNode
expression*: ASTNode
AwaitExpr* = ref object of ASTNode
awaitee*: ASTNode
LambdaExpr* = ref object of ASTNode
body*: ASTNode
arguments*: seq[ASTNode]
# This is, in order, the list of each default argument
# the function takes. It maps 1:1 with self.arguments
# although it may be shorter (in which case this maps
# 1:1 with what's left of self.arguments after all
# positional arguments have been consumed)
defaults*: seq[ASTNode]
isGenerator*: bool
SliceExpr* = ref object of ASTNode
slicee*: ASTNode
ends*: seq[ASTNode]
AssignExpr* = ref object of ASTNode
name*: ASTNode
value*: ASTNode
ExprStmt* = ref object of ASTNode
expression*: ASTNode
ImportStmt* = ref object of ASTNode
moduleName*: ASTNode
FromImportStmt* = ref object of ASTNode
fromModule*: ASTNode
fromAttributes*: seq[ASTNode]
DelStmt* = ref object of ASTNode
name*: ASTNode
AssertStmt* = ref object of ASTNode
expression*: ASTNode
RaiseStmt* = ref object of ASTNode
exception*: ASTNode
BlockStmt* = ref object of ASTNode
code*: seq[ASTNode]
ForStmt* = ref object of ASTNode
discard # Unused
ForEachStmt* = ref object of ASTNode
identifier*: ASTNode
expression*: ASTNode
body*: ASTNode
DeferStmt* = ref object of ASTNode
deferred*: ASTNode
TryStmt* = ref object of ASTNode
body*: ASTNode
handlers*: seq[tuple[body: ASTNode, exc: ASTNode, name: ASTNode]]
finallyClause*: ASTNode
elseClause*: ASTNode
WhileStmt* = ref object of ASTNode
condition*: ASTNode
body*: ASTNode
AwaitStmt* = ref object of ASTNode
awaitee*: ASTNode
BreakStmt* = ref object of ASTNode
ContinueStmt* = ref object of ASTNode
ReturnStmt* = ref object of ASTNode
value*: ASTNode
IfStmt* = ref object of ASTNode
condition*: ASTNode
thenBranch*: ASTNode
elseBranch*: ASTNode
YieldStmt* = ref object of ASTNode
expression*: ASTNode
Declaration* = ref object of ASTNode
owner*: string # Used for determining if a module can access a given field
closedOver*: bool
VarDecl* = ref object of Declaration
name*: ASTNode
value*: ASTNode
isConst*: bool
isStatic*: bool
isPrivate*: bool
FunDecl* = ref object of Declaration
name*: ASTNode
body*: ASTNode
arguments*: seq[ASTNode]
# This is, in order, the list of each default argument
# the function takes. It maps 1:1 with self.arguments
# although it may be shorter (in which case this maps
# 1:1 with what's left of self.arguments after all
# positional arguments have been consumed)
defaults*: seq[ASTNode]
isAsync*: bool
isGenerator*: bool
isStatic*: bool
isPrivate*: bool
ClassDecl* = ref object of Declaration
name*: ASTNode
body*: ASTNode
parents*: seq[ASTNode]
isStatic*: bool
isPrivate*: bool
Expression* = LiteralExpr | ListExpr | GetItemExpr | SetItemExpr | UnaryExpr | BinaryExpr | CallExpr | AssignExpr |
GroupingExpr | IdentExpr | DictExpr | TupleExpr | SetExpr |
TrueExpr | FalseExpr | NilExpr |
NanExpr | InfExpr
Statement* = ExprStmt | ImportStmt | FromImportStmt | DelStmt | AssertStmt | RaiseStmt | BlockStmt | ForStmt | WhileStmt |
ForStmt | BreakStmt | ContinueStmt | ReturnStmt | IfStmt
proc newASTNode*(kind: NodeKind, token: Token): ASTNode =
## Initializes a new generic ASTNode object
new(result)
result.kind = kind
result.token = token
proc isConst*(self: ASTNode): bool {.inline.} = self.kind in {intExpr, hexExpr, binExpr, octExpr, strExpr,
falseExpr,
trueExpr, infExpr,
nanExpr,
floatExpr, nilExpr}
proc isLiteral*(self: ASTNode): bool {.inline.} = self.isConst() or self.kind in
{tupleExpr, dictExpr, setExpr, listExpr}
proc newIntExpr*(literal: Token): IntExpr =
result = IntExpr(kind: intExpr)
result.literal = literal
result.token = literal
proc newOctExpr*(literal: Token): OctExpr =
result = OctExpr(kind: octExpr)
result.literal = literal
result.token = literal
proc newHexExpr*(literal: Token): HexExpr =
result = HexExpr(kind: hexExpr)
result.literal = literal
result.token = literal
proc newBinExpr*(literal: Token): BinExpr =
result = BinExpr(kind: binExpr)
result.literal = literal
result.token = literal
proc newFloatExpr*(literal: Token): FloatExpr =
result = FloatExpr(kind: floatExpr)
result.literal = literal
result.token = literal
proc newTrueExpr*(token: Token): LiteralExpr = LiteralExpr(kind: trueExpr, token: token)
proc newFalseExpr*(token: Token): LiteralExpr = LiteralExpr(kind: falseExpr, token: token)
proc newNaNExpr*(token: Token): LiteralExpr = LiteralExpr(kind: nanExpr, token: token)
proc newNilExpr*(token: Token): LiteralExpr = LiteralExpr(kind: nilExpr, token: token)
proc newInfExpr*(token: Token): LiteralExpr = LiteralExpr(kind: infExpr, token: token)
proc newStrExpr*(literal: Token): StrExpr =
result = StrExpr(kind: strExpr)
result.literal = literal
result.token = literal
proc newIdentExpr*(name: Token): IdentExpr =
result = IdentExpr(kind: identExpr)
result.name = name
result.token = name
proc newGroupingExpr*(expression: ASTNode, token: Token): GroupingExpr =
result = GroupingExpr(kind: groupingExpr)
result.expression = expression
result.token = token
proc newLambdaExpr*(arguments, defaults: seq[ASTNode], body: ASTNode,
isGenerator: bool, token: Token): LambdaExpr =
result = LambdaExpr(kind: lambdaExpr)
result.body = body
result.arguments = arguments
result.defaults = defaults
result.isGenerator = isGenerator
result.token = token
proc newGetItemExpr*(obj: ASTNode, name: ASTNode, token: Token): GetItemExpr =
result = GetItemExpr(kind: getItemExpr)
result.obj = obj
result.name = name
result.token = token
proc newListExpr*(members: seq[ASTNode], token: Token): ListExpr =
result = ListExpr(kind: listExpr)
result.members = members
result.token = token
proc newSetExpr*(members: seq[ASTNode], token: Token): SetExpr =
result = SetExpr(kind: setExpr)
result.members = members
result.token = token
proc newTupleExpr*(members: seq[ASTNode], token: Token): TupleExpr =
result = TupleExpr(kind: tupleExpr)
result.members = members
result.token = token
proc newDictExpr*(keys, values: seq[ASTNode], token: Token): DictExpr =
result = DictExpr(kind: dictExpr)
result.keys = keys
result.values = values
result.token = token
proc newSetItemExpr*(obj, name, value: ASTNode, token: Token): SetItemExpr =
result = SetItemExpr(kind: setItemExpr)
result.obj = obj
result.name = name
result.value = value
result.token = token
proc newCallExpr*(callee: ASTNode, arguments: tuple[positionals: seq[ASTNode],
keyword: seq[tuple[name: ASTNode, value: ASTNode]]],
token: Token): CallExpr =
result = CallExpr(kind: callExpr)
result.callee = callee
result.arguments = arguments
result.token = token
proc newSliceExpr*(slicee: ASTNode, ends: seq[ASTNode],
token: Token): SliceExpr =
result = SliceExpr(kind: sliceExpr)
result.slicee = slicee
result.ends = ends
result.token = token
proc newUnaryExpr*(operator: Token, a: ASTNode): UnaryExpr =
result = UnaryExpr(kind: unaryExpr)
result.operator = operator
result.a = a
result.token = result.operator
proc newBinaryExpr*(a: ASTNode, operator: Token, b: ASTNode): BinaryExpr =
result = BinaryExpr(kind: binaryExpr)
result.operator = operator
result.a = a
result.b = b
result.token = operator
proc newYieldExpr*(expression: ASTNode, token: Token): YieldExpr =
result = YieldExpr(kind: yieldExpr)
result.expression = expression
result.token = token
proc newAssignExpr*(name, value: ASTNode, token: Token): AssignExpr =
result = AssignExpr(kind: assignExpr)
result.name = name
result.value = value
result.token = token
proc newAwaitExpr*(awaitee: ASTNode, token: Token): AwaitExpr =
result = AwaitExpr(kind: awaitExpr)
result.awaitee = awaitee
result.token = token
proc newExprStmt*(expression: ASTNode, token: Token): ExprStmt =
result = ExprStmt(kind: exprStmt)
result.expression = expression
result.token = token
proc newImportStmt*(moduleName: ASTNode, token: Token): ImportStmt =
result = ImportStmt(kind: importStmt)
result.moduleName = moduleName
result.token = token
proc newFromImportStmt*(fromModule: ASTNode, fromAttributes: seq[ASTNode],
token: Token): FromImportStmt =
result = FromImportStmt(kind: fromImportStmt)
result.fromModule = fromModule
result.fromAttributes = fromAttributes
result.token = token
proc newDelStmt*(name: ASTNode, token: Token): DelStmt =
result = DelStmt(kind: delStmt)
result.name = name
result.token = token
proc newYieldStmt*(expression: ASTNode, token: Token): YieldStmt =
result = YieldStmt(kind: yieldStmt)
result.expression = expression
result.token = token
proc newAwaitStmt*(awaitee: ASTNode, token: Token): AwaitExpr =
result = AwaitExpr(kind: awaitExpr)
result.awaitee = awaitee
result.token = token
proc newAssertStmt*(expression: ASTNode, token: Token): AssertStmt =
result = AssertStmt(kind: assertStmt)
result.expression = expression
result.token = token
proc newDeferStmt*(deferred: ASTNode, token: Token): DeferStmt =
result = DeferStmt(kind: deferStmt)
result.deferred = deferred
result.token = token
proc newRaiseStmt*(exception: ASTNode, token: Token): RaiseStmt =
result = RaiseStmt(kind: raiseStmt)
result.exception = exception
result.token = token
proc newTryStmt*(body: ASTNode, handlers: seq[tuple[body: ASTNode, exc: ASTNode, name: ASTNode]],
finallyClause: ASTNode,
elseClause: ASTNode, token: Token): TryStmt =
result = TryStmt(kind: tryStmt)
result.body = body
result.handlers = handlers
result.finallyClause = finallyClause
result.elseClause = elseClause
result.token = token
proc newBlockStmt*(code: seq[ASTNode], token: Token): BlockStmt =
result = BlockStmt(kind: blockStmt)
result.code = code
result.token = token
proc newWhileStmt*(condition: ASTNode, body: ASTNode, token: Token): WhileStmt =
result = WhileStmt(kind: whileStmt)
result.condition = condition
result.body = body
result.token = token
proc newForEachStmt*(identifier: ASTNode, expression, body: ASTNode,
token: Token): ForEachStmt =
result = ForEachStmt(kind: forEachStmt)
result.identifier = identifier
result.expression = expression
result.body = body
result.token = token
proc newBreakStmt*(token: Token): BreakStmt =
result = BreakStmt(kind: breakStmt)
result.token = token
proc newContinueStmt*(token: Token): ContinueStmt =
result = ContinueStmt(kind: continueStmt)
result.token = token
proc newReturnStmt*(value: ASTNode, token: Token): ReturnStmt =
result = ReturnStmt(kind: returnStmt)
result.value = value
result.token = token
proc newIfStmt*(condition: ASTNode, thenBranch, elseBranch: ASTNode,
token: Token): IfStmt =
result = IfStmt(kind: ifStmt)
result.condition = condition
result.thenBranch = thenBranch
result.elseBranch = elseBranch
result.token = token
proc newVarDecl*(name: ASTNode, value: ASTNode = newNilExpr(Token()),
isStatic: bool = true, isConst: bool = false,
isPrivate: bool = true, token: Token, owner: string,
closedOver: bool): VarDecl =
result = VarDecl(kind: varDecl)
result.name = name
result.value = value
result.isConst = isConst
result.isStatic = isStatic
result.isPrivate = isPrivate
result.token = token
result.owner = owner
proc newFunDecl*(name: ASTNode, arguments, defaults: seq[ASTNode],
body: ASTNode, isStatic: bool = true, isAsync,
isGenerator: bool, isPrivate: bool = true, token: Token,
owner: string, closedOver: bool): FunDecl =
result = FunDecl(kind: funDecl)
result.name = name
result.arguments = arguments
result.defaults = defaults
result.body = body
result.isAsync = isAsync
result.isGenerator = isGenerator
result.isStatic = isStatic
result.isPrivate = isPrivate
result.token = token
result.owner = owner
result.closedOver = closedOver
proc newClassDecl*(name: ASTNode, body: ASTNode,
parents: seq[ASTNode], isStatic: bool = true,
isPrivate: bool = true, token: Token,
owner: string, closedOver: bool): ClassDecl =
result = ClassDecl(kind: classDecl)
result.name = name
result.body = body
result.parents = parents
result.isStatic = isStatic
result.isPrivate = isPrivate
result.token = token
result.owner = owner
result.closedOver = closedOver
proc `$`*(self: ASTNode): string =
if self == nil:
return "nil"
case self.kind:
of intExpr, floatExpr, hexExpr, binExpr, octExpr, strExpr, trueExpr,
falseExpr, nanExpr, nilExpr, infExpr:
if self.kind in {trueExpr, falseExpr, nanExpr, nilExpr, infExpr}:
result &= &"Literal({($self.kind)[0..^5]})"
elif self.kind == strExpr:
result &= &"Literal({LiteralExpr(self).literal.lexeme[1..^2].escape()})"
else:
result &= &"Literal({LiteralExpr(self).literal.lexeme})"
of identExpr:
result &= &"Identifier('{IdentExpr(self).name.lexeme}')"
of groupingExpr:
result &= &"Grouping({GroupingExpr(self).expression})"
of getItemExpr:
var self = GetItemExpr(self)
result &= &"GetItem(obj={self.obj}, name={self.name})"
of setItemExpr:
var self = SetItemExpr(self)
result &= &"SetItem(obj={self.obj}, name={self.value}, value={self.value})"
of callExpr:
var self = CallExpr(self)
result &= &"""Call({self.callee}, arguments=(positionals=[{self.arguments.positionals.join(", ")}], keyword=[{self.arguments.keyword.join(", ")}]))"""
of unaryExpr:
var self = UnaryExpr(self)
result &= &"Unary(Operator('{self.operator.lexeme}'), {self.a})"
of binaryExpr:
var self = BinaryExpr(self)
result &= &"Binary({self.a}, Operator('{self.operator.lexeme}'), {self.b})"
of assignExpr:
var self = AssignExpr(self)
result &= &"Assign(name={self.name}, value={self.value})"
of exprStmt:
var self = ExprStmt(self)
result &= &"ExpressionStatement({self.expression})"
of breakStmt:
result = "Break()"
of importStmt:
var self = ImportStmt(self)
result &= &"Import({self.moduleName})"
of fromImportStmt:
var self = FromImportStmt(self)
result &= &"""FromImport(fromModule={self.fromModule}, fromAttributes=[{self.fromAttributes.join(", ")}])"""
of delStmt:
var self = DelStmt(self)
result &= &"Del({self.name})"
of assertStmt:
var self = AssertStmt(self)
result &= &"Assert({self.expression})"
of raiseStmt:
var self = RaiseStmt(self)
result &= &"Raise({self.exception})"
of blockStmt:
var self = BlockStmt(self)
result &= &"""Block([{self.code.join(", ")}])"""
of whileStmt:
var self = WhileStmt(self)
result &= &"While(condition={self.condition}, body={self.body})"
of forEachStmt:
var self = ForEachStmt(self)
result &= &"ForEach(identifier={self.identifier}, expression={self.expression}, body={self.body})"
of returnStmt:
var self = ReturnStmt(self)
result &= &"Return({self.value})"
of yieldExpr:
var self = YieldExpr(self)
result &= &"Yield({self.expression})"
of awaitExpr:
var self = AwaitExpr(self)
result &= &"Await({self.awaitee})"
of ifStmt:
var self = IfStmt(self)
if self.elseBranch == nil:
result &= &"If(condition={self.condition}, thenBranch={self.thenBranch}, elseBranch=nil)"
else:
result &= &"If(condition={self.condition}, thenBranch={self.thenBranch}, elseBranch={self.elseBranch})"
of yieldStmt:
var self = YieldStmt(self)
result &= &"YieldStmt({self.expression})"
of awaitStmt:
var self = AwaitStmt(self)
result &= &"AwaitStmt({self.awaitee})"
of varDecl:
var self = VarDecl(self)
result &= &"Var(name={self.name}, value={self.value}, const={self.isConst}, static={self.isStatic}, private={self.isPrivate})"
of funDecl:
var self = FunDecl(self)
result &= &"""FunDecl(name={self.name}, body={self.body}, arguments=[{self.arguments.join(", ")}], defaults=[{self.defaults.join(", ")}], async={self.isAsync}, generator={self.isGenerator}, static={self.isStatic}, private={self.isPrivate})"""
of classDecl:
var self = ClassDecl(self)
result &= &"""Class(name={self.name}, body={self.body}, parents=[{self.parents.join(", ")}], static={self.isStatic}, private={self.isPrivate})"""
of tupleExpr:
var self = TupleExpr(self)
result &= &"""Tuple([{self.members.join(", ")}])"""
of setExpr:
var self = SetExpr(self)
result &= &"""Set([{self.members.join(", ")}])"""
of listExpr:
var self = ListExpr(self)
result &= &"""List([{self.members.join(", ")}])"""
of dictExpr:
var self = DictExpr(self)
result &= &"""Dict(keys=[{self.keys.join(", ")}], values=[{self.values.join(", ")}])"""
of lambdaExpr:
var self = LambdaExpr(self)
result &= &"""Lambda(body={self.body}, arguments=[{self.arguments.join(", ")}], defaults=[{self.defaults.join(", ")}], generator={self.isGenerator})"""
of deferStmt:
var self = DeferStmt(self)
result &= &"Defer({self.deferred})"
of sliceExpr:
var self = SliceExpr(self)
result &= &"""Slice({self.slicee}, ends=[{self.ends.join(", ")}])"""
of tryStmt:
var self = TryStmt(self)
result &= &"TryStmt(body={self.body}, handlers={self.handlers}"
if self.finallyClause != nil:
result &= &", finallyClause={self.finallyClause}"
else:
result &= ", finallyClause=nil"
if self.elseClause != nil:
result &= &", elseClause={self.elseClause}"
else:
result &= ", elseClause=nil"
result &= ")"
else:
discard

View File

@ -0,0 +1,297 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Low level bytecode implementation details
import ast
import ../../util/multibyte
import errors
import strutils
import strformat
export ast
type
Chunk* = ref object
## A piece of bytecode.
## Consts represents the constants table the code is referring to.
## Code is the linear sequence of compiled bytecode instructions.
## Lines maps bytecode instructions to line numbers using Run
## Length Encoding. Instructions are encoded in groups whose structure
## follows the following schema:
## - The first integer represents the line number
## - The second integer represents the count of whatever comes after it
## (let's call it c)
## - After c, a sequence of c integers follows
##
## A visual representation may be easier to understand: [1, 2, 3, 4]
## This is to be interpreted as "there are 2 instructions at line 1 whose values
## are 3 and 4"
## This is more efficient than using the naive approach, which would encode
## the same line number multiple times and waste considerable amounts of space.
consts*: seq[ASTNode]
code*: seq[uint8]
lines*: seq[int]
reuseConsts*: bool
OpCode* {.pure.} = enum
## Enum of possible opcodes.
# Note: x represents the
# argument to unary opcodes, while
# a and b represent arguments to binary
# opcodes. Other variable names may be
# used for more complex opcodes. All
# arguments to opcodes (if they take
# arguments) come from popping off the
# stack. Unsupported operations will
# raise TypeError or ValueError exceptions
# and never fail silently
LoadConstant = 0u8, # Pushes constant at position x in the constant table onto the stack
## Binary operators
UnaryNegate, # Pushes the result of -x onto the stack
BinaryAdd, # Pushes the result of a + b onto the stack
BinarySubtract, # Pushes the result of a - b onto the stack
BinaryDivide, # Pushes the result of a / b onto the stack (true division). The result is a float
BinaryFloorDiv, # Pushes the result of a // b onto the stack (integer division). The result is always an integer
BinaryMultiply, # Pushes the result of a * b onto the stack
BinaryPow, # Pushes the result of a ** b (a to the power of b) onto the stack
BinaryMod, # Pushes the result of a % b onto the stack (modulo division)
BinaryShiftRight, # Pushes the result of a >> b (a with bits shifted b times to the right) onto the stack
BinaryShiftLeft, # Pushes the result of a << b (a with bits shifted b times to the left) onto the stack
BinaryXor, # Pushes the result of a ^ b (bitwise exclusive or) onto the stack
BinaryOr, # Pushes the result of a | b (bitwise or) onto the stack
BinaryAnd, # Pushes the result of a & b (bitwise and) onto the stack
UnaryNot, # Pushes the result of ~x (bitwise not) onto the stack
BinaryAs, # Pushes the result of a as b onto the stack (converts a to the type of b. Explicit support from a is required)
BinaryIs, # Pushes the result of a is b onto the stack (true if a and b point to the same object, false otherwise)
BinaryIsNot, # Pushes the result of not (a is b). This could be implemented in terms of BinaryIs, but it's more efficient this way
BinaryOf, # Pushes the result of a of b onto the stack (true if a is a subclass of b, false otherwise)
BinarySlice, # Perform slicing on supported objects (like "hello"[0:2], which yields "he"). The result is pushed onto the stack
BinarySubscript, # Subscript operator, like "hello"[0] (which pushes 'h' onto the stack)
## Binary comparison operators
GreaterThan, # Pushes the result of a > b onto the stack
LessThan, # Pushes the result of a < b onto the stack
EqualTo, # Pushes the result of a == b onto the stack
NotEqualTo, # Pushes the result of a != b onto the stack (optimization for not (a == b))
GreaterOrEqual, # Pushes the result of a >= b onto the stack
LessOrEqual, # Pushes the result of a <= b onto the stack
## Logical operators
LogicalNot, # Pushes true if
LogicalAnd,
LogicalOr,
## Constant opcodes (each of them pushes a singleton on the stack)
Nil,
True,
False,
Nan,
Inf,
## Basic stack operations
Pop, # Pops an element off the stack and discards it
Push, # Pushes x onto the stack
PopN, # Pops x elements off the stack (optimization for exiting scopes and returning from functions)
## Name resolution/handling
LoadAttribute,
DeclareName, # Declares a global dynamically bound name in the current scope
LoadName, # Loads a dynamically bound variable
LoadFast, # Loads a statically bound variable
StoreName, # Sets/updates a dynamically bound variable's value
StoreFast, # Sets/updates a statically bound variable's value
DeleteName, # Unbinds a dynamically bound variable's name from the current scope
DeleteFast, # Unbinds a statically bound variable's name from the current scope
LoadHeap, # Loads a closed-over variable
StoreHeap, # Stores a closed-over variable
## Looping and jumping
Jump, # Absolute, unconditional jump into the bytecode
JumpIfFalse, # Jumps to an absolute index in the bytecode if the value at the top of the stack is falsey
JumpIfTrue, # Jumps to an absolute index in the bytecode if the value at the top of the stack is truthy
JumpIfFalsePop, # Like JumpIfFalse, but it also pops off the stack (regardless of truthyness). Optimization for if statements
JumpIfFalseOrPop, # Jumps to an absolute index in the bytecode if the value at the top of the stack is falsey and pops it otherwise
JumpForwards, # Relative, unconditional, positive jump in the bytecode
JumpBackwards, # Relative, unconditional, negative jump into the bytecode
Break, # Temporary opcode used to signal exiting out of loops
## Long variants of jumps (they use a 24-bit operand instead of a 16-bit one)
LongJump,
LongJumpIfFalse,
LongJumpIfTrue,
LongJumpIfFalsePop,
LongJumpIfFalseOrPop,
LongJumpForwards,
LongJumpBackwards,
## Functions
Call, # Calls a callable object
Return # Returns from the current function
## Exception handling
Raise, # Raises exception x
ReRaise, # Re-raises active exception
BeginTry, # Initiates an exception handling context
FinishTry, # Closes the current exception handling context
## Generators
Yield,
## Coroutines
Await,
## Collection literals
BuildList,
BuildDict,
BuildSet,
BuildTuple,
## Misc
Assert, # Raises an AssertionFailed exception if the value at the top of the stack is falsey
MakeClass, # Builds a class instance from the values at the top of the stack (class object, constructor arguments, etc.)
Slice, # Slices an object (takes 3 arguments: start, stop, step). Pushes the result of a.subscript(b, c, d) onto the stack
GetItem, # Pushes the result of a.getItem(b) onto the stack
ImplicitReturn, # Optimization for returning nil from functions (saves us a VM "clock cycle")
# We group instructions by their operation/operand types for easier handling when debugging
# Simple instructions encompass:
# - Instructions that push onto/pop off the stack unconditionally (True, False, Pop, etc.)
# - Unary and binary operators
const simpleInstructions* = {Return, BinaryAdd, BinaryMultiply,
BinaryDivide, BinarySubtract,
BinaryMod, BinaryPow, Nil,
True, False, OpCode.Nan, OpCode.Inf,
BinaryShiftLeft, BinaryShiftRight,
BinaryXor, LogicalNot, EqualTo,
GreaterThan, LessThan, LoadAttribute,
BinarySlice, Pop, UnaryNegate,
BinaryIs, BinaryAs, GreaterOrEqual,
LessOrEqual, BinaryOr, BinaryAnd,
UnaryNot, BinaryFloorDiv, BinaryOf, Raise,
ReRaise, BeginTry, FinishTry, Yield, Await,
MakeClass, ImplicitReturn}
# Constant instructions are instructions that operate on the bytecode constant table
const constantInstructions* = {LoadConstant, DeclareName, LoadName, StoreName, DeleteName}
# Stack triple instructions operate on the stack at arbitrary offsets and pop arguments off of it in the form
# of 24 bit integers
const stackTripleInstructions* = {Call, StoreFast, DeleteFast, LoadFast, LoadHeap, StoreHeap}
# Stack double instructions operate on the stack at arbitrary offsets and pop arguments off of it in the form
# of 16 bit integers
const stackDoubleInstructions* = {}
# Argument double argument instructions take hardcoded arguments on the stack as 16 bit integers
const argumentDoubleInstructions* = {PopN, }
# Jump instructions jump at relative or absolute bytecode offsets
const jumpInstructions* = {JumpIfFalse, JumpIfFalsePop, JumpForwards, JumpBackwards,
LongJumpIfFalse, LongJumpIfFalsePop, LongJumpForwards,
LongJumpBackwards, JumpIfTrue, LongJumpIfTrue}
# Collection instructions push a built-in collection type onto the stack
const collectionInstructions* = {BuildList, BuildDict, BuildSet, BuildTuple}
proc newChunk*(reuseConsts: bool = true): Chunk =
## Initializes a new, empty chunk
result = Chunk(consts: @[], code: @[], lines: @[], reuseConsts: reuseConsts)
proc `$`*(self: Chunk): string = &"""Chunk(consts=[{self.consts.join(", ")}], code=[{self.code.join(", ")}], lines=[{self.lines.join(", ")}])"""
proc write*(self: Chunk, newByte: uint8, line: int) =
## Adds the given instruction at the provided line number
## to the given chunk object
assert line > 0, "line must be greater than zero"
if self.lines.high() >= 1 and self.lines[^2] == line:
self.lines[^1] += 1
else:
self.lines.add(line)
self.lines.add(1)
self.code.add(newByte)
proc write*(self: Chunk, bytes: openarray[uint8], line: int) =
## Calls write in a loop with all members of the given
## array
for cByte in bytes:
self.write(cByte, line)
proc write*(self: Chunk, newByte: OpCode, line: int) =
## Adds the given instruction at the provided line number
## to the given chunk object
self.write(uint8(newByte), line)
proc write*(self: Chunk, bytes: openarray[OpCode], line: int) =
## Calls write in a loop with all members of the given
## array
for cByte in bytes:
self.write(uint8(cByte), line)
proc getLine*(self: Chunk, idx: int): int =
## Returns the associated line of a given
## instruction index
if self.lines.len < 2:
raise newException(IndexDefect, "the chunk object is empty")
var
count: int
current: int = 0
for n in countup(0, self.lines.high(), 2):
count = self.lines[n + 1]
if idx in current - count..<current + count:
return self.lines[n]
current += count
raise newException(IndexDefect, "index out of range")
proc findOrAddConstant(self: Chunk, constant: ASTNode): int =
## Small optimization function that reuses the same constant
## if it's already been written before (only if self.reuseConsts
## equals true)
if not self.reuseConsts:
return
for i, c in self.consts:
# We cannot use simple equality because the nodes likely have
# different token objects with different values
if c.kind != constant.kind:
continue
if constant.isConst():
var c = LiteralExpr(c)
var constant = LiteralExpr(constant)
if c.literal.lexeme == constant.literal.lexeme:
# This wouldn't work for stuff like 2e3 and 2000.0, but those
# forms are collapsed in the compiler before being written
# to the constants table
return i
elif constant.kind == identExpr:
var c = IdentExpr(c)
var constant = IdentExpr(constant)
if c.name.lexeme == constant.name.lexeme:
return i
else:
continue
self.consts.add(constant)
result = self.consts.high()
proc addConstant*(self: Chunk, constant: ASTNode): array[3, uint8] =
## Writes a constant to a chunk. Returns its index casted to a 3-byte
## sequence (array). Constant indexes are reused if a constant is used
## more than once and self.reuseConsts equals true
if self.consts.len() == 16777215:
# The constant index is a 24 bit unsigned integer, so that's as far
# as we can index into the constant table (the same applies
# to our stack by the way). Not that anyone's ever gonna hit this
# limit in the real world, but you know, just in case
raise newException(CompileError, "cannot encode more than 16777215 constants")
result = self.findOrAddConstant(constant).toTriple()

View File

@ -0,0 +1,21 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
type
## Nim exceptions for internal JAPL failures
NimVMException* = object of CatchableError
LexingError* = object of NimVMException
ParseError* = object of NimVMException
CompileError* = object of NimVMException
SerializationError* = object of NimVMException

View File

@ -0,0 +1,86 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import strformat
import strutils
type
TokenType* {.pure.} = enum
## Token types enumeration
# Booleans
True, False,
# Other singleton types
Infinity, NotANumber, Nil
# Control flow statements
If, Else,
# Looping statements
While, For,
# Keywords
Fun, Break, Lambda,
Continue, Var, Const, Is,
Return, Async, Class, Import, From,
IsNot, Raise, Assert, Del, Await,
Foreach, Yield, Static, Dynamic,
Private, Public, As, Of, Defer, Try,
Except, Finally
# Basic types
Integer, Float, String, Identifier,
Binary, Octal, Hex
# Brackets, parentheses and other
# symbols
LeftParen, RightParen, # ()
LeftBrace, RightBrace, # {}
LeftBracket, RightBracket, # []
Dot, Semicolon, Colon, Comma, # . ; : ,
Plus, Minus, Slash, Asterisk, # + - / *
Percentage, DoubleAsterisk, # % **
Caret, Pipe, Ampersand, Tilde, # ^ | & ~
Equal, GreaterThan, LessThan, # = > <
LessOrEqual, GreaterOrEqual, # >= <=
NotEqual, RightShift, LeftShift, # != >> <<
LogicalAnd, LogicalOr, LogicalNot, FloorDiv, # and or not //
InplaceAdd, InplaceSub, InplaceDiv, # += -= /=
InplaceMod, InplaceMul, InplaceXor, # %= *= ^=
InplaceAnd, InplaceOr, # &= |=
DoubleEqual, InplaceFloorDiv, InplacePow, # == //= **=
InplaceRightShift, InplaceLeftShift
# Miscellaneous
EndOfFile
Token* = ref object
## A token object
kind*: TokenType
lexeme*: string
line*: int
pos*: tuple[start, stop: int]
proc `$`*(self: Token): string =
if self != nil:
result = &"Token(kind={self.kind}, lexeme={$(self.lexeme)}, line={self.line}, pos=({self.pos.start}, {self.pos.stop}))"
else:
result = "nil"

402
src/frontend/optimizer.nim Normal file
View File

@ -0,0 +1,402 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import meta/ast
import meta/token
import parseutils
import strformat
import strutils
import math
type
WarningKind* = enum
unreachableCode,
nameShadowing,
isWithALiteral,
equalityWithSingleton,
valueOverflow,
implicitConversion,
invalidOperation
Warning* = ref object
kind*: WarningKind
node*: ASTNode
Optimizer* = ref object
warnings: seq[Warning]
foldConstants*: bool
proc initOptimizer*(foldConstants: bool = true): Optimizer =
## Initializes a new optimizer object
new(result)
result.foldConstants = foldConstants
result.warnings = @[]
proc newWarning(self: Optimizer, kind: WarningKind, node: ASTNode) =
self.warnings.add(Warning(kind: kind, node: node))
proc `$`*(self: Warning): string = &"Warning(kind={self.kind}, node={self.node})"
# Forward declaration
proc optimizeNode(self: Optimizer, node: ASTNode): ASTNode
proc optimizeConstant(self: Optimizer, node: ASTNode): ASTNode =
## Performs some checks on constant AST nodes such as
## integers. This method converts all of the different
## integer forms (binary, octal and hexadecimal) to
## decimal integers. Overflows are checked here too
if not self.foldConstants:
return node
case node.kind:
of intExpr:
var x: int
var y = IntExpr(node)
try:
assert parseInt(y.literal.lexeme, x) == len(y.literal.lexeme)
except ValueError:
self.newWarning(valueOverflow, node)
result = node
of hexExpr:
var x: int
var y = HexExpr(node)
try:
assert parseHex(y.literal.lexeme, x) == len(y.literal.lexeme)
except ValueError:
self.newWarning(valueOverflow, node)
return node
result = IntExpr(kind: intExpr, literal: Token(kind: Integer, lexeme: $x, line: y.literal.line, pos: (start: -1, stop: -1)))
of binExpr:
var x: int
var y = BinExpr(node)
try:
assert parseBin(y.literal.lexeme, x) == len(y.literal.lexeme)
except ValueError:
self.newWarning(valueOverflow, node)
return node
result = IntExpr(kind: intExpr, literal: Token(kind: Integer, lexeme: $x, line: y.literal.line, pos: (start: -1, stop: -1)))
of octExpr:
var x: int
var y = OctExpr(node)
try:
assert parseOct(y.literal.lexeme, x) == len(y.literal.lexeme)
except ValueError:
self.newWarning(valueOverflow, node)
return node
result = IntExpr(kind: intExpr, literal: Token(kind: Integer, lexeme: $x, line: y.literal.line, pos: (start: -1, stop: -1)))
of floatExpr:
var x: float
var y = FloatExpr(node)
try:
discard parseFloat(y.literal.lexeme, x)
except ValueError:
self.newWarning(valueOverflow, node)
return node
result = FloatExpr(kind: floatExpr, literal: Token(kind: Float, lexeme: $x, line: y.literal.line, pos: (start: -1, stop: -1)))
else:
result = node
proc optimizeUnary(self: Optimizer, node: UnaryExpr): ASTNode =
## Attempts to optimize unary expressions
var a = self.optimizeNode(node.a)
if self.warnings.len() > 0 and self.warnings[^1].kind == valueOverflow and self.warnings[^1].node == a:
# We can't optimize further, the overflow will be caught in the compiler
return UnaryExpr(kind: unaryExpr, a: a, operator: node.operator)
case a.kind:
of intExpr:
var x: int
assert parseInt(IntExpr(a).literal.lexeme, x) == len(IntExpr(a).literal.lexeme)
case node.operator.kind:
of Tilde:
x = not x
of Minus:
x = -x
else:
discard # Unreachable
result = IntExpr(kind: intExpr, literal: Token(kind: Integer, lexeme: $x, line: node.operator.line, pos: (start: -1, stop: -1)))
of floatExpr:
var x: float
discard parseFloat(FloatExpr(a).literal.lexeme, x)
case node.operator.kind:
of Minus:
x = -x
of Tilde:
self.newWarning(invalidOperation, node)
return node
else:
discard
result = FloatExpr(kind: floatExpr, literal: Token(kind: Float, lexeme: $x, line: node.operator.line, pos: (start: -1, stop: -1)))
else:
result = node
proc optimizeBinary(self: Optimizer, node: BinaryExpr): ASTNode =
## Attempts to optimize binary expressions
var a, b: ASTNode
a = self.optimizeNode(node.a)
b = self.optimizeNode(node.b)
if self.warnings.len() > 0 and self.warnings[^1].kind == valueOverflow and (self.warnings[^1].node == a or self.warnings[^1].node == b):
# We can't optimize further, the overflow will be caught in the compiler. We don't return the same node
# because optimizeNode might've been able to optimize one of the two operands and we don't know which
return BinaryExpr(kind: binaryExpr, a: a, b: b, operator: node.operator)
if node.operator.kind == DoubleEqual:
if a.kind in {trueExpr, falseExpr, nilExpr, nanExpr, infExpr}:
self.newWarning(equalityWithSingleton, a)
elif b.kind in {trueExpr, falseExpr, nilExpr, nanExpr, infExpr}:
self.newWarning(equalityWithSingleton, b)
elif node.operator.kind == Is:
if a.kind in {strExpr, intExpr, tupleExpr, dictExpr, listExpr, setExpr}:
self.newWarning(isWithALiteral, a)
elif b.kind in {strExpr, intExpr, tupleExpr, dictExpr, listExpr, setExpr}:
self.newWarning(isWithALiteral, b)
if a.kind == intExpr and b.kind == intExpr:
# Optimizes integer operations
var x, y, z: int
assert parseInt(IntExpr(a).literal.lexeme, x) == IntExpr(a).literal.lexeme.len()
assert parseInt(IntExpr(b).literal.lexeme, y) == IntExpr(b).literal.lexeme.len()
try:
case node.operator.kind:
of Plus:
z = x + y
of Minus:
z = x - y
of Asterisk:
z = x * y
of FloorDiv:
z = int(x / y)
of DoubleAsterisk:
if y >= 0:
z = x ^ y
else:
# Nim's builtin pow operator can't handle
# negative exponents, so we use math's
# pow and convert from/to floats instead
z = pow(x.float, y.float).int
of Percentage:
z = x mod y
of Caret:
z = x xor y
of Ampersand:
z = x and y
of Pipe:
z = x or y
of Slash:
# Special case, yields a float
return FloatExpr(kind: intExpr, literal: Token(kind: Float, lexeme: $(x / y), line: IntExpr(a).literal.line, pos: (start: -1, stop: -1)))
else:
result = BinaryExpr(kind: binaryExpr, a: a, b: b, operator: node.operator)
except OverflowDefect:
self.newWarning(valueOverflow, node)
return BinaryExpr(kind: binaryExpr, a: a, b: b, operator: node.operator)
except RangeDefect:
# TODO: What warning do we raise here?
return BinaryExpr(kind: binaryExpr, a: a, b: b, operator: node.operator)
result = IntExpr(kind: intExpr, literal: Token(kind: Integer, lexeme: $z, line: IntExpr(a).literal.line, pos: (start: -1, stop: -1)))
elif a.kind == floatExpr or b.kind == floatExpr:
var x, y, z: float
if a.kind == intExpr:
var temp: int
assert parseInt(IntExpr(a).literal.lexeme, temp) == IntExpr(a).literal.lexeme.len()
x = float(temp)
self.newWarning(implicitConversion, a)
else:
discard parseFloat(FloatExpr(a).literal.lexeme, x)
if b.kind == intExpr:
var temp: int
assert parseInt(IntExpr(b).literal.lexeme, temp) == IntExpr(b).literal.lexeme.len()
y = float(temp)
self.newWarning(implicitConversion, b)
else:
discard parseFloat(FloatExpr(b).literal.lexeme, y)
# Optimizes float operations
try:
case node.operator.kind:
of Plus:
z = x + y
of Minus:
z = x - y
of Asterisk:
z = x * y
of FloorDiv, Slash:
z = x / y
of DoubleAsterisk:
z = pow(x, y)
of Percentage:
z = x mod y
else:
result = BinaryExpr(kind: binaryExpr, a: a, b: b, operator: node.operator)
except OverflowDefect:
self.newWarning(valueOverflow, node)
return BinaryExpr(kind: binaryExpr, a: a, b: b, operator: node.operator)
result = FloatExpr(kind: floatExpr, literal: Token(kind: Float, lexeme: $z, line: LiteralExpr(a).literal.line, pos: (start: -1, stop: -1)))
elif a.kind == strExpr and b.kind == strExpr:
var a = StrExpr(a)
var b = StrExpr(b)
case node.operator.kind:
of Plus:
result = StrExpr(kind: strExpr, literal: Token(kind: String, lexeme: "'" & a.literal.lexeme[1..<(^1)] & b.literal.lexeme[1..<(^1)] & "'", pos: (start: -1, stop: -1)))
else:
result = node
elif a.kind == strExpr and self.optimizeNode(b).kind == intExpr and not (self.warnings.len() > 0 and self.warnings[^1].kind == valueOverflow and self.warnings[^1].node == b):
var a = StrExpr(a)
var b = IntExpr(b)
var bb: int
assert parseInt(b.literal.lexeme, bb) == b.literal.lexeme.len()
case node.operator.kind:
of Asterisk:
result = StrExpr(kind: strExpr, literal: Token(kind: String, lexeme: "'" & a.literal.lexeme[1..<(^1)].repeat(bb) & "'"))
else:
result = node
elif b.kind == strExpr and self.optimizeNode(a).kind == intExpr and not (self.warnings.len() > 0 and self.warnings[^1].kind == valueOverflow and self.warnings[^1].node == a):
var b = StrExpr(b)
var a = IntExpr(a)
var aa: int
assert parseInt(a.literal.lexeme, aa) == a.literal.lexeme.len()
case node.operator.kind:
of Asterisk:
result = StrExpr(kind: strExpr, literal: Token(kind: String, lexeme: "'" & b.literal.lexeme[1..<(^1)].repeat(aa) & "'"))
else:
result = node
else:
# There's no constant folding we can do!
result = node
proc detectClosures(self: Optimizer, node: FunDecl) =
## Goes trough a function's code and detects
## references to variables in enclosing local
## scopes
var names: seq[Declaration] = @[]
for line in BlockStmt(node.body).code:
case line.kind:
of varDecl:
names.add(VarDecl(line))
of funDecl:
names.add(FunDecl(line))
of classDecl:
names.add(ClassDecl(line))
else:
discard
for name in names:
proc optimizeNode(self: Optimizer, node: ASTNode): ASTNode =
## Analyzes an AST node and attempts to perform
## optimizations on it. If no optimizations can be
## applied or self.foldConstants is set to false,
## then the same node is returned
if not self.foldConstants:
return node
case node.kind:
of exprStmt:
result = newExprStmt(self.optimizeNode(ExprStmt(node).expression), ExprStmt(node).token)
of intExpr, hexExpr, octExpr, binExpr, floatExpr, strExpr:
result = self.optimizeConstant(node)
of unaryExpr:
result = self.optimizeUnary(UnaryExpr(node))
of binaryExpr:
result = self.optimizeBinary(BinaryExpr(node))
of groupingExpr:
# Recursively unnests groups
result = self.optimizeNode(GroupingExpr(node).expression)
of callExpr:
var node = CallExpr(node)
for i, positional in node.arguments.positionals:
node.arguments.positionals[i] = self.optimizeNode(positional)
for i, (key, value) in node.arguments.keyword:
node.arguments.keyword[i].value = self.optimizeNode(value)
result = node
of sliceExpr:
var node = SliceExpr(node)
for i, e in node.ends:
node.ends[i] = self.optimizeNode(e)
node.slicee = self.optimizeNode(node.slicee)
result = node
of tryStmt:
var node = TryStmt(node)
node.body = self.optimizeNode(node.body)
if node.finallyClause != nil:
node.finallyClause = self.optimizeNode(node.finallyClause)
if node.elseClause != nil:
node.elseClause = self.optimizeNode(node.elseClause)
for i, handler in node.handlers:
node.handlers[i].body = self.optimizeNode(node.handlers[i].body)
result = node
of funDecl:
var decl = FunDecl(node)
for i, node in decl.defaults:
decl.defaults[i] = self.optimizeNode(node)
decl.body = self.optimizeNode(decl.body)
result = decl
of blockStmt:
var node = BlockStmt(node)
for i, n in node.code:
node.code[i] = self.optimizeNode(n)
result = node
of varDecl:
var decl = VarDecl(node)
decl.value = self.optimizeNode(decl.value)
result = decl
of assignExpr:
var asgn = AssignExpr(node)
asgn.value = self.optimizeNode(asgn.value)
result = asgn
of listExpr:
var l = ListExpr(node)
for i, e in l.members:
l.members[i] = self.optimizeNode(e)
result = node
of setExpr:
var s = SetExpr(node)
for i, e in s.members:
s.members[i] = self.optimizeNode(e)
result = node
of tupleExpr:
var t = TupleExpr(node)
for i, e in t.members:
t.members[i] = self.optimizeNode(e)
result = node
of dictExpr:
var d = DictExpr(node)
for i, e in d.keys:
d.keys[i] = self.optimizeNode(e)
for i, e in d.values:
d.values[i] = self.optimizeNode(e)
result = node
else:
result = node
proc optimize*(self: Optimizer, tree: seq[ASTNode]): tuple[tree: seq[ASTNode], warnings: seq[Warning]] =
## Runs the optimizer on the given source
## tree and returns a new optimized tree
## as well as a list of warnings that may
## be of interest. The input tree may be
## identical to the output tree if no optimization
## could be performed. Constant folding can be
## turned off by setting foldConstants to false
## when initializing the optimizer object. This
## optimization step also takes care of detecting
## closed-over variables so that the compiler can
## emit appropriate instructions for them later on
var newTree: seq[ASTNode] = @[]
for node in tree:
newTree.add(self.optimizeNode(node))
result = (tree: newTree, warnings: self.warnings)

1096
src/frontend/parser.nim Normal file

File diff suppressed because it is too large Load Diff

273
src/frontend/serializer.nim Normal file
View File

@ -0,0 +1,273 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import meta/ast
import meta/errors
import meta/bytecode
import meta/token
import ../config
import ../util/multibyte
import strformat
import strutils
import nimSHA2
import times
export ast
type
Serializer* = ref object
file: string
filename: string
chunk: Chunk
Serialized* = ref object
## Wrapper returned by
## the Serializer.read*
## procedures to store
## metadata
fileHash*: string
japlVer*: tuple[major, minor, patch: int]
japlBranch*: string
commitHash*: string
compileDate*: int
chunk*: Chunk
proc `$`*(self: Serialized): string =
result = &"Serialized(fileHash={self.fileHash}, version={self.japlVer.major}.{self.japlVer.minor}.{self.japlVer.patch}, branch={self.japlBranch}), commitHash={self.commitHash}, date={self.compileDate}, chunk={self.chunk[]}"
proc error(self: Serializer, message: string) =
## Raises a formatted SerializationError exception
raise newException(SerializationError, &"A fatal error occurred while (de)serializing '{self.filename}' -> {message}")
proc initSerializer*(self: Serializer = nil): Serializer =
new(result)
if self != nil:
result = self
result.file = ""
result.filename = ""
result.chunk = nil
## Basic routines and helpers to convert various objects from and to to their byte representation
proc toBytes(self: Serializer, s: string): seq[byte] =
for c in s:
result.add(byte(c))
proc toBytes(self: Serializer, s: int): array[8, uint8] =
result = cast[array[8, uint8]](s)
proc toBytes(self: Serializer, d: SHA256Digest): seq[byte] =
for b in d:
result.add(b)
proc bytesToString(self: Serializer, input: seq[byte]): string =
for b in input:
result.add(char(b))
proc bytesToInt(self: Serializer, input: array[8, byte]): int =
copyMem(result.addr, input.unsafeAddr, sizeof(int))
proc bytesToInt(self: Serializer, input: array[3, byte]): int =
copyMem(result.addr, input.unsafeAddr, sizeof(byte) * 3)
proc extend[T](s: var seq[T], a: openarray[T]) =
## Extends s with the elements of a
for e in a:
s.add(e)
proc writeHeaders(self: Serializer, stream: var seq[byte], file: string) =
## Writes the JAPL bytecode headers in-place into a byte stream
stream.extend(self.toBytes(BYTECODE_MARKER))
stream.add(byte(JAPL_VERSION.major))
stream.add(byte(JAPL_VERSION.minor))
stream.add(byte(JAPL_VERSION.patch))
stream.add(byte(len(JAPL_BRANCH)))
stream.extend(self.toBytes(JAPL_BRANCH))
if len(JAPL_COMMIT_HASH) != 40:
self.error("the commit hash must be exactly 40 characters long")
stream.extend(self.toBytes(JAPL_COMMIT_HASH))
stream.extend(self.toBytes(getTime().toUnixFloat().int()))
stream.extend(self.toBytes(computeSHA256(file)))
proc writeConstants(self: Serializer, stream: var seq[byte]) =
## Writes the constants table in-place into the given stream
for constant in self.chunk.consts:
case constant.kind:
of intExpr, floatExpr:
stream.add(0x1)
stream.extend(len(constant.token.lexeme).toTriple())
stream.extend(self.toBytes(constant.token.lexeme))
of strExpr:
stream.add(0x2)
var temp: byte
var strip: int = 2
var offset: int = 1
case constant.token.lexeme[0]:
of 'f':
strip = 3
inc(offset)
temp = 0x2
of 'b':
strip = 3
inc(offset)
temp = 0x1
else:
strip = 2
temp = 0x0
stream.extend((len(constant.token.lexeme) - strip).toTriple()) # Removes the quotes from the length count as they're not written
stream.add(temp)
stream.add(self.toBytes(constant.token.lexeme[offset..^2]))
of identExpr:
stream.add(0x0)
stream.extend(len(constant.token.lexeme).toTriple())
stream.add(self.toBytes(constant.token.lexeme))
else:
self.error(&"unknown constant kind in chunk table ({constant.kind})")
stream.add(0x59) # End marker
proc readConstants(self: Serializer, stream: seq[byte]): int =
## Reads the constant table from the given stream and
## adds each constant to the chunk object (note: most compile-time
## information such as the original token objects and line info is lost when
## serializing the data, so those fields are set to nil or some default
## value). Returns the number of bytes that were processed in the stream
var stream = stream
var count: int = 0
while true:
case stream[0]:
of 0x59:
inc(count)
break
of 0x2:
stream = stream[1..^1]
let size = self.bytesToInt([stream[0], stream[1], stream[2]])
stream = stream[3..^1]
var s = newStrExpr(Token(lexeme: ""))
case stream[0]:
of 0x0:
discard
of 0x1:
s.token.lexeme.add("b")
of 0x2:
s.token.lexeme.add("f")
else:
self.error(&"unknown string modifier in chunk table (0x{stream[0].toHex()}")
stream = stream[1..^1]
s.token.lexeme.add("\"")
for i in countup(0, size - 1):
s.token.lexeme.add(cast[char](stream[i]))
s.token.lexeme.add("\"")
stream = stream[size..^1]
self.chunk.consts.add(s)
inc(count, size + 5)
of 0x1:
stream = stream[1..^1]
inc(count)
let size = self.bytesToInt([stream[0], stream[1], stream[2]])
stream = stream[3..^1]
inc(count, 3)
var tok: Token = new(Token)
tok.lexeme = self.bytesToString(stream[0..<size])
if "." in tok.lexeme:
tok.kind = Float
self.chunk.consts.add(newFloatExpr(tok))
else:
tok.kind = Integer
self.chunk.consts.add(newIntExpr(tok))
stream = stream[size..^1]
inc(count, size)
of 0x0:
stream = stream[1..^1]
let size = self.bytesToInt([stream[0], stream[1], stream[2]])
stream = stream[3..^1]
discard self.chunk.addConstant(newIdentExpr(Token(lexeme: self.bytesToString(stream[0..<size]))))
stream = stream[size..^1]
inc(count, size + 4)
else:
self.error(&"unknown constant kind in chunk table (0x{stream[0].toHex()})")
result = count
proc writeCode(self: Serializer, stream: var seq[byte]) =
## Writes the bytecode from the given chunk to the given source
## stream
stream.extend(self.chunk.code.len.toTriple())
stream.extend(self.chunk.code)
proc readCode(self: Serializer, stream: seq[byte]): int =
## Reads the bytecode from a given stream and writes
## it into the given chunk
let size = [stream[0], stream[1], stream[2]].fromTriple()
var stream = stream[3..^1]
for i in countup(0, int(size) - 1):
self.chunk.code.add(stream[i])
assert len(self.chunk.code) == int(size)
return int(size)
proc dumpBytes*(self: Serializer, chunk: Chunk, file, filename: string): seq[byte] =
## Dumps the given bytecode and file to a sequence of bytes and returns it.
## The file argument must be the actual file's content and is needed to compute its SHA256 hash.
self.file = file
self.filename = filename
self.chunk = chunk
self.writeHeaders(result, self.file)
self.writeConstants(result)
self.writeCode(result)
proc loadBytes*(self: Serializer, stream: seq[byte]): Serialized =
## Loads the result from dumpBytes to a Serializer object
## for use in the VM or for inspection
discard self.initSerializer()
new(result)
result.chunk = newChunk()
self.chunk = result.chunk
var stream = stream
try:
if stream[0..<len(BYTECODE_MARKER)] != self.toBytes(BYTECODE_MARKER):
self.error("malformed bytecode marker")
stream = stream[len(BYTECODE_MARKER)..^1]
result.japlVer = (major: int(stream[0]), minor: int(stream[1]), patch: int(stream[2]))
stream = stream[3..^1]
let branchLength = stream[0]
stream = stream[1..^1]
result.japlBranch = self.bytesToString(stream[0..<branchLength])
stream = stream[branchLength..^1]
result.commitHash = self.bytesToString(stream[0..<40]).toLowerAscii()
stream = stream[40..^1]
result.compileDate = self.bytesToInt([stream[0], stream[1], stream[2], stream[3], stream[4], stream[5], stream[6], stream[7]])
stream = stream[8..^1]
result.fileHash = self.bytesToString(stream[0..<32]).toHex().toLowerAscii()
stream = stream[32..^1]
stream = stream[self.readConstants(stream)..^1]
stream = stream[self.readCode(stream)..^1]
except IndexDefect:
self.error("truncated bytecode file")
except AssertionDefect:
self.error("corrupted bytecode file")

186
src/main.nim Normal file
View File

@ -0,0 +1,186 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Test module to wire up JAPL components
import frontend/lexer
import frontend/parser
import frontend/optimizer
import frontend/compiler
import frontend/serializer
import util/debugger
import jale/editor
import jale/templates
import jale/plugin/defaults
import jale/plugin/editor_history
import jale/keycodes
import jale/multiline
import config
const debugLexer = false
const debugParser = false
const debugOptimizer = false
const debugCompiler = true
const debugSerializer = false
import strformat
import strutils
when debugSerializer:
import sequtils
import times
import nimSHA2
proc getLineEditor: LineEditor =
result = newLineEditor()
result.prompt = "=> "
result.populateDefaults() # Setup default keybindings
let hist = result.plugHistory() # Create history object
result.bindHistory(hist) # Set default history keybindings
proc main =
const filename = "test.jpl"
var source: string
var tokens: seq[Token]
var tree: seq[ASTNode]
var optimized: tuple[tree: seq[ASTNode], warnings: seq[Warning]]
var compiled: Chunk
when debugSerializer:
var serialized: Serialized
var serializedRaw: seq[byte]
var keep = true
var lexer = initLexer()
var parser = initParser()
var optimizer = initOptimizer(foldConstants=false)
var compiler = initCompiler()
when debugSerializer:
var serializer = initSerializer()
let lineEditor = getLineEditor()
lineEditor.bindEvent(jeQuit):
keep = false
lineEditor.bindKey("ctrl+a"):
lineEditor.content.home()
lineEditor.bindKey("ctrl+e"):
lineEditor.content.`end`()
echo JAPL_VERSION_STRING
while keep:
try:
stdout.write(">>> ")
source = lineEditor.read()
if source in ["# clear", "#clear"]:
echo "\x1Bc" & JAPL_VERSION_STRING
continue
elif source == "#exit" or source == "# exit":
echo "Goodbye!"
break
elif source == "":
continue
except IOError:
echo ""
break
try:
tokens = lexer.lex(source, filename)
when debugLexer:
echo "Tokenization step: "
for token in tokens:
echo "\t", token
echo ""
tree = parser.parse(tokens, filename)
when debugParser:
echo "Parsing step: "
for node in tree:
echo "\t", node
echo ""
optimized = optimizer.optimize(tree)
when debugOptimizer:
echo &"Optimization step (constant folding enabled: {optimizer.foldConstants}):"
for node in optimized.tree:
echo "\t", node
echo ""
stdout.write(&"Produced warnings: ")
if optimized.warnings.len() > 0:
echo ""
for warning in optimized.warnings:
echo "\t", warning
else:
stdout.write("No warnings produced\n")
echo ""
compiled = compiler.compile(optimized.tree, filename)
when debugCompiler:
echo "Compilation step:"
stdout.write("\t")
echo &"""Raw byte stream: [{compiled.code.join(", ")}]"""
echo "\nBytecode disassembler output below:\n"
disassembleChunk(compiled, filename)
echo ""
when debugSerializer:
serializedRaw = serializer.dumpBytes(compiled, source, filename)
echo "Serialization step: "
stdout.write("\t")
echo &"""Raw hex output: {serializedRaw.mapIt(toHex(it)).join("").toLowerAscii()}"""
echo ""
serialized = serializer.loadBytes(serializedRaw)
echo "Deserialization step:"
echo &"\t- File hash: {serialized.fileHash} (matches: {computeSHA256(source).toHex().toLowerAscii() == serialized.fileHash})"
echo &"\t- JAPL version: {serialized.japlVer.major}.{serialized.japlVer.minor}.{serialized.japlVer.patch} (commit {serialized.commitHash[0..8]} on branch {serialized.japlBranch})"
stdout.write("\t")
echo &"""- Compilation date & time: {fromUnix(serialized.compileDate).format("d/M/yyyy HH:mm:ss")}"""
stdout.write(&"\t- Reconstructed constants table: [")
for i, e in serialized.chunk.consts:
stdout.write(e)
if i < len(serialized.chunk.consts) - 1:
stdout.write(", ")
stdout.write("]\n")
stdout.write(&"\t- Reconstructed bytecode: [")
for i, e in serialized.chunk.code:
stdout.write($e)
if i < len(serialized.chunk.code) - 1:
stdout.write(", ")
stdout.write(&"] (matches: {serialized.chunk.code == compiled.code})\n")
except LexingError:
let lineNo = lexer.getLine()
let relPos = lexer.getRelPos(lineNo)
let line = lexer.getSource().splitLines()[lineNo - 1].strip()
echo getCurrentExceptionMsg()
echo &"Source line: {line}"
echo " ".repeat(relPos.start + len("Source line: ")) & "^".repeat(relPos.stop - relPos.start)
except ParseError:
let lineNo = parser.getCurrentToken().line
let relPos = lexer.getRelPos(lineNo)
let line = lexer.getSource().splitLines()[lineNo - 1].strip()
echo getCurrentExceptionMsg()
echo &"Source line: {line}"
echo " ".repeat(relPos.start + len("Source line: ")) & "^".repeat(relPos.stop - parser.getCurrentToken().lexeme.len())
except CompileError:
let lineNo = compiler.getCurrentNode().token.line
let relPos = lexer.getRelPos(lineNo)
let line = lexer.getSource().splitLines()[lineNo - 1].strip()
echo getCurrentExceptionMsg()
echo &"Source line: {line}"
echo " ".repeat(relPos.start + len("Source line: ")) & "^".repeat(relPos.stop - compiler.getCurrentNode().token.lexeme.len())
when isMainModule:
setControlCHook(proc {.noconv.} = quit(1))
main()

85
src/memory/allocator.nim Normal file
View File

@ -0,0 +1,85 @@
# Copyright 2022 Mattia Giambirtone
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Memory allocator from JAPL
import segfaults
import ../config
when DEBUG_TRACE_ALLOCATION:
import strformat
proc reallocate*(p: pointer, oldSize: int, newSize: int): pointer =
## Wrapper around realloc/dealloc
try:
if newSize == 0 and p != nil:
when DEBUG_TRACE_ALLOCATION:
if oldSize > 1:
echo &"DEBUG - Memory manager: Deallocating {oldSize} bytes"
else:
echo "DEBUG - Memory manager: Deallocating 1 byte"
dealloc(p)
return nil
when DEBUG_TRACE_ALLOCATION:
if pointr == nil and newSize == 0:
echo &"DEBUG - Memory manager: Warning, asked to dealloc() nil pointer from {oldSize} to {newSize} bytes, ignoring request"
if oldSize > 0 and p != nil or oldSize == 0:
when DEBUG_TRACE_ALLOCATION:
if oldSize == 0:
if newSize > 1:
echo &"DEBUG - Memory manager: Allocating {newSize} bytes of memory"
else:
echo "DEBUG - Memory manager: Allocating 1 byte of memory"
else:
echo &"DEBUG - Memory manager: Resizing {oldSize} bytes of memory to {newSize} bytes"
result = realloc(p, newSize)
when DEBUG_TRACE_ALLOCATION:
if oldSize > 0 and pointr == nil:
echo &"DEBUG - Memory manager: Warning, asked to realloc() nil pointer from {oldSize} to {newSize} bytes, ignoring request"
except NilAccessDefect:
stderr.write("JAPL: could not manage memory, segmentation fault\n")
quit(139) # For now, there's not much we can do if we can't get the memory we need, so we exit
template resizeArray*(kind: untyped, pointr: pointer, oldCount, newCount: int): untyped =
## Handy macro (in the C sense of macro, not nim's) to resize a dynamic array
cast[ptr UncheckedArray[kind]](reallocate(pointr, sizeof(kind) * oldCount, sizeof(kind) * newCount))
template freeArray*(kind: untyped, pointr: pointer, oldCount: int): untyped =
## Frees a dynamic array
reallocate(pointr, sizeof(kind) * oldCount, 0)
template free*(kind: untyped, pointr: pointer): untyped =
## Frees a pointer by reallocating its
## size to 0
reallocate(pointr, sizeof(kind), 0)
template growCapacity*(capacity: int): untyped =
## Handy macro used to calculate how much
## more memory is needed when reallocating
## dynamic arrays
if capacity < 8:
8
else:
capacity * ARRAY_GROW_FACTOR
template allocate*(castTo: untyped, sizeTo: untyped, count: int): untyped =
## Allocates an object and casts its pointer to the specified type
cast[ptr castTo](reallocate(nil, 0, sizeof(sizeTo) * count))

195
src/util/debugger.nim Normal file
View File

@ -0,0 +1,195 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ../frontend/meta/bytecode
import ../frontend/meta/ast
import multibyte
import strformat
import strutils
import terminal
proc nl = stdout.write("\n")
proc printDebug(s: string, newline: bool = false) =
stdout.write(&"DEBUG - Disassembler -> {s}")
if newline:
nl()
proc printName(name: string, newline: bool = false) =
setForegroundColor(fgRed)
stdout.write(name)
setForegroundColor(fgGreen)
if newline:
nl()
proc printInstruction(instruction: OpCode, newline: bool = false) =
printDebug("Instruction: ")
printName($instruction)
if newline:
nl()
proc simpleInstruction(instruction: OpCode, offset: int): int =
printInstruction(instruction)
nl()
return offset + 1
proc stackTripleInstruction(instruction: OpCode, chunk: Chunk, offset: int): int =
## Debugs instructions that operate on a single value on the stack using a 24-bit operand
var slot = [chunk.code[offset + 1], chunk.code[offset + 2], chunk.code[offset + 3]].fromTriple()
printInstruction(instruction)
stdout.write(&", points to index ")
setForegroundColor(fgYellow)
stdout.write(&"{slot}")
nl()
return offset + 4
proc stackDoubleInstruction(instruction: OpCode, chunk: Chunk, offset: int): int =
## Debugs instructions that operate on a single value on the stack using a 16-bit operand
var slot = [chunk.code[offset + 1], chunk.code[offset + 2]].fromDouble()
printInstruction(instruction)
stdout.write(&", points to index ")
setForegroundColor(fgYellow)
stdout.write(&"{slot}")
nl()
return offset + 3
proc argumentDoubleInstruction(instruction: OpCode, chunk: Chunk, offset: int): int =
## Debugs instructions that operate on a hardcoded value value on the stack using a 16-bit operand
var slot = [chunk.code[offset + 1], chunk.code[offset + 2]].fromDouble()
printInstruction(instruction)
stdout.write(&", has argument ")
setForegroundColor(fgYellow)
stdout.write(&"{slot}")
nl()
return offset + 3
proc constantInstruction(instruction: OpCode, chunk: Chunk, offset: int): int =
## Debugs instructions that operate on the constant table
var constant = [chunk.code[offset + 1], chunk.code[offset + 2], chunk.code[offset + 3]].fromTriple()
printInstruction(instruction)
stdout.write(&", points to constant at position ")
setForegroundColor(fgYellow)
stdout.write(&"{constant}")
nl()
let obj = chunk.consts[constant]
setForegroundColor(fgGreen)
printDebug("Operand: ")
setForegroundColor(fgYellow)
stdout.write(&"{obj}\n")
setForegroundColor(fgGreen)
printDebug("Value kind: ")
setForegroundColor(fgYellow)
stdout.write(&"{obj.kind}\n")
return offset + 4
proc jumpInstruction(instruction: OpCode, chunk: Chunk, offset: int): int =
## Debugs jumps
var jump: int
case instruction:
of JumpIfFalse, JumpIfTrue, JumpIfFalsePop, JumpForwards, JumpBackwards:
jump = [chunk.code[offset + 1], chunk.code[offset + 2]].fromDouble().int()
of LongJumpIfFalse, LongJumpIfTrue, LongJumpIfFalsePop, LongJumpForwards, LongJumpBackwards:
jump = [chunk.code[offset + 1], chunk.code[offset + 2], chunk.code[offset + 3]].fromTriple().int()
else:
discard # Unreachable
printInstruction(instruction, true)
printDebug("Jump size: ")
setForegroundColor(fgYellow)
stdout.write($jump)
nl()
return offset + 3
proc collectionInstruction(instruction: OpCode, chunk: Chunk, offset: int): int =
## Debugs instructions that push collection types on the stack
var elemCount = int([chunk.code[offset + 1], chunk.code[offset + 2], chunk.code[offset + 3]].fromTriple())
printInstruction(instruction, true)
case instruction:
of BuildList, BuildTuple, BuildSet:
var elements: seq[ASTNode] = @[]
for n in countup(0, elemCount - 1):
elements.add(chunk.consts[n])
printDebug("Elements: ")
setForegroundColor(fgYellow)
stdout.write(&"""[{elements.join(", ")}]""")
setForegroundColor(fgGreen)
of BuildDict:
var elements: seq[tuple[key: ASTNode, value: ASTNode]] = @[]
for n in countup(0, (elemCount - 1) * 2, 2):
elements.add((key: chunk.consts[n], value: chunk.consts[n + 1]))
printDebug("Elements: ")
setForegroundColor(fgYellow)
stdout.write(&"""[{elements.join(", ")}]""")
setForegroundColor(fgGreen)
else:
discard # Unreachable
echo ""
return offset + 4
proc disassembleInstruction*(chunk: Chunk, offset: int): int =
## Takes one bytecode instruction and prints it
setForegroundColor(fgGreen)
printDebug("Offset: ")
setForegroundColor(fgYellow)
echo offset
setForegroundColor(fgGreen)
printDebug("Line: ")
setForegroundColor(fgYellow)
stdout.write(&"{chunk.getLine(offset)}\n")
setForegroundColor(fgGreen)
var opcode = OpCode(chunk.code[offset])
case opcode:
of simpleInstructions:
result = simpleInstruction(opcode, offset)
of constantInstructions:
result = constantInstruction(opcode, chunk, offset)
of stackDoubleInstructions:
result = stackDoubleInstruction(opcode, chunk, offset)
of stackTripleInstructions:
result = stackTripleInstruction(opcode, chunk, offset)
of argumentDoubleInstructions:
result = argumentDoubleInstruction(opcode, chunk, offset)
of jumpInstructions:
result = jumpInstruction(opcode, chunk, offset)
of collectionInstructions:
result = collectionInstruction(opcode, chunk, offset)
else:
echo &"DEBUG - Unknown opcode {opcode} at index {offset}"
result = offset + 1
proc disassembleChunk*(chunk: Chunk, name: string) =
## Takes a chunk of bytecode, and prints it
echo &"==== JAPL Bytecode Debugger - Chunk '{name}' ====\n"
var index = 0
while index < chunk.code.len:
index = disassembleInstruction(chunk, index)
echo ""
setForegroundColor(fgDefault)
echo &"==== Debug session ended - Chunk '{name}' ===="

40
src/util/multibyte.nim Normal file
View File

@ -0,0 +1,40 @@
# Copyright 2022 Mattia Giambirtone & All Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Utilities to convert from/to our 16-bit and 24-bit representations
## of numbers
proc toDouble*(input: int | uint | uint16): array[2, uint8] =
## Converts an int (either int, uint or uint16)
## to an array[2, uint8]
result = cast[array[2, uint8]](uint16(input))
proc toTriple*(input: uint | int): array[3, uint8] =
## Converts an unsigned integer (int is converted
## to an uint and sign is lost!) to an array[3, uint8]
result = cast[array[3, uint8]](uint(input))
proc fromDouble*(input: array[2, uint8]): uint16 =
## Rebuilds the output of toDouble into
## an uint16
copyMem(result.addr, unsafeAddr(input), sizeof(uint16))
proc fromTriple*(input: array[3, uint8]): uint =
## Rebuilds the output of toTriple into
## an uint
copyMem(result.addr, unsafeAddr(input), sizeof(uint8) * 3)