Proto-typed (pt) is a compiled statically typed language with garbage collection and modern syntax.
As the name suggests, proto-typed was designed for writing prototypes and short simple programs in a compiled and typed manner.
This repository contains Proto-typed Compiler (ptc) and the language definition in textual form.
PT is a standalone program that can be used to compile and run proto-typed programs. Just as the language, it has a straight forward and simple design, but can be customized.
It has 3 commands:
build- Build compiles the program.run- Run compiles the program and runs it.see- Run compiles the program, runs it and deletes the binary.
Options for pt has to be specified before the command, for ptc after the command and arguments to the program after --.
- Create your pt program
hello.pt:
print("Hello, World!\n")
- Run the program using
pt:
pt run hello.pt
This will output:
Hello, World!
and create binary hello.
Soon precompiled ptc releases should be available under release section in GitHub (https://github.com/mark-sed/proto-typed/releases).
As mentioned before, ptc relies on LLVM and expects it to be installed. The ptc can be compiled from source using CMake with the following steps.
- Clone the repository (or download it as a zip in GitHub):
git clone https://github.com/mark-sed/proto-typed.git- Enter the repository and create a
builddirectory:
cd proto-typed
mkdir build- Run CMake for pt compiler:
cmake -S . -B build
cmake --build build --target ptc- (Optional) Run CMake for pt utility:
cmake --build build --target ptAfter running this inside of build you should find compiled proto-typed compiler named ptc and compiled pt utility.
Note that pt is currently available only for Linux. To have ptc and pt accessible from anywhere and everything have easy to use, you can use install.sh script to place all the needed files into the expected places:
- (Optional) Setting up pt path and libs:
sudo bash install.shProto-typed (pt) is a compiled statically typed language with garbage collection and modern syntax. But it offers lot more, here is a list of some of the key proto-typed features:
- Compiled.
- Statically typed.
- Support for modules and simple compilation of multiple modules (only main module path needs to be specified, rest will be found by the compiler).
- Automatic memory management (garbage collection).
- Dynamic
anytype. - Maybe types (hold value of specified type or
none). - Function overloading.
- No need for forward declaration of functions.
- Implicit main function generation (code similar to interpreted scripts).
- C library functions and simple invocation of C functions (not yet implemented).
Before going through in depth code features, here are some simple code examples to showcase the syntax.
print("Hello, World!\n")int fib(int n) {
if(n < 2) return n
return fib(n - 1) + fib(n - 2)
}
int index = 42
print("Fibonacci for index "++index++" is "++fib(index))Proto-typed offers simple types and user defined structures. These types can then be constructed into arrays or maybe versions.
All variables are initialized to their default value if not initialized explicitly.
There are in fact 2 groups of types, the first - primitive - does not require runtime initialization (int, float, bool and value none) and the second - composite - which requires runtime initialization (string, maybe type including any, arrays and struct).
Signed whole number represented by 64 bits. Such number can be written in decimal, binary (prefix 0b or 0B), octal (prefix 0q or 0Q) or hexadecimal (prefix 0x or 0X) format.
int a_dec = 42 // Decimal
int a_bin = 0b101010 // Binary
int a_oct = 0q52 // Octal
int a_hex = 0x2A // HexadecimalFloating point number represented by 64 bits. It can be also written using scientific notation.
float not_pi = 3.14159265
float avogardo = 6.02214076e23
float rnd = -22e-8Note that for the purpose of correct range parsing, float cannot start or end with only . - you cannot write 2., but have to write 2.0 and the same for .5, which has to be 0.5.
Boolean value capable of containing either true or false value.
bool slow // false (default value)
bool compiled = trueDynamic unicode array of characters. String support escape sequences.
string boring_text = "sushi\n\tramen."
string cool_text = "寿司\n\t拉麺"
string cooler_text = "🍣\n\t🍜."
string pozdrav = "Ahoj, jak se máš?"String can also be written as "raw strings" or "rstrings", where no escape sequences are parsed. Such strings have to pre prefixed with r:
string raw_text = r"New line is written as \n and \ has to be escaped by another \ in normal strings."Characters can be also encoded using escape sequences, these can be in octal (\q or \Q) or hexadecimal (\x or \X) after their prefix in braces {}:
string space = "\x{20}"
string space8 = "\Q{12}"Unicode characters can also be escaped using the \U or \u prefix followed by the character's hexadecimal code.
string potassium_source = "\U{0001f34c}"Strings are represented as simple objects and therefore their size can be easily determined in constant time, but at the same time string's character buffer is zero terminated making it compatible with many C functions.
String can be also sliced. Slicing has a form of [start..end] or [start,next..end] (range inside of []):
string dna_seqence = "a-c-c-g-t-a-t-g"
string amino_acids = dna_seqence[0,2..length(dna_seqence)]Slice can also use descending range and therefore reverse a string. Keep in mind, that end is not included in the range, but start is and therefore there needs to be some adjustments:
string dna_seqence = "a-c-c-g-t-a-t-g"
int l = length(dna_seqence) - 1
string reversed_aas = dna_seqence[l, l - 2 .. -1]Structs can hold variables of any type, but cannot contain function definitions withing their scope. When referring to a struct type only the name can be used (without the struct keyword).
struct Pair {
int first
int second
}
Pair p
p.first = foo()
p.second = bar()
print(p.first++", "++p.second)Structs can also have default initializers for their elements:
struct Player {
string name = "Unknown"
float strength = 10.5
int x = 350
int y = 200
}Arrays in pt are dynamic (you can think of them as vectors in C++).
Array type is defined by putting [] after type name:
int[] values
for (int i : 0..20) {
append(values, i)
}
for (int c: values) {
print(c++" ")
}Multidimensional arrays (matrices) work in the same way:
int[][] pos = [
[0, 1, 2, 8],
[9, 0, 8, 3],
[2, 3, 5, 6],
[0, 1, 1, 4]
]
string[][] tt = [
["x", "o", "x"],
["x", "x", "o"],
["o", "x", "o"]
]
string center = tt[1][1]Array is also result of slicing. Slicing has a form of [start..end] or [start,next..end] (range inside of []):
float x = [0.2, 1.3, 4.5, 5.0, 0.0, 9.9, 7.1, 1.0]
for (float i : x[1,2..6]) {
print(i++" ") // 1.3 5.0 9.9
}
float y = x[0..2] // [0.2, 1.3] Maybe value can either hold value of its base type or none.
int? x // none by default
x = -7Every maybe value is passed by a reference to a function and can therefore be used to modify input arguments:
void pow2(int? x) {
x = x * x
}
int v = 5
pow2(v)
print(v++"\n")But don't think of maybe values as of pointers or references since any function taking a maybe value can accept any base value, including a constant:
void pow2(int? x) {
x = x * x
}
pow2(42)But there is still a difference when passing maybe and actual value to a function. Maybe to maybe assignment will work only for actual maybe value passed in. In the following example the assignment in function getK assigns address to passed in argument, but when the passed in value is not a maybe type, what is modified is the address of parameter (x), not the address of passed in variable (k):
void getK(int? x) {
any KVAL = 7542
x = KVAL
}
int? maybe_k
int k
getK(maybe_k)
getK(k)
print(maybe_k++" "++k) // 7542 0When assigning a maybe value to another maybe value, both of these will contain the same address and therefore the same value:
string? a = "hi"
string? b
b = a
b = "bye"
print(a) // byeAny type (not surprisingly) can hold any value. But what it holds is only known to the user not the compiler and therefore it is easy to get a runtime error. To extract a value it has to be assigned or implicitly casted.
any x = 42
// code ...
x = "forty two"
string y = x
print(y)Any type has to be internally represented as a maybe type (to be able to hold none), which might cause some problems with maybe to maybe assignment and memory sharing:
any x = 32
int? y
y = x
x = "Ca vas pas"
print(y++"\n") // Incorrect value - string read as intAny type, unlike maybe type, is checked as an address and not a value (as it is only known to the user what type is stored there).
Function type is always a maybe type (the ? is implicit, just like with any) and therefore it can hold none.
Function type has the following syntax:
<return type>(<optional list of argument types>) <variable name>
For example:
bool isBig(int? a, bool warn) {
if(a != none) {
return a > 100
}
if(warn) print("is none")
return false
}
bool(int?, bool) funIsBig
funIsBig = isBig
print(funIsBig(4, false) as string)Function type can also be taken as an argument:
void err_print(string s) {
print("Error: "++s++"\n")
}
void report(void(string) rep_fun, string msg) {
rep_fun(msg)
}
report(err_print, "Oops!")
report(print, "Never mind!")Proto-typed aims for having a simple syntax, that allows good code readability and writing, as it's main purpose is prototyping and small programs.
End of statement in pt can be marked with (1) a new line (\n), (2) a semicolon (;), (3) the end of the certain construct or (4) end of file.
Example (1):
int a = 42
int b = 0which is equivalent to (2, 4):
int a = 42; int b = 0In case of constructs ending with } (3), there is no need for ; or \n:
if(a) {
c = 42;
} print("hello")Proto-typed supports c-style one line comments (//) and multiline comments (/**/):
// Single line comments
/*
Multiline
comment
*/Importing a module is done using the import keyword followed by the module's name (without its extension) or comma separated list of module names:
import bigmath
import window, controller, handlerEvery function, global variable and type (struct) can be accessed from another module.
To access external module symbols the symbol has to be always prefaced by the imported module's name, then 2 colons - scope (::) - followed by the symbols name.
File mod2.pt:
string NAME = "mod2"
int get_key() {
return 42
}File main.pt:
import mod2
print("Key for module " ++ mod2::NAME ++ " is: " ++ mod2::get_key())Functions use c-style syntax of return type followed by the name, arguments and then the body:
int foo(float a, bool b) {
int r
// Code ...
return r
}Proto-typed has a support for function overloading (multiple functions with the same name, but different arguments):
void add(int a, int b) {
print((a + b)++"\n")
}
void add(string a, string b) {
print(a ++ b ++"\n")
}
add(4, 8) // 12
add("h", "i") // hiIf statement's expression has to be boolean value (int or maybe won't be implicitly converted), the same is true for while and do-while. If follows the c-style syntax as well:
if (a == 0) {
foo(c)
} else if (a == 1) {
bar(c)
} else {
baz(c)
}If does not require {} if it is followed only by one statement, but keep in mind that in proto-typed new line is the terminator for a statement and therefore, unlike C, it does not allow for arbitrary amount of new lines after if. Fortunately, proto-typed will emit an error when such incorrect case happens.
if (a)
print("Hi") // Syntax error
else print("Bye") // Syntax error
if (a) print("Hi") // Correct
else print("Bye") // CorrectAlso keep in mind that the statement following if has to be terminated as well, either by a new line or a semicolon and therefore when writing a one-line if-else, one must use semicolon (or {}) after each statement:
if (a) print("Hi") else print("Bye") // Syntax error
if (a) print("Hi"); else print("Bye") // CorrectWhile and do-while loops have the following syntax:
while (a < 10) {
// code
a += 1
}
do {
// code
a += 1
} while (a < 10)For works more like a for each loop, where it iterates over ranges, arrays or strings:
float[] values
// init values
for (float i : values) {
print(i++"\n")
}
string text = "Some text"
for (string letter: text) {
print(letter++" ")
}Proto-typed also offers special type range (see more bellow), which can be used counted for loops:
for (int i : 0..5) {
print(i++" ") // 0 1 2 3 4
}
for (int j : 0,2..5) {
print(j++" ") // 0 2 4
}
string text = "Some more text"
for (int k : 0..length(text)) {
print(k++": "++text[k]++"\n")
}Following table contains pt operators from highest precedence to the lowest.
| Operator | Description | Associativity |
|---|---|---|
:: |
Module scope | none |
(), [], [..] |
Function call, array indexing, slicing | none |
as |
Type casting | left |
. |
Structure member access | left |
not, ~ |
Logical NOT, bitwise NOT | right |
** |
Exponentiation (returns float) |
right |
*, /, % |
Multiplication, division, reminder | left |
+, - |
Addition/array join, subtraction | left |
<<, >> |
Bitwise left shift, bitwise right shift | left |
.. |
Range | left |
>, >=, <, <= |
Bigger than, bigger than or equal, less than, less than or equal | left |
==, != |
Equality, inequality | left |
& |
Bitwise AND | left |
^ |
Bitwise XOR | left |
| |
Bitwise OR | left |
in |
Membership | left |
and |
Logical AND | left |
or |
Logical OR | left |
++ |
Concatenation | left |
=, ++=, **=, +=, -=, /=, *=, %=, &=, |=, ^=, ~=, <<=, >>= |
Assignment, compound assignments | left |
Proto-typed does not provide some of the higher level constructs such as classes and objects, but at the same time tries to provide abstractions for simple and quick coding. Example of such abstractions is the memory managements handled by the garbage collector or not present pointer type.
Main function is implicitly generated by the compiler and it's job is to initialize modules and execute entry function of the current module.
This means that pt module does not contain any main function, but it works with the global scope as this function, but beware that all the functions and variables declared here are still global and accessible from other modules.
Entry function - _entry - contains all the global scope code (statements that are not declaration).
Entry function is called only for the main module, so if it needs to be executed for imported modules it is needed to call it explicitly.
import mod2
mod2::_entry()Module could possibly call its own entry function.
Casting can be done only to non-maybe type, unless the casted value is of type any. The reason for this is that maybe is dynamically allocated memory and casting (reading) this memory as different type (size) does not make sense. For any, this cast only reads the value.
void foo(string? c) {
c = "changed"
}
int? a = 4
foo(a as string?) // Error
// This does not make sense as a would be
// modified to contain string
string? str_a
str_a = a as string
foo(str_a) // WorksIf you really wish to play god and treat memory as different type, you can utilize the any type for this:
any ivalue = 1
float? fvalue
fvalue = ivalue as float?Calls to standard library do not require any module name prefix and any of the functions can be overridden by custom definitions.
length- String length.int length(string s)- Returns the length of the strings.
to_string- Conversion of types to string. This is used by theasoperator.string to_string(int i)- Converts intito string.string to_string(float f)- Converts floatfto string.string to_string(bool b)- Converts boolbto string.
mto_string- Conversion of maybe types to string (fornonewill return "none"). This is used by theasoperator.string to_string(int? i)- Converts maybe intito string.string to_string(float? f)- Converts maybe floatfto string.string to_string(bool? b)- Converts maybe boolbto string.string to_string(string? s)- Converts maybe stringsto string. This is useful only for printing asnonewill be indistinguishable from string "none".
- From string - Conversion from string to other types. This is used by the
asoperator, butasexpects valid value (will ignorenoneand crash).int? to_int(string str)- Converts string number in base 10 and returns it as int or none if it is not an integer.int? to_int_base(string str, int base)- Converts string number in basebaseand returns it as int or none if it is not an integer in given base.float? to_float(string str)- Converts string float and returns it a float or none if it is not a float.
find- Find substring.int find(string s1, string s2)- Returns first index of substrings2in strings1or -1 if not found.
contains- Check if string contains substring (equivalent tos2 in s1).bool contains(string s1, string s2)- Returns true if strings1contains string s2, false otherwise.
reverse- String in reverse.string reverse(string s)- Returns copy of stringsreversed.
slice- Slices a string (equivalent tos[start, next..end]). Range can be also descending for reversed string.string slice(string s, int start, int end)- Slices string from indexstart, with step 1 or -1 until indexend.string slice(string s, int start, int next, int end)- Slices string from indexstart, with stepnext - startuntil indexend.
- Case conversion - Conversion to uppercase.
string upper(string s)- Returns copy of stringsin uppercase.string lower(string s)- Returns copy of stringsin lowercase.
ord- Converts letter of string to its integer value.int ord(string s)- Converts first letter ofsto its integer value.
chr- Converts integer value to corresponding letter.string chr(int i)- Converts integeriinto corresponding letter and returns it as a string.
These functions are templated for any array (matrix) type, the type T stands for this general array type. Type TBase stands for the base type of T (e.g.: T might be int[][] and then TBase would be int[]). Type TElem is the base non-array type of T (e.g.: T might be int[][] and then TElem would be int).
append- Append to an array.void append(T a, TBase v)- Appendvat the end of arraya.void mappend(T a, TBase? v)- Append maybe valuevinto arraya.
insert- Insert to an array.void insert(T a, TBase v, int index)- Insertvat indexindex(from 0 tolength(a)) into arraya.void minsert(T a, TBase? v, int index)- Insert maybe valuevat indexindex(from 0 tolength(a)) into arraya.
remove- Remove value from an array.void remove(T a, int index)- Removes value of the arrayaat indexindex.
length- Array length.int length(T a)- Returns the length of the arraya.
equals- Array equality (equivalent toa1 == a2)bool equals(T a1, T a2)- Returns true if all value ina1are equal to those ina2.
find- Find value in an array.int find(T a, TElem e)- Returns index of valueein arrayaor -1 if it does not exist there.
contains- Check if value is present in an array (equivalent toe in a)bool contains(T a, TElem e)- Returns true if valueeis in arraya, false otherwise.
reverse- Array in reverse.T reverse(T a)- Returns copy of arrayareversed.
slice- Slices an array (equivalent toa[start, next..end]). Range can be also descending for reversed array.T slice(T a, int start, int end)- Slices array from indexstart, with step 1 or -1 until indexend.T slice(T a, int start, int next, int end)- Slices array from indexstart, with stepnext - startuntil indexend.
sort- Sorts an array.void sort(T a, bool(TElem, TElem) cmp)- Sorts arrayausing comparison functioncmp.
These functions are templated and type S represent generic struct type in this case.
equals- Struct equality (equivalent tos1 == s2)bool equals(S s1, S s2)- Returns true if values of all elements ins1are equal to those ins2.
- Math constants
M_PI- Ludolph's number.M_E- Euler's number.M_PHI- Golden ratio ((1 + 5^0.5)/2).M_EGAMMA- Euler–Mascheroni constant.
- Trigonometric functions
float sin(float x)- Sine ofx.float cos(float x)- Cosine ofx.float tan(float x)- Tangent ofx.
abs- Absolute value.int abs(int x)float abs(float x)
sum- Sum of all values in an arrayint sum(int[] arr)float sum(float[] arr)
int gcd(int a, int b)- Greatest common divisor ofaandb.int lcm(int a, int b)- Least common multiple ofaandb.- Logarithm - Computes logarithm.
float ln(float x)- Natural (base e) logarithm ofx.float log10(float x)- Common (base 10) logarithm ofx.
int system(string cmd)- Calls host environment command processor withcmd. Return value is implementation-defined value.string getenv(string name)- Return value of environemnt variable.bool setenv(string name, string value, bool overwrite)- Sets environment variable to passed in value.- Pseudo-random number generation - Functions for pseudo random number generation. Seed is initialized implicitly.
void set_seed(int)- Sets seed for generator.int rand_int(int min, int max)- Random integer number betweenminandmax(including).float rand_float(float min, float max)- Random float number betweenminandmax(including).float rand_float()- Random float between 0.0 and 1.0 (including).bool rand_bool()- Random boolean.int rand_uint()- Returns random unsigned integer.
int timestamp()- Current time since epoch (timestamp).
IO works with built-in structure File.
File open(string path, string mode)- Opens a file and returns File structure handle.pathis the relative or absolute path to the file and mode is one of:"r"(read),"w"(write),"a"(append),"r+"(read/update),"w+"(write/update),"a+"(append/update). For binary files suffix the mode withb("rb"). On failure File.handle will be 0.bool close(File f)- Closes opened file. On succeess returns true, false otherwise.- Reading whole input
string read(File f)- Reads the whole filefand returns it as a string.
- Reading one character
string getc(File f)- Reads one character from a file and returns it as a string (the string is empty ifEOFis reached).string getc()- Reads one character from stdin and returns it as a string (the string is empty ifEOFis reached).
- Input
string input()- Reads one line from stdin.string input(string prompt)- Prints outpromptand then reads input from stdin.
- Reading command line arguments
string[] args- This is a global variable, not a function. It contains all the command line arguments passed to the program.
Proto-typed compiler (ptc) uses LLVM and can target any of big amount of targets LLVM can compile for. The ptc also relies on LibC.
If you encounter any issues not mentioned here, you can report it in GitHub issues section.
Because of some types requiring initialization, only non-initialized types can be assigned to a global variable in its declaration, but a workaround is to assign it after the declaration:
int a = 4
int[] arr_wrong = [a, a+1, a+2] // Won't currently work
int[] arr_corr
arr_corr = [a, a+1, a+2] // Will work