home
menu
lock
Editing — Passerine
arrow_back
Back
add
Create a new page
delete
Delete this page
Editing
You must be authenticated for your changes to save. Styling with
Markdown
is supported.
Over the past few years or so, an idea for a novel programming language has been simmering in my mind. I've been calling this language Passerine, out of a lack of a better name. In short, Passerine is a small core language that is easily extensible, which employs clever memory-management and execution techniques. Passerine has roots in Scheme and ML-flavored languages — it arose from the slight dissatisfaction yet enormous potential I feel those languages have. So, how does Passerine work? At its core, Passerine is lambda-calculus with pattern-matching, structural types, fiber-based concurrency, and syntactic extension. At the core of Passerine lies the *pattern*, which when given an *object*, will de-structure it into a set of bindings. It's best to just show you an example. Here, we're de-structuring a tuple into a set of bindings: ```plain (x, y) = (1, (2, 3)) ``` In this case, `x` would be bound to `1`, and `y` would be bound to the tuple `(2, 3)`. The `=` sign in the above example denotes assignment, mimetic of other programming languages. To encapsulate computations, functions are used. Functions take *n* patterns and an expression whose evaluation is delayed. Upon calling a function, the arguments are matched against the parametric patterns, and the expression in then evaluated in a new *scope* containing the produced bindings. Here, we're defining a function that increments a number: ```plain n -> n + 1 ``` The `->` (read *lambda*) operator constructs a function, with the pattern being on the left, and the expression being on the right. Of course, the above function merely produces a function object. To evaluate a function, it must be called; to call a function, simply append the necessary arguments: ```plain (n -> n + 1) 3 ``` The value of this expression would be `4`. Parenthesis are used for grouping; otherwise, the semantics of the above expression might be ambiguous, or straight-up incorrect. Assignment, in combination with lambdas is a powerful thing. To name a function, simply assign it to an identifier. For example, here's a definition of Fibonacci number calculation through recursive means: ```plain fib = n -> if n <= 1 { 1 } else { fib (n - 1) + fib (n - 2) } ``` Curly-braces define a *block* which can be thought of as fancy grouping. Within a block, *n* expressions can be listed, separated by newlines or semicolons; the block takes on the value of the last expression. Although parenthesis would've sufficed in the above example, it's stylistically more coherent to use parenthesis for grouping, and blocks for regions of computation. So, for example, the block: ```plain { x = 1; y = 2 x + y } ``` Would have the value `3`. At this point, you may be thinking that the syntax being used is a bit obscure. After all, in how many other languages are all functions anonymous and defined with `->`? Rather than obscure, I like to think of this syntax as simple - we've already covered 3/5ths of the core language. So what remains? Just syntactic-extension and pattern-matching. To extent passerine's syntax, syntactic-extensions are used. Extensions, quite simply, are bits of code that hygienically produce more code when invoked at compile time. Extensions use a simple yet powerful templating language to transform code. Here's a simple example of a swap operator: ```plain syntax a '<-> b { tmp = a a = b b = tmp } x = 7 y = 3 x <-> y ``` Extensions are defined with the `syntax` keyword, followed by *n* arguments, followed by the code that the captured arguments will be spliced into. Note that the above code is completely hygienic. the expanded macro looks something like this: ```plain _tmp = x x = y x = _tmp ``` because `tmp` was not passed as a syntactic parameter, all uses of tmp in the function body are unique unrepresentable variables that do not collide with any other variables currently bound in scope. Essentially: ```plain tmp = 1 x = 2 y = 3 x <-> y ``` Will not affect the value of `tmp`; `tmp` will still be `1`. So, what does the syntax look like? We're talking about what goes in between: ```plain syntax ... {} ``` Each item between `syntax` and the body is referred to as a syntactic argument. arguments can be: - Literal *textual identifiers*, which prefixed with a quote (`'`). - Sub-syntactic arguments, followed by *modifiers*. - Patterns with support for sub-syntactic arguments. So, first off, what are textual identifiers? Identifiers are literal names that must be present for the pattern to match. Each syntactic extension is required to have at least one. For example, an extension that matches a *for loop*: ```plain syntax 'for binding 'in values do { ... } ``` In this case, `'for` and `'in` are textual identifiers. For example, this definition could be used as follows: ```plain for a in [1, 2, 3] { print a } ``` Here, `a` is the `binding`, `[1, 2, 3]` are the `values`, and `{ print a }` is the `do`. Operators can be defined this way as well: ```plain syntax sequence 'contains value { c = s -> { h:t = s if h == val { True } else { match t { [] -> False rest -> c rest } } } c sequence } ``` This defines a contains operator that could be used as follows: ```plain print { if [1, 2, 3] contains 2 { "It contains 2" } else { "It does not contain 2" } } ``` Evidently, `It contains 2` would be printed. What are modifiers? Modifiers are postfix symbols that allow for argument flexibility. The following modifiers exist: - Zero or more (`...`) - Optional (`?`) Additionally, parenthesis are used for grouping, and `{ ... }` are used to match expressions within blocks. Here are some syntactic arguments that match an `if else` statement like this one: ```plain if x == 0 { print "Zero" } else if x % 2 == 0 { print "Even" } else { print "Not even or zero" } ``` The syntactic arg. must match a beginning `if`: ```plain syntax 'if { ... } ``` Then, a condition: ```plain syntax 'if condition { ... } ``` Then, the first block: ```plain syntax 'if condition then { ... } ``` Next, we'll need a number of `else if
` statements: ```plain syntax 'if condition then ('else 'if other do)... { ... } ``` Followed by a required closing else (Passerine is expression oriented and type-checked, so a closing `else` ensures that an `if` expression always returns a value.): ```plain syntax 'if condition then ('else 'if other do)... 'else finally { ... } ``` Of course, if statements are already baked into the language - let's try something else - a `match` expression. A match expression takes a value and a number of functions, and tries to apply the value to each function until one successfully matches and runs. A match statement looks as follows: ```plain x = Some (1, 2, 3) match x { Some (m, x, b) -> m * x + b None -> 0 } ``` If no branches are matched, an error is raised. The syntactic arguments for this extension follows the following layout: ```plain syntax 'match value { arms... } { ... } ``` > ## Note: > At this point, I realized that full functional anonymity makes non-local returns really hard; Here's what I'm getting at: > say we have the following definition: > > ```plain > foo = a -> { > bar = n -> { > return n + 1 > } > bar a > } > ``` > > Now obviously, we'd expect the `return` in `bar` to return from `bar`, simple enough. Now look at the following example: > > ```plain > -- takes a list and position, > -- returns `Some index` if the value exists, > -- `None` otherwise. > find = list value -> { > for (index, v) in (enumerate list) { > if v == value { > return Some index > } > } > > return None > } > ``` > > Now this is where it gets tricky. `for` is a macro that construct an invisible function, so the above really looks more like the following: > > ```plain > find = list value -> { > _body = (index, v) -> if v == value { > return Some index > } > _loop = _remaining { > _maybe_item = (head _remaining) > _remaining = (tail _remaining) > if _item is Some _ { > Some _item = _maybe_item > body _item > _loop _remaining > } > } > } > ``` > > The main issue with this is that the return in the generated function `_body` is ambiguous - does it return from `_body`, or does it return from find? > > One way to get around this is by explicitly naming which function is being returned from; for instance: > > ```plain > find = list value -> { > for (index, v) in (enumerate list) { > if v == value { > find return Some index -- note func return body > } > } > > return None > } > ``` > > The issue with this is that it's slightly more verbose, and doesn't work for unnamed functions (unless you want to get all `super` `self`). Another method is just dropping the imperative style entirely; for instance: > > ```plain > find = list value -> { > let [head | tail] = list > if head == value { > Some 0 > } else { > match (find tail value) { > Some position -> Some (1 + position) > None -> None > } > } > } > ``` > > This, while pretty cool, is limiting. Writing functional code is nice and all, but imperative style code is easier to follow. If you think of a semantical solution that doesn't require any syntactic changes, please reach out :) main challenge - how to effectively implement returning and yielding? ```plain { for arm in arms { let result = try (value arm) if result is (Ok v) { } else if result is (Err (Match _)) { () } else { raise result } } } ```