Introducing Minerva

Posted September 15th, 2021

Introduction

For the past few months I have been working on a new language I'm calling Minerva. It's still in the early stages, but I think it's a good time for me to announce it publicly, to get some feedback and formulate my next steps.

Why make another programming language?

Because it's interesting and fun

It doesn't take much work to make something that can do a lot. But a lot of design decisions go into making a complete language.

It gives me a new appreciation for some programming languages.

Some things that seem trivial are actually very complicated. And some things that took a decade for some languages to implement took me a weekend.

To Remove GoTo ish constructs

In Dijkstra's paper Go To Statement Considered Harmful, Edgar Dijkstra argued that GoTo statements make it harder to reason about code, because they make it harder to bridge the gap between the static structure of the code and the dynamic structure of the execution. He advocated using proper control flow structures instead, so that the textual structure of the code matched the execution.

I think that there are still language constructs that are similar to GoTo statements. For example, return statements, exceptions, break statements, and continue statements. Code with any of these are harder to reason about, and they impact all of the surrounding code. They require the programmer to mentally simulate a computer, rather than glance at the code and tell what it's doing.

Provide Convenient Control Flow Structures

Given that I've removed some common control flow features, some things have to take their place. Some of these are already provided - such as if and switch in mainstream languages, and some patterns are not. I want to provide a few more common structures, as well as a way to create even more.

There is a way we talk about programs to people who use different languages or even non-programmers, and then there is how our code actually looks like. Some of this is a result of developer discipline, and some of this is to do with the affordances of a language.

I'm not saying that we should execute natural language. What I am saying is that I want a programming language with structures that correspond to the concepts programmers talk about in a way that is obvious at a glance.

The following stack exchange threads were motivations for this: https://stackoverflow.com/questions/4293744/useful-alternative-control-structures

https://softwareengineering.stackexchange.com/questions/22070/which-useful-alternative-control-structures-do-you-know

To give a few examples, in pseudocode a programmer might come up with something like this:

until (condition) {
	// do stuff
}

But when they actually write the code, they have to do this:

while (!condition) {
	// do stuff
}

This is a trivial example, but it's an extra step, and makes it harder to reason about your code when it's combined with other code. I think there are lots of situations like this, where there is a control flow structure that the programmer is imagining, but then they have to translate it to the few available control flow structures, often by adding extra variable to track state whilst doing the other operations that are unrelated to control flow. It's not quite as bad as using GoTos to write loops whilst writing code to calculate accounts, but it's the same principle. If it turns out that there are too many control flow structures to provide all of them from the language, then I will either need to provide macro support or maybe synctactic support for DSLs. E.g, in Kotlin unless would look like this:

fun unless(condition: Boolean, block: () -> Unit) {
	while (!condition) {
		block()
	}
}

unless (condition) {
	// do stuff
}

Be similar to mainstream languages

I'm responding to pain points I've encountered in mainstream languages that I think are tractable. It may be that the best programming language is fundamentally different to the current paradigm, or to any existing paradigm. But I think the problems I'm trying to solve can be greatly improved in current languages. I'd like to investigate that, and see if I have something mainstream languages can adopt.

Highlights

You can create variables

var message = "Hello world";

It has static type inference, so you don't need to write this:

var message: String = "Hello world";

Blocks are expressions, they evaluate to the last expression:

var message = {
	"Hello world";
};

There are if statements, and if expressions:

var age = 20;

if (age > 18) {
	print "You are an adult";
};

var ageCategory = 
	if (age < 18)
		"minor"
	else
		"adult";

You can also create while statements:

var iterations = 0;

while (iterations < 10) {
	print iterations;
	iterations = iterations + 1;
}

And functions:

function foo() => {
	print "Hello World!";
};

function double(number: Int) => number * 2;

var square(number: Int) => number * number;

function factorial(number: Int) : Int =>
	if (number == 0) 1
	else number * factorial(number-1);

foo(); // expect "Hello World!"
print double(4); // expect 8
print square(3); // expect 9
print factorial(5); // expect 120

And match expressions:

function fibonacci(number: Int): number => 
	match(number) {
		0 => 1;
		1 => 1;
		else => fibonacci(number-1) + fibonacci(number-2);
	};

You can also create classes:

// constructor parameters
// those prefixed with var are automatically set as fields
class Parent(var name: String, message: String) {
	constructor {
		print message;
	};
	
	function printName() => {
		print this.name;
	};
};

class Child(name, var age: Int) extends Parent(name) {
}

	function printInfo() {
		this.printName();
		print this.age;
	};
};

var child = Child("John", 25);
child.printInfo();

The type system can do other cool stuff:


var magic: Int | String | (Int)=>String = "hmm";

print typematch(magic) {
	Int => magic;
	String => magic;
	(Int)=> String => magic(3);
}

Notice that I have not used a single return statement. I think that return statements, break statements, and continue statements are go to like features, and evidence of not having the right control flow structures. So far, turning most statements into expressions has eliminated the need for return statements.

Next Steps

I'll start with my high level goals, and the things I would have to implement to get there. This will often depend on some other feature, so it will probably be implemented in reverse order:

Bootstrapping. I want to implement the compiler in itself. I think this would be a good way to make sure that the language adaquately meets its goals - better control flow strucures in an imperative framework.
Backends: To achieve the previous goal, I would have to create a more performant implementation. At the moment I only have a tree-walking interpreter. Writing a tree walking interpreter in a tree walking interpreter would be very slow and memory intensive. I also want to see if this hypothesis is true: creating more structured forms for control flow can be better optimised for things like cache friendliness. Since my main focus is the frontend, targeting other backends is not very urgent. That said, targeting other backends can make it more usable, and it's fun. I'd like to create the following backends.
- Pallas VM. Minerva is the Roman equivalent of the Greek Goddess Athena. Athena is often given the epithet "Pallas". Pallas is a custom Bytecode interpreter / VM I intend to write.
- WASM. This will probably have to wait until I've created my own VM, and WASM have come up with a good way to support garbage collected languages. Compiling to Web assembly could give this language a leg up in terms of usefulness.
- JavaScript
- LLVM
- JVM
Iterators and collections I'm still not quite sure what new syntactic control flow structures I will create, but I know that it will be syntactic sugar around iterators.
Interfaces / traits To create iterators, I will need an iterator interface to allow things like linked lists and array lists. To do this I need interfaces. I could also have traits, which are like interfaces but with default methods. This can also be structurally typed.
Generics What if I want to iterate over a list of strings, or cars? To make the same list interface implementation work, I will need generic types
Characters, pairs, hashmaps, other common datatypes. To write a lexer I need characters. Other parts of a compiler implementation will need hashmaps.
Figure out identity and equality - for hashmaps
Enums: Convenient for handling different tokens for the compiler
Standard library, for loading files and stuff
Way to type standard library functions
Module system,
Tuples / Records
Mutual Recursion - handle declarations in a separate pass
Better type inference - function parameters
Better Type Match Scoping Type matching for union types changes the type of the variable in each case block. E.g, if I have a variable called id that is union of string and int, then in the string case block the id variable will be of type string, not a union, so I can't change it after checking it. The variable is under superposition, until it's observed. This is fine for some cases, but want if I want to implement a linked list? In many linked list implementations, there is often a current node pointer that either is a linked list node or null, and is set to the next node until it is null. To handle this now, I need a recursive implementation. An alternate solution would allow me to bind the variable to a new name in each case scope, so I can still modify the original one.
Pratt parser? Might allow user to write macros?
Constants, useful language feature, helpful for optimisation
Optimisation Tail calls, bytecode, constant folding, etc
Packaging / distribution
Documentation
Langauge server for editors
Multiline comments
Arithmetic assignment operators
Create the control structures I'm not too sure what they are yet. I've mentioned until statements. Another common problem is doing something differently on the first or last iteration of a loop. Or executing a block of code if the loop never iterated.