Introduction
This is a new small series I wanted to write for a while. Next to machine code programming on old gaming consoles, I’ve been spending a while now on learning Haskell. Why Haskell?
While I did start programming in C/C++ and assembly, I’ve never ventured outside of the classic “procedural OOP dogma” most textbooks teach today. Learning the basics of functional programming with Haskell was an eye-opener. I seriously no longer understand how I could never live without pattern matching, function mapping, and list folding. Writing a loop to modify an array feels almost obscene once you’ve seen the elegant and simple beauty of function mapping.
Haskell is just a polymorphic lambda calculus with lazy evaluation and algebraic data types, what’s the deal?
Even learning about lambda calculus on its own gave me a lot of insight into how programming languages work. So if you haven’t ever considered giving functional programming a try, stop reading this and go straight to Learn You a Haskell For Great Good! - a great, easy-to-follow, and free introduction to Haskell. Reading the first three or four chapters should give you enough understanding of Haskell to follow the code in this post (except for the input/output part, that’s a bit weird in Haskell; but even without the details it should be fairly obvious what each function does).
So, the goal here is to write a simple Interpreter/REPL for GEORGE.
What is GEORGE?
GEORGE is the greatest programming language you have never heard of. It was developed by Australian philosopher and logician Charles Leonard Hamblin in 1957 for use on the DEUCE, an early British computer. He’s credited with introducing Reverse Polish Notation into computer science. You can read his original article here. It defines a complete programming language in only 12 pages! You should definitely read it.
But why choose GEORGE? One, I like the name. Two, it reminds me of Forth a lot, another fun programming language that uses RPN. Three, I really could find an implementation for GEORGE anywhere so maybe I’ll claim a first here :)
I’m not going to go into GEORGE itself here. This will be done in a future post. Today, I just want to write a “quick and dirty” interpreter for GEORGE. This will be the base for a more complex solution later on.
Haskell lends itself pretty good to writing interpreters and compilers, so without further ado, here we go.
A Simple REPL
Let’s first write a simple loop that will take an input string and print it unmodified back to the user. This is easily done in Haskell, I shamelessly took the code from this excellent respository by Joel Chelliah on how to write a REPL:
module Main where
import Control.Monad (unless)
import System.IO
import Resolver
-- Simple REPL Loop
main :: IO ()
main = do
input <- read'
unless (input == ":quit")
$ print' (eval' input) >> main
-- Read the input from console
read' :: IO String
read' = putStr "GEORGE> "
>> hFlush stdout
>> getLine
-- Evaluate the given input; does nothing...yet
eval' :: String -> String
eval' input = input
-- Print the evaluated string
print' :: String -> IO ()
print' = putStrLn
Nothing really surprising here, it’s expertly explained in the repository I linked above. Please check it out if you’re struggling to understand this code.
Running this in your console yields (don’t forget to ghc Main.hs Resolver.hs
):
GEORGE> Hello, George!
Hello, George!
GEORGE>
Now, how to evaluate the input string? Since we’re going to evaluate the input string from left to right, we can use one of Haskell’s most powerful facilities, folding lists. If we interpret our input string as a list of commands, we simply execute all “commands” from left to right and return the result.
Here’s the code, which is basically a simple RPN interpreter (there’s a chapter on Learn You a Haskell that explains a similar solution):
module Resolver where
-- fold the stack
rpn :: ([Double] -> [String] -> [Double])
rpn = foldl workStack
-- read and resolve string
resolve :: String -> Double
resolve = head . rpn [] . words
-- manipulate stack
workStack :: [Double] -> String -> [Double]
workStack (top:next:xs) "+" = (top + next):xs
workStack (top:next:xs) "-" = (next - top):xs
workStack (top:next:xs) "*" = (top * next):xs
workStack (top:next:xs) "/" = (next / top):xs
workStack (top:next:xs) "^" = (next ** top):xs
workStack xs n = (read n::Double):xs
Again, this is as simple as it gets. resolve
first uses words
to split the input string at whitespaces. Then, rpn
uses workStack
as the folding function with foldl
over the split input string. If everything goes right, we’re left with a simple double list with only one element, a.k.a, the result. head
extracts that result for us.
(The name workStack
is a bit misleading at this point, but I plan to add a “real” stack to this interpreter like the parameter stack in Forth.)
Now, we only need to update eval'
function we wrote earlier to make this work:
-- Evaluate the given input
eval' :: String -> String
eval' input = (show . resolve) input
Purists, of course, will do an eta-reduction here:
-- Evaluate the given input
eval' :: String -> String
eval' = show . resolve
Much better.
And again testing in the console:
GEORGE> 5 5 +
10.0
GEORGE> 4 4 + 2 /
4.0
GEORGE> 10 2 ^ 2 /
50.0
See, a simple RPN resolver that can easily be extended with pattern matching! Adding functions like sqrt
or sin
is trivial.
Problems and Next Steps
Now, as I said, this is quick and dirty. There are a lot of problems with this. For example, if I give it more operands than operators, the result will be “wrong”, inasmuch that there’s still unevaluated numbers on the stack:
GEORGE> 5 6 7 +
13.0
This is undesirable for obvious reasons.
I have also cheated a bit by making the interpreter work with floating-point numbers natively. This can be undesirable when numbers are used as indices for accessing variables or in loops.
Another look at workStack
shows something we don’t like in Haskell: A lot of code repetition. This is begging to be abstracted.
So, a lot of problems to take care of. Next time, we’ll take a closer look at this function and try to abstract it further before we implement additional facilities of GEORGE into this interpreter. We’ll also replace String
with Text
since it is considered a superior type for complex text interpretation/manipulation.
Credits
The picture at the top of this post is George Davis sitting at a DEUCE computer, taken from http://users.tpg.com.au/eedeuce/people.html#2
(see what I did there?)