Capturing union values with fparsec
I just started playing with fparsec which is a parser combinatorics library that lets you create chainable parsers to parse DSL’s. After having built my own parser, lexer, and interpreter, playing with other libraries is really fun, I like seeing how others have done it. Unlike my mutable parser written in C#, with FParsec the idea is that it will encapsulate the underlying stream state and result into a parser object. Since F# is mostly immutable, this is how the underlying modified stream state gets captured and passed as a new stream to the next parser. I actually like this kind of workflow since you don’t need to create a grammar which is parsed and creates code for you (which is what ANTLR does). There’s something very appealing to have it be dynamic.
As a quick example, I was following the tutorial on the fparsec site and wanted to understand how to capture a value to store in a discriminated union. For example, if I have a type
type Token =
| Literal of string
How do I get a Literal("foo")
created?
All of the examples I saw never looked to instantiate that. After a bit of poking around I noticed that they were using the |>>
syntax which is a function that is passed the result value of the capture. So when you do
pstring "foo" |\>\> Literal
You’ve invoked the constructor of the discriminated union similar to this:
let literal = "foo" |\> Literal
Which is equivalent to
let literal = Literal("foo")
This is because most of the fparsec functions and overloads give you back a Parser type
type Parser\<'TResult, 'TUserState\> = CharStream\<'TUserState\> -\> Reply\<'TResult\>
Which is just an alias for a function that takes a utf16 character stream that holds onto a user state and returns a reply that holds the value you wanted. If you look at charstream it looks similar to my simple tokenizer. The functions |>>
, >>=
, and >>%
are all overloads that help you chain parsers and get your result back. If you are curious you can trace through their types here.
Now, if you don’t need to capture the result value and want to just create an instance of an empty union type then you can use the >>%
syntax which will let you return a result:
let Token =
| Null
let nullTest = pstring "null" \>\>% Null
There are a bunch of overloaded methods and custom operators with fparsec. For example
let Token =
| Null
let nullTest = stringReturn "null" Null
Is equivalent to the >>%
example.
It’s a little overwhelming trying to figure out how all the combinators are pieced together, but that’s part of the fun of learning something new.