# Pattern-matching-based AST Evaluation as Prisms

Posted on October 10, 2021 by ubikium## Introduction

When performing evaluation on an Abstract Syntax Tree (AST) for an Embedded Domain-Specific Language (EDSL), it would be nice to have a “pattern matching” feature to define semantic rules. For example, “for an expression of the form `Add M N`

, if `M`

and `N`

are integers, it evaluates to `M+N`

”.

This blog post will introduce a way to use prism-like functions to express such complex pattern-matching-based evaluation rules. For most of the parts, the reader need not to know anything about optics. Just basic concepts of algebraic data types, functors, and typeclasses would suffice.

In Haskell, if you write the syntax of the EDSL with algebraic data types, basic pattern matching is already built in. For a simple imperative language defined by the data types:

```
data AExp = ALit Int | AVar String
| AAdd AExp AExp
data BExp = BLit Bool
-- less than or equal to
| BLe AExp AExp
| BAnd BExp BExp
data Stmt = SAssign String AExp
-- If-then-else
| SIte BExp Stmt Stmt
```

We can pattern match on an `Stmt`

value with

```
case e of
SAssign v expr -> ...
SIte b stmt stmt -> ...
```

These are all good for top-level definitions, however, it can be quite cumbersome to pattern match something that’s deep in the structure. Say we want to write a renaming function. It accepts a function from variable names to variable names, like `\s -> "new_" ++ s`

. Then it will change a `Stmt`

tree to a new tree with variable names replaced. This is usually defined recursively:

```
subs_s :: (String -> String) -> Stmt -> Stmt
= case s of
subs_s rho s SAssign v a -> SAssign (rho v) (subs_a rho a)
SIte b s1 s2 -> SIte (subs_b rho b) (subs_s rho s1) (subs_s rho s2)
subs_a :: (String -> String) -> AExp -> AExp
= case a of
subs_a rho a ALit _ -> a
AVar v -> AVar (rho v)
AAdd a1 a2 -> AAdd (subs_a rho a1) (subs_a rho a2)
subs_b :: (String -> String) -> BExp -> BExp
-> case bexp of
subs_b rho b BLit _ -> b
BLe a1 a2 -> BLe (subs_a rho a1) (subs_a rho a2)
BAnd b1 b2 -> BAnd (subs_b rho b1) (subs_b rho b2)
```

If we then proceed to write another function to perform application. This function will take a function from variables to integer numbers. And then it takes a `Stmt`

tree and returns a new tree with variables replaced with the numbers.

At first glance, we seem to be able to write a more general function `subs_s_var`

that takes a substitution function of the type `String -> AExp`

to unify them. But remember we are in an imperative language. The substitution from variables to values should not happen to the target variable for an assignment statement (the so-called “left value”). Therefore, the implementation is slightly different:

```
apply_s :: (String -> Int) -> Stmt -> Stmt
= case s of
apply_s rho s SAssign v a -> SAssign v (apply_a rho a)
SIte b s1 s2 -> SIte (apply_b rho b) (apply_s rho s1) (apply_s rho s2)
apply_a :: (String -> Int) -> AExp -> AExp
= case a of
apply_a rho a ALit _ -> a
AVar v -> ALit (rho v)
AAdd a1 a2 -> AAdd (apply_a rho a1) (apply_a rho a2)
apply_b :: (String -> Int) -> BExp -> BExp
-> case bexp of
apply_b rho b BLit _ -> b
BLe a1 a2 -> BLe (apply_a rho a1) (apply_a rho a2)
BAnd b1 b2 -> BAnd (apply_b rho b1) (apply_b rho b2)
```

Notice in the function `apply_s`

, we leave the `v`

in `SAssign v a`

unchanged. In contrast, the function `subs_s`

changes `v`

to `rho v`

. Actually in the application case, the only place where we replace the variable with a value is in `apply_a`

: if we find an `AVar v`

, we change it to `ALit (rho v)`

.

From these two sets of functions, we can see that the pattern we want to match is not “every appearance of a variable”, but “every appearance of a variable within an expression (either an `AExp`

or a `BExp`

)”. This is an example of the need for nuanced patterns. The pattern “every variable in an expression” can be abstracted from the common parts of `subs_s`

and `apply_s`

:

```
aexp_avar :: (String -> AExp) -> AExp -> AExp
= case a of
aexp_avar rho a ALit _ -> a
AVar v -> rho v
AAdd a1 a2 -> AAdd (aexp_avar rho a1) (aexp_avar rho a2)
bexp_avar :: (String -> AExp) -> BExp -> BExp
-> case bexp of
bexp_avar rho b BLit _ -> b
BLe a1 a2 -> BLe (aexp_avar rho a1) (aexp_avar rho a2)
BAnd b1 b2 -> BAnd (bexp_avar rho b1) (bexp_avar rho b2)
```

These two functions together express the idea of pattern matching a variable inside an `AExp`

or a `BExp`

, when such a variable is found, it will be changed into an `AExp`

by applying `rho`

, and a new tree is constructed accordingly.

The functions `subs_s`

and `apply_s`

can share the same pattern matching path when they go from `AExp`

to `AVar`

. So they only need to specify how to go from a `Stmt`

to an `AExp`

.

For `subs_s`

, we may find `AVar`

in the target and value of `SAssign`

, as well as in the condition and clauses of `SIte`

. On the other hand, for `apply_s`

, we will go from a `Stmt`

to an `AExp`

by leaving out the assignment target part.

```
subs_s :: (String -> String) -> Stmt -> Stmt
= case s of
subs_s rho s SAssign v a -> SAssign (rho v) (subs_a rho a)
SIte b s1 s2 -> SIte (subs_b rho b) (subs_s rho s1) (subs_s rho s2)
apply_s :: (String -> Int) -> Stmt -> Stmt
= case s of
apply_s rho s SAssign v a -> SAssign v (apply_a rho a)
SIte b s1 s2 -> SIte (apply_b rho b) (apply_s rho s1) (apply_s rho s2)
```

The above functions specifies how to go from a `Stmt`

to an `AExp`

or a `BExp`

. Then we only need to specify how to go from an `AExp`

or a `BExp`

to a `AVar`

and what to do when such a pattern matching succeeds. This can be easily done by:

```
subs_a :: (String -> String) -> AExp -> AExp
= aexp_avar (AVar . rho)
subs_a rho
subs_b :: (String -> String) -> BExp -> BExp
= bexp_avar (AVar . rho)
subs_b rho
apply_a :: (String -> Int) -> AExp -> AExp
= aexp_avar (ALit . rho)
apply_a rho
apply_b :: (String -> Int) -> BExp -> BExp
= bexp_avar (ALit . rho) apply_b rho
```

Let’s review what we’ve done here. We break up the process of going from a `Stmt`

to `AVar`

into three steps:

- We specify how to go from a
`Stmt`

to`AExp`

and`BExp`

for each function, because we are taking different pattern matching paths to expressions. - Then we abstract a common pattern to go from
`AExp`

and`BExp`

to`AVar`

. This pattern matching path is the same for both functions. So we come up with`aexp_avar`

and`bexp_avar`

to do just that, then`subs_a`

,`apply_a`

, and other functions are implemented using these patterns. - Finally, we plug the pieces back to
`subs_s`

and`apply_s`

, to get the whole pattern matching path.

In this way, we can write less code when implementing `subs_a`

and so on. Furthermore, actions are decoupled with the pattern matching process. `aexp_avar`

and `bexp_avar`

can be viewed as individual stored pattern matching paths. They leave out what to do if the pattern matching succeeds, as long as it gets an `AExp`

in the end to replace the matched variable.

As is with all decouplings, we also get benefits when the structure changes. Suppose we want to add a new constructor to the definition of `BExp`

:

```
data BExp = ...
| BNeg BExp
```

We only need to change how we go from a `BExp`

to an `AExp`

:

```
bexp_avar :: (String -> AExp) -> BExp -> BExp
-> case b of
bexp_avar rho b ... -> ...
BNeg b' -> BNeg (bexp_avar rho b')
```

All other functions remain the same. After all, we are only adding a new way to go from a `BExp`

to an `AExp`

. This shouldn’t change how to go from a `Stmt`

to an `AExp`

and other paths.

These “stored pattern matching paths” also combine with each other if the types match. Say we want to do some evaluation, but we want to restrict the process to only the condition of each “if-then-else” block, maybe for some optimization reasons. We already know how to get from an `AExp`

to a variable, now we only need to figure out how to get from a `Stmt`

to `AExp`

in `if-then-else`

conditions.

```
stmt_ite_aexp :: (String -> AExp) -> Stmt -> Stmt
= case s of
stmt_ite_aexp rho s SAssign _ _ -> s
SIte b s1 s2 -> SIte (bexp_avar rho b) s1 s2
```

Notice how we “stringed” the two paths together: from `Stmt`

to `AExp`

, then from `AExp`

to `AVar`

.

In conclusion, we have obtained a good of way of doing pattern matching for our EDSL, with the following advantages:

- It is expressive enough for complex patterns.
- Many patterns can be reused, due to the decoupling of actions from the pattern matching process.
- When the syntax changes, the patterns can be changed in a small, predictable, and local scope.
- Patterns are composable.

However, this representation still has two shortcomings:

- We don’t have the concept of “failure” in the process of pattern matching. A new tree is returned, but we don’t know whether any pattern matching took place. We may need such information, for example, when trying to answer whether there are no more patterns to match, i.e. whether there are no applicable evaluation rules.
- The composition is not clear. In
`stmt_ite_a`

, it is not obvious we are going from a`Stmt`

to a condition`BExp`

of an “if-then-else” block, and then from an`BExp`

to`AExp`

. This is because the sub-pattern-matching is inserted in the middle of the function.

We now take the same idea, but use a new way to represent “pattern matching functions”, which aims to address these two problems.

## Targeted language

To demonstrate the advantages of the representation, we expand the above toy language to a more complete yet still small EDSL.

We target a simple imperative language called IMP from a K-framework’s tutorial. The language embedded in Haskell looks like this:

```
data AExp v = ALit Int | AVar v
| ANeg Int
| ADiv (AExp v) (AExp v)
| AAdd (AExp v) (AExp v)
data BExp v = BLit Bool
-- less than or equal to
| BLe (AExp v) (AExp v)
| BNeg (BExp v)
| BAnd (BExp v) (BExp v)
-- empty block is Nothing
type Block v = Maybe (Stmt v)
data Stmt v = SBlock (Block v)
| SAssign v (AExp v)
-- If-then-else
| SIte (BExp v) (Block v) (Block v)
| SWhile (BExp v) (Block v)
| SSeq (Stmt v) (Stmt v)
-- identity of SSeq
| SUnit
```

Compared with the example from the introduction, IMP has a richer set of structures. You can write quite a lot of programs with IMP. For some example programs written with this embedded IMP, see this page. The corresponding actual IMP programs that can be interpreted by K-framework tools like `kompiler`

can be found in test files.

## Method

### Representation of pattern matchings

We will now proceed to define pattern matchings, and use a type to express our definition.

**Definition.** A pattern matching for a algebraic data type is a function, when given a (possibly failing) substitution of the substructure, can give a substitution of the whole structure.

For example, when pattern matching on `AVar`

in `AExp`

, what we are saying is that given a substitution for `AVar`

, we know how to construct an `AExp`

as a result for this substitution.

This obviously doesn’t cover every meaning of the term “pattern matching”, but here we consider it to be a good fit for our purpose of AST evaluation. We can use the following type to express our definition:

`type MatchInto s t a b = (a -> Maybe b) -> s -> Maybe t`

This type means if given a possibly failing substitution of the substructure `a`

to `b`

(i.e. a function `a -> Maybe b`

), we can construct a possibly failing substitution for the overall structure `s`

and change it into `t`

.

For our evaluation process, an evaluation rule always preserves the type of the value (or more precisely, the reducible expression), i.e. no rule can change an `AExp`

to a `BExp`

. Therefore, we can use a less general type:

```
type MatchInto' s a = MatchInto s s a a
type MatchSelf a = MatchInto' a a
```

It can be illustrating to expand the type definitions:

```
type MatchInto' s a = (a -> Maybe a) -> s -> Maybe s
type MatchSelf a = (a -> Maybe a) -> a -> Maybe a
```

The difference between the two types is that `MatchInto'`

may move into a different type to continue the pattern matching, while `MatchSelf`

says that the substructure has the same type of the matched pattern.

For example, an `AExp`

in the `BLe AExp AExp`

will change the type of pattern matching from a `BExp`

to an `AExp`

, while the `AExp`

in `AAdd AExp AExp`

will not change. This is quite handy when we discuss recursively pattern matching into substructures, since a successful pattern matching will return a same type, we can continue to match into the substructures recursively.

### Simple patten matching functions

Let’s see some examples to get a hang of it.

```
aLit :: MatchSelf (AExp v)
= case e of
aLit f e ALit _ -> f e
-> Nothing _
```

This means `aLit`

is a pattern matching from an `AExp v`

to `AExp v`

. When the constructor is `ALit`

, the matching succeeds, it will apply the substitution function `f`

, which will produce a result of `Maybe (AExp v)`

. We use this substitution result to replace the matched `AExp v`

in the whole tree. Otherwise, the pattern matching fails and returns a `Nothing`

.

Note: it is not guaranteed that if the pattern matching succeeds, the substitution will also succeed and return a `Just`

value. The substitution may be partial, for example:

```
keep5 :: AExp v -> Maybe (AExp v)
ALit 5) = Just (ALit 5)
keep5 (= Nothing keep5 _
```

`keep5`

will fail on `ALit 6`

when performing the pattern matching, i.e. `aLit keep5 (ALit 6)`

will return a `Nothing`

, although the pattern matching on the constructor succeeds.

One might be tempted to conclude that for each constructor of the sum type, we will produce exactly one pattern matching function. However, we are free to choose onto which substructure to apply the substitution function, for example, the whole structure, one of the arguments, or some subset of the arguments. An example of this distinction is as follows:

```
bNeg :: MatchSelf (BExp v)
= case e of
bNeg f e BNeg _ -> f e
-> Nothing
_
bNegArg :: MatchSelf (BExp v)
= case e of
bNegArg f e BNeg e' -> BNeg <$> f e'
-> Nothing _
```

Although both functions match on the same constructor `BNeg`

, the pattern `bNeg`

applies the substitution on the whole structure, while `bNegArg`

only applies it into the subexpression, and wrap it up with the same `BNeg`

constructor. The former function is more general because it can choose to change the constructor to any `BExp v`

constructor, e.g.

```
negLit :: BExp v -> Maybe (BExp v)
= bNeg $ \b -> case b of
negLit BNeg (BLit b) -> Just $ BLit (not b)
-> Nothing _
```

Then `bNeg negLit (BNeg (BLit True))`

returns `Just (BLit False)`

, while `bNegArg negLit (BNeg (BLit True))`

will return `Nothing`

, because it expects to see a `BNeg`

in the argument of the top structure, but gets a `BLit`

instead. On the other hand `bNegArg negLit (BNeg (BNeg (BLit True)))`

will succeed by returning `Just (BNeg (BLit False))`

.

### Composing pattern matching functions

Recall the type of `MatchInto'`

:

```
type MatchInto s t a b = (a -> Maybe b) -> s -> Maybe t
type MatchInto' s a = MatchInto s s a a
```

Observe the expanded type of `type MatchInto' s a = (a -> Maybe a) -> (s -> Maybe s)`

. We can easily see if we have `aToB :: MatchInto' a b`

and `bToC :: MatchInto' b c`

, then the composed pattern `aToC = aToB . bToC`

has the type `MatchInto' a c`

, which will pattern match into the overall big structure `a`

and get matches of the smallest substructure `c`

.

For example, even if we don’t know the implementation, just by reading the types, we will be able to tell whether and how the pattern matchings will compose.

```
aAddArg :: MatchInto' (AExp v) (AExp v)
aLit :: MatchInto' (AExp v) (AExp v)
. aLit :: MatchInto' (AExp v) (AExp v) aAddArg
```

Supposedly, the composed pattern will match arithmetic literals in arguments of `AAdd`

. For example, `(aAddArg . aLit) (\_ -> Just (ALit 0)) (AAdd (ALit 1) (AVar "x"))`

will return `Just (AAdd (ALit 0) (AVar "x"))`

.

The following composed pattern matching function will match an `ADiv`

expression which appears in an argument of a `BLe`

expression.

```
bLeArg :: MatchInto' (BExp v) (AExp v)
aDiv :: MatchInto' (AExp v) (AExp v)
. aDiv :: MatchInto' (BExp v) (AExp v) bLeArg
```

We can also compose more than one pattern matching functions, to get a more complex pattern:

```
sIteCond :: MatchInto' (Stmt v) (BExp v)
bNeg :: MatchInto' (BExp v) (BExp v)
bLit :: MatchInto' (BExp v) (BExp v)
. bNeg . bLit :: MatchInto' (Stmt v) (BExp v) sIteCond
```

For those who are familiar with the concept, this should immediately remind you of the power of optics. In fact, `MatchInto`

is indeed similar to a kind of optics called prisms. We will return to that in the end. But right now, neither the definition nor the implementation requires any knowledge about optics. And we have a quite good intuition of the meaning of each part.

### Pattern matching combinators

It is straightforward to manually write out all the pattern matching functions for the entire syntax tree. However, there are some patterns that can be extracted into combinators to facilitate the process.

A basic example is a binary constructor, by which we mean a constructor with two substructures of the same type. This is only a contingent feature of the IMP language, it may not be useful to other EDSL. However, it’s easy to come up with new pattern matching paths and combine them together.

For binary constructors, what we want to do is to match into both of the substructures. If either of them succeeds, we will replace that succeeded substructure with the result of applying the substitution. The other substructure remains the same. And we construct a new tree using the same constructor.

```
binaryConstr :: (a -> Maybe a) -> (a -> a -> b) -> a -> a -> Maybe b
=
binaryConstr f constr a1 a2 let aL = flip constr a2 <$> f a1
= constr a1 <$> f a2
aR in aL <|> aR
```

Here we are using the `Alternative`

instance of `Maybe`

. The definition is as follows:

```
instance Alternative Maybe where
= Nothing
empty Nothing <|> r = r
<|> _ = l l
```

In short, this means we will go through some `Maybe`

values, if we meet a `Nothing`

, the search continues, otherwise, we’ve found a `Just`

value, then we stop the search and return the `Just`

value. If all we’ve met along the way are `Nothing`

values, we will return a `Nothing`

value in the end.

An example of `binaryConstr`

is:

```
bLeArg :: MatchInto' (BExp v) (AExp v)
= \e -> case e of
bLeArg f BLe e1 e2 -> binaryConstr f BLe e1 e2
-> Nothing _
```

`bLeArg`

matches into two arguments of a `BLe`

constructor.

Remember the discussion about matching a thing itself and matching its substructures? If the matched substructure has the same type of the overall structure, then we can recursively match into it. For binary constructors, this provide a different combinator:

```
binaryRec :: (a -> Maybe a) -> (a -> a -> a) -> a -> a -> a -> Maybe a
=
binaryRec f constr a1 a2 a let aL = flip constr a2 <$> f a1
= constr a1 <$> f a2
aR = f a
aS in aL <|> aR <|> aS
```

We add the result of matching the term itself `aS = f a`

to the possible results, by connecting `aS`

to other results with `<|>`

.

If we really want, we can write such a pattern:

```
aAddWithArgs :: MatchSelf (AExp v)
= case e of
aAddWithArgs f e AAdd e1 e2 -> binaryRec f AAdd e1 e2 e
-> Nothing _
```

However, as we will explain later, we choose to use a more general way for recursively pattern matching substructures. For now, we only provide two patterns for the constructor `AAdd`

:

```
aAdd :: MatchSelf (AExp v)
= case e of
aAdd f e AAdd _ _ -> f e
-> Nothing
_
aAddArg :: MatchSelf (AExp v)
= case e of
aAddArg f e AAdd e1 e2 -> binaryConstr f AAdd e1 e2
-> Nothing _
```

`aAdd`

matches the structure itself, while `aAddArg`

matches exactly the two arguments without touching the overall structure.

### Recursive pattern matching

A recursively pattern matching into substructures process can be done through the following steps:

- Try to match itself.
- If 1 fails, find the substructures with the same type.
- Recursively match into substructures.

This combinator is provided as a typeclass.

```
class RecursiveMatch a where
recursiveMatch :: MatchSelf a
= f e <|> subMatch (recursiveMatch f) e
recursiveMatch f e
subMatch :: MatchSelf a
```

`recursiveMatch`

does exactly steps 1-3. For any `RecursiveMatch`

instance, we have to specify how to find substructures.

For example, recall the definition of `AExp`

:

```
data AExp v = ALit Int | AVar v
| ANeg Int
| ADiv (AExp v) (AExp v)
| AAdd (AExp v) (AExp v)
```

For a substructure to also have the type `AExp`

, it must belong to either the arguments of `ADiv`

or the arguments of `AAdd`

. Therefore, the instance for `AExp`

can be defined as:

```
instance RecursiveMatch (AExp v) where
subMatch :: MatchSelf (AExp v)
= aDivArg f s <|> aAddArg f s subMatch f s
```

And now when we can use `recursiveMatch`

from the typeclass to get all `AExp`

in an `AExp`

with an arbitrary depth of unfolding (and also itself). This is a technique called anamorphism, which can be intuitively understood as a mechanism to unfold over a structure. This is why we choose to provide only the patterns to match substructures instead of using `binaryRec`

.

One last combinator in our toolbox, the `possibly`

combinator will try pattern matching functions from a list, if any of them succeeds (i.e. returns a `Just`

value), then the whole pattern matching succeeds. This is like `concat`

from `Data.Foldable`

, except we are using the `Alternative`

instance, so what we want is `asum`

:

```
possibly :: [MatchInto' s a] -> MatchInto' s a
= asum . map (\p -> p f s) $ patterns possibly patterns f s
```

Then the recursively matching turns into listing substructures:

```
instance RecursiveMatch (AExp v) where
subMatch :: MatchSelf (AExp v)
= possibly
subMatch
[ aDivArg
, aAddArg
]
instance RecursiveMatch (BExp v) where
subMatch :: MatchSelf (BExp v)
= possibly
subMatch
[ bNegArg
, bAndArg
]
instance RecursiveMatch (Stmt v) where
subMatch :: MatchSelf (Stmt v)
= possibly
subMatch . block
[ sBlock
, sSeqFirst ]
```

Now we can combine simple patterns to create complex patterns. For example:

```
aExpAVar :: MatchInto' (AExp v) (AExp v)
= recursiveMatch . aVar aExpAVar
```

This pattern `aExpAVar`

will match to all variables inside an `AExp`

, no matter how deep it is in the structure.

Similarly, we can find all `AExp`

inside a `BExp`

by `bExpAExp`

:

```
bExpAExp :: MatchInto' (BExp v) (AExp v)
= recursiveMatch . bLeArg bExpAExp
```

For more complex patterns, we may need to mix combinators and paths:

```
stmtBExp :: MatchInto' (Stmt v) (BExp v)
= recursiveMatch . possibly
stmtBExp
[ sIteCond. recursiveMatch ]
```

In the implementation of `stmtAExp`

, since both `Stmt`

and `AExp`

can recursively match substructures with the same type, we add `recursiveMatch`

to both ends to thoroughly explore the tree.

The `recursiveMatch`

on the first line of the definition matches first. It uses the `Stmt`

instance to recursively find all `Stmt`

in a `Stmt`

. And the middle pattern will lead us from a `Stmt`

to a `BExp`

. Although it’s unnecessary to use `possibly`

here, we use a list to remind us to add more cases when the definition is expanded, because this pattern matching path is more likely to be subject to change. Finally, the `recursiveMatch`

on the last line of the definition will find all `BExp`

inside a `BExp`

. In conclusion, `stmtBExp`

will return all `BExp`

inside a `Stmt`

.

Similarly, we have `stmtAExp`

to pattern match from a `Stmt`

to all `AExp`

inside it.

```
stmtAExp :: MatchInto' (Stmt v) (AExp v)
= recursiveMatch . possibly
stmtAExp
[ sAssignArg. bExpAExp
, stmtBExp . recursiveMatch ]
```

And recall the example in the introduction, where we want to get from a `Stmt`

to all the variables inside it. Well, we can implement that now with:

```
stmtAVar :: MatchInto' (Stmt v) (AExp v)
= stmtAExp . aVar stmtAVar
```

That is, we match a `Stmt`

to find all `AExp`

, and then match them to find all `AVar`

.

Interestingly, this is not the only way to match from `AExp`

to `AVar`

. We can also write:

```
stmtAVar' :: MatchInto' (Stmt v) (AExp v)
= stmtAExp . aExpAVar stmtAVar'
```

That is, we match from a `Stmt`

to `AExp`

, and then go from `AExp`

to all `AVar`

. The difference is that, inside an `AExp`

, `stmtAVar`

will give up if it sees the top level constructor is not `AVar`

, because it is using the `aVar`

pattern, while `stmtAVar'`

will keep searching into substructures.

We are not leaving out any possible matches if we use the first approach, since `stmtAExp`

will look into the substructures later. This can be proven by the following reasoning: for any `AVar`

, `stmtAExp`

will guarantee to match `AVar`

as `AExp`

, then the last match must be on the constructor `AVar`

.

In the case of a mismatch, `stmtAExp`

will once again search into substructures of `AExp`

, without knowing `aExpAVar`

has already failed. So the first implementation is correct and more efficient. This gives us an example of thinking about the search order and optimization when dealing with these patterns. It takes a bit of effort to reason, so you should design your patterns carefully.

### Evaluation

With these patterns ready at hand, the evaluation process is straightforward.

Suppose by evaluation, we only need to transform the AST (meaning no states are mutated), then an evaluation rule can be defined as:

`type Rule = Stmt -> Maybe Stmt`

This is just half of the type of `MatchInto`

. The reason is that we define a pattern matching to be a thing that asks what to do when a pattern matching succeeds, and it will tell you want will happen as a result. Well, an evaluation rule will specifies what to do when a pattern matching succeeds, so the only thing left is what will happen.

The type basically means that if the pattern matching succeeds and the substitution function returns a `Just`

value, we will get a new tree wrapped in `Just`

. Otherwise, we will get a `Nothing`

.

For example, to evaluate integer addition:

```
ruleIntAdd :: Rule
= stmtAExp $ \e -> case e of
ruleIntAdd AAdd (ALit i1) (ALit i2)) -> Just $ ALit (i1 + i2)
(-> Nothing _
```

Of course, you can change the pattern to reflect different evaluation orders. For example, the `BAnd`

can be made into only strict in the first argument, to exhibit a “short circuit” behaviour:

```
ruleBoolAnd :: Rule
= stmtBExp . bAnd $ \e -> case e of
ruleBoolAnd BAnd (BLit True) e' -> Just e'
BAnd (BLit False) _ -> Just $ BLit False
-> Nothing _
```

Of course, for an imperative language, it’s not enough to evaluate expressions. We need to maintain a state to be passed between statements. We can provide a pattern into the state-decorated expressions and combine it seamlessly with normal patterns. (For those familiar with optics, we obtain a prism by composing a lens with a prism.)

For a complete example of evaluation rules for IMP, see this page.

## Relation to Prisms

A `Prism`

is a kind of optics. To some extent, it can be understood as a way to extract a value from a sum type. This is very close to what we have defined to be a pattern matching process.

The definition of `Prism`

encoded in the so-called van Laarhoven representation is as follows:

```
type Prism s t a b = forall p f. (Choice p, Applicative f)
=> p a (f b) -> p s (f t)
```

As a matter of fact, `(->)`

is an instance of `Choice`

, and `Maybe`

is an instance of `Applicative`

. If we instantiate `p`

to `(->)`

and `f`

to `Maybe`

, we get the exact type of `MatchInto`

.

Therefore, although the type `MatchInto`

is not a prism, it is composable with other prisms. The reason is that a prism must work with every `p`

and `f`

that are instances of `Choice`

and `Applicative`

respectively. Then, a prism can always be instantiated to `(->)`

and `Maybe`

, so that it can compose with pattern matching functions. For example,

```
_Just :: Prism (Maybe a) (Maybe b) a b
. aLit :: (AExp v -> Maybe (AExp v))
_Just -> Maybe (AExp v) -> Maybe (Maybe (AExp v))
```

This allows us to apply a pattern matching function within a `Maybe`

value and the pattern matching is only performed in a `Just`

case.

I will not go into the details of the implementation of prisms. However, I hope the above introduction of pattern matching functions should at least help to build some intuition for prisms and their composition.

From the this perspective, it should be possible to write prisms for constructors of everyday sum types. In fact, for a class of sum types, some optics packages can use template Haskell to automatically generate prisms that cover all the above pattern matching functions. To do this with the lens package, after introducing the related package into scope, you only need to add the following lines after the type definition.

```
'AExp
makePrisms ''BExp
makePrisms ''Stmt makePrisms '
```

For a constructor like `ALit`

, a prism with the name `_ALit`

will be generated. Not only can it perform everything we do with the pattern matching function `aLit`

, it has the full power of a complete prism. The implication is that we can use a lot of the functions from the `lens`

package, to support a wider range of actions with these pattern matching paths.

`_ALit :: Prism (AExp v) (AExp v) Int Int`

We can query if there is a value inside a type with the `^?`

operator.

```
>>> (ALit 5) ^? _ALit
Just 5
>>> (ALit 5) ^? _AVar
Nothing
```

We can apply a function to the value pointed by the prism with the function `over`

:

```
>>> over (_AAdd . _1 . _ALit) (+1) (AAdd (ALit 5) (AVar "x"))
AAdd (ALit 6) (AVar "x")) (
```

Here `_1`

will select the first component of the arguments of `_AAdd`

.

These may seem confusing if you are not familiar with optics. If that’s the case, you can use what we have introduced before as a reference. Because of their additional power, optics packages provide many more functions and operators to interact with them. To use those tools, one need to get used to the specific symbols, but at a user level, it’s not that magical. Basically you need to specific three things:

- How to get to the place? Usually by composition of prisms and other optics.
- What to do after we get there? We can view the value, apply a function to change the value, and do many other things.
- In addition, prisms also allow you to choose how to pack the structure back. In our application, we only allow replacement to the same place, but for different optics, you may be able to keep the structure, fold it with a special typeclass instance, or make a list of all matched values.

## Conclusions

In this blog post, we introduced a way to perform pattern matchings for AST evaluation. The core of this representation lies in the type definition `type MatchInto s t a b = (a -> Maybe b) -> (s -> Maybe t)`

. We applied this method to the IMP language and introduced some combinators that allow us to write complex patterns.

We then explained the relation between our method and prisms, namely, prisms are more general structures that come with many tools. Our method can be replaced by prisms. And the original definition composes with prisms to some extent.

A complete IMP interpreter that utilises the aforementioned method of pattern-matching-based tree evaluation can be found at this repository, along with some descriptions and tests.