This tutorial is a literate Haskell program whwre we start by specifying the language pragmas and imports we will need for this module.
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE FlexibleContexts #-}
import Data.Functor.Identity
import qualified Data.HashMap.Lazy as HML
import Text.RE.REOptions
import qualified Text.RE.TDFA as TDFA
import Text.RE.TDFA.String
import Text.RE.TestBench
regex supports macros in regular expressions. There are a bunch of standard macros and you can define your own.
RE macros are enclosed in@{
… ‘}’. By convention the macros in the standard environment start with a ‘%’. @{%date}
will match an ISO 8601 date, this
> countMatches $ "2016-01-09 2015-12-5 2015-10-05" *=~ [re|@{%date}|]
ghci2
picking out the two dates.
See the tables listing the standard macros in the tables folder of the distribution.
See the log-processor example and the Text.RE.TestBench
for more on how you can develop, document and test RE macros with the regex test bench.
You can use the regex test bench to add you own macros. As a simple example we will add an ‘epsilon’ macro to the standard ‘prelude’ macro environment. (See the re-nginx-log-processor
for a more extensive example of macro environments.)
The @{epsilon}
macro will expand to a RE that matches only the empty string:
.{0}
(A use such a seemingly useless RE macro will become apparent in the test example below.)
Firstly we define a two argument function function to create a MacroDescriptor
from:
the MacroEnv
macro environment argument will be used to compile the macro RE (we don’t need it in this instance, of course, but we are following a general recipe);
the macroId
name of the macro (which is passed into us because the calling context need the name of the macro).
epsilon_macro :: MacroEnv -> MacroID -> MacroDescriptor
=
epsilon_macro env mid Just samples env mid
runTests TDFA.regexType MacroDescriptor
= RegexSource ".{0}" -- the RE to be substituted for the macro
{ macroSource = map fst samples -- list of string that should match the above macro RE
, macroSamples = counter_samples -- list of string that should **not** match the above macro RE
, macroCounterSamples = [] -- for bookkeeping
, macroTestResults = Nothing -- no parser needed for this one!
, macroParser = "an epsilon parser, matching the empty string only"
, macroDescription
}where
samples :: [(String,String)]
=
samples ""
[ dup
]where
= (x,x)
dup x
=
counter_samples "not an empty string"
[ ]
The compiled Macros RE
that we will slot into the REOptions
used to compile the RE is constructed in two steps. Firstly we provide a function that takes the @MacroEnv@ that all of the macros will use to build their REs and returns the augmented MacroEnv
with the new macro definitions.
MacroEnv
is generic and not dependent upon any back end — none of the macros have been compiled.
my_env :: MacroEnv -> MacroEnv
= env
my_env env0 where
= env0 `HML.union` HML.fromList
env "epsilon" epsilon_macro
[ f
]
= (mid, mk env mid)
f nm mk where
= MacroID nm mid
MacroEnv
we compile the macros into a Macros RE
macro table that we can insert into an REOptions
that can be used to compile REs in the application.
my_macros :: Macros RE
= runIdentity $ mkMacros mk TDFA.regexType ExclCaptures $ my_env TDFA.preludeEnv
my_macros where
= maybe oops Identity . TDFA.compileRegexWithOptions TDFA.noPreludeREOptions
mk
= error "my_macros: unexpected RE compilation error" oops
The makeREOptions
function can be used to construct an REOptions
for compiling REs with [re_| ... |]
and [ed_| ... /// ... |]
quasi quoters.
myOptions :: TDFA.REOptions
= TDFA.makeREOptions my_macros myOptions
Now we can try out the @{epsilon}
macro, using it to match nothing!
> matched $ "///" ?=~ [re_|^//@{epsilon}/$|] myOptions
ghciTrue
Why would we we want to match nothing? To break up three ‘/’ in the RE part of a [ed_| ... /// ... |]
SearchReplace
template.
> "a <///> replacement example" *=~/ [ed_|<//@{epsilon}/>///<three slashes>|] myOptions
ghci"a <three slashes> replacement example"
For a more extensive example of macro environments see the re-nginx-log-processor