├── README.md ├── Setup.hs ├── ChangeLog.md ├── .gitignore ├── LICENSE ├── yoda.cabal └── Text └── Yoda.lhs /README.md: -------------------------------------------------------------------------------- 1 | Text/Yoda.lhs -------------------------------------------------------------------------------- /Setup.hs: -------------------------------------------------------------------------------- 1 | import Distribution.Simple 2 | main = defaultMain 3 | -------------------------------------------------------------------------------- /ChangeLog.md: -------------------------------------------------------------------------------- 1 | # Revision history for yoda 2 | 3 | ## 0.1.0.0 -- 2018-10-26 4 | 5 | * First version. Released on an unsuspecting world. 6 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | dist 2 | dist-* 3 | cabal-dev 4 | *.o 5 | *.hi 6 | *.chi 7 | *.chs.h 8 | *.dyn_o 9 | *.dyn_hi 10 | .hpc 11 | .hsenv 12 | .cabal-sandbox/ 13 | cabal.sandbox.config 14 | *.prof 15 | *.aux 16 | *.hp 17 | *.eventlog 18 | .stack-work/ 19 | cabal.project.local 20 | cabal.project.local~ 21 | .HTF/ 22 | .ghc.environment.* 23 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2018, Nicolas Wu 2 | 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions are met: 7 | 8 | * Redistributions of source code must retain the above copyright 9 | notice, this list of conditions and the following disclaimer. 10 | 11 | * Redistributions in binary form must reproduce the above 12 | copyright notice, this list of conditions and the following 13 | disclaimer in the documentation and/or other materials provided 14 | with the distribution. 15 | 16 | * Neither the name of Nicolas Wu nor the names of other 17 | contributors may be used to endorse or promote products derived 18 | from this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 21 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 22 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 23 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 24 | OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 25 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 26 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 27 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 28 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 29 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 30 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 31 | -------------------------------------------------------------------------------- /yoda.cabal: -------------------------------------------------------------------------------- 1 | -- Initial yoda.cabal generated by cabal init. For further documentation, 2 | -- see http://haskell.org/cabal/users-guide/ 3 | 4 | -- The name of the package. 5 | name: yoda 6 | 7 | -- The package version. See the Haskell package versioning policy (PVP) 8 | -- for standards guiding when and how versions should be incremented. 9 | -- https://wiki.haskell.org/Package_versioning_policy 10 | -- PVP summary: +-+------- breaking API changes 11 | -- | | +----- non-breaking API additions 12 | -- | | | +--- code changes with no API change 13 | version: 0.1.3.0 14 | 15 | -- A short (one-line) description of the package. 16 | synopsis: Parser combinators for young padawans 17 | 18 | -- A longer description of the package. 19 | description: Yoda is a small parser combinator library. It is not efficient, nor 20 | beautiful, but it hopes to teach young padawans to use the source 21 | and learn to write a parser. 22 | 23 | -- URL for the project homepage or repository. 24 | homepage: https://github.com/zenzike/yoda 25 | 26 | 27 | -- The license under which the package is released. 28 | license: BSD3 29 | 30 | -- The file containing the license text. 31 | license-file: LICENSE 32 | 33 | -- The package author(s). 34 | author: Nicolas Wu 35 | 36 | -- An email address to which users can send suggestions, bug reports, and 37 | -- patches. 38 | maintainer: nicolas.wu@gmail.com 39 | 40 | -- A copyright notice. 41 | -- copyright: 42 | 43 | category: Text 44 | 45 | build-type: Simple 46 | 47 | -- Extra files to be distributed with the package, such as examples or a 48 | -- README. 49 | extra-source-files: ChangeLog.md, README.md 50 | 51 | -- Constraint on the version of Cabal needed to build this package. 52 | cabal-version: >=1.10 53 | 54 | 55 | library 56 | -- Modules exported by the library. 57 | exposed-modules: Text.Yoda 58 | 59 | -- Modules included in this library but not exported. 60 | -- other-modules: 61 | 62 | -- LANGUAGE extensions used by modules in this package. 63 | other-extensions: InstanceSigs 64 | 65 | -- Other library packages from which modules are imported. 66 | build-depends: base >=4.10 && <4.11 67 | 68 | -- Directories containing source files. 69 | -- hs-source-dirs: 70 | 71 | -- Base language which the package is written in. 72 | default-language: Haskell2010 73 | 74 | source-repository head 75 | type: git 76 | location: https://github.com/zenzike/yoda 77 | -------------------------------------------------------------------------------- /Text/Yoda.lhs: -------------------------------------------------------------------------------- 1 | ``` 2 | 3 | ██╗ ██╗ ██████╗ ██████╗ █████╗ 4 | ╚██╗ ██╔╝██╔═══██╗██╔══██╗██╔══██╗ 5 | ╚████╔╝ ██║ ██║██║ ██║███████║ 6 | ╚██╔╝ ██║ ██║██║ ██║██╔══██║ 7 | ██║ ╚██████╔╝██████╔╝██║ ██║ 8 | ╚═╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝ 9 | 10 | parser combinators for young padawans 11 | 12 | ``` 13 | 14 | Introduction 15 | ============ 16 | 17 | Yoda is a small parser combinator library. It is not efficient, nor 18 | beautiful, but it hopes to teach young padawans to use the source 19 | and learn to write a parser. 20 | 21 | ╔═════════════════════════════════════════════════════════════╗ 22 | ║ ║ 23 | ║ <(-,-)> Do, or do not, there is no try. -- Master Yoda ║ 24 | ║ ║ 25 | ╚═════════════════════════════════════════════════════════════╝ 26 | 27 | Yoda is a parser in the Parsec family of libraries, which includes 28 | Parsec, attoparsec, Megaparsec, and trifecta. The main difference is 29 | that Yoda does not require you to use the `try` function: it 30 | automatically tries all alternatives for you. 31 | 32 | The module exports the following functions and types. Some of these 33 | functions are defined outside of this file, namely, those marked under 34 | `Functor`, `Applicative`, `Alternative`, `Monad`. 35 | 36 | ```lhs 37 | 38 | > {-# LANGUAGE InstanceSigs #-} 39 | > module Text.Yoda 40 | > ( Parser 41 | > , parse 42 | > , parseMaybe 43 | > , parseIO 44 | > 45 | > -- Functor 46 | > , (<$>), (<$), skip 47 | > 48 | > -- Applicative 49 | > , pure, (<*>), (<*), (*>), (<**>) 50 | > 51 | > -- Alternative 52 | > , (<|>), empty, some, many, optional, choice 53 | > , chainl, chainl1, chainr, chainr1 54 | > , prefix, postfix 55 | 56 | > -- Monoidal 57 | > , unit, mult, (<~>), (<~), (~>) 58 | > 59 | > -- Monad 60 | > , return, (>>=) 61 | > 62 | > -- Miscellaneous 63 | > , item, look, eof, char, string, satisfy 64 | > , oneOf, noneOf, sepBy, sepBy1 65 | > , (<:>), between 66 | > 67 | > , cull 68 | > , try -- not needed, but here for historic reasons 69 | > 70 | > ) where 71 | 72 | ``` 73 | 74 | We have to import some classes whose instances we will be 75 | implementing for our parsers. 76 | ```lhs 77 | 78 | > import Control.Monad 79 | > import Control.Applicative 80 | > import Data.List 81 | 82 | ``` 83 | 84 | Parser 85 | ====== 86 | 87 | Our parsers will take in a `String` and produce a list of possible 88 | parses, along with remaining unparsed strings. 89 | ```lhs 90 | 91 | > newtype Parser a = Parser (String -> [(a, String)]) 92 | 93 | > parse :: Parser a -> (String -> [(a, String)]) 94 | > parse (Parser p) = p 95 | 96 | ``` 97 | 98 | ``` 99 | 100 | > parseIO :: Parser a -> String -> IO a 101 | > parseIO p fileName = do 102 | > file <- readFile fileName 103 | > let Just result = parseMaybe p file 104 | > return result 105 | 106 | > parseMaybe :: Parser a -> String -> Maybe a 107 | > parseMaybe px ts = case parse px ts of 108 | > [] -> Nothing 109 | > ((x, ts'):txs) -> Just x 110 | 111 | ``` 112 | This parser tries to push out a character from the incoming stream. It 113 | fails to parse if there is no remaining input. 114 | ```lhs 115 | 116 | > item :: Parser Char 117 | > item = Parser (\ts -> case ts of 118 | > [] -> [] 119 | > (t:ts') -> [(t, ts')]) 120 | 121 | ``` 122 | Now we implement Luke, I mean, look: 123 | ```lhs 124 | 125 | > look :: Parser String 126 | > look = Parser (\ts -> [(ts, ts)]) 127 | 128 | ``` 129 | It is also useful to know if we have reached the end of the input: 130 | ```lhs 131 | 132 | > eof :: Parser () 133 | > eof = Parser (\ts -> case ts of 134 | > [] -> [((), ts)] 135 | > _ -> []) 136 | 137 | ``` 138 | At this stage, we can output what has been given to us on the input, 139 | but we have no way to change the outcome of what we do based on that 140 | input. 141 | 142 | We'll now start climbing the class hierarchy. Each class provides its 143 | own ways of combining and working with parsers, and extends the power 144 | of our combinator language with new functionality. 145 | 146 | 147 | Functor 148 | ======= 149 | 150 | The functor instance captures the idea of modifying the output of 151 | successful parses. 152 | ```lhs 153 | 154 | > instance Functor Parser where 155 | > fmap :: (a -> b) -> Parser a -> Parser b 156 | > fmap f (Parser px) = Parser (\ts -> [ (f x, ts') | (x, ts') <- px ts]) 157 | 158 | ``` 159 | Derived combinators: 160 | ```lhs 161 | 162 | < (<$>) :: Functor f => (a -> b) -> f a -> f b 163 | < (<$>) = fmap 164 | < 165 | < (<$) :: Functor f => a -> f b -> f a 166 | < x <$ py = const x <$> py 167 | 168 | > skip :: Functor f => f a -> f () 169 | > skip px = () <$ px 170 | 171 | ``` 172 | 173 | Applicative 174 | =========== 175 | 176 | The applicative instance shows how parsers can be chained together. 177 | ```lhs 178 | 179 | > instance Applicative Parser where 180 | > pure :: a -> Parser a 181 | > pure x = Parser (\ts -> [(x, ts)]) 182 | > 183 | > (<*>) :: Parser (a -> b) -> Parser a -> Parser b 184 | > Parser pf <*> Parser px = 185 | > Parser (\ts -> [ (f x, ts'') | (f, ts') <- pf ts 186 | > , (x, ts'') <- px ts']) 187 | 188 | ``` 189 | Derived combinators: 190 | ```lhs 191 | 192 | < (<*) :: Applicative f => f a -> f b -> f a 193 | < px <* py = const <$> px <*> py 194 | < 195 | < (*>) :: Applicative f => f a -> f b -> f b 196 | < px *> py = flip const <$> px <*> py 197 | < -- = id <$ px <*> py 198 | < 199 | < (<**>) :: Applicative f => f a -> f (a -> b) -> f b 200 | < px <**> pf = (flip ($)) <$> px <*> pf 201 | 202 | 203 | > (<:>) :: Applicative f => f a -> f [a] -> f [a] 204 | > px <:> pxs = (:) <$> px <*> pxs 205 | 206 | > between :: Applicative f => f open -> f close -> f a -> f a 207 | > between popen pclose px = popen *> px <* pclose 208 | 209 | ``` 210 | 211 | Monoidal 212 | ======== 213 | 214 | An equivalent alternative class to `Applicative` is `Monoidal`. 215 | ```lhs 216 | 217 | > class Functor f => Monoidal f where 218 | > unit :: f () 219 | > mult :: f a -> f b -> f (a, b) 220 | 221 | > instance Monoidal Parser where 222 | 223 | ``` 224 | The `unit` parser returns `()` without parsing any input. 225 | ```lhs 226 | 227 | > unit :: Parser () 228 | > unit = Parser (\ts -> [((), ts)]) 229 | 230 | ``` 231 | For example: 232 | ```lhs 233 | 234 | < parse (unit) "Hello" = [((), "Hello")] 235 | 236 | ``` 237 | The `mult` combinator takes two parsers `px` and `py` and returns 238 | pairs of values containing the results of parsing `px` followed by 239 | `py`. 240 | ```lhs 241 | 242 | > mult :: Parser a -> Parser b -> Parser (a, b) 243 | > mult (Parser px) (Parser py) = 244 | > Parser (\ts -> [((x, y), ts'') | (x, ts') <- px ts 245 | > , (y, ts'') <- py ts']) 246 | 247 | ``` 248 | This is convenient as the following binary operator: 249 | ```lhs 250 | 251 | > (<~>) :: Monoidal f => f a -> f b -> f (a, b) 252 | > px <~> py = mult px py 253 | 254 | ``` 255 | The following derived combinators project out an element of the pair: 256 | ```lhs 257 | 258 | > (<~) :: Monoidal f => f a -> f b -> f a 259 | > px <~ py = fst <$> px <~> py 260 | > 261 | > (~>) :: Monoidal f => f a -> f b -> f b 262 | > px ~> py = snd <$> px <~> py 263 | 264 | ``` 265 | The combinators for `Applicative` and `Monoidal` can be defined in 266 | terms of one another. 267 | ```lhs 268 | 269 | < pure x = const x <$> unit 270 | < pf <*> px = uncurry ($) (pf <~> py) 271 | 272 | < unit = pure () 273 | < mult px py = (,) <$> px <*> py 274 | 275 | < px <* py = px <~ py 276 | < px *> py = px ~> py 277 | 278 | ``` 279 | 280 | Alternative 281 | =========== 282 | 283 | Choices between parsers are given by the `Alternative` class. This 284 | class assumes that the given Parser is already `Applicative`. 285 | ```lhs 286 | 287 | > instance Alternative Parser where 288 | > empty :: Parser a 289 | > empty = Parser (\ts -> []) 290 | > 291 | > (<|>) :: Parser a -> Parser a -> Parser a 292 | > Parser px <|> Parser py = Parser (\ts -> px ts ++ py ts) 293 | 294 | ``` 295 | 296 | Derived combinators 297 | ------------------- 298 | 299 | A simple convenience function that offers the choice between inputs is 300 | given by `choice`: 301 | ```lhs 302 | 303 | > choice :: Alternative f => [f a] -> f a 304 | > choice = foldr (<|>) empty 305 | 306 | ``` 307 | 308 | It's useful to repeat a parser multiple times. The `some px` parser 309 | parses one or more instances of `px`, whereas the `many px` parser 310 | parses zero or more instances of `px`. 311 | ```lhs 312 | 313 | < some :: Alternative f => f a -> f [a] 314 | < some px = px <:> many px 315 | < 316 | < many :: Alternative f => f a -> f [a] 317 | < many px = some px <|> pure [] 318 | 319 | ``` 320 | 321 | Giving the option to parse: 322 | 323 | < optional :: Alternative f => f a -> f (Maybe a) 324 | < optional v = Just <$> v <|> pure Nothing 325 | 326 | 327 | ```lhs 328 | 329 | > chainl :: Alternative f => f a -> f (a -> a -> a) -> a -> f a 330 | > chainl px pf x = chainl1 px pf <|> pure x 331 | 332 | > chainl1 :: Alternative f => f a -> f (a -> a -> a) -> f a 333 | > chainl1 px pf = foldl' (flip ($)) <$> px <*> (many (flip <$> pf <*> px)) 334 | 335 | > chainr :: Alternative f => f a -> f (a -> a -> a) -> a -> f a 336 | > chainr px pf x = chainr1 px pf <|> pure x 337 | 338 | > chainr1 :: Alternative f => f a -> f (a -> a -> a) -> f a 339 | > chainr1 px pf = flip (foldr ($)) <$> (many (px <**> pf)) <*> px 340 | 341 | > prefix :: Alternative f => f (a -> a) -> f a -> f a 342 | > prefix op p = flip (foldr ($)) <$> many op <*> p 343 | 344 | > postfix :: Alternative f => f a -> f (a -> a) -> f a 345 | > postfix p op = foldl (flip ($)) <$> p <*> many op 346 | 347 | > sepBy :: Alternative f => f a -> f sep -> f [a] 348 | > sepBy px psep = sepBy1 px psep <|> pure [] 349 | > 350 | > sepBy1 :: Alternative f => f a -> f sep -> f [a] 351 | > sepBy1 px psep = px <:> (many (psep *> px)) 352 | 353 | ``` 354 | 355 | Monad 356 | ===== 357 | 358 | The monad instance allows the value in the result of one parser to 359 | influence the output of the parse. 360 | ```lhs 361 | 362 | > instance Monad Parser where 363 | > return :: a -> Parser a 364 | > return ofTheJedi = pure ofTheJedi -- sorry, I couldn't help it. 365 | > 366 | > (>>=) :: Parser a -> (a -> Parser b) -> Parser b 367 | > Parser px >>= f = Parser (\ts -> concat [ parse (f x) ts' | (x, ts') <- px ts ]) 368 | 369 | 370 | Satisfy 371 | ======= 372 | 373 | The `satisfy` parser accepts characters that satisfy a given 374 | predicate. It can be derived from the monadic interface as 375 | follows: 376 | 377 | ``` 378 | Derived combinators: 379 | ```lhs 380 | 381 | < satisfy :: (Char -> Bool) -> Parser Char 382 | < satisfy p = item >>= \t -> if p t then pure t else empty 383 | 384 | ``` 385 | 386 | More directly, we can avoid monadic combinators with this: 387 | 388 | ```lhs 389 | 390 | > satisfy :: (Char -> Bool) -> Parser Char 391 | > satisfy p = Parser (\ts -> case ts of 392 | > [] -> [] 393 | > (t:ts') -> [(t, ts') | p t]) 394 | > 395 | > oneOf :: [Char] -> Parser Char 396 | > oneOf = satisfy . flip elem 397 | > 398 | > noneOf :: [Char] -> Parser Char 399 | > noneOf cs = satisfy (not . flip elem cs) 400 | 401 | ``` 402 | Using `satisfy` we can build a useful array of smaller parsers, such 403 | as one for recognising a particular character, or a particular string. 404 | ```lhs 405 | 406 | > char :: Char -> Parser Char 407 | > char c = satisfy (c ==) 408 | 409 | > 410 | > string :: String -> Parser String 411 | > string [] = return "" 412 | > string (c:cs) = char c <:> string cs 413 | 414 | ``` 415 | 416 | Miscellaneous 417 | ============= 418 | 419 | It is convenient to have a way to remove results from a parse. 420 | ```lhs 421 | 422 | > cull :: Parser a -> Parser a 423 | > cull (Parser px) = Parser (\ts -> take 1 (px ts)) 424 | 425 | ``` 426 | 427 | 428 | There is a try after all, but it is only here to make this work with 429 | code written for other members of the Parsec family. 430 | ```lhs 431 | 432 | > try :: Parser a -> Parser a 433 | > try = id 434 | 435 | ``` 436 | 437 | 438 | 439 | Pronunciation /prəˌnʌnsɪˈeɪʃ(ə)n/ 440 | ==================================== 441 | 442 | Most of the symbols in this file are not easily pronounced, so let's establish 443 | some nomenclature. 444 | 445 | Symbol Name 446 | 447 | <$> fmap 448 | <$ const fmap 449 | 450 | <*> tie fighter, or just "tie", ap 451 | <* tie left, 452 | *> tie right, 453 | <**> tie bomber, pa 454 | 455 | >>= bind 456 | 457 | <|> or 458 | 459 | <:> lift cons 460 | 461 | --------------------------------------------------------------------------------