├── .gitignore
├── README.md
├── elm-package.json
├── LICENSE
├── comparison.md
└── src
    ├── Parser
        ├── Internal.elm
        ├── LowLevel.elm
        └── LanguageKit.elm
    └── Parser.elm


/.gitignore:
--------------------------------------------------------------------------------
1 | elm-stuff


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Moved to [elm/parser](https://github.com/elm/parser)
2 | 


--------------------------------------------------------------------------------
/elm-package.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "version": "2.0.1",
 3 |     "summary": "a parsing library, focused on simplicity and great error messages",
 4 |     "repository": "https://github.com/elm-tools/parser.git",
 5 |     "license": "BSD-3-Clause",
 6 |     "source-directories": [
 7 |         "src"
 8 |     ],
 9 |     "exposed-modules": [
10 |         "Parser",
11 |         "Parser.LanguageKit",
12 |         "Parser.LowLevel"
13 |     ],
14 |     "dependencies": {
15 |         "elm-lang/core": "5.1.0 <= v < 6.0.0",
16 |         "elm-tools/parser-primitives": "1.0.0 <= v < 2.0.0"
17 |     },
18 |     "elm-version": "0.18.0 <= v < 0.19.0"
19 | }
20 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2017-present, Evan Czaplicki
 2 | All rights reserved.
 3 | 
 4 | Redistribution and use in source and binary forms, with or without
 5 | modification, are permitted provided that the following conditions are met:
 6 | 
 7 | * Redistributions of source code must retain the above copyright notice, this
 8 |   list of conditions and the following disclaimer.
 9 | 
10 | * Redistributions in binary form must reproduce the above copyright notice,
11 |   this list of conditions and the following disclaimer in the documentation
12 |   and/or other materials provided with the distribution.
13 | 
14 | * Neither the name of the {organization} nor the names of its
15 |   contributors may be used to endorse or promote products derived from
16 |   this software without specific prior written permission.
17 | 
18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
19 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
21 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
22 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
24 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
25 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28 | 


--------------------------------------------------------------------------------
/comparison.md:
--------------------------------------------------------------------------------
 1 | ## Comparison with Prior Work
 2 | 
 3 | I have not seen the [parser pipeline][1] or the [context stack][2] ideas in other libraries, but [delayed commits][3] relate to prior work.
 4 | 
 5 | [1]: README.md#parser-pipelines
 6 | [2]: README.md#tracking-context
 7 | [3]: README.md#delayed-commits
 8 | 
 9 | Most parser combinator libraries I have seen are based on Haskell’s Parsec library, which has primitives named `try` and `lookAhead`. I believe [`delayedCommitMap`][delayedCommitMap] is a better primitive for two reasons.
10 | 
11 | [delayedCommitMap]: http://package.elm-lang.org/packages/elm-tools/parser/latest/Parser#delayedCommitMap
12 | 
13 | 
14 | ### Performance and Composition
15 | 
16 | Say we want to create a precise error message for `length [1,,3]`. The naive approach with Haskell’s Parsec library produces very bad error messages:
17 | 
18 | ```haskell
19 | spaceThenArg :: Parser Expr
20 | spaceThenArg =
21 |   try (spaces >> term)
22 | ```
23 | 
24 | This means we get a precise error from `term`, but then throw it away and say something went wrong at the space before the `[`. Very confusing! To improve quality, we must write something like this:
25 | 
26 | ```haskell
27 | spaceThenArg :: Parser Expr
28 | spaceThenArg =
29 |   choice
30 |     [ do  lookAhead (spaces >> char '[')
31 |           spaces
32 |           term
33 |     , try (spaces >> term)
34 |     ]
35 | ```
36 | 
37 | Notice that we parse `spaces` twice no matter what.
38 | 
39 | Notice that we also had to hardcode `[` in the `lookAhead`. What if we update `term` to parse records that start with `{` as well? To get good commits on records, we must remember to update `lookAhead` to look for `oneOf "[{"`. Implementation details are leaking out of `term`!
40 | 
41 | With `delayedCommit` in this Elm library, you can just say:
42 | 
43 | ```elm
44 | spaceThenArg : Parser Expr
45 | spaceThenArg =
46 |   delayedCommit spaces term
47 | ```
48 | 
49 | It does less work, and is more reliable as `term` evolves. I believe `delayedCommit` makes `lookAhead` pointless.
50 | 
51 | 
52 | ### Expressiveness
53 | 
54 | You can define `try` in terms of [`delayedCommitMap`][delayedCommitMap] like this:
55 | 
56 | ```elm
57 | try : Parser a -> Parser a
58 | try parser =
59 |   delayedCommitMap always parser (succeed ())
60 | ```
61 | 
62 | No expressiveness is lost!
63 | 
64 | While it is possible to define `try`, I left it out of this package. In practice, `try` often leads to “bad commits” where your parser fails in a very specific way, but you then backtrack to a less specific error message. I considered naming it `allOrNothing` to better explain how it changes commit behavior, but ultimately, I thought it was best to encourage users to express their parsers with `delayedCommit` directly.
65 | 
66 | 
67 | ### Summary
68 | 
69 | Compared to previous work, `delayedCommit` lets you produce precise error messages **more efficiently**. By thinking about “commit behavior” directly, you also end up with **cleaner composition** of parsers. And these benefits come **without any loss of expressiveness**.
70 | 


--------------------------------------------------------------------------------
/src/Parser/Internal.elm:
--------------------------------------------------------------------------------
  1 | module Parser.Internal exposing
  2 |   ( Parser(..)
  3 |   , Step(..)
  4 |   , State
  5 |   , chomp
  6 |   , chompDigits
  7 |   , chompDotAndExp
  8 |   , isBadIntEnd
  9 |   )
 10 | 
 11 | 
 12 | import Char
 13 | import ParserPrimitives as Prim
 14 | 
 15 | 
 16 | 
 17 | -- PARSERS
 18 | 
 19 | 
 20 | type Parser ctx x a =
 21 |   Parser (State ctx -> Step ctx x a)
 22 | 
 23 | 
 24 | type Step ctx x a
 25 |   = Good a (State ctx)
 26 |   | Bad x (State ctx)
 27 | 
 28 | 
 29 | type alias State ctx =
 30 |   { source : String
 31 |   , offset : Int
 32 |   , indent : Int
 33 |   , context : List ctx
 34 |   , row : Int
 35 |   , col : Int
 36 |   }
 37 | 
 38 | 
 39 | 
 40 | -- CHOMPERS
 41 | 
 42 | 
 43 | chomp : (Char -> Bool) -> Int -> String -> Int
 44 | chomp isGood offset source =
 45 |   let
 46 |     newOffset =
 47 |       Prim.isSubChar isGood offset source
 48 |   in
 49 |     if newOffset < 0 then
 50 |       offset
 51 | 
 52 |     else
 53 |       chomp isGood newOffset source
 54 | 
 55 | 
 56 | 
 57 | -- CHOMP DIGITS
 58 | 
 59 | 
 60 | chompDigits : (Char -> Bool) -> Int -> String -> Result Int Int
 61 | chompDigits isValidDigit offset source =
 62 |   let
 63 |     newOffset =
 64 |       chomp isValidDigit offset source
 65 |   in
 66 |     -- no digits
 67 |     if newOffset == offset then
 68 |       Err newOffset
 69 | 
 70 |     -- ends with non-digit characters
 71 |     else if Prim.isSubChar isBadIntEnd newOffset source /= -1 then
 72 |       Err newOffset
 73 | 
 74 |     -- all valid digits!
 75 |     else
 76 |       Ok newOffset
 77 | 
 78 | 
 79 | isBadIntEnd : Char -> Bool
 80 | isBadIntEnd char =
 81 |   Char.isDigit char
 82 |   || Char.isUpper char
 83 |   || Char.isLower char
 84 |   || char == '.'
 85 | 
 86 | 
 87 | 
 88 | -- CHOMP FLOAT STUFF
 89 | 
 90 | 
 91 | chompDotAndExp : Int -> String -> Result Int Int
 92 | chompDotAndExp offset source =
 93 |   let
 94 |     dotOffset =
 95 |       Prim.isSubChar isDot offset source
 96 |   in
 97 |     if dotOffset == -1 then
 98 |       chompExp offset source
 99 | 
100 |     else
101 |       chompExp (chomp Char.isDigit dotOffset source) source
102 | 
103 | 
104 | isDot : Char -> Bool
105 | isDot char =
106 |   char == '.'
107 | 
108 | 
109 | chompExp : Int -> String -> Result Int Int
110 | chompExp offset source =
111 |   let
112 |     eOffset =
113 |       Prim.isSubChar isE offset source
114 |   in
115 |     if eOffset == -1 then
116 |       Ok offset
117 | 
118 |     else
119 |       let
120 |         opOffset =
121 |           Prim.isSubChar isPlusOrMinus eOffset source
122 | 
123 |         expOffset =
124 |           if opOffset == -1 then eOffset else opOffset
125 |       in
126 |         if Prim.isSubChar isZero expOffset source /= -1 then
127 |           Err expOffset
128 | 
129 |         else if Prim.isSubChar Char.isDigit expOffset source == -1 then
130 |           Err expOffset
131 | 
132 |         else
133 |           chompDigits Char.isDigit expOffset source
134 | 
135 | 
136 | isE : Char -> Bool
137 | isE char =
138 |   char == 'e' || char == 'E'
139 | 
140 | 
141 | isZero : Char -> Bool
142 | isZero char =
143 |   char == '0'
144 | 
145 | 
146 | isPlusOrMinus : Char -> Bool
147 | isPlusOrMinus char =
148 |   char == '+' || char == '-'
149 | 
150 | 


--------------------------------------------------------------------------------
/src/Parser/LowLevel.elm:
--------------------------------------------------------------------------------
  1 | module Parser.LowLevel exposing
  2 |   ( getIndentLevel
  3 |   , withIndentLevel
  4 | 
  5 |   , getPosition
  6 |   , getRow
  7 |   , getCol
  8 | 
  9 |   , getOffset
 10 |   , getSource
 11 |   )
 12 | 
 13 | {-| You are unlikely to need any of this under normal circumstances.
 14 | 
 15 | # Indentation
 16 | @docs getIndentLevel, withIndentLevel
 17 | 
 18 | # Row, Column, Offset, and Source
 19 | @docs getPosition, getRow, getCol, getOffset, getSource
 20 | 
 21 | -}
 22 | 
 23 | import Parser exposing (Parser)
 24 | import Parser.Internal as I exposing (State)
 25 | 
 26 | 
 27 | 
 28 | -- INDENTATION
 29 | 
 30 | 
 31 | {-| This parser tracks “indentation level” so you can parse indentation
 32 | sensitive languages. Indentation levels correspond to column numbers, so
 33 | it starts at 1.
 34 | -}
 35 | getIndentLevel : Parser Int
 36 | getIndentLevel =
 37 |   I.Parser <| \state -> I.Good state.indent state
 38 | 
 39 | 
 40 | {-| Run a parser with a given indentation level. So you will likely
 41 | use `getCol` to get the current column, `andThen` give that to
 42 | `withIndentLevel`.
 43 | -}
 44 | withIndentLevel : Int -> Parser a -> Parser a
 45 | withIndentLevel newIndent (I.Parser parse) =
 46 |   I.Parser <| \state1 ->
 47 |     case parse (changeIndent newIndent state1) of
 48 |       I.Good a state2 ->
 49 |         I.Good a (changeIndent state1.indent state2)
 50 | 
 51 |       I.Bad x state2 ->
 52 |         I.Bad x (changeIndent state1.indent state2)
 53 | 
 54 | 
 55 | changeIndent : Int -> State ctx -> State ctx
 56 | changeIndent newIndent { source, offset, context, row, col } =
 57 |   { source = source
 58 |   , offset = offset
 59 |   , indent = newIndent
 60 |   , context = context
 61 |   , row = row
 62 |   , col = col
 63 |   }
 64 | 
 65 | 
 66 | 
 67 | -- POSITION
 68 | 
 69 | 
 70 | {-| Code editors treat code like a grid. There are rows and columns.
 71 | In most editors, rows and colums are 1-indexed. You move to a new row
 72 | whenever you see a `\n` character.
 73 | 
 74 | The `getPosition` parser succeeds with your current row and column
 75 | within the string you are parsing.
 76 | -}
 77 | getPosition : Parser (Int, Int)
 78 | getPosition =
 79 |   I.Parser <| \state -> I.Good (state.row, state.col) state
 80 | 
 81 | 
 82 | {-| The `getRow` parser succeeds with your current row within
 83 | the string you are parsing.
 84 | -}
 85 | getRow : Parser Int
 86 | getRow =
 87 |   I.Parser <| \state -> I.Good state.row state
 88 | 
 89 | 
 90 | {-| The `getCol` parser succeeds with your current column within
 91 | the string you are parsing.
 92 | -}
 93 | getCol : Parser Int
 94 | getCol =
 95 |   I.Parser <| \state -> I.Good state.col state
 96 | 
 97 | 
 98 | {-| Editors think of code as a grid, but behind the scenes it is just
 99 | a flat array of UTF16 characters. `getOffset` tells you your index in
100 | that flat array. So if you have read `"\n\n\n\n"` you are on row 5,
101 | column 1, and offset 4.
102 | 
103 | **Note:** browsers use UTF16 strings, so characters may be one or two 16-bit
104 | words. This means you can read 4 characters, but your offset will move by 8.
105 | -}
106 | getOffset : Parser Int
107 | getOffset =
108 |   I.Parser <| \state -> I.Good state.offset state
109 | 
110 | 
111 | {-| Get the entire string you are parsing right now. Paired with
112 | `getOffset` this can let you use `String.slice` to grab substrings
113 | with very little intermediate allocation.
114 | -}
115 | getSource : Parser String
116 | getSource =
117 |   I.Parser <| \state -> I.Good state.source state
118 | 
119 | 


--------------------------------------------------------------------------------
/src/Parser/LanguageKit.elm:
--------------------------------------------------------------------------------
  1 | module Parser.LanguageKit exposing
  2 |   ( variable
  3 |   , list, record, tuple, sequence, Trailing(..)
  4 |   , whitespace, LineComment(..), MultiComment(..)
  5 |   )
  6 | 
  7 | 
  8 | {-|
  9 | 
 10 | # Variables
 11 | @docs variable
 12 | 
 13 | # Lists, records, and that sort of thing
 14 | @docs list, record, tuple, sequence, Trailing
 15 | 
 16 | # Whitespace
 17 | @docs whitespace, LineComment, MultiComment
 18 | 
 19 | -}
 20 | 
 21 | 
 22 | import Set exposing (Set)
 23 | import Parser exposing (..)
 24 | import Parser.Internal as I exposing (Step(..), State)
 25 | import ParserPrimitives as Prim
 26 | 
 27 | 
 28 | 
 29 | -- VARIABLES
 30 | 
 31 | 
 32 | {-| Create a parser for variables. It takes two `Char` checkers. The
 33 | first one is for the first character. The second one is for all the
 34 | other characters.
 35 | 
 36 | In Elm, we distinguish between upper and lower case variables, so we
 37 | can do something like this:
 38 | 
 39 |     import Char
 40 |     import Parser exposing (..)
 41 |     import Parser.LanguageKit exposing (variable)
 42 |     import Set
 43 | 
 44 |     lowVar : Parser String
 45 |     lowVar =
 46 |       variable Char.isLower isVarChar keywords
 47 | 
 48 |     capVar : Parser String
 49 |     capVar =
 50 |       variable Char.isUpper isVarChar keywords
 51 | 
 52 |     isVarChar : Char -> Bool
 53 |     isVarChar char =
 54 |       Char.isLower char
 55 |       || Char.isUpper char
 56 |       || Char.isDigit char
 57 |       || char == '_'
 58 | 
 59 |     keywords : Set.Set String
 60 |     keywords =
 61 |       Set.fromList [ "let", "in", "case", "of" ]
 62 | -}
 63 | variable : (Char -> Bool) -> (Char -> Bool) -> Set String -> Parser String
 64 | variable isFirst isOther keywords =
 65 |   I.Parser <| \({ source, offset, indent, context, row, col } as state1) ->
 66 |     let
 67 |       firstOffset =
 68 |         Prim.isSubChar isFirst offset source
 69 |     in
 70 |       if firstOffset == -1 then
 71 |         Bad ExpectingVariable state1
 72 | 
 73 |       else
 74 |         let
 75 |           state2 =
 76 |             if firstOffset == -2 then
 77 |               varHelp isOther (offset + 1) (row + 1) 1 source indent context
 78 |             else
 79 |               varHelp isOther firstOffset row (col + 1) source indent context
 80 | 
 81 |           name =
 82 |             String.slice offset state2.offset source
 83 |         in
 84 |           if Set.member name keywords then
 85 |             Bad ExpectingVariable state1
 86 | 
 87 |           else
 88 |             Good name state2
 89 | 
 90 | 
 91 | varHelp : (Char -> Bool) -> Int -> Int -> Int -> String -> Int -> List ctx -> State ctx
 92 | varHelp isGood offset row col source indent context =
 93 |   let
 94 |     newOffset =
 95 |       Prim.isSubChar isGood offset source
 96 |   in
 97 |     if newOffset == -1 then
 98 |       { source = source
 99 |       , offset = offset
100 |       , indent = indent
101 |       , context = context
102 |       , row = row
103 |       , col = col
104 |       }
105 | 
106 |     else if newOffset == -2 then
107 |       varHelp isGood (offset + 1) (row + 1) 1 source indent context
108 | 
109 |     else
110 |       varHelp isGood newOffset row (col + 1) source indent context
111 | 
112 | 
113 | 
114 | -- SEQUENCES
115 | 
116 | 
117 | {-| Parse a comma-separated list like `[ 1, 2, 3 ]`. You provide
118 | a parser for the spaces and for the list items. So if you want
119 | to parse a list of integers, you would say:
120 | 
121 |     import Parser exposing (Parser)
122 |     import Parser.LanguageKit as Parser
123 | 
124 |     intList : Parser (List Int)
125 |     intList =
126 |       Parser.list spaces Parser.int
127 | 
128 |     spaces : Parser ()
129 |     spaces =
130 |       Parser.ignore zeroOrMore (\char -> char == ' ')
131 | 
132 |     -- run intList "[]"            == Ok []
133 |     -- run intList "[ ]"           == Ok []
134 |     -- run intList "[1,2,3]"       == Ok [1,2,3]
135 |     -- run intList "[ 1, 2, 3 ]"   == Ok [1,2,3]
136 |     -- run intList "[ 1 , 2 , 3 ]" == Ok [1,2,3]
137 |     -- run intList "[ 1, 2, 3, ]"  == Err ...
138 |     -- run intList "[, 1, 2, 3 ]"  == Err ...
139 | 
140 | **Note:** If you want trailing commas, check out the
141 | [`sequence`](#sequence) function.
142 | -}
143 | list : Parser () -> Parser a -> Parser (List a)
144 | list spaces item =
145 |   sequence
146 |     { start = "["
147 |     , separator = ","
148 |     , end = "]"
149 |     , spaces = spaces
150 |     , item = item
151 |     , trailing = Forbidden
152 |     }
153 | 
154 | 
155 | {-| Help parse records like `{ a = 2, b = 2 }`. You provide
156 | a parser for the spaces and for the list items, you might say:
157 | 
158 |     import Parser exposing ( Parser, (|.), (|=), zeroOrMore )
159 |     import Parser.LanguageKit as Parser
160 | 
161 |     record : Parser (List (String, Int))
162 |     record =
163 |       Parser.record spaces field
164 | 
165 |     field : Parser (String, Int)
166 |     field =
167 |       Parser.succeed (,)
168 |         |= lowVar
169 |         |. spaces
170 |         |. Parser.symbol "="
171 |         |. spaces
172 |         |= int
173 | 
174 |     spaces : Parser ()
175 |     spaces =
176 |       Parser.ignore zeroOrMore (\char -> char == ' ')
177 | 
178 |     -- run record "{}"               == Ok []
179 |     -- run record "{ }"              == Ok []
180 |     -- run record "{ x = 3 }"        == Ok [ ("x",3) ]
181 |     -- run record "{ x = 3, }"       == Err ...
182 |     -- run record "{ x = 3, y = 4 }" == Ok [ ("x",3), ("y",4) ]
183 |     -- run record "{ x = 3, y = }"   == Err ...
184 | 
185 | **Note:** If you want trailing commas, check out the
186 | [`sequence`](#sequence) function.
187 | -}
188 | record : Parser () -> Parser a -> Parser (List a)
189 | record spaces item =
190 |   sequence
191 |     { start = "{"
192 |     , separator = ","
193 |     , end = "}"
194 |     , spaces = spaces
195 |     , item = item
196 |     , trailing = Forbidden
197 |     }
198 | 
199 | 
200 | {-| Help parse tuples like `(3, 4)`. Works just like [`list`](#list)
201 | and [`record`](#record). And if you need something custom, check out
202 | the [`sequence`](#sequence) function.
203 | -}
204 | tuple : Parser () -> Parser a -> Parser (List a)
205 | tuple spaces item =
206 |   sequence
207 |     { start = "("
208 |     , separator = ","
209 |     , end = ")"
210 |     , spaces = spaces
211 |     , item = item
212 |     , trailing = Forbidden
213 |     }
214 | 
215 | 
216 | {-| Handle things *like* lists and records, but you can customize the
217 | details however you need. Say you want to parse C-style code blocks:
218 | 
219 |     import Parser exposing (Parser)
220 |     import Parser.LanguageKit as Parser exposing (Trailing(..))
221 | 
222 |     block : Parser (List Stmt)
223 |     block =
224 |       Parser.sequence
225 |         { start = "{"
226 |         , separator = ";"
227 |         , end = "}"
228 |         , spaces = spaces
229 |         , item = statement
230 |         , trailing = Mandatory -- demand a trailing semi-colon
231 |         }
232 | 
233 |     -- spaces : Parser ()
234 |     -- statement : Parser Stmt
235 | 
236 | **Note:** If you need something more custom, do not be afraid to check
237 | out the implementation and customize it for your case. It is better to
238 | get nice error messages with a lower-level implementation than to try
239 | to hack high-level parsers to do things they are not made for.
240 | -}
241 | sequence
242 |   : { start : String
243 |     , separator : String
244 |     , end : String
245 |     , spaces : Parser ()
246 |     , item : Parser a
247 |     , trailing : Trailing
248 |     }
249 |   -> Parser (List a)
250 | sequence { start, end, spaces, item, separator, trailing } =
251 |   symbol start
252 |     |- spaces
253 |     |- sequenceEnd end spaces item separator trailing
254 | 
255 | 
256 | {-| What’s the deal with trailing commas? Are they `Forbidden`?
257 | Are they `Optional`? Are they `Mandatory`? Welcome to [shapes
258 | club](http://poorlydrawnlines.com/comic/shapes-club/)!
259 | -}
260 | type Trailing = Forbidden | Optional | Mandatory
261 | 
262 | 
263 | ignore : Parser ignore -> Parser keep -> Parser keep
264 | ignore ignoreParser keepParser =
265 |   map2 revAlways ignoreParser keepParser
266 | 
267 | 
268 | (|-) : Parser ignore -> Parser keep -> Parser keep
269 | (|-) =
270 |   ignore
271 | 
272 | 
273 | revAlways : ignore -> keep -> keep
274 | revAlways _ keep =
275 |   keep
276 | 
277 | 
278 | sequenceEnd : String -> Parser () -> Parser a -> String -> Trailing -> Parser (List a)
279 | sequenceEnd end spaces parseItem sep trailing =
280 |   let
281 |     chompRest item =
282 |       case trailing of
283 |         Forbidden ->
284 |           sequenceEndForbidden end spaces parseItem sep [item]
285 | 
286 |         Optional ->
287 |           sequenceEndOptional end spaces parseItem sep [item]
288 | 
289 |         Mandatory ->
290 |           spaces
291 |             |- symbol sep
292 |             |- spaces
293 |             |- sequenceEndMandatory end spaces parseItem sep [item]
294 |   in
295 |     oneOf
296 |       [ parseItem
297 |           |> andThen chompRest
298 |       , symbol end
299 |           |- succeed []
300 |       ]
301 | 
302 | 
303 | sequenceEndForbidden : String -> Parser () -> Parser a -> String -> List a -> Parser (List a)
304 | sequenceEndForbidden end spaces parseItem sep revItems =
305 |   let
306 |     chompRest item =
307 |       sequenceEndForbidden end spaces parseItem sep (item :: revItems)
308 |   in
309 |     ignore spaces <|
310 |       oneOf
311 |         [ symbol sep
312 |             |- spaces
313 |             |- andThen chompRest parseItem
314 |         , symbol end
315 |             |- succeed (List.reverse revItems)
316 |         ]
317 | 
318 | 
319 | sequenceEndOptional : String -> Parser () -> Parser a -> String -> List a -> Parser (List a)
320 | sequenceEndOptional end spaces parseItem sep revItems =
321 |   let
322 |     parseEnd =
323 |       andThen (\_ -> succeed (List.reverse revItems)) (symbol end)
324 | 
325 |     chompRest item =
326 |       sequenceEndOptional end spaces parseItem sep (item :: revItems)
327 |   in
328 |     ignore spaces <|
329 |       oneOf
330 |         [ symbol sep
331 |             |- spaces
332 |             |- oneOf [ andThen chompRest parseItem, parseEnd ]
333 |         , parseEnd
334 |         ]
335 | 
336 | 
337 | sequenceEndMandatory : String -> Parser () -> Parser a -> String -> List a -> Parser (List a)
338 | sequenceEndMandatory end spaces parseItem sep revItems =
339 |   let
340 |     chompRest item =
341 |       sequenceEndMandatory end spaces parseItem sep (item :: revItems)
342 |   in
343 |     oneOf
344 |       [ andThen chompRest <|
345 |           parseItem
346 |             |. spaces
347 |             |. symbol sep
348 |             |. spaces
349 |       , symbol end
350 |           |- succeed (List.reverse revItems)
351 |       ]
352 | 
353 | 
354 | 
355 | -- WHITESPACE
356 | 
357 | 
358 | {-| Create a custom whitespace parser. It will always chomp the
359 | `' '`, `'\r'`, and `'\n'` characters, but you can customize some
360 | other things. Here are some examples:
361 | 
362 |     elm : Parser ()
363 |     elm =
364 |       whitespace
365 |         { allowTabs = False
366 |         , lineComment = LineComment "--"
367 |         , multiComment = NestableComment "{-" "-}"
368 |         }
369 | 
370 |     js : Parser ()
371 |     js =
372 |       whitespace
373 |         { allowTabs = True
374 |         , lineComment = LineComment "//"
375 |         , multiComment = UnnestableComment "/*" "*/"
376 |         }
377 | 
378 | If you need further customization, please open an issue describing your
379 | scenario or check out the source code and write it yourself. This is all
380 | built using stuff from the root `Parser` module.
381 | -}
382 | whitespace
383 |   : { allowTabs : Bool
384 |     , lineComment : LineComment
385 |     , multiComment : MultiComment
386 |     }
387 |   -> Parser ()
388 | whitespace { allowTabs, lineComment, multiComment } =
389 |   let
390 |     tabParser =
391 |       if allowTabs then
392 |         [ Parser.ignore zeroOrMore isTab ]
393 |       else
394 |         []
395 | 
396 |     lineParser =
397 |       case lineComment of
398 |         NoLineComment ->
399 |           []
400 | 
401 |         LineComment start ->
402 |           [ symbol start
403 |               |. ignoreUntil "\n"
404 |           ]
405 | 
406 |     multiParser =
407 |       case multiComment of
408 |         NoMultiComment ->
409 |           []
410 | 
411 |         UnnestableComment start end ->
412 |           [ symbol start
413 |               |. ignoreUntil end
414 |           ]
415 | 
416 |         NestableComment start end ->
417 |           [ nestableComment start end
418 |           ]
419 |   in
420 |     whitespaceHelp <|
421 |       oneOf (tabParser ++ lineParser ++ multiParser)
422 | 
423 | 
424 | chompSpaces : Parser ()
425 | chompSpaces =
426 |   Parser.ignore zeroOrMore isSpace
427 | 
428 | 
429 | isSpace : Char -> Bool
430 | isSpace char =
431 |   char == ' ' || char == '\n' || char == '\r'
432 | 
433 | 
434 | isTab : Char -> Bool
435 | isTab char =
436 |   char == '\t'
437 | 
438 | 
439 | whitespaceHelp : Parser a -> Parser ()
440 | whitespaceHelp parser =
441 |   ignore chompSpaces <|
442 |     oneOf [ andThen (\_ -> whitespaceHelp parser) parser, succeed () ]
443 | 
444 | 
445 | {-| Are line comments allowed? If so, what symbol do they start with?
446 | 
447 |     LineComment "--"   -- Elm
448 |     LineComment "//"   -- JS
449 |     LineComment "#"    -- Python
450 |     NoLineComment      -- OCaml
451 | -}
452 | type LineComment = NoLineComment | LineComment String
453 | 
454 | 
455 | {-| Are multi-line comments allowed? If so, what symbols do they start
456 | and end with?
457 | 
458 |     NestableComment "{-" "-}"    -- Elm
459 |     UnnestableComment "/*" "*/"  -- JS
460 |     NoMultiComment               -- Python
461 | 
462 | In Elm, you can nest multi-line comments. In C-like languages, like JS,
463 | this is not allowed. As soon as you see a `*/` the comment is over no
464 | matter what.
465 | -}
466 | type MultiComment
467 |   = NoMultiComment
468 |   | NestableComment String String
469 |   | UnnestableComment String String
470 | 
471 | 
472 | nestableComment : String -> String -> Parser ()
473 | nestableComment start end =
474 |   case (String.uncons start, String.uncons end) of
475 |     (Nothing, _) ->
476 |       fail "Trying to parse a multi-line comment, but the start token cannot be the empty string!"
477 | 
478 |     (_, Nothing) ->
479 |       fail "Trying to parse a multi-line comment, but the end token cannot be the empty string!"
480 | 
481 |     ( Just (startChar, _), Just (endChar, _) ) ->
482 |       let
483 |         isNotRelevant char =
484 |           char /= startChar && char /= endChar
485 |       in
486 |         symbol start
487 |           |. nestableCommentHelp isNotRelevant start end 1
488 | 
489 | 
490 | nestableCommentHelp : (Char -> Bool) -> String -> String -> Int -> Parser ()
491 | nestableCommentHelp isNotRelevant start end nestLevel =
492 |   lazy <| \_ ->
493 |     ignore (Parser.ignore zeroOrMore isNotRelevant) <|
494 |       oneOf
495 |         [ ignore (symbol end) <|
496 |             if nestLevel == 1 then
497 |               succeed ()
498 |             else
499 |               nestableCommentHelp isNotRelevant start end (nestLevel - 1)
500 |         , ignore (symbol start) <|
501 |             nestableCommentHelp isNotRelevant start end (nestLevel + 1)
502 |         , ignore (Parser.ignore (Exactly 1) isChar) <|
503 |             nestableCommentHelp isNotRelevant start end nestLevel
504 |         ]
505 | 
506 | 
507 | isChar : Char -> Bool
508 | isChar char =
509 |   True
510 | 


--------------------------------------------------------------------------------
/src/Parser.elm:
--------------------------------------------------------------------------------
   1 | module Parser exposing
   2 |   ( Parser
   3 |   , run
   4 |   , int, float, symbol, keyword, end
   5 |   , Count(..), zeroOrMore, oneOrMore, keep, ignore, repeat
   6 |   , succeed, fail, map, oneOf, (|=), (|.), map2, lazy, andThen
   7 |   , delayedCommit, delayedCommitMap
   8 |   , source, sourceMap, ignoreUntil
   9 |   , Error, Problem(..), Context, inContext
  10 |   )
  11 | 
  12 | {-|
  13 | 
  14 | # Parsers
  15 | @docs Parser, run
  16 | 
  17 | # Numbers and Keywords
  18 | @docs int, float, symbol, keyword, end
  19 | 
  20 | # Repeat Parsers
  21 | @docs Count, zeroOrMore, oneOrMore, keep, ignore, repeat
  22 | 
  23 | # Combining Parsers
  24 | @docs succeed, fail, map, oneOf, (|=), (|.), map2, lazy, andThen
  25 | 
  26 | # Delayed Commits
  27 | @docs delayedCommit, delayedCommitMap
  28 | 
  29 | # Efficiency Tricks
  30 | @docs source, sourceMap, ignoreUntil
  31 | 
  32 | # Errors
  33 | @docs Error, Problem, Context, inContext
  34 | -}
  35 | 
  36 | import Char
  37 | import Parser.Internal as Internal exposing (Parser(..), Step(..))
  38 | import ParserPrimitives as Prim
  39 | 
  40 | 
  41 | 
  42 | -- PARSER
  43 | 
  44 | 
  45 | {-| A parser! If you have a `Parser Int`, it is a parser that turns
  46 | strings into integers.
  47 | -}
  48 | type alias Parser a =
  49 |   Internal.Parser Context Problem a
  50 | 
  51 | 
  52 | type alias Step a =
  53 |   Internal.Step Context Problem a
  54 | 
  55 | 
  56 | type alias State =
  57 |   Internal.State Context
  58 | 
  59 | 
  60 | {-| Actually run a parser.
  61 | 
  62 |     run (keyword "true") "true"  == Ok ()
  63 |     run (keyword "true") "True"  == Err ...
  64 |     run (keyword "true") "false" == Err ...
  65 | -}
  66 | run : Parser a -> String -> Result Error a
  67 | run (Parser parse) source =
  68 |   let
  69 |     initialState =
  70 |       { source = source
  71 |       , offset = 0
  72 |       , indent = 1
  73 |       , context = []
  74 |       , row = 1
  75 |       , col = 1
  76 |       }
  77 |   in
  78 |     case parse initialState of
  79 |       Good a _ ->
  80 |         Ok a
  81 | 
  82 |       Bad problem { row, col, context } ->
  83 |         Err
  84 |           { row = row
  85 |           , col = col
  86 |           , source = source
  87 |           , problem = problem
  88 |           , context = context
  89 |           }
  90 | 
  91 | 
  92 | -- ERRORS
  93 | 
  94 | 
  95 | {-| Parse errors as data. You can format it however makes the most
  96 | sense for your application. Maybe that is all text, or maybe it is fancy
  97 | interactive HTML. Up to you!
  98 | 
  99 | You get:
 100 | 
 101 |   - The `row` and `col` of the error.
 102 |   - The full `source` provided to the [`run`](#run) function.
 103 |   - The actual `problem` you ran into.
 104 |   - A stack of `context` that describes where the error is *conceptually*.
 105 | 
 106 | **Note:** `context` is a stack. That means [`inContext`](#inContext)
 107 | adds to the *front* of this list, not the back. So if you want the
 108 | [`Context`](#Context) closest to the error, you want the first element
 109 | of the `context` stack.
 110 | -}
 111 | type alias Error =
 112 |   { row : Int
 113 |   , col : Int
 114 |   , source : String
 115 |   , problem : Problem
 116 |   , context : List Context
 117 |   }
 118 | 
 119 | 
 120 | {-| The particular problem you ran into.
 121 | 
 122 | The tricky one here is `BadRepeat`. That means that you are running
 123 | `zeroOrMore parser` where `parser` can succeed without consuming any
 124 | input. That means it will just loop forever, consuming no input until
 125 | the program crashes.
 126 | -}
 127 | type Problem
 128 |   = BadOneOf (List Problem)
 129 |   | BadInt
 130 |   | BadFloat
 131 |   | BadRepeat
 132 |   | ExpectingEnd
 133 |   | ExpectingSymbol String
 134 |   | ExpectingKeyword String
 135 |   | ExpectingVariable
 136 |   | ExpectingClosing String
 137 |   | Fail String
 138 | 
 139 | 
 140 | {-| Most parsers only let you know the row and column where the error
 141 | occurred. But what if you could *also* say “the error occured **while
 142 | parsing a list**” and let folks know what the *parser* thinks it is
 143 | doing?!
 144 | 
 145 | The error messages would be a lot nicer! That is what Elm compiler does,
 146 | and it is what `Context` helps you do in this library! **See the
 147 | [`inContext`](#inContext) docs for a nice example!**
 148 | 
 149 | About the actual fields:
 150 | 
 151 |   - `description` is set by [`inContext`](#inContext)
 152 |   - `row` and `col` are where [`inContext`](#inContext) began
 153 | 
 154 | Say you use `inContext` in your list parser. And say get an error trying
 155 | to parse `[ 1, 23zm5, 3 ]`. In addition to error information about `23zm5`,
 156 | you would have `Context` with the row and column of the starting `[` symbol.
 157 | -}
 158 | type alias Context =
 159 |   { row : Int
 160 |   , col : Int
 161 |   , description : String
 162 |   }
 163 | 
 164 | 
 165 | 
 166 | -- PRIMITIVES
 167 | 
 168 | 
 169 | {-| A parser that succeeds without consuming any text.
 170 | 
 171 |     run (succeed 90210  ) "mississippi" == Ok 90210
 172 |     run (succeed 3.141  ) "mississippi" == Ok 3.141
 173 |     run (succeed ()     ) "mississippi" == Ok ()
 174 |     run (succeed Nothing) "mississippi" == Ok Nothing
 175 | 
 176 | Seems weird, but it is often useful in combination with
 177 | [`oneOf`](#oneOf) or [`andThen`](#andThen).
 178 | -}
 179 | succeed : a -> Parser a
 180 | succeed a =
 181 |   Parser <| \state -> Good a state
 182 | 
 183 | 
 184 | {-| A parser always fails.
 185 | 
 186 |     run (fail "bad list") "[1,2,3]" == Err ..
 187 | 
 188 | Seems weird, but it is often useful in combination with
 189 | [`oneOf`](#oneOf) or [`andThen`](#andThen).
 190 | -}
 191 | fail : String -> Parser a
 192 | fail message =
 193 |   Parser <| \state -> Bad (Fail message) state
 194 | 
 195 | 
 196 | 
 197 | -- MAPPING
 198 | 
 199 | 
 200 | {-| Transform the result of a parser. Maybe you have a value that is
 201 | an integer or `null`:
 202 | 
 203 |     nullOrInt : Parser (Maybe Int)
 204 |     nullOrInt =
 205 |       oneOf
 206 |         [ map Just int
 207 |         , map (\_ -> Nothing) (keyword "null")
 208 |         ]
 209 | 
 210 |     -- run nullOrInt "0"    == Ok (Just 0)
 211 |     -- run nullOrInt "13"   == Ok (Just 13)
 212 |     -- run nullOrInt "null" == Ok Nothing
 213 |     -- run nullOrInt "zero" == Err ...
 214 | 
 215 | -}
 216 | map : (a -> b) -> Parser a -> Parser b
 217 | map func (Parser parse) =
 218 |   Parser <| \state1 ->
 219 |     case parse state1 of
 220 |       Good a state2 ->
 221 |         Good (func a) state2
 222 | 
 223 |       Bad x state2 ->
 224 |         Bad x state2
 225 | 
 226 | 
 227 | {-| **This function is not used much in practice.** It is nicer to use
 228 | the [parser pipeline][pp] operators [`(|.)`](#|.) and [`(|=)`](#|=)
 229 | instead.
 230 | 
 231 | [pp]: https://github.com/elm-tools/parser/blob/master/README.md#parser-pipeline
 232 | 
 233 | That said, this function can combine two parsers. Maybe you
 234 | want to parse some spaces followed by an integer:
 235 | 
 236 |     spacesThenInt : Parser Int
 237 |     spacesThenInt =
 238 |       map2 (\_ n -> n) spaces int
 239 | 
 240 |     spaces : Parser ()
 241 |     spaces =
 242 |       ignore zeroOrMore (\char -> char == ' ')
 243 | 
 244 | We can also use `map2` to define `(|.)` and `(|=)` like this:
 245 | 
 246 |     (|.) : Parser keep -> Parser ignore -> Parser keep
 247 |     (|.) keepParser ignoreParser =
 248 |       map2 (\keep _ -> keep) keepParser ignoreParser
 249 | 
 250 |     (|=) : Parser (a -> b) -> Parser a -> Parser b
 251 |     (|=) funcParser argParser =
 252 |       map2 (\func arg -> func arg) funcParser argParser
 253 | -}
 254 | map2 : (a -> b -> value) -> Parser a -> Parser b -> Parser value
 255 | map2 func (Parser parseA) (Parser parseB) =
 256 |   Parser <| \state1 ->
 257 |     case parseA state1 of
 258 |       Bad x state2 ->
 259 |         Bad x state2
 260 | 
 261 |       Good a state2 ->
 262 |         case parseB state2 of
 263 |           Bad x state3 ->
 264 |             Bad x state3
 265 | 
 266 |           Good b state3 ->
 267 |             Good (func a b) state3
 268 | 
 269 | 
 270 | {-| **Keep** a value in a parser pipeline.
 271 | 
 272 | Read about parser pipelines **[here][]**. They are really nice!
 273 | 
 274 | [here]: https://github.com/elm-tools/parser/blob/master/README.md#parser-pipeline
 275 | -}
 276 | (|=) : Parser (a -> b) -> Parser a -> Parser b
 277 | (|=) parseFunc parseArg =
 278 |   map2 apply parseFunc parseArg
 279 | 
 280 | 
 281 | apply : (a -> b) -> a -> b
 282 | apply f a =
 283 |   f a
 284 | 
 285 | 
 286 | {-| **Ignore** a value in a parser pipeline.
 287 | 
 288 | Read about parser pipelines **[here][]**. They are really nice!
 289 | 
 290 | [here]: https://github.com/elm-tools/parser/blob/master/README.md#parser-pipeline
 291 | -}
 292 | (|.) : Parser keep -> Parser ignore -> Parser keep
 293 | (|.) keepParser ignoreParser =
 294 |   map2 always keepParser ignoreParser
 295 | 
 296 | 
 297 | infixl 5 |.
 298 | infixl 5 |=
 299 | 
 300 | 
 301 | 
 302 | -- AND THEN
 303 | 
 304 | 
 305 | {-| Run a parser *and then* run another parser!
 306 | -}
 307 | andThen : (a -> Parser b) -> Parser a -> Parser b
 308 | andThen callback (Parser parseA) =
 309 |   Parser <| \state1 ->
 310 |     case parseA state1 of
 311 |       Bad x state2 ->
 312 |         Bad x state2
 313 | 
 314 |       Good a state2 ->
 315 |         let
 316 |           (Parser parseB) =
 317 |             callback a
 318 |         in
 319 |           parseB state2
 320 | 
 321 | 
 322 | 
 323 | -- LAZY
 324 | 
 325 | 
 326 | {-| Helper to define recursive parsers. Say we want a parser for simple
 327 | boolean expressions:
 328 | 
 329 |     true
 330 |     false
 331 |     (true || false)
 332 |     (true || (true || false))
 333 | 
 334 | Notice that a boolean expression might contain *other* boolean expressions.
 335 | That means we will want to define our parser in terms of itself:
 336 | 
 337 |     type Boolean
 338 |       = MyTrue
 339 |       | MyFalse
 340 |       | MyOr Boolean Boolean
 341 | 
 342 |     boolean : Parser Boolean
 343 |     boolean =
 344 |       oneOf
 345 |         [ succeed MyTrue
 346 |             |. keyword "true"
 347 |         , succeed MyFalse
 348 |             |. keyword "false"
 349 |         , succeed MyOr
 350 |             |. symbol "("
 351 |             |. spaces
 352 |             |= lazy (\_ -> boolean)
 353 |             |. spaces
 354 |             |. symbol "||"
 355 |             |. spaces
 356 |             |= lazy (\_ -> boolean)
 357 |             |. spaces
 358 |             |. symbol ")"
 359 |         ]
 360 | 
 361 |     spaces : Parser ()
 362 |     spaces =
 363 |       ignore zeroOrMore (\char -> char == ' ')
 364 | 
 365 | **Notice that `boolean` uses `boolean` in its definition!** In Elm, you can
 366 | only define a value in terms of itself it is behind a function call. So
 367 | `lazy` helps us define these self-referential parsers.
 368 | 
 369 | **Note:** In some cases, it may be more natural or efficient to use
 370 | `andThen` to hide a self-reference behind a function.
 371 | -}
 372 | lazy : (() -> Parser a) -> Parser a
 373 | lazy thunk =
 374 |   Parser <| \state ->
 375 |     let
 376 |       (Parser parse) =
 377 |         thunk ()
 378 |     in
 379 |       parse state
 380 | 
 381 | 
 382 | 
 383 | -- ONE OF
 384 | 
 385 | 
 386 | {-| Try a bunch of different parsers. If a parser does not commit, we
 387 | move on and try the next one. If a parser *does* commit, we give up on any
 388 | remaining parsers.
 389 | 
 390 | The idea is: if you make progress and commit to a parser, you want to
 391 | get error messages from *that path*. If you bactrack and keep trying stuff
 392 | you will get a much less precise error.
 393 | 
 394 | So say we are parsing “language terms” that include integers and lists
 395 | of integers:
 396 | 
 397 |     term : Parser Expr
 398 |     term =
 399 |       oneOf
 400 |         [ listOf int
 401 |         , int
 402 |         ]
 403 | 
 404 |     listOf : Parser a -> Parser (List a)
 405 |     listOf parser =
 406 |       succeed identity
 407 |         |. symbol "["
 408 |         |. spaces
 409 |         ...
 410 | 
 411 | When we get to `oneOf`, we first try the `listOf int` parser. If we see a
 412 | `[` we *commit* to that parser. That means if something goes wrong, we do
 413 | not backtrack. Instead the parse fails! If we do not see a `[` we move on
 414 | to the second option and just try the `int` parser.
 415 | -}
 416 | oneOf : List (Parser a) -> Parser a
 417 | oneOf parsers =
 418 |   Parser <| \state -> oneOfHelp state [] parsers
 419 | 
 420 | 
 421 | oneOfHelp : State -> List Problem -> List (Parser a) -> Step a
 422 | oneOfHelp state problems parsers =
 423 |   case parsers of
 424 |     [] ->
 425 |       Bad (BadOneOf (List.reverse problems)) state
 426 | 
 427 |     Parser parse :: remainingParsers ->
 428 |       case parse state of
 429 |         Good _ _ as step ->
 430 |           step
 431 | 
 432 |         Bad problem { row, col } as step ->
 433 |           if state.row == row && state.col == col then
 434 |             oneOfHelp state (problem :: problems) remainingParsers
 435 | 
 436 |           else
 437 |             step
 438 | 
 439 | 
 440 | 
 441 | -- REPEAT
 442 | 
 443 | 
 444 | {-| Try to use the parser as many times as possible. Say we want to parse
 445 | `NaN` a bunch of times:
 446 | 
 447 |     batman : Parser Int
 448 |     batman =
 449 |       map List.length (repeat zeroOrMore (keyword "NaN"))
 450 | 
 451 |     -- run batman "whatever"       == Ok 0
 452 |     -- run batman ""               == Ok 0
 453 |     -- run batman "NaN"            == Ok 1
 454 |     -- run batman "NaNNaN"         == Ok 2
 455 |     -- run batman "NaNNaNNaN"      == Ok 3
 456 |     -- run batman "NaNNaN batman!" == Ok 2
 457 | 
 458 | **Note:** If you are trying to parse things like `[1,2,3]` or `{ x = 3 }`
 459 | check out the [`list`](Parser-LanguageKit#list) and
 460 | [`record`](Parser-LanguageKit#record) functions in the
 461 | [`Parser.LanguageKit`](Parser-LanguageKit) module.
 462 | -}
 463 | repeat : Count -> Parser a -> Parser (List a)
 464 | repeat count (Parser parse) =
 465 |   case count of
 466 |     Exactly n ->
 467 |       Parser <| \state ->
 468 |         repeatExactly n parse [] state
 469 | 
 470 |     AtLeast n ->
 471 |       Parser <| \state ->
 472 |         repeatAtLeast n parse [] state
 473 | 
 474 | 
 475 | repeatExactly : Int -> (State -> Step a) -> List a -> State -> Step (List a)
 476 | repeatExactly n parse revList state1 =
 477 |   if n <= 0 then
 478 |     Good (List.reverse revList) state1
 479 | 
 480 |   else
 481 |     case parse state1 of
 482 |       Good a state2 ->
 483 |         if state1.row == state2.row && state1.col == state2.col then
 484 |           Bad BadRepeat state2
 485 |         else
 486 |           repeatExactly (n - 1) parse (a :: revList) state2
 487 | 
 488 |       Bad x state2 ->
 489 |         Bad x state2
 490 | 
 491 | 
 492 | repeatAtLeast : Int -> (State -> Step a) -> List a -> State -> Step (List a)
 493 | repeatAtLeast n parse revList state1 =
 494 |   case parse state1 of
 495 |     Good a state2 ->
 496 |       if state1.row == state2.row && state1.col == state2.col then
 497 |         Bad BadRepeat state2
 498 |       else
 499 |         repeatAtLeast (n - 1) parse (a :: revList) state2
 500 | 
 501 |     Bad x state2 ->
 502 |       if state1.row == state2.row && state1.col == state2.col && n <= 0 then
 503 |         Good (List.reverse revList) state1
 504 | 
 505 |       else
 506 |         Bad x state2
 507 | 
 508 | 
 509 | 
 510 | -- DELAYED COMMIT
 511 | 
 512 | 
 513 | {-| Only commit if `Parser a` succeeds and `Parser value` makes some progress.
 514 | 
 515 | This is very important for generating high quality error messages! Read more
 516 | about this [here][1] and [here][2].
 517 | 
 518 | [1]: https://github.com/elm-tools/parser/blob/master/README.md#delayed-commits
 519 | [2]: https://github.com/elm-tools/parser/blob/master/comparison.md
 520 | -}
 521 | delayedCommit : Parser a -> Parser value -> Parser value
 522 | delayedCommit filler realStuff =
 523 |   delayedCommitMap (\_ v -> v) filler realStuff
 524 | 
 525 | 
 526 | {-| Like [`delayedCommit`](#delayedCommit), but lets you extract values from
 527 | both parsers. Read more about it [here][1] and [here][2].
 528 | 
 529 | [1]: https://github.com/elm-tools/parser/blob/master/README.md#delayed-commits
 530 | [2]: https://github.com/elm-tools/parser/blob/master/comparison.md
 531 | -}
 532 | delayedCommitMap : (a -> b -> value) -> Parser a -> Parser b -> Parser value
 533 | delayedCommitMap func (Parser parseA) (Parser parseB) =
 534 |   Parser <| \state1 ->
 535 |     case parseA state1 of
 536 |       Bad x _ ->
 537 |         Bad x state1
 538 | 
 539 |       Good a state2 ->
 540 |         case parseB state2 of
 541 |           Good b state3 ->
 542 |             Good (func a b) state3
 543 | 
 544 |           Bad x state3 ->
 545 |             if state2.row == state3.row && state2.col == state3.col then
 546 |               Bad x state1
 547 |             else
 548 |               Bad x state3
 549 | 
 550 | 
 551 | 
 552 | -- SYMBOLS and KEYWORDS
 553 | 
 554 | 
 555 | {-| Parse symbols like `,`, `(`, and `&&`.
 556 | 
 557 |     run (symbol "[") "[" == Ok ()
 558 |     run (symbol "[") "4" == Err ... (ExpectingSymbol "[") ...
 559 | -}
 560 | symbol : String -> Parser ()
 561 | symbol str =
 562 |   token ExpectingSymbol str
 563 | 
 564 | 
 565 | {-| Parse keywords like `let`, `case`, and `type`.
 566 | 
 567 |     run (keyword "let") "let" == Ok ()
 568 |     run (keyword "let") "var" == Err ... (ExpectingKeyword "let") ...
 569 | -}
 570 | keyword : String -> Parser ()
 571 | keyword str =
 572 |   token ExpectingKeyword str
 573 | 
 574 | 
 575 | token : (String -> Problem) -> String -> Parser ()
 576 | token makeProblem str =
 577 |   Parser <| \({ source, offset, indent, context, row, col } as state) ->
 578 |     let
 579 |       (newOffset, newRow, newCol) =
 580 |         Prim.isSubString str offset row col source
 581 |     in
 582 |       if newOffset == -1 then
 583 |         Bad (makeProblem str) state
 584 | 
 585 |       else
 586 |         Good ()
 587 |           { source = source
 588 |           , offset = newOffset
 589 |           , indent = indent
 590 |           , context = context
 591 |           , row = newRow
 592 |           , col = newCol
 593 |           }
 594 | 
 595 | 
 596 | -- INT
 597 | 
 598 | 
 599 | {-| Parse integers. It accepts decimal and hexidecimal formats.
 600 | 
 601 |     -- decimal
 602 |     run int "1234" == Ok 1234
 603 |     run int "1.34" == Err ...
 604 |     run int "1e31" == Err ...
 605 |     run int "123a" == Err ...
 606 |     run int "0123" == Err ...
 607 | 
 608 |     -- hexidecimal
 609 |     run int "0x001A" == Ok 26
 610 |     run int "0x001a" == Ok 26
 611 |     run int "0xBEEF" == Ok 48879
 612 |     run int "0x12.0" == Err ...
 613 |     run int "0x12an" == Err ...
 614 | 
 615 | **Note:** If you want a parser for both `Int` and `Float` literals,
 616 | check out [`Parser.LanguageKit.number`](Parser-LanguageKit#number).
 617 | It does not backtrack, so it should be faster and give better error
 618 | messages than using `oneOf` and combining `int` and `float` yourself.
 619 | 
 620 | **Note:** If you want to enable octal or binary `Int` literals,
 621 | check out [`Parser.LanguageKit.int`](Parser-LanguageKit#int).
 622 | -}
 623 | int : Parser Int
 624 | int =
 625 |   Parser <| \{ source, offset, indent, context, row, col } ->
 626 |     case intHelp offset (Prim.isSubChar isZero offset source) source of
 627 |       Err badOffset ->
 628 |         Bad BadInt
 629 |           { source = source
 630 |           , offset = badOffset
 631 |           , indent = indent
 632 |           , context = context
 633 |           , row = row
 634 |           , col = col + (badOffset - offset)
 635 |           }
 636 | 
 637 |       Ok goodOffset ->
 638 |         case String.toInt (String.slice offset goodOffset source) of
 639 |           Err _ ->
 640 |             Debug.crash badIntMsg
 641 | 
 642 |           Ok n ->
 643 |             Good n
 644 |               { source = source
 645 |               , offset = goodOffset
 646 |               , indent = indent
 647 |               , context = context
 648 |               , row = row
 649 |               , col = col + (goodOffset - offset)
 650 |               }
 651 | 
 652 | 
 653 | intHelp : Int -> Int -> String -> Result Int Int
 654 | intHelp offset zeroOffset source =
 655 |   if zeroOffset == -1 then
 656 |     Internal.chompDigits Char.isDigit offset source
 657 | 
 658 |   else if Prim.isSubChar isX zeroOffset source /= -1 then
 659 |     Internal.chompDigits Char.isHexDigit (offset + 2) source
 660 | 
 661 | --  else if Prim.isSubChar isO zeroOffset source /= -1 then
 662 | --    Internal.chompDigits Char.isOctDigit (offset + 2) source
 663 | 
 664 |   else if Prim.isSubChar Internal.isBadIntEnd zeroOffset source == -1 then
 665 |     Ok zeroOffset
 666 | 
 667 |   else
 668 |     Err zeroOffset
 669 | 
 670 | 
 671 | isZero : Char -> Bool
 672 | isZero char =
 673 |   char == '0'
 674 | 
 675 | 
 676 | isO : Char -> Bool
 677 | isO char =
 678 |   char == 'o'
 679 | 
 680 | 
 681 | isX : Char -> Bool
 682 | isX char =
 683 |   char == 'x'
 684 | 
 685 | 
 686 | badIntMsg : String
 687 | badIntMsg =
 688 |   """The `Parser.int` parser seems to have a bug.
 689 | Please report an SSCCE to <https://github.com/elm-tools/parser/issues>."""
 690 | 
 691 | 
 692 | 
 693 | -- FLOAT
 694 | 
 695 | 
 696 | {-| Parse floats.
 697 | 
 698 |     run float "123"       == Ok 123
 699 |     run float "3.1415"    == Ok 3.1415
 700 |     run float "0.1234"    == Ok 0.1234
 701 |     run float ".1234"     == Ok 0.1234
 702 |     run float "1e-42"     == Ok 1e-42
 703 |     run float "6.022e23"  == Ok 6.022e23
 704 |     run float "6.022E23"  == Ok 6.022e23
 705 |     run float "6.022e+23" == Ok 6.022e23
 706 |     run float "6.022e"    == Err ..
 707 |     run float "6.022n"    == Err ..
 708 |     run float "6.022.31"  == Err ..
 709 | 
 710 | **Note:** If you want a parser for both `Int` and `Float` literals,
 711 | check out [`Parser.LanguageKit.number`](Parser-LanguageKit#number).
 712 | It does not backtrack, so it should be faster and give better error
 713 | messages than using `oneOf` and combining `int` and `float` yourself.
 714 | 
 715 | **Note:** If you want to disable literals like `.123` like Elm,
 716 | check out [`Parser.LanguageKit.float`](Parser-LanguageKit#float).
 717 | -}
 718 | float : Parser Float
 719 | float =
 720 |   Parser <| \{ source, offset, indent, context, row, col } ->
 721 |     case floatHelp offset (Prim.isSubChar isZero offset source) source of
 722 |       Err badOffset ->
 723 |         Bad BadFloat
 724 |           { source = source
 725 |           , offset = badOffset
 726 |           , indent = indent
 727 |           , context = context
 728 |           , row = row
 729 |           , col = col + (badOffset - offset)
 730 |           }
 731 | 
 732 |       Ok goodOffset ->
 733 |         case String.toFloat (String.slice offset goodOffset source) of
 734 |           Err _ ->
 735 |             Debug.crash badFloatMsg
 736 | 
 737 |           Ok n ->
 738 |             Good n
 739 |               { source = source
 740 |               , offset = goodOffset
 741 |               , indent = indent
 742 |               , context = context
 743 |               , row = row
 744 |               , col = col + (goodOffset - offset)
 745 |               }
 746 | 
 747 | 
 748 | floatHelp : Int -> Int -> String -> Result Int Int
 749 | floatHelp offset zeroOffset source =
 750 |   if zeroOffset >= 0 then
 751 |     Internal.chompDotAndExp zeroOffset source
 752 | 
 753 |   else
 754 |     let
 755 |       dotOffset =
 756 |         Internal.chomp Char.isDigit offset source
 757 | 
 758 |       result =
 759 |         Internal.chompDotAndExp dotOffset source
 760 |     in
 761 |       case result of
 762 |         Err _ ->
 763 |           result
 764 | 
 765 |         Ok n ->
 766 |           if n == offset then Err n else result
 767 | 
 768 | 
 769 | badFloatMsg : String
 770 | badFloatMsg =
 771 |   """The `Parser.float` parser seems to have a bug.
 772 | Please report an SSCCE to <https://github.com/elm-tools/parser/issues>."""
 773 | 
 774 | 
 775 | 
 776 | -- END
 777 | 
 778 | 
 779 | {-| Check if you have reached the end of the string you are parsing.
 780 | 
 781 |     justAnInt : Parser Int
 782 |     justAnInt =
 783 |       succeed identity
 784 |         |= int
 785 |         |. end
 786 | 
 787 |     -- run justAnInt "90210" == Ok 90210
 788 |     -- run justAnInt "1 + 2" == Err ...
 789 |     -- run int       "1 + 2" == Ok 1
 790 | 
 791 | Parsers can succeed without parsing the whole string. Ending your parser
 792 | with `end` guarantees that you have successfully parsed the whole string.
 793 | -}
 794 | end : Parser ()
 795 | end =
 796 |   Parser <| \state ->
 797 |     if String.length state.source == state.offset then
 798 |       Good () state
 799 | 
 800 |     else
 801 |       Bad ExpectingEnd state
 802 | 
 803 | 
 804 | 
 805 | -- SOURCE
 806 | 
 807 | 
 808 | {-| Run a parser, but return the underlying source code that actually
 809 | got parsed.
 810 | 
 811 |     -- run (source (ignore oneOrMore Char.isLower)) "abc" == Ok "abc"
 812 |     -- keep count isOk = source (ignore count isOk)
 813 | 
 814 | This becomes a useful optimization when you need to [`keep`](#keep)
 815 | something very specific. For example, say we want to parse capitalized
 816 | words:
 817 | 
 818 |     import Char
 819 | 
 820 |     variable : Parser String
 821 |     variable =
 822 |       succeed (++)
 823 |         |= keep (Exactly 1) Char.isUpper
 824 |         |= keep zeroOrMore Char.isLower
 825 | 
 826 | In this case, each `keep` allocates a string. Then we use `(++)` to create the
 827 | final string. That means *three* strings are allocated.
 828 | 
 829 | In contrast, using `source` with `ignore` lets you grab the final string
 830 | directly. It tracks where the parser starts and ends, so it can use
 831 | `String.slice` to grab that part directly.
 832 | 
 833 |     variable : Parser String
 834 |     variable =
 835 |       source <|
 836 |         ignore (Exactly 1) Char.isUpper
 837 |           |. ignore zeroOrMore Char.isLower
 838 | 
 839 | This version only allocates *one* string.
 840 | -}
 841 | source : Parser a -> Parser String
 842 | source parser =
 843 |   sourceMap always parser
 844 | 
 845 | 
 846 | {-| Like `source`, but it allows you to combine the source string
 847 | with the value that is produced by the parser. So maybe you want
 848 | a float, but you also want to know exactly how it looked.
 849 | 
 850 |     number : Parser (String, Float)
 851 |     number =
 852 |       sourceMap (,) float
 853 | 
 854 |     -- run number "100" == Ok ("100", 100)
 855 |     -- run number "1e2" == Ok ("1e2", 100)
 856 | -}
 857 | sourceMap : (String -> a -> b) -> Parser a -> Parser b
 858 | sourceMap func (Parser parse) =
 859 |   Parser <| \({source, offset} as state1) ->
 860 |     case parse state1 of
 861 |       Bad x state2 ->
 862 |         Bad x state2
 863 | 
 864 |       Good a state2 ->
 865 |         let
 866 |           subString =
 867 |             String.slice offset state2.offset source
 868 |         in
 869 |           Good (func subString a) state2
 870 | 
 871 | 
 872 | 
 873 | -- REPEAT
 874 | 
 875 | 
 876 | {-| How many characters to [`keep`](#keep) or [`ignore`](#ignore).
 877 | -}
 878 | type Count = AtLeast Int | Exactly Int
 879 | 
 880 | 
 881 | {-| A simple alias for `AtLeast 0` so your code reads nicer:
 882 | 
 883 |     import Char
 884 | 
 885 |     spaces : Parser String
 886 |     spaces =
 887 |       keep zeroOrMore (\c -> c == ' ')
 888 | 
 889 |     -- same as: keep (AtLeast 0) (\c -> c == ' ')
 890 | -}
 891 | zeroOrMore : Count
 892 | zeroOrMore =
 893 |   AtLeast 0
 894 | 
 895 | 
 896 | {-| A simple alias for `AtLeast 1` so your code reads nicer:
 897 | 
 898 |     import Char
 899 | 
 900 |     lows : Parser String
 901 |     lows =
 902 |       keep oneOrMore Char.isLower
 903 | 
 904 |     -- same as: keep (AtLeast 1) Char.isLower
 905 | -}
 906 | oneOrMore : Count
 907 | oneOrMore =
 908 |   AtLeast 1
 909 | 
 910 | 
 911 | {-| Keep some characters. If you want a capital letter followed by
 912 | zero or more lower case letters, you could say:
 913 | 
 914 |     import Char
 915 | 
 916 |     capitalized : Parser String
 917 |     capitalized =
 918 |       succeed (++)
 919 |         |= keep (Exactly 1) Char.isUpper
 920 |         |= keep zeroOrMore  Char.isLower
 921 | 
 922 |     -- good: Cat, Tom, Sally
 923 |     -- bad: cat, tom, TOM, tOm
 924 | 
 925 | **Note:** Check out [`source`](#source) for a more efficient
 926 | way to grab the underlying source of a complex parser.
 927 | -}
 928 | keep : Count -> (Char -> Bool) -> Parser String
 929 | keep count predicate =
 930 |   source (ignore count predicate)
 931 | 
 932 | 
 933 | {-| Ignore some characters. If you want to ignore one or more
 934 | spaces, you might say:
 935 | 
 936 |     spaces : Parser ()
 937 |     spaces =
 938 |       ignore oneOrMore (\c -> c == ' ')
 939 | 
 940 | -}
 941 | ignore : Count -> (Char -> Bool) -> Parser ()
 942 | ignore count predicate =
 943 |   case count of
 944 |     Exactly n ->
 945 |       Parser <| \{ source, offset, indent, context, row, col } ->
 946 |         ignoreExactly n predicate source offset indent context row col
 947 | 
 948 |     AtLeast n ->
 949 |       Parser <| \{ source, offset, indent, context, row, col } ->
 950 |         ignoreAtLeast n predicate source offset indent context row col
 951 | 
 952 | 
 953 | ignoreExactly : Int -> (Char -> Bool) -> String -> Int -> Int -> List Context -> Int -> Int -> Step ()
 954 | ignoreExactly n predicate source offset indent context row col =
 955 |   if n <= 0 then
 956 |     Good ()
 957 |       { source = source
 958 |       , offset = offset
 959 |       , indent = indent
 960 |       , context = context
 961 |       , row = row
 962 |       , col = col
 963 |       }
 964 | 
 965 |   else
 966 |     let
 967 |       newOffset =
 968 |         Prim.isSubChar predicate offset source
 969 |     in
 970 |       if newOffset == -1 then
 971 |         Bad BadRepeat
 972 |           { source = source
 973 |           , offset = offset
 974 |           , indent = indent
 975 |           , context = context
 976 |           , row = row
 977 |           , col = col
 978 |           }
 979 | 
 980 |       else if newOffset == -2 then
 981 |         ignoreExactly (n - 1) predicate source (offset + 1) indent context (row + 1) 1
 982 | 
 983 |       else
 984 |         ignoreExactly (n - 1) predicate source newOffset indent context row (col + 1)
 985 | 
 986 | 
 987 | ignoreAtLeast : Int -> (Char -> Bool) -> String -> Int -> Int -> List Context -> Int -> Int -> Step ()
 988 | ignoreAtLeast n predicate source offset indent context row col =
 989 |   let
 990 |     newOffset =
 991 |       Prim.isSubChar predicate offset source
 992 |   in
 993 |     -- no match
 994 |     if newOffset == -1 then
 995 |       let
 996 |         state =
 997 |           { source = source
 998 |           , offset = offset
 999 |           , indent = indent
1000 |           , context = context
1001 |           , row = row
1002 |           , col = col
1003 |           }
1004 |       in
1005 |         if n <= 0 then Good () state else Bad BadRepeat state
1006 | 
1007 |     -- matched a newline
1008 |     else if newOffset == -2 then
1009 |       ignoreAtLeast (n - 1) predicate source (offset + 1) indent context (row + 1) 1
1010 | 
1011 |     -- normal match
1012 |     else
1013 |       ignoreAtLeast (n - 1) predicate source newOffset indent context row (col + 1)
1014 | 
1015 | 
1016 | 
1017 | -- IGNORE UNTIL
1018 | 
1019 | 
1020 | {-| Ignore characters until *after* the given string.
1021 | So maybe we want to parse Elm-style single-line comments:
1022 | 
1023 |     elmComment : Parser ()
1024 |     elmComment =
1025 |       symbol "--"
1026 |         |. ignoreUntil "\n"
1027 | 
1028 | Or maybe you want to parse JS-style multi-line comments:
1029 | 
1030 |     jsComment : Parser ()
1031 |     jsComment =
1032 |       symbol "/*"
1033 |         |. ignoreUntil "*/"
1034 | 
1035 | **Note:** You must take more care when parsing Elm-style multi-line
1036 | comments. Elm can recognize nested comments, but the `jsComment` parser
1037 | cannot. See [`Parser.LanguageKit.whitespace`](Parser-LanguageKit#whitespace)
1038 | for help with this.
1039 | -}
1040 | ignoreUntil : String -> Parser ()
1041 | ignoreUntil str =
1042 |   Parser <| \({ source, offset, indent, context, row, col } as state) ->
1043 |     let
1044 |       (newOffset, newRow, newCol) =
1045 |         Prim.findSubString False str offset row col source
1046 |     in
1047 |       if newOffset == -1 then
1048 |         Bad (ExpectingClosing str) state
1049 | 
1050 |       else
1051 |         Good ()
1052 |           { source = source
1053 |           , offset = newOffset
1054 |           , indent = indent
1055 |           , context = context
1056 |           , row = newRow
1057 |           , col = newCol
1058 |           }
1059 | 
1060 | 
1061 | 
1062 | -- CONTEXT
1063 | 
1064 | 
1065 | {-| Specify what you are parsing right now. So if you have a parser
1066 | for lists like `[ 1, 2, 3 ]` you could say:
1067 | 
1068 |     list : Parser (List Int)
1069 |     list =
1070 |       inContext "list" <|
1071 |         succeed identity
1072 |           |. symbol "["
1073 |           |. spaces
1074 |           |= commaSep int
1075 |           |. spaces
1076 |           |. symbol "]"
1077 | 
1078 |     -- spaces : Parser ()
1079 |     -- commaSep : Parser a -> Parser (List a)
1080 | 
1081 | Now you get that extra context information if there is a parse error anywhere
1082 | in the list. For example, if you have `[ 1, 23zm5, 3 ]` you could generate an
1083 | error message like this:
1084 | 
1085 |     I ran into a problem while parsing this list:
1086 | 
1087 |         [ 1, 23zm5, 3 ]
1088 |              ^
1089 |     Looking for a valid integer, like 6 or 90210.
1090 | 
1091 | Notice that the error message knows you are parsing a list right now!
1092 | -}
1093 | inContext : String -> Parser a -> Parser a
1094 | inContext ctx (Parser parse) =
1095 |   Parser <| \({ context, row, col } as initialState) ->
1096 |     let
1097 |       state1 =
1098 |         changeContext (Context row col ctx :: context) initialState
1099 |     in
1100 |       case parse state1 of
1101 |         Good a state2 ->
1102 |           Good a (changeContext context state2)
1103 | 
1104 |         Bad _ _ as step ->
1105 |           step
1106 | 
1107 | 
1108 | changeContext : List Context -> State -> State
1109 | changeContext newContext { source, offset, indent, row, col } =
1110 |   { source = source
1111 |   , offset = offset
1112 |   , indent = indent
1113 |   , context = newContext
1114 |   , row = row
1115 |   , col = col
1116 |   }
1117 | 


--------------------------------------------------------------------------------