├── README.md
├── diagrams
├── README.md
├── png
│ ├── array.png
│ ├── boolean.png
│ ├── codepointEscape.png
│ ├── digit.png
│ ├── fraction.png
│ ├── member.png
│ ├── nonZeroDigit.png
│ ├── number.png
│ ├── object.png
│ ├── positiveInteger.png
│ ├── shortcutEscape.png
│ ├── string.png
│ ├── unescaped.png
│ └── value.png
└── svg
│ ├── array.svg
│ ├── boolean.svg
│ ├── codepointEscape.svg
│ ├── digit.svg
│ ├── fraction.svg
│ ├── member.svg
│ ├── nonZeroDigit.svg
│ ├── number.svg
│ ├── object.svg
│ ├── positiveInteger.svg
│ ├── shortcutEscape.svg
│ ├── string.svg
│ ├── unescaped.svg
│ └── value.svg
├── implementation
├── MIT-LICENSE.txt
├── Main.hs
├── README.md
├── son.cabal
├── src
│ ├── Son.hs
│ └── Son
│ │ ├── Generator.hs
│ │ └── Parser.hs
├── stack.yaml
├── stack.yaml.lock
└── test
│ ├── JQ.hs
│ └── Test.hs
├── json.abnf
├── references
└── rfc7159.txt
└── son.ebnf
/README.md:
--------------------------------------------------------------------------------
1 | # Son
2 |
3 | A subset of JSON.
4 |
5 | JSON contains redundant syntax such as allowing both `10e2` and `10E2`. This helps when writing it by hand but isn't good for machine-to-machine communication. Piping JSON through multiple programs creates lots of trivial changes, which makes it hard to do things like take meaningful diffs.
6 |
7 | Son is a subset of JSON without redundant options. It's intended for machine-to-machine communication by programs that want to follow [Postel's Law](https://tools.ietf.org/html/rfc761#section-2.10) -- they can accept normal JSON for flexibility and output Son for consistency.
8 |
9 | ## Son Numbers
10 |
11 | 
12 |
13 | + No exponential notation
14 | + No trailing zeros in fractions
15 | + No negative zero
16 | + No positive sign
17 |
18 | ### positiveInteger:
19 |
20 | 
21 |
22 | ### fraction:
23 |
24 | 
25 |
26 | ## Son Strings
27 |
28 | 
29 |
30 | + No unnecessary escape sequences
31 |
32 | JSON doesn't allow Unicode characters below codepoint x20 or unescaped `"` and `\` in strings. To allow these to still be encoded in JSON we've had to keep a few escape sequences. We use two-character escape sequences (e.g. `\n`) when available, and six-character ones (e.g. `\u0001`) when not.
33 |
34 | ### shortcutEscape:
35 |
36 | 
37 |
38 | ## Other Changes from JSON
39 |
40 | + No insignificant whitespace
41 |
42 | # Status
43 |
44 | Unreleased. Like JSON, the intention is that once Son is released it will never change.
45 |
46 | # Specification
47 |
48 | The formal part of its specification is [./son.ebnf](./son.ebnf). It uses the EBNF notation described [here](https://www.w3.org/TR/2004/REC-xml11-20040204/#sec-notation).
49 |
50 | Additionally:
51 |
52 | + The only valid byte encoding of Son is UTF-8. Byte order marks are forbidden.
53 |
54 | + Object keys must be unique.
55 |
56 | + Object members must be sorted by ascending lexicographic order of their keys.
57 |
58 | # Implementations
59 |
60 | + Haskell reference implementation: [./implementation](./implementation).
61 |
62 | # Special Thanks
63 |
64 | + [@chmike](https://github.com/chmike) and [@etherealvisage](https://github.com/etherealvisage): key ordering should specify "ascending".
65 |
66 | + [@John-Nagle](https://github.com/John-Nagle): UTF-8 should be the only allowed encoding.
67 |
68 | # Differences between this and related projects
69 |
70 | See [here](https://housejeffries.com/page/7).
71 |
72 | # Notes
73 |
74 | + The diagrams were generated with [GrammKit](https://github.com/dundalek/GrammKit).
75 |
76 | + `./vendored/rfc7159.txt` is from [here](https://tools.ietf.org/rfc/rfc7159.txt).
77 |
--------------------------------------------------------------------------------
/diagrams/README.md:
--------------------------------------------------------------------------------
1 |
2 | # son.ebnf
3 |
4 | ## value
5 |
6 | 
7 |
8 | Used by: [member](#member), [array](#array)
9 | References: [object](#object), [array](#array), [string](#string), [number](#number), [boolean](#boolean)
10 |
11 | ## boolean
12 |
13 | 
14 |
15 | Used by: [value](#value)
16 |
17 | ## object
18 |
19 | 
20 |
21 | Used by: [value](#value)
22 | References: [member](#member)
23 |
24 | ## member
25 |
26 | 
27 |
28 | Used by: [object](#object)
29 | References: [string](#string), [value](#value)
30 |
31 | ## array
32 |
33 | 
34 |
35 | Used by: [value](#value)
36 | References: [value](#value)
37 |
38 | ## number
39 |
40 | 
41 |
42 | Used by: [value](#value)
43 | References: [positiveInteger](#positiveInteger), [fraction](#fraction)
44 |
45 | ## fraction
46 |
47 | 
48 |
49 | Used by: [number](#number)
50 | References: [digit](#digit), [nonZeroDigit](#nonZeroDigit)
51 |
52 | ## positiveInteger
53 |
54 | 
55 |
56 | Used by: [number](#number)
57 | References: [nonZeroDigit](#nonZeroDigit), [digit](#digit)
58 |
59 | ## digit
60 |
61 | 
62 |
63 | Used by: [fraction](#fraction), [positiveInteger](#positiveInteger)
64 |
65 | ## nonZeroDigit
66 |
67 | 
68 |
69 | Used by: [fraction](#fraction), [positiveInteger](#positiveInteger)
70 |
71 | ## string
72 |
73 | 
74 |
75 | Used by: [value](#value), [member](#member)
76 | References: [unescaped](#unescaped), [shortcutEscape](#shortcutEscape), [codepointEscape](#codepointEscape)
77 |
78 | ## unescaped
79 |
80 | 
81 |
82 | Used by: [string](#string)
83 |
84 | ## shortcutEscape
85 |
86 | 
87 |
88 | Used by: [string](#string)
89 |
90 | ## codepointEscape
91 |
92 | 
93 |
94 | Used by: [string](#string)
95 |
--------------------------------------------------------------------------------
/diagrams/png/array.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/array.png
--------------------------------------------------------------------------------
/diagrams/png/boolean.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/boolean.png
--------------------------------------------------------------------------------
/diagrams/png/codepointEscape.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/codepointEscape.png
--------------------------------------------------------------------------------
/diagrams/png/digit.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/digit.png
--------------------------------------------------------------------------------
/diagrams/png/fraction.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/fraction.png
--------------------------------------------------------------------------------
/diagrams/png/member.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/member.png
--------------------------------------------------------------------------------
/diagrams/png/nonZeroDigit.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/nonZeroDigit.png
--------------------------------------------------------------------------------
/diagrams/png/number.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/number.png
--------------------------------------------------------------------------------
/diagrams/png/object.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/object.png
--------------------------------------------------------------------------------
/diagrams/png/positiveInteger.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/positiveInteger.png
--------------------------------------------------------------------------------
/diagrams/png/shortcutEscape.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/shortcutEscape.png
--------------------------------------------------------------------------------
/diagrams/png/string.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/string.png
--------------------------------------------------------------------------------
/diagrams/png/unescaped.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/unescaped.png
--------------------------------------------------------------------------------
/diagrams/png/value.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seagreen/Son/f52c54d810b3d711aa94c71b7069aeb9bc83cca6/diagrams/png/value.png
--------------------------------------------------------------------------------
/diagrams/svg/array.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
104 |
--------------------------------------------------------------------------------
/diagrams/svg/boolean.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
55 |
--------------------------------------------------------------------------------
/diagrams/svg/codepointEscape.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
101 |
--------------------------------------------------------------------------------
/diagrams/svg/digit.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
43 |
--------------------------------------------------------------------------------
/diagrams/svg/fraction.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
83 |
--------------------------------------------------------------------------------
/diagrams/svg/member.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
63 |
--------------------------------------------------------------------------------
/diagrams/svg/nonZeroDigit.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
43 |
--------------------------------------------------------------------------------
/diagrams/svg/number.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
127 |
--------------------------------------------------------------------------------
/diagrams/svg/object.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
104 |
--------------------------------------------------------------------------------
/diagrams/svg/positiveInteger.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
75 |
--------------------------------------------------------------------------------
/diagrams/svg/shortcutEscape.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
95 |
--------------------------------------------------------------------------------
/diagrams/svg/string.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
123 |
--------------------------------------------------------------------------------
/diagrams/svg/unescaped.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
63 |
--------------------------------------------------------------------------------
/diagrams/svg/value.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
87 |
--------------------------------------------------------------------------------
/implementation/MIT-LICENSE.txt:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 | Copyright (c) 2017 Ian Grant Jeffries
3 |
4 | Permission is hereby granted, free of charge, to any person obtaining a copy
5 | of this software and associated documentation files (the "Software"), to deal
6 | in the Software without restriction, including without limitation the rights
7 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8 | copies of the Software, and to permit persons to whom the Software is
9 | furnished to do so, subject to the following conditions:
10 |
11 | The above copyright notice and this permission notice shall be included in all
12 | copies or substantial portions of the Software.
13 |
14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
17 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
18 | DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
19 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
20 | OR OTHER DEALINGS IN THE SOFTWARE.
21 |
--------------------------------------------------------------------------------
/implementation/Main.hs:
--------------------------------------------------------------------------------
1 | module Main where
2 |
3 | import Data.Aeson
4 | import Options.Applicative
5 | import Protolude
6 | import Son (Son(Son))
7 |
8 | import qualified Data.ByteString as BS
9 | import qualified Data.Text as T
10 | import qualified Son
11 |
12 | data Mode
13 | = JSONToSon
14 | | Verify
15 | | VerifyNoNewline
16 | deriving (Eq, Show)
17 |
18 | execInfo :: ParserInfo Mode
19 | execInfo = info (options <**> helper)
20 | ( fullDesc
21 | <> progDesc (fold [ "Convert JSON to Son."
22 | , " Note that no newline will be appended."
23 | , " `stdin JSON -> Either (stderr String) (stdout Son)`"
24 | ])
25 | )
26 | where
27 | options :: Parser Mode
28 | options = fromMaybe JSONToSon
29 | <$> optional (verifyFlag <|> verifyNoNewlineFlag)
30 |
31 | verifyFlag :: Parser Mode
32 | verifyFlag =
33 | flag' Verify
34 | ( long "verify"
35 | <> help (fold [ "Check if Son is valid."
36 | , " Warning: any error messages are likely to be terrible."
37 | , " `stdin SonWithNewline -> Either (stderr String) ()`"
38 | ])
39 | )
40 |
41 | verifyNoNewlineFlag :: Parser Mode
42 | verifyNoNewlineFlag =
43 | flag' VerifyNoNewline
44 | ( long "verify-no-newline"
45 | <> help (fold [ "Check if Son is valid."
46 | , " Warning: any error messages are likely to be terrible."
47 | , " `stdin Son -> Either (stderr String) ()`"
48 | ])
49 | )
50 |
51 | sonFromJSON :: ByteString -> IO ByteString
52 | sonFromJSON bts =
53 | case eitherDecodeStrict bts of
54 | Left e -> panic (T.pack e)
55 | Right v -> pure (Son.encode (Son v))
56 |
57 | isSon :: ByteString -> IO ()
58 | isSon bts =
59 | case Son.decode bts of
60 | Left e -> print bts >> panic (show e)
61 | Right _ -> pure ()
62 |
63 | main :: IO ()
64 | main = do
65 | mode <- execParser execInfo
66 | bts <- BS.getContents
67 | case mode of
68 | JSONToSon -> sonFromJSON bts >>= BS.putStr
69 | Verify -> stripNewline bts >>= isSon
70 | VerifyNoNewline -> isSon bts
71 | where
72 | stripNewline :: ByteString -> IO ByteString
73 | stripNewline bts =
74 | case BS.unsnoc bts of
75 | Nothing -> panic "empty input"
76 | Just (noNewline, c) ->
77 | if BS.singleton c == "\n"
78 | then pure noNewline
79 | else panic ( "instead of ending with a newline, input ends with: "
80 | <> show c
81 | )
82 |
--------------------------------------------------------------------------------
/implementation/README.md:
--------------------------------------------------------------------------------
1 | # Son reference implementation
2 |
3 | ## Goals
4 |
5 | (in order)
6 |
7 | 1. Be correct.
8 |
9 | 2. Be clear.
10 |
11 | 3. Be fast.
12 |
13 | ## Notes
14 |
15 | This is a reference implementation, not a production library. Parsing is currently 3x slower and serializing is 6x slower than `aeson`. Additionally, I've only run a few tests so it probably fares even worse for large values.
16 |
--------------------------------------------------------------------------------
/implementation/son.cabal:
--------------------------------------------------------------------------------
1 | name: son
2 | version: 0.1.0.0
3 | synopsis: Son parser and generator
4 | homepage: https://github.com/seagreen/son
5 | license: MIT
6 | license-file: MIT-LICENSE.txt
7 | author: Ian Grant Jeffries
8 | maintainer: ian@housejeffries.com
9 | category: Data
10 | build-type: Simple
11 | cabal-version: >=1.10
12 |
13 | library
14 | hs-source-dirs:
15 | src
16 | default-language: Haskell2010
17 | default-extensions:
18 | GeneralizedNewtypeDeriving
19 | NoImplicitPrelude
20 | OverloadedStrings
21 | ScopedTypeVariables
22 | TupleSections
23 | ghc-options:
24 | -Wall
25 | exposed-modules:
26 | Son
27 | Son.Generator
28 | Son.Parser
29 | build-depends:
30 | base
31 | , aeson
32 | , attoparsec
33 | , bytestring
34 | , protolude
35 | , QuickCheck
36 | , scientific
37 | , text
38 | , unordered-containers
39 | , vector
40 |
41 | executable son
42 | hs-source-dirs:
43 | .
44 | default-language: Haskell2010
45 | default-extensions:
46 | GeneralizedNewtypeDeriving
47 | NoImplicitPrelude
48 | OverloadedStrings
49 | ScopedTypeVariables
50 | TupleSections
51 | ghc-options:
52 | -Wall
53 | main-is: Main.hs
54 | build-depends:
55 | base
56 | , aeson
57 | , attoparsec
58 | , bytestring
59 | , protolude
60 | , QuickCheck
61 | , scientific
62 | , text
63 | , unordered-containers
64 | , vector
65 |
66 | , optparse-applicative
67 | , son
68 |
69 | test-suite test
70 | hs-source-dirs:
71 | test
72 | main-is: Test.hs
73 | other-modules:
74 | JQ
75 | default-language: Haskell2010
76 | default-extensions:
77 | NoImplicitPrelude
78 | OverloadedStrings
79 | ScopedTypeVariables
80 | TupleSections
81 | type: exitcode-stdio-1.0
82 | ghc-options:
83 | -Wall
84 | build-depends:
85 | base
86 | , aeson
87 | , bytestring
88 | , process
89 | , protolude
90 | , QuickCheck
91 | , son
92 | , tasty
93 | , tasty-hunit
94 | , tasty-quickcheck
95 | , text
96 |
--------------------------------------------------------------------------------
/implementation/src/Son.hs:
--------------------------------------------------------------------------------
1 | module Son where
2 |
3 | import Data.Aeson
4 | import Data.Attoparsec.Text (endOfInput, parseOnly)
5 | import Data.Scientific (Scientific, fromFloatDigits)
6 | import Data.Text.Encoding.Error (UnicodeException)
7 | import Protolude
8 | import Son.Generator (generateSon)
9 | import Son.Parser (sonValue)
10 | import Test.QuickCheck hiding (generate)
11 |
12 | import qualified Data.HashMap.Strict as HM
13 | import qualified Data.Text as T
14 | import qualified Data.Vector as V
15 |
16 | newtype Son
17 | = Son { _unSon :: Value }
18 | deriving (Eq, Show, NFData)
19 |
20 | -- * Parsing
21 |
22 | parse :: Text -> Either Text Son
23 | parse = bimap T.pack Son . parseOnly (sonValue <* endOfInput)
24 |
25 | data DecodingError
26 | = UTF8Error UnicodeException
27 | | SyntaxError Text
28 | deriving (Eq, Show)
29 |
30 | decode :: ByteString -> Either DecodingError Son
31 | decode = first SyntaxError . parse
32 | <=< first UTF8Error . decodeUtf8'
33 |
34 | -- * Generation
35 |
36 | generate :: Son -> Text
37 | generate = generateSon . _unSon
38 |
39 | encode :: Son -> ByteString
40 | encode = encodeUtf8 . generate
41 |
42 | -- * Random Son value creation (for testing)
43 |
44 | instance Arbitrary Son where
45 | arbitrary = Son <$> sized arbitraryValue
46 | where
47 | arbitraryValue :: Int -> Gen Value
48 | arbitraryValue n
49 | | n <= 1 = oneof nonRecursive
50 | | otherwise = oneof $
51 | (Array . V.fromList <$> arbitraryArray (n `div` 10))
52 | : (Object . HM.fromList <$> arbitraryObject (n `div` 10))
53 | : nonRecursive
54 |
55 | arbitraryArray :: Int -> Gen [Value]
56 | arbitraryArray n = traverse (const (arbitraryValue n))
57 | =<< (arbitrary :: Gen [()])
58 |
59 | arbitraryObject :: Int -> Gen [(Text, Value)]
60 | arbitraryObject n = traverse (const textAndValue)
61 | =<< (arbitrary :: Gen [()])
62 | where
63 | textAndValue :: Gen (Text, Value)
64 | textAndValue = (,) <$> arbitraryText <*> arbitraryValue n
65 |
66 | nonRecursive :: [Gen Value]
67 | nonRecursive =
68 | [ pure Null
69 | , Bool <$> arbitrary
70 | , String <$> arbitraryText
71 | , Number <$> arbitraryScientific
72 | ]
73 |
74 | arbitraryText :: Gen Text
75 | arbitraryText = T.pack <$> arbitrary
76 |
77 | arbitraryScientific :: Gen Scientific
78 | arbitraryScientific = (fromFloatDigits :: Double -> Scientific)
79 | <$> arbitrary
80 |
--------------------------------------------------------------------------------
/implementation/src/Son/Generator.hs:
--------------------------------------------------------------------------------
1 | module Son.Generator where
2 |
3 | import Data.Aeson
4 | import Data.Char (intToDigit)
5 | import Data.HashMap.Strict (HashMap)
6 | import Data.Scientific (Scientific)
7 | import Data.String (String)
8 | import Data.Text.Lazy.Builder (Builder)
9 | import Data.Vector (Vector)
10 | import Protolude
11 |
12 | import qualified Data.HashMap.Strict as HM
13 | import qualified Data.Scientific as Sci
14 | import qualified Data.Text as T
15 | import qualified Data.Text.Lazy as TL
16 | import qualified Data.Text.Lazy.Builder as TB
17 |
18 | generateSon :: Value -> Text
19 | generateSon = TL.toStrict . TB.toLazyText . genValue
20 |
21 | genValue :: Value -> Builder
22 | genValue (Object hm) = genObject hm
23 | genValue (Array xs) = genArray xs
24 | genValue (String s) = genString s
25 | genValue (Number n) = genNumber n
26 | genValue (Bool True) = "true"
27 | genValue (Bool False) = "false"
28 | genValue Null = "null"
29 |
30 | genObject :: HashMap Text Value -> Builder
31 | genObject hm = "{" <> foldl' addMember mempty sortedMembers <> "}"
32 | where
33 | sortedMembers :: [(Text, Value)]
34 | sortedMembers = sortOn fst (HM.toList hm)
35 |
36 | addMember :: Builder -> (Text, Value) -> Builder
37 | addMember a (k,v)
38 | | a == mempty = pair
39 | | otherwise = a <> "," <> pair
40 | where
41 | pair :: Builder
42 | pair = genString k <> ":" <> genValue v
43 |
44 | genArray :: Vector Value -> Builder
45 | genArray xs = "[" <> foldl' addElement mempty xs <> "]"
46 | where
47 | addElement :: Builder -> Value -> Builder
48 | addElement a v
49 | | a == mempty = genValue v
50 | | otherwise = a <> "," <> genValue v
51 |
52 | genString :: Text -> Builder
53 | genString t = TB.singleton '"'
54 | <> TB.fromText (T.concatMap escape t)
55 | <> TB.singleton '"'
56 | where
57 | escape :: Char -> Text
58 | escape c =
59 | case c of
60 | '"' -> '\\' `T.cons` T.singleton '"'
61 | '\\' -> '\\' `T.cons` T.singleton '\\'
62 | '\x00' -> "\\u0000"
63 | '\x01' -> "\\u0001"
64 | '\x02' -> "\\u0002"
65 | '\x03' -> "\\u0003"
66 | '\x04' -> "\\u0004"
67 | '\x05' -> "\\u0005"
68 | '\x06' -> "\\u0006"
69 | '\x07' -> "\\u0007"
70 | '\x08' -> "\\b" -- backspace
71 | '\x09' -> "\\t" -- tab
72 | '\x0a' -> "\\n" -- line feed
73 | '\x0b' -> "\\u000b"
74 | '\x0c' -> "\\f" -- form feed
75 | '\x0d' -> "\\r" -- carriage return
76 | '\x0e' -> "\\u000e"
77 | '\x0f' -> "\\u000f"
78 | '\x10' -> "\\u0010"
79 | '\x11' -> "\\u0011"
80 | '\x12' -> "\\u0012"
81 | '\x13' -> "\\u0013"
82 | '\x14' -> "\\u0014"
83 | '\x15' -> "\\u0015"
84 | '\x16' -> "\\u0016"
85 | '\x17' -> "\\u0017"
86 | '\x18' -> "\\u0018"
87 | '\x19' -> "\\u0019"
88 | '\x1a' -> "\\u001a"
89 | '\x1b' -> "\\u001b"
90 | '\x1c' -> "\\u001c"
91 | '\x1d' -> "\\u001d"
92 | '\x1e' -> "\\u001e"
93 | '\x1f' -> "\\u001f"
94 | _ -> T.singleton c
95 |
96 | genNumber :: Scientific -> Builder
97 | genNumber = TB.fromString . formatScientific
98 |
99 | -- | Based on @scientific@ 0.3.4.10.
100 | --
101 | -- Modified to remove scientific notation options
102 | -- and trailing digits in fractions.
103 | formatScientific :: Scientific -> String
104 | formatScientific s
105 | | Sci.coefficient s < 0 = '-':formatPositiveScientific (-s)
106 | | otherwise = formatPositiveScientific s
107 | where
108 | formatPositiveScientific :: Scientific -> String
109 | formatPositiveScientific = fmtAsFixed . Sci.toDecimalDigits
110 |
111 | -- | Based on @scientific@ 0.3.4.10.
112 | --
113 | -- Modified not to print trailing zeros in fractions.
114 | fmtAsFixed :: ([Int], Int) -> String
115 | fmtAsFixed (is, e)
116 | | e <= 0 = '0':mkFractional (replicate (-e) '0' <> ds)
117 | | otherwise = f e "" ds
118 | where
119 | mk0 :: String -> String
120 | mk0 "" = "0"
121 | mk0 ls = ls
122 |
123 | mkFractional :: String -> String
124 | mkFractional "" = ""
125 | mkFractional "0" = ""
126 | mkFractional ls = '.':ls
127 |
128 | ds :: String
129 | ds = intToDigit <$> is
130 |
131 | f :: Int -> String -> String -> String
132 | f 0 s rs = mk0 (reverse s) <> mkFractional rs
133 | f n s "" = f (n-1) ('0':s) ""
134 | f n s (r:rs) = f (n-1) (r:s) rs
135 |
--------------------------------------------------------------------------------
/implementation/src/Son/Parser.hs:
--------------------------------------------------------------------------------
1 | module Son.Parser where
2 |
3 | import Control.Monad.Fail (fail)
4 | import Data.Aeson
5 | import Data.Attoparsec.Text
6 | import Data.Char (isDigit)
7 | import Data.HashMap.Strict (HashMap)
8 | import Data.Scientific (Scientific)
9 | import Data.String (unlines)
10 | import Data.Vector (Vector)
11 | import Protolude hiding (option, take)
12 |
13 | import qualified Data.HashMap.Strict as HM
14 | import qualified Data.Text as T
15 | import qualified Data.Vector as V
16 |
17 | sonValue :: Parser Value
18 | sonValue = fmap Object sonObject
19 | <|> fmap Array sonArray
20 | <|> fmap String sonString
21 | <|> fmap Number sonNumber
22 | <|> fmap Bool sonBoolean
23 | <|> fmap (const Null) sonNull
24 |
25 | sonObject :: Parser (HashMap Text Value)
26 | sonObject = do
27 | void (char '{')
28 | es <- element `sepBy` char ','
29 | void (char '}')
30 | checkOrder es
31 | pure (HM.fromList es)
32 | where
33 | element :: Parser (Text, Value)
34 | element = do
35 | t <- sonString
36 | void (char ':')
37 | x <- sonValue
38 | pure (t,x)
39 |
40 | checkOrder :: [(Text, Value)] -> Parser ()
41 | checkOrder = void . foldM f Nothing
42 | where
43 | f :: Maybe Text -> (Text, a) -> Parser (Maybe Text)
44 | f Nothing (k,_) = pure (Just k)
45 | f (Just acc) (k,_)
46 | | k > acc = pure (Just k)
47 | | otherwise = fail ("Key out of order: " <> T.unpack k)
48 |
49 | sonArray :: Parser (Vector Value)
50 | sonArray = do
51 | void (char '[')
52 | xs <- V.fromList <$> sonValue `sepBy` char ','
53 | void (char ']')
54 | pure xs
55 |
56 | sonString :: Parser Text
57 | sonString = do
58 | void (char '"')
59 | ts <- mempty <$ char '"'
60 | <|> body <* char '"'
61 | pure (T.concat ts)
62 | where
63 | body :: Parser [Text]
64 | body = many (takeWhile1 (\c -> c /= '"' && c /= '\\') <|> unescape)
65 |
66 | unescape :: Parser Text
67 | unescape = do
68 | void (char '\\')
69 | T.singleton <$> (shortcutEscape <|> unescapeCodePoint)
70 |
71 | shortcutEscape :: Parser Char
72 | shortcutEscape =
73 | char '"'
74 | <|> char '\\'
75 | -- NOTE: solidus (U+002F) isn't listed here, because even though
76 | -- JSON allows it to be escaped with @\/@ it doesn't have to be.
77 | <|> (char 'b' *> pure '\x08') -- backspace
78 | <|> (char 't' *> pure '\x09') -- tab
79 | <|> (char 'n' *> pure '\x0a') -- line feed
80 | <|> (char 'f' *> pure '\x0c') -- form feed
81 | <|> (char 'r' *> pure '\x0d') -- carriage return
82 |
83 | unescapeCodePoint :: Parser Char
84 | unescapeCodePoint = do
85 | void (char 'u')
86 | void (char '0')
87 | void (char '0')
88 | n <- take 2
89 | case n of
90 | "00" -> pure '\x00'
91 | "01" -> pure '\x01'
92 | "02" -> pure '\x02'
93 | "03" -> pure '\x03'
94 | "04" -> pure '\x04'
95 | "05" -> pure '\x05'
96 | "06" -> pure '\x06'
97 | "07" -> pure '\x07'
98 | --
99 | "0b" -> pure '\x0b'
100 | --
101 | "0e" -> pure '\x0e'
102 | "0f" -> pure '\x0f'
103 | "10" -> pure '\x10'
104 | "11" -> pure '\x11'
105 | "12" -> pure '\x12'
106 | "13" -> pure '\x13'
107 | "14" -> pure '\x14'
108 | "15" -> pure '\x15'
109 | "16" -> pure '\x16'
110 | "17" -> pure '\x17'
111 | "18" -> pure '\x18'
112 | "19" -> pure '\x19'
113 | "1a" -> pure '\x1a'
114 | "1b" -> pure '\x1b'
115 | "1c" -> pure '\x1c'
116 | "1d" -> pure '\x1d'
117 | "1e" -> pure '\x1e'
118 | "1f" -> pure '\x1f'
119 | _ -> fail ("\\u escape followed by invalid sequence: " <> T.unpack n)
120 |
121 | sonNumber :: Parser Scientific
122 | sonNumber = makeScientific -- TODO: note that Order matters here.
123 | <|> 0 <$ char '0'
124 | where
125 | -- Use @nonZero@ to make sure the number's a valid Son number,
126 | -- then use @readMaybe@ to turn it into a @Scientific@.
127 | --
128 | -- This would be more efficient if @nonZero@ could create the
129 | -- @Scientific@ itself. To do so we'll need a function going from
130 | -- @Bool -> [Char] -> [Char] -> Scientific@, where the @Bool@ is
131 | -- the sign of the number and the @Char@s are digits.
132 | makeScientific :: Parser Scientific
133 | makeScientific = do
134 | (t, ()) <- match nonZero
135 | case readMaybe (T.unpack t) of
136 | Nothing -> fail (unlines [ "readMaybe failed (this should"
137 | , " never happen) on input: "
138 | , show t
139 | ])
140 | Just s -> pure s
141 |
142 | nonZero :: Parser ()
143 | nonZero = do
144 | option () (void (char '-'))
145 | startsWithNonZero <|> char '0' *> void sonFraction
146 | where
147 | startsWithNonZero :: Parser ()
148 | startsWithNonZero = do
149 | void sonPositiveInteger
150 | option () (void sonFraction)
151 |
152 | sonPositiveInteger :: Parser Text
153 | sonPositiveInteger = do
154 | n <- takeTill (not . isDigit)
155 | when (T.null n) (fail "integer part is empty")
156 | when (T.head n == '0') (fail ("integer part stars with zero: " <> show n))
157 | pure n
158 |
159 | sonFraction :: Parser Text
160 | sonFraction = do
161 | void (char '.')
162 | n <- takeTill (not . isDigit)
163 | when (T.null n) (fail "no digits after decimal point")
164 | when (T.last n == '0') (fail (show ("fractional part ends in zero: " <> n)))
165 | pure n
166 |
167 | sonBoolean :: Parser Bool
168 | sonBoolean = True <$ string "true"
169 | <|> False <$ string "false"
170 |
171 | sonNull :: Parser ()
172 | sonNull = () <$ string "null"
173 |
--------------------------------------------------------------------------------
/implementation/stack.yaml:
--------------------------------------------------------------------------------
1 | resolver: lts-14.18
2 | extra-deps:
3 | - aeson-1.4.6.0 # Important enough to lock down
4 |
--------------------------------------------------------------------------------
/implementation/stack.yaml.lock:
--------------------------------------------------------------------------------
1 | # This file was autogenerated by Stack.
2 | # You should not edit this file by hand.
3 | # For more information, please see the documentation at:
4 | # https://docs.haskellstack.org/en/stable/lock_files
5 |
6 | packages:
7 | - completed:
8 | hackage: aeson-1.4.6.0@sha256:560575b008a23960403a128331f0e59594786b5cd19a35be0cd74b9a7257958e,6980
9 | pantry-tree:
10 | size: 40193
11 | sha256: 5769473440ae594ae8679dde9fe12b6d00a49264a9dd8962a53ff3ae5740d7a5
12 | original:
13 | hackage: aeson-1.4.6.0
14 | snapshots:
15 | - completed:
16 | size: 524789
17 | url: https://raw.githubusercontent.com/commercialhaskell/stackage-snapshots/master/lts/14/18.yaml
18 | sha256: 646be71223e08234131c6989912e6011e01b9767bc447b6d466a35e14360bdf2
19 | original: lts-14.18
20 |
--------------------------------------------------------------------------------
/implementation/test/JQ.hs:
--------------------------------------------------------------------------------
1 | module JQ where
2 |
3 | import Data.Aeson
4 | import Data.String (String)
5 | import Protolude
6 | import System.Process (readProcess)
7 |
8 | import qualified Data.ByteString.Lazy as LBS
9 | import qualified Data.Text as T
10 |
11 | encodeJQ :: ToJSON a => a -> IO String
12 | encodeJQ a =
13 | readProcess "jq" ["--compact-output", "--sort-keys", "."] jsonString
14 | where
15 | jsonString :: String
16 | jsonString = T.unpack (decodeUtf8 (LBS.toStrict (encode a)))
17 |
--------------------------------------------------------------------------------
/implementation/test/Test.hs:
--------------------------------------------------------------------------------
1 | module Main where
2 |
3 | import Data.Aeson
4 | import Data.List (unlines)
5 | import Data.String (String)
6 | import JQ
7 | import Protolude
8 | import Son (Son(..))
9 | import Test.QuickCheck.Monadic
10 | import Test.Tasty
11 | import Test.Tasty.HUnit hiding (assert)
12 | import Test.Tasty.QuickCheck
13 |
14 | import qualified Data.Text as T
15 | import qualified Son
16 |
17 | main :: IO ()
18 | main = defaultMain $ testGroup "Son"
19 | [ testProperty "is always valid JSON" isJSON
20 | , testProperty "decodes to the same Aeson Value" isSameJSON
21 | , testProperty "roundtrips through JSON without changing" roundtrip
22 | , testCase "properties must be ordered" orderedProperties
23 | , testGroup "numbers are serialized correctly"
24 | [ testCase "0" (Son.generate (Son (Number 0)) @?= "0")
25 | , testCase "1" (Son.generate (Son (Number 1)) @?= "1")
26 | ]
27 | -- , testProperty "outputs the same JSON as jq with certain arguments" jqTest
28 | ]
29 |
30 | isJSON :: Son -> Property
31 | isJSON a =
32 | let b = Son.generate a
33 | c = eitherDecodeStrict (encodeUtf8 b) :: Either String Value
34 | s = unlines [ "serialized to: " <> T.unpack b
35 | , "parsed to: " <> show c
36 | ]
37 | in counterexample s (isRight c)
38 |
39 | -- | This won't hold for every JSON library. For instance if a
40 | -- library parses `1.0` and `1` to different values it won't pass.
41 | -- I don't think Aeson encodes any info in ways that are squashed
42 | -- by Son serializaton though.
43 | isSameJSON :: Son -> Property
44 | isSameJSON a =
45 | let b = Son.generate a
46 | c = eitherDecodeStrict (encodeUtf8 b)
47 | s = unlines [ "serialized to: " <> T.unpack b
48 | , "parsed to: " <> show c
49 | ]
50 | in counterexample s (Right (_unSon a) == c)
51 |
52 | roundtrip :: Son -> Property
53 | roundtrip a =
54 | let b = Son.generate a
55 | c = Son.decode (encodeUtf8 b)
56 | s = unlines [ "serialized to: " <> T.unpack b
57 | , "parsed to: " <> show c
58 | ]
59 | in counterexample s (Right a == c)
60 |
61 | jqTest :: Son -> Property
62 | jqTest a = monadicIO $ do
63 | let b = Son.generate a
64 | c <- run $ encodeJQ (_unSon a)
65 | let s = unlines [ "serialized to Son: " <> T.unpack b
66 | , "serialized with jq: " <> c
67 | ]
68 | -- The following is useful if QuickCheck produces a huge counterexample:
69 | --
70 | -- when (b <> T.singleton '\n' /= T.pack c) $ do
71 | -- run (writeFile "first" (b <> "\n"))
72 | -- run (writeFile "second" (T.pack c))
73 | monitor (counterexample s)
74 | assert (b <> T.singleton '\n' == T.pack c)
75 |
76 | orderedProperties :: Assertion
77 | orderedProperties = isLeft (Son.parse "{\"b\":null,\"a\":null}") @?= True
78 |
--------------------------------------------------------------------------------
/json.abnf:
--------------------------------------------------------------------------------
1 | JSON-text = ws value ws
2 |
3 | begin-array = ws %x5B ws ; [ left square bracket
4 | begin-object = ws %x7B ws ; { left curly bracket
5 | end-array = ws %x5D ws ; ] right square bracket
6 | end-object = ws %x7D ws ; } right curly bracket
7 | name-separator = ws %x3A ws ; : colon
8 | value-separator = ws %x2C ws ; , comma
9 |
10 | ws = *(
11 | %x20 / ; Space
12 | %x09 / ; Horizontal tab
13 | %x0A / ; Line feed or New line
14 | %x0D ) ; Carriage return
15 |
16 | value = false / null / true / object / array / number / string
17 | false = %x66.61.6c.73.65 ; false
18 | null = %x6e.75.6c.6c ; null
19 | true = %x74.72.75.65 ; true
20 |
21 | object = begin-object [ member *( value-separator member ) ]
22 | end-object
23 | member = string name-separator value
24 |
25 | array = begin-array [ value *( value-separator value ) ] end-array
26 |
27 | number = [ minus ] int [ frac ] [ exp ]
28 | decimal-point = %x2E ; .
29 | digit1-9 = %x31-39 ; 1-9
30 | e = %x65 / %x45 ; e E
31 | exp = e [ minus / plus ] 1*DIGIT
32 | frac = decimal-point 1*DIGIT
33 | int = zero / ( digit1-9 *DIGIT )
34 | minus = %x2D ; -
35 | plus = %x2B ; +
36 | zero = %x30 ; 0
37 |
38 | string = quotation-mark *char quotation-mark
39 | char = unescaped /
40 | escape (
41 | %x22 / ; " quotation mark U+0022
42 | %x5C / ; \ reverse solidus U+005C
43 | %x2F / ; / solidus U+002F
44 | %x62 / ; b backspace U+0008
45 | %x66 / ; f form feed U+000C
46 | %x6E / ; n line feed U+000A
47 | %x72 / ; r carriage return U+000D
48 | %x74 / ; t tab U+0009
49 | %x75 4HEXDIG ) ; uXXXX U+XXXX
50 | escape = %x5C ; \
51 | quotation-mark = %x22 ; "
52 | unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
53 |
--------------------------------------------------------------------------------
/references/rfc7159.txt:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 | Internet Engineering Task Force (IETF) T. Bray, Ed.
8 | Request for Comments: 7159 Google, Inc.
9 | Obsoletes: 4627, 7158 March 2014
10 | Category: Standards Track
11 | ISSN: 2070-1721
12 |
13 |
14 | The JavaScript Object Notation (JSON) Data Interchange Format
15 |
16 | Abstract
17 |
18 | JavaScript Object Notation (JSON) is a lightweight, text-based,
19 | language-independent data interchange format. It was derived from
20 | the ECMAScript Programming Language Standard. JSON defines a small
21 | set of formatting rules for the portable representation of structured
22 | data.
23 |
24 | This document removes inconsistencies with other specifications of
25 | JSON, repairs specification errors, and offers experience-based
26 | interoperability guidance.
27 |
28 | Status of This Memo
29 |
30 | This is an Internet Standards Track document.
31 |
32 | This document is a product of the Internet Engineering Task Force
33 | (IETF). It represents the consensus of the IETF community. It has
34 | received public review and has been approved for publication by the
35 | Internet Engineering Steering Group (IESG). Further information on
36 | Internet Standards is available in Section 2 of RFC 5741.
37 |
38 | Information about the current status of this document, any errata,
39 | and how to provide feedback on it may be obtained at
40 | http://www.rfc-editor.org/info/rfc7159.
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 | Bray Standards Track [Page 1]
59 |
60 | RFC 7159 JSON March 2014
61 |
62 |
63 | Copyright Notice
64 |
65 | Copyright (c) 2014 IETF Trust and the persons identified as the
66 | document authors. All rights reserved.
67 |
68 | This document is subject to BCP 78 and the IETF Trust's Legal
69 | Provisions Relating to IETF Documents
70 | (http://trustee.ietf.org/license-info) in effect on the date of
71 | publication of this document. Please review these documents
72 | carefully, as they describe your rights and restrictions with respect
73 | to this document. Code Components extracted from this document must
74 | include Simplified BSD License text as described in Section 4.e of
75 | the Trust Legal Provisions and are provided without warranty as
76 | described in the Simplified BSD License.
77 |
78 | This document may contain material from IETF Documents or IETF
79 | Contributions published or made publicly available before November
80 | 10, 2008. The person(s) controlling the copyright in some of this
81 | material may not have granted the IETF Trust the right to allow
82 | modifications of such material outside the IETF Standards Process.
83 | Without obtaining an adequate license from the person(s) controlling
84 | the copyright in such materials, this document may not be modified
85 | outside the IETF Standards Process, and derivative works of it may
86 | not be created outside the IETF Standards Process, except to format
87 | it for publication as an RFC or to translate it into languages other
88 | than English.
89 |
90 |
91 |
92 |
93 |
94 |
95 |
96 |
97 |
98 |
99 |
100 |
101 |
102 |
103 |
104 |
105 |
106 |
107 |
108 |
109 |
110 |
111 |
112 |
113 |
114 | Bray Standards Track [Page 2]
115 |
116 | RFC 7159 JSON March 2014
117 |
118 |
119 | Table of Contents
120 |
121 | 1. Introduction ....................................................3
122 | 1.1. Conventions Used in This Document ..........................4
123 | 1.2. Specifications of JSON .....................................4
124 | 1.3. Introduction to This Revision ..............................4
125 | 2. JSON Grammar ....................................................4
126 | 3. Values ..........................................................5
127 | 4. Objects .........................................................6
128 | 5. Arrays ..........................................................6
129 | 6. Numbers .........................................................6
130 | 7. Strings .........................................................8
131 | 8. String and Character Issues .....................................9
132 | 8.1. Character Encoding .........................................9
133 | 8.2. Unicode Characters .........................................9
134 | 8.3. String Comparison ..........................................9
135 | 9. Parsers ........................................................10
136 | 10. Generators ....................................................10
137 | 11. IANA Considerations ...........................................10
138 | 12. Security Considerations .......................................11
139 | 13. Examples ......................................................12
140 | 14. Contributors ..................................................13
141 | 15. References ....................................................13
142 | 15.1. Normative References .....................................13
143 | 15.2. Informative References ...................................13
144 | Appendix A. Changes from RFC 4627 .................................15
145 |
146 | 1. Introduction
147 |
148 | JavaScript Object Notation (JSON) is a text format for the
149 | serialization of structured data. It is derived from the object
150 | literals of JavaScript, as defined in the ECMAScript Programming
151 | Language Standard, Third Edition [ECMA-262].
152 |
153 | JSON can represent four primitive types (strings, numbers, booleans,
154 | and null) and two structured types (objects and arrays).
155 |
156 | A string is a sequence of zero or more Unicode characters [UNICODE].
157 | Note that this citation references the latest version of Unicode
158 | rather than a specific release. It is not expected that future
159 | changes in the UNICODE specification will impact the syntax of JSON.
160 |
161 | An object is an unordered collection of zero or more name/value
162 | pairs, where a name is a string and a value is a string, number,
163 | boolean, null, object, or array.
164 |
165 | An array is an ordered sequence of zero or more values.
166 |
167 |
168 |
169 |
170 | Bray Standards Track [Page 3]
171 |
172 | RFC 7159 JSON March 2014
173 |
174 |
175 | The terms "object" and "array" come from the conventions of
176 | JavaScript.
177 |
178 | JSON's design goals were for it to be minimal, portable, textual, and
179 | a subset of JavaScript.
180 |
181 | 1.1. Conventions Used in This Document
182 |
183 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
184 | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
185 | document are to be interpreted as described in [RFC2119].
186 |
187 | The grammatical rules in this document are to be interpreted as
188 | described in [RFC5234].
189 |
190 | 1.2. Specifications of JSON
191 |
192 | This document updates [RFC4627], which describes JSON and registers
193 | the media type "application/json".
194 |
195 | A description of JSON in ECMAScript terms appears in Version 5.1 of
196 | the ECMAScript specification [ECMA-262], Section 15.12. JSON is also
197 | described in [ECMA-404].
198 |
199 | All of the specifications of JSON syntax agree on the syntactic
200 | elements of the language.
201 |
202 | 1.3. Introduction to This Revision
203 |
204 | In the years since the publication of RFC 4627, JSON has found very
205 | wide use. This experience has revealed certain patterns, which,
206 | while allowed by its specifications, have caused interoperability
207 | problems.
208 |
209 | Also, a small number of errata have been reported (see RFC Errata IDs
210 | 607 [Err607] and 3607 [Err3607]).
211 |
212 | This document's goal is to apply the errata, remove inconsistencies
213 | with other specifications of JSON, and highlight practices that can
214 | lead to interoperability problems.
215 |
216 | 2. JSON Grammar
217 |
218 | A JSON text is a sequence of tokens. The set of tokens includes six
219 | structural characters, strings, numbers, and three literal names.
220 |
221 | A JSON text is a serialized value. Note that certain previous
222 | specifications of JSON constrained a JSON text to be an object or an
223 |
224 |
225 |
226 | Bray Standards Track [Page 4]
227 |
228 | RFC 7159 JSON March 2014
229 |
230 |
231 | array. Implementations that generate only objects or arrays where a
232 | JSON text is called for will be interoperable in the sense that all
233 | implementations will accept these as conforming JSON texts.
234 |
235 | JSON-text = ws value ws
236 |
237 | These are the six structural characters:
238 |
239 | begin-array = ws %x5B ws ; [ left square bracket
240 |
241 | begin-object = ws %x7B ws ; { left curly bracket
242 |
243 | end-array = ws %x5D ws ; ] right square bracket
244 |
245 | end-object = ws %x7D ws ; } right curly bracket
246 |
247 | name-separator = ws %x3A ws ; : colon
248 |
249 | value-separator = ws %x2C ws ; , comma
250 |
251 | Insignificant whitespace is allowed before or after any of the six
252 | structural characters.
253 |
254 | ws = *(
255 | %x20 / ; Space
256 | %x09 / ; Horizontal tab
257 | %x0A / ; Line feed or New line
258 | %x0D ) ; Carriage return
259 |
260 | 3. Values
261 |
262 | A JSON value MUST be an object, array, number, or string, or one of
263 | the following three literal names:
264 |
265 | false null true
266 |
267 | The literal names MUST be lowercase. No other literal names are
268 | allowed.
269 |
270 | value = false / null / true / object / array / number / string
271 |
272 | false = %x66.61.6c.73.65 ; false
273 |
274 | null = %x6e.75.6c.6c ; null
275 |
276 | true = %x74.72.75.65 ; true
277 |
278 |
279 |
280 |
281 |
282 | Bray Standards Track [Page 5]
283 |
284 | RFC 7159 JSON March 2014
285 |
286 |
287 | 4. Objects
288 |
289 | An object structure is represented as a pair of curly brackets
290 | surrounding zero or more name/value pairs (or members). A name is a
291 | string. A single colon comes after each name, separating the name
292 | from the value. A single comma separates a value from a following
293 | name. The names within an object SHOULD be unique.
294 |
295 | object = begin-object [ member *( value-separator member ) ]
296 | end-object
297 |
298 | member = string name-separator value
299 |
300 | An object whose names are all unique is interoperable in the sense
301 | that all software implementations receiving that object will agree on
302 | the name-value mappings. When the names within an object are not
303 | unique, the behavior of software that receives such an object is
304 | unpredictable. Many implementations report the last name/value pair
305 | only. Other implementations report an error or fail to parse the
306 | object, and some implementations report all of the name/value pairs,
307 | including duplicates.
308 |
309 | JSON parsing libraries have been observed to differ as to whether or
310 | not they make the ordering of object members visible to calling
311 | software. Implementations whose behavior does not depend on member
312 | ordering will be interoperable in the sense that they will not be
313 | affected by these differences.
314 |
315 | 5. Arrays
316 |
317 | An array structure is represented as square brackets surrounding zero
318 | or more values (or elements). Elements are separated by commas.
319 |
320 | array = begin-array [ value *( value-separator value ) ] end-array
321 |
322 | There is no requirement that the values in an array be of the same
323 | type.
324 |
325 | 6. Numbers
326 |
327 | The representation of numbers is similar to that used in most
328 | programming languages. A number is represented in base 10 using
329 | decimal digits. It contains an integer component that may be
330 | prefixed with an optional minus sign, which may be followed by a
331 | fraction part and/or an exponent part. Leading zeros are not
332 | allowed.
333 |
334 | A fraction part is a decimal point followed by one or more digits.
335 |
336 |
337 |
338 | Bray Standards Track [Page 6]
339 |
340 | RFC 7159 JSON March 2014
341 |
342 |
343 | An exponent part begins with the letter E in upper or lower case,
344 | which may be followed by a plus or minus sign. The E and optional
345 | sign are followed by one or more digits.
346 |
347 | Numeric values that cannot be represented in the grammar below (such
348 | as Infinity and NaN) are not permitted.
349 |
350 | number = [ minus ] int [ frac ] [ exp ]
351 |
352 | decimal-point = %x2E ; .
353 |
354 | digit1-9 = %x31-39 ; 1-9
355 |
356 | e = %x65 / %x45 ; e E
357 |
358 | exp = e [ minus / plus ] 1*DIGIT
359 |
360 | frac = decimal-point 1*DIGIT
361 |
362 | int = zero / ( digit1-9 *DIGIT )
363 |
364 | minus = %x2D ; -
365 |
366 | plus = %x2B ; +
367 |
368 | zero = %x30 ; 0
369 |
370 | This specification allows implementations to set limits on the range
371 | and precision of numbers accepted. Since software that implements
372 | IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
373 | generally available and widely used, good interoperability can be
374 | achieved by implementations that expect no more precision or range
375 | than these provide, in the sense that implementations will
376 | approximate JSON numbers within the expected precision. A JSON
377 | number such as 1E400 or 3.141592653589793238462643383279 may indicate
378 | potential interoperability problems, since it suggests that the
379 | software that created it expects receiving software to have greater
380 | capabilities for numeric magnitude and precision than is widely
381 | available.
382 |
383 | Note that when such software is used, numbers that are integers and
384 | are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
385 | sense that implementations will agree exactly on their numeric
386 | values.
387 |
388 |
389 |
390 |
391 |
392 |
393 |
394 | Bray Standards Track [Page 7]
395 |
396 | RFC 7159 JSON March 2014
397 |
398 |
399 | 7. Strings
400 |
401 | The representation of strings is similar to conventions used in the C
402 | family of programming languages. A string begins and ends with
403 | quotation marks. All Unicode characters may be placed within the
404 | quotation marks, except for the characters that must be escaped:
405 | quotation mark, reverse solidus, and the control characters (U+0000
406 | through U+001F).
407 |
408 | Any character may be escaped. If the character is in the Basic
409 | Multilingual Plane (U+0000 through U+FFFF), then it may be
410 | represented as a six-character sequence: a reverse solidus, followed
411 | by the lowercase letter u, followed by four hexadecimal digits that
412 | encode the character's code point. The hexadecimal letters A though
413 | F can be upper or lower case. So, for example, a string containing
414 | only a single reverse solidus character may be represented as
415 | "\u005C".
416 |
417 | Alternatively, there are two-character sequence escape
418 | representations of some popular characters. So, for example, a
419 | string containing only a single reverse solidus character may be
420 | represented more compactly as "\\".
421 |
422 | To escape an extended character that is not in the Basic Multilingual
423 | Plane, the character is represented as a 12-character sequence,
424 | encoding the UTF-16 surrogate pair. So, for example, a string
425 | containing only the G clef character (U+1D11E) may be represented as
426 | "\uD834\uDD1E".
427 |
428 | string = quotation-mark *char quotation-mark
429 |
430 | char = unescaped /
431 | escape (
432 | %x22 / ; " quotation mark U+0022
433 | %x5C / ; \ reverse solidus U+005C
434 | %x2F / ; / solidus U+002F
435 | %x62 / ; b backspace U+0008
436 | %x66 / ; f form feed U+000C
437 | %x6E / ; n line feed U+000A
438 | %x72 / ; r carriage return U+000D
439 | %x74 / ; t tab U+0009
440 | %x75 4HEXDIG ) ; uXXXX U+XXXX
441 |
442 | escape = %x5C ; \
443 |
444 | quotation-mark = %x22 ; "
445 |
446 | unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
447 |
448 |
449 |
450 | Bray Standards Track [Page 8]
451 |
452 | RFC 7159 JSON March 2014
453 |
454 |
455 | 8. String and Character Issues
456 |
457 | 8.1. Character Encoding
458 |
459 | JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default
460 | encoding is UTF-8, and JSON texts that are encoded in UTF-8 are
461 | interoperable in the sense that they will be read successfully by the
462 | maximum number of implementations; there are many implementations
463 | that cannot successfully read texts in other encodings (such as
464 | UTF-16 and UTF-32).
465 |
466 | Implementations MUST NOT add a byte order mark to the beginning of a
467 | JSON text. In the interests of interoperability, implementations
468 | that parse JSON texts MAY ignore the presence of a byte order mark
469 | rather than treating it as an error.
470 |
471 | 8.2. Unicode Characters
472 |
473 | When all the strings represented in a JSON text are composed entirely
474 | of Unicode characters [UNICODE] (however escaped), then that JSON
475 | text is interoperable in the sense that all software implementations
476 | that parse it will agree on the contents of names and of string
477 | values in objects and arrays.
478 |
479 | However, the ABNF in this specification allows member names and
480 | string values to contain bit sequences that cannot encode Unicode
481 | characters; for example, "\uDEAD" (a single unpaired UTF-16
482 | surrogate). Instances of this have been observed, for example, when
483 | a library truncates a UTF-16 string without checking whether the
484 | truncation split a surrogate pair. The behavior of software that
485 | receives JSON texts containing such values is unpredictable; for
486 | example, implementations might return different values for the length
487 | of a string value or even suffer fatal runtime exceptions.
488 |
489 | 8.3. String Comparison
490 |
491 | Software implementations are typically required to test names of
492 | object members for equality. Implementations that transform the
493 | textual representation into sequences of Unicode code units and then
494 | perform the comparison numerically, code unit by code unit, are
495 | interoperable in the sense that implementations will agree in all
496 | cases on equality or inequality of two strings. For example,
497 | implementations that compare strings with escaped characters
498 | unconverted may incorrectly find that "a\\b" and "a\u005Cb" are not
499 | equal.
500 |
501 |
502 |
503 |
504 |
505 |
506 | Bray Standards Track [Page 9]
507 |
508 | RFC 7159 JSON March 2014
509 |
510 |
511 | 9. Parsers
512 |
513 | A JSON parser transforms a JSON text into another representation. A
514 | JSON parser MUST accept all texts that conform to the JSON grammar.
515 | A JSON parser MAY accept non-JSON forms or extensions.
516 |
517 | An implementation may set limits on the size of texts that it
518 | accepts. An implementation may set limits on the maximum depth of
519 | nesting. An implementation may set limits on the range and precision
520 | of numbers. An implementation may set limits on the length and
521 | character contents of strings.
522 |
523 | 10. Generators
524 |
525 | A JSON generator produces JSON text. The resulting text MUST
526 | strictly conform to the JSON grammar.
527 |
528 | 11. IANA Considerations
529 |
530 | The MIME media type for JSON text is application/json.
531 |
532 | Type name: application
533 |
534 | Subtype name: json
535 |
536 | Required parameters: n/a
537 |
538 | Optional parameters: n/a
539 |
540 | Encoding considerations: binary
541 |
542 | Security considerations: See [RFC7159], Section 12.
543 |
544 | Interoperability considerations: Described in [RFC7159]
545 |
546 | Published specification: [RFC7159]
547 |
548 | Applications that use this media type:
549 | JSON has been used to exchange data between applications written
550 | in all of these programming languages: ActionScript, C, C#,
551 | Clojure, ColdFusion, Common Lisp, E, Erlang, Go, Java, JavaScript,
552 | Lua, Objective CAML, Perl, PHP, Python, Rebol, Ruby, Scala, and
553 | Scheme.
554 |
555 |
556 |
557 |
558 |
559 |
560 |
561 |
562 | Bray Standards Track [Page 10]
563 |
564 | RFC 7159 JSON March 2014
565 |
566 |
567 | Additional information:
568 | Magic number(s): n/a
569 | File extension(s): .json
570 | Macintosh file type code(s): TEXT
571 |
572 | Person & email address to contact for further information:
573 | IESG
574 |
575 |
576 | Intended usage: COMMON
577 |
578 | Restrictions on usage: none
579 |
580 | Author:
581 | Douglas Crockford
582 |
583 |
584 | Change controller:
585 | IESG
586 |
587 |
588 | Note: No "charset" parameter is defined for this registration.
589 | Adding one really has no effect on compliant recipients.
590 |
591 | 12. Security Considerations
592 |
593 | Generally, there are security issues with scripting languages. JSON
594 | is a subset of JavaScript but excludes assignment and invocation.
595 |
596 | Since JSON's syntax is borrowed from JavaScript, it is possible to
597 | use that language's "eval()" function to parse JSON texts. This
598 | generally constitutes an unacceptable security risk, since the text
599 | could contain executable code along with data declarations. The same
600 | consideration applies to the use of eval()-like functions in any
601 | other programming language in which JSON texts conform to that
602 | language's syntax.
603 |
604 |
605 |
606 |
607 |
608 |
609 |
610 |
611 |
612 |
613 |
614 |
615 |
616 |
617 |
618 | Bray Standards Track [Page 11]
619 |
620 | RFC 7159 JSON March 2014
621 |
622 |
623 | 13. Examples
624 |
625 | This is a JSON object:
626 |
627 | {
628 | "Image": {
629 | "Width": 800,
630 | "Height": 600,
631 | "Title": "View from 15th Floor",
632 | "Thumbnail": {
633 | "Url": "http://www.example.com/image/481989943",
634 | "Height": 125,
635 | "Width": 100
636 | },
637 | "Animated" : false,
638 | "IDs": [116, 943, 234, 38793]
639 | }
640 | }
641 |
642 | Its Image member is an object whose Thumbnail member is an object and
643 | whose IDs member is an array of numbers.
644 |
645 | This is a JSON array containing two objects:
646 |
647 | [
648 | {
649 | "precision": "zip",
650 | "Latitude": 37.7668,
651 | "Longitude": -122.3959,
652 | "Address": "",
653 | "City": "SAN FRANCISCO",
654 | "State": "CA",
655 | "Zip": "94107",
656 | "Country": "US"
657 | },
658 | {
659 | "precision": "zip",
660 | "Latitude": 37.371991,
661 | "Longitude": -122.026020,
662 | "Address": "",
663 | "City": "SUNNYVALE",
664 | "State": "CA",
665 | "Zip": "94085",
666 | "Country": "US"
667 | }
668 | ]
669 |
670 |
671 |
672 |
673 |
674 | Bray Standards Track [Page 12]
675 |
676 | RFC 7159 JSON March 2014
677 |
678 |
679 | Here are three small JSON texts containing only values:
680 |
681 | "Hello world!"
682 |
683 | 42
684 |
685 | true
686 |
687 | 14. Contributors
688 |
689 | RFC 4627 was written by Douglas Crockford. This document was
690 | constructed by making a relatively small number of changes to that
691 | document; thus, the vast majority of the text here is his.
692 |
693 | 15. References
694 |
695 | 15.1. Normative References
696 |
697 | [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE
698 | Standard 754, August 2008,
699 | .
700 |
701 | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
702 | Requirement Levels", BCP 14, RFC 2119, March 1997.
703 |
704 | [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
705 | Specifications: ABNF", STD 68, RFC 5234, January 2008.
706 |
707 | [UNICODE] The Unicode Consortium, "The Unicode Standard",
708 | .
709 |
710 | 15.2. Informative References
711 |
712 | [ECMA-262] Ecma International, "ECMAScript Language Specification
713 | Edition 5.1", Standard ECMA-262, June 2011,
714 | .
716 |
717 | [ECMA-404] Ecma International, "The JSON Data Interchange Format",
718 | Standard ECMA-404, October 2013,
719 | .
721 |
722 | [Err3607] RFC Errata, Errata ID 3607, RFC 3607,
723 | .
724 |
725 |
726 |
727 |
728 |
729 |
730 | Bray Standards Track [Page 13]
731 |
732 | RFC 7159 JSON March 2014
733 |
734 |
735 | [Err607] RFC Errata, Errata ID 607, RFC 607,
736 | .
737 |
738 | [RFC4627] Crockford, D., "The application/json Media Type for
739 | JavaScript Object Notation (JSON)", RFC 4627, July 2006.
740 |
741 |
742 |
743 |
744 |
745 |
746 |
747 |
748 |
749 |
750 |
751 |
752 |
753 |
754 |
755 |
756 |
757 |
758 |
759 |
760 |
761 |
762 |
763 |
764 |
765 |
766 |
767 |
768 |
769 |
770 |
771 |
772 |
773 |
774 |
775 |
776 |
777 |
778 |
779 |
780 |
781 |
782 |
783 |
784 |
785 |
786 | Bray Standards Track [Page 14]
787 |
788 | RFC 7159 JSON March 2014
789 |
790 |
791 | Appendix A. Changes from RFC 4627
792 |
793 | This section lists changes between this document and the text in RFC
794 | 4627.
795 |
796 | o Changed the title and abstract of the document.
797 |
798 | o Changed the reference to [UNICODE] to be not version specific.
799 |
800 | o Added a "Specifications of JSON" section.
801 |
802 | o Added an "Introduction to This Revision" section.
803 |
804 | o Changed the definition of "JSON text" so that it can be any JSON
805 | value, removing the constraint that it be an object or array.
806 |
807 | o Added language about duplicate object member names, member
808 | ordering, and interoperability.
809 |
810 | o Clarified the absence of a requirement that values in an array be
811 | of the same JSON type.
812 |
813 | o Applied erratum #607 from RFC 4627 to correctly align the artwork
814 | for the definition of "object".
815 |
816 | o Changed "as sequences of digits" to "in the grammar below" in the
817 | "Numbers" section, and made base-10-ness explicit.
818 |
819 | o Added language about number interoperability as a function of
820 | IEEE754, and added an IEEE754 reference.
821 |
822 | o Added language about interoperability and Unicode characters and
823 | about string comparisons. To do this, turned the old "Encoding"
824 | section into a "String and Character Issues" section, with three
825 | subsections: "Character Encoding", "Unicode Characters", and
826 | "String Comparison".
827 |
828 | o Changed guidance in the "Parsers" section to point out that
829 | implementations may set limits on the range "and precision" of
830 | numbers.
831 |
832 | o Updated and tidied the "IANA Considerations" section.
833 |
834 | o Made a real "Security Considerations" section and lifted the text
835 | out of the previous "IANA Considerations" section.
836 |
837 |
838 |
839 |
840 |
841 |
842 | Bray Standards Track [Page 15]
843 |
844 | RFC 7159 JSON March 2014
845 |
846 |
847 | o Applied erratum #3607 from RFC 4627 by removing the security
848 | consideration that begins "A JSON text can be safely passed" and
849 | the JavaScript code that went with that consideration.
850 |
851 | o Added a note to the "Security Considerations" section pointing out
852 | the risks of using the "eval()" function in JavaScript or any
853 | other language in which JSON texts conform to that language's
854 | syntax.
855 |
856 | o Added a note to the "IANA Considerations" clarifying the absence
857 | of a "charset" parameter for the application/json media type.
858 |
859 | o Changed "100" to 100 and added a boolean field, both in the first
860 | example.
861 |
862 | o Added examples of JSON texts with simple values, neither objects
863 | nor arrays.
864 |
865 | o Added a "Contributors" section crediting Douglas Crockford.
866 |
867 | o Added a reference to RFC 4627.
868 |
869 | o Moved the ECMAScript reference from Normative to Informative and
870 | updated it to reference ECMAScript 5.1, and added a reference to
871 | ECMA 404.
872 |
873 | Author's Address
874 |
875 | Tim Bray (editor)
876 | Google, Inc.
877 |
878 | EMail: tbray@textuality.com
879 |
880 |
881 |
882 |
883 |
884 |
885 |
886 |
887 |
888 |
889 |
890 |
891 |
892 |
893 |
894 |
895 |
896 |
897 |
898 | Bray Standards Track [Page 16]
899 |
900 |
--------------------------------------------------------------------------------
/son.ebnf:
--------------------------------------------------------------------------------
1 | value ::= object | array | string | number | boolean | "null"
2 |
3 | boolean ::= "true" | "false"
4 |
5 | object ::= "{" ( member ( "," member )* )? "}"
6 | member ::= string ":" value
7 |
8 | array ::= "[" ( value ( "," value )* )? "]"
9 |
10 | number ::= "-"? (positiveInteger fraction? | "0" fraction)
11 | | "0"
12 |
13 | fraction ::= "." digit* nonZeroDigit
14 | positiveInteger ::= nonZeroDigit digit*
15 |
16 | digit ::= [#x30 - #x39]
17 | nonZeroDigit ::= [#x31 - #x39]
18 |
19 | string ::= '"'
20 | ( unescaped
21 | | "\" (shortcutEscape | codepointEscape)
22 | )*
23 | '"'
24 | unescaped ::= ( [#x20 - #x21]
25 | | [#x23 - #x5B]
26 | | [#x5D - #x10FFFF]
27 | )
28 | shortcutEscape ::= '"'
29 | | "\"
30 | | "b"
31 | | "t"
32 | | "n"
33 | | "f"
34 | | "r"
35 | codepointEscape ::= "u00" ( "0" ([#x0 - #x7] | #xB | [#xE - #xF])
36 | | [#x10 - #x1F]
37 | )
38 |
--------------------------------------------------------------------------------