├── .gitignore
├── .travis.yml
├── LICENSE
├── MANIFEST.in
├── Makefile
├── README.md
├── jsonlike
├── __init__.py
└── api.py
├── requirements.txt
├── setup.py
└── tests
├── __init__.py
└── jsonlike_test.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | *.py~
10 |
11 | # Distribution / packaging
12 | .Python
13 | env/
14 | build/
15 | develop-eggs/
16 | dist/
17 | downloads/
18 | eggs/
19 | .eggs/
20 | lib/
21 | lib64/
22 | parts/
23 | sdist/
24 | var/
25 | *.egg-info/
26 | .installed.cfg
27 | *.egg
28 |
29 | # PyInstaller
30 | # Usually these files are written by a python script from a template
31 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
32 | *.manifest
33 | *.spec
34 |
35 | # Installer logs
36 | pip-log.txt
37 | pip-delete-this-directory.txt
38 |
39 | # Unit test / coverage reports
40 | htmlcov/
41 | .tox/
42 | .coverage
43 | .coverage.*
44 | .cache
45 | nosetests.xml
46 | coverage.xml
47 | *,cover
48 | .hypothesis/
49 |
50 | # Translations
51 | *.mo
52 | *.pot
53 |
54 | # Django stuff:
55 | *.log
56 |
57 | # Sphinx documentation
58 | docs/_build/
59 |
60 | # PyBuilder
61 | target/
62 |
63 | #Ipython Notebook
64 | .ipynb_checkpoints
65 |
--------------------------------------------------------------------------------
/.travis.yml:
--------------------------------------------------------------------------------
1 | language: python
2 | python:
3 | - "2.7"
4 | # - "3.2"
5 | - "3.3"
6 | - "3.4"
7 |
8 | script:
9 | - make test
10 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 |
3 | Copyright (c) 2016 Shaun Viguerie
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include README.md LICENSE
2 | recursive-include tests *
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | SHELL := /bin/bash
2 |
3 | init:
4 | @pip install -r requirements.txt
5 |
6 | test:
7 | @nosetests ./tests/
8 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # jsonlike [](https://travis-ci.org/shaunvxc/jsonlike) [](https://badge.fury.io/py/jsonlike)
2 | ### Why?
3 | Sometimes, especially when working with `JSON` data from the web, you will find that the data format is not quite JSON and thus have to do a little bit of fighting with it in order to successfully call `json.loads()`.
4 |
5 | ### Goal
6 | The goal of this package is **try** and provide the same functionality as `json.loads()` for data that **looks** like JSON, but doesn't play nicely with `json.loads()` or other common solutions.
7 |
8 | In its current state, it simply applies some heuristics that solve some of the common cases I've run into while working with not-quite `json` structured data. Overtime, I'd like to see it turn into something a bit more robust.
9 |
10 | ### Usage
11 | ```python
12 | import jsonlike
13 | jsonlike.loads(invalid_json_string)
14 | ```
15 |
16 | Currently, `jsonlike.loads` will
17 | * strip out bad escape characters
18 | * strip out HTML content with JSON values
19 | * add missing commas
20 | * correct errors due to nested `"`'s
21 |
22 | ##### Strip response callback wrappers
23 | ```python
24 | import jsonlike
25 | jsonlike.unwrap_and_load("callback({"a": ""hello""})") # yields {"a":"hello"}
26 | ```
27 | For JSON surrounded by a callback wrapper, calling `unwrap_and_load` will use the `unwrapper` library to strip away the callback, before returning `loads()` on the remaining content.
28 |
29 | ### Installation
30 | `$ pip install jsonlike`
31 |
32 | ## Contributing
33 | 1. Fork it ( https://github.com/shaunvxc/jsonlike/fork )
34 | 1. Create your feature branch (`git checkout -b new-feature`)
35 | 1. Commit your changes (`git commit -am 'Add some feature'`)
36 | 1. Run the tests (`make test`)
37 | 1. Push change to the branch (`git push origin new-feature`)
38 | 1. Create a Pull Request
39 |
40 |
--------------------------------------------------------------------------------
/jsonlike/__init__.py:
--------------------------------------------------------------------------------
1 | from .api import loads, unwrap_and_load
2 |
--------------------------------------------------------------------------------
/jsonlike/api.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | from __future__ import unicode_literals
3 |
4 | import json
5 | import demjson
6 | import yaml
7 | import re
8 | import unwrapper
9 |
10 |
11 | def unwrap_and_load(content):
12 | cleaned = clean_json(unwrapper.unwrap_raw(content))
13 | return loads(cleaned)
14 |
15 |
16 | def loads(content, try_yaml=False):
17 | try:
18 | json.loads(content)
19 | except Exception:
20 | cleaned = clean_json(content)
21 | try:
22 | # strip out HTML content and unescaped chars
23 | return json.loads(cleaned)
24 | except Exception:
25 | # try using demjson to decode a non-strict json string
26 | try:
27 | return demjson.decode(cleaned)
28 | except Exception:
29 | if try_yaml:
30 | # try loading as yaml-- yaml is a superset of json..this could be dangerous in cases
31 | return yaml.load(cleaned)
32 | raise
33 |
34 | def clean_json(content):
35 | return remove_html(remove_bad_double_quotes(remove_invalid_escapes(add_missing_commas(content))))
36 |
37 |
38 | def process_repl(match):
39 | return match.group(2)
40 |
41 |
42 | def remove_bad_blocks(block_pairs, content):
43 | x = content
44 | for pair in block_pairs:
45 | x = remove_bad_block(pair[0], pair[1], x)
46 |
47 | return x
48 |
49 |
50 | def remove_bad_block(key_name_to_rem, key_to_stop_rem_at, content):
51 | if '"{}"'.format(key_name_to_rem) in content:
52 | x = re.sub('(\"{}\":.{{1,}})(\"{}\")'.format(key_name_to_rem, key_to_stop_rem_at), process_repl, content, flags=re.DOTALL)
53 | return x
54 |
55 | return content
56 |
57 |
58 | def sub_first(match):
59 | return match.group(1) + '"'
60 |
61 |
62 | def sub_last(match):
63 | return '"' + match.group(1)
64 |
65 |
66 | def remove_bad_double_quotes(content):
67 | # JSON requires values to be surrounded in " 's, ie `{"foo": "bar"}`. This handles
68 | # cases where the JSON is like `{"foo":""bar""}`--- json.loads() won't like this case !
69 | return re.sub('([^\:])\"\"', sub_first, re.sub(r'\"\"([^,])', sub_last, content))
70 |
71 |
72 | def remove_html(content):
73 | return re.sub('<[^<]+?>', '', content)
74 |
75 |
76 | def remove_invalid_escapes(content):
77 | return content.replace("\\", "")
78 |
79 |
80 | def add_missing_commas(content):
81 | return content.replace('"}', '",}')
82 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | demjson==2.2.4
2 | unwrapper==1.0.0
3 | nose
4 | sure
5 | pyyaml
6 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | import os
4 | from setuptools import setup, find_packages
5 |
6 | here = os.path.abspath(os.path.dirname(__file__))
7 |
8 | required = [
9 | 'future'
10 | ]
11 |
12 | setup(
13 | name='jsonlike',
14 | version='0.0.2',
15 | packages=['jsonlike'],
16 | url='https://github.com/shaunvxc/jsonlike',
17 | license='MIT',
18 | author='Shaun Viguerie',
19 | author_email='shaunvig114@gmail.com',
20 | description='repair and parse invalid but jsonlike content',
21 | install_requires=required,
22 | classifiers=[
23 | # How mature is this project? Common values are
24 | # 3 - Alpha
25 | # 4 - Beta
26 | # 5 - Production/Stable
27 | 'Development Status :: 3 - Alpha',
28 |
29 | # Indicate who your project is intended for
30 | 'Intended Audience :: Developers',
31 | 'Intended Audience :: System Administrators',
32 | 'Intended Audience :: End Users/Desktop',
33 | 'Topic :: System :: Shells',
34 | 'Topic :: System :: System Shells',
35 |
36 | # Pick your license as you wish (should match "license" above)
37 | 'License :: OSI Approved :: MIT License',
38 |
39 | # Specify the Python versions you support here. In particular, ensure
40 | # that you indicate whether you support Python 2, Python 3 or both.
41 | 'Programming Language :: Python :: 2.7',
42 | 'Programming Language :: Python :: 3.3',
43 | 'Programming Language :: Python :: 3.4',
44 | 'Programming Language :: Python :: 3.5',
45 | ]
46 | )
47 |
--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shaunvxc/jsonlike/7d933a46da348853b26f9717a875494bb964f642/tests/__init__.py
--------------------------------------------------------------------------------
/tests/jsonlike_test.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | from __future__ import unicode_literals
3 |
4 | import sure
5 | import json
6 |
7 | from jsonlike import unwrap_and_load, loads
8 |
9 | def test_loads_w_bad_double_quotes():
10 | loads('{"a":1, "b": 2, "c":""shaun""}').should.equal(json.loads('{"a":1, "b": 2, "c": "shaun"}'))
11 |
12 |
13 | def test_unwrap_and_loads():
14 | unwrap_and_load('json13123({"a":1, "b": 2, "c":""shaun""})').should.equal(json.loads('{"a":1, "b": 2, "c": "shaun"}'))
15 |
16 |
17 | def test_loads_w_html():
18 | loads('{"a":1, "b": 2, "c": "hey"}').should.equal(json.loads('{"a":1, "b": 2, "c": "hey"}'))
19 |
20 |
21 | def test_loads_w_html2():
22 | loads('{"a":1, "b": 2, "c": "hey"}').should.equal(json.loads('{"a":1, "b": 2, "c": "hey"}'))
23 |
--------------------------------------------------------------------------------