├── .gitignore ├── .travis.yml ├── LICENSE ├── MANIFEST.in ├── Makefile ├── README.md ├── jsonlike ├── __init__.py └── api.py ├── requirements.txt ├── setup.py └── tests ├── __init__.py └── jsonlike_test.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | *.py~ 10 | 11 | # Distribution / packaging 12 | .Python 13 | env/ 14 | build/ 15 | develop-eggs/ 16 | dist/ 17 | downloads/ 18 | eggs/ 19 | .eggs/ 20 | lib/ 21 | lib64/ 22 | parts/ 23 | sdist/ 24 | var/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .coverage 43 | .coverage.* 44 | .cache 45 | nosetests.xml 46 | coverage.xml 47 | *,cover 48 | .hypothesis/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | 57 | # Sphinx documentation 58 | docs/_build/ 59 | 60 | # PyBuilder 61 | target/ 62 | 63 | #Ipython Notebook 64 | .ipynb_checkpoints 65 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | python: 3 | - "2.7" 4 | # - "3.2" 5 | - "3.3" 6 | - "3.4" 7 | 8 | script: 9 | - make test 10 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Shaun Viguerie 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include README.md LICENSE 2 | recursive-include tests * -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | SHELL := /bin/bash 2 | 3 | init: 4 | @pip install -r requirements.txt 5 | 6 | test: 7 | @nosetests ./tests/ 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # jsonlike [![Build Status](https://travis-ci.org/shaunvxc/jsonlike.svg?branch=master)](https://travis-ci.org/shaunvxc/jsonlike) [![PyPI version](https://badge.fury.io/py/jsonlike.svg)](https://badge.fury.io/py/jsonlike) 2 | ### Why? 3 | Sometimes, especially when working with `JSON` data from the web, you will find that the data format is not quite JSON and thus have to do a little bit of fighting with it in order to successfully call `json.loads()`. 4 | 5 | ### Goal 6 | The goal of this package is **try** and provide the same functionality as `json.loads()` for data that **looks** like JSON, but doesn't play nicely with `json.loads()` or other common solutions. 7 | 8 | In its current state, it simply applies some heuristics that solve some of the common cases I've run into while working with not-quite `json` structured data. Overtime, I'd like to see it turn into something a bit more robust. 9 | 10 | ### Usage 11 | ```python 12 | import jsonlike 13 | jsonlike.loads(invalid_json_string) 14 | ``` 15 | 16 | Currently, `jsonlike.loads` will 17 | * strip out bad escape characters 18 | * strip out HTML content with JSON values 19 | * add missing commas 20 | * correct errors due to nested `"`'s 21 | 22 | ##### Strip response callback wrappers 23 | ```python 24 | import jsonlike 25 | jsonlike.unwrap_and_load("callback({"a": ""hello""})") # yields {"a":"hello"} 26 | ``` 27 | For JSON surrounded by a callback wrapper, calling `unwrap_and_load` will use the `unwrapper` library to strip away the callback, before returning `loads()` on the remaining content. 28 | 29 | ### Installation 30 | `$ pip install jsonlike` 31 | 32 | ## Contributing 33 | 1. Fork it ( https://github.com/shaunvxc/jsonlike/fork ) 34 | 1. Create your feature branch (`git checkout -b new-feature`) 35 | 1. Commit your changes (`git commit -am 'Add some feature'`) 36 | 1. Run the tests (`make test`) 37 | 1. Push change to the branch (`git push origin new-feature`) 38 | 1. Create a Pull Request 39 | 40 | -------------------------------------------------------------------------------- /jsonlike/__init__.py: -------------------------------------------------------------------------------- 1 | from .api import loads, unwrap_and_load 2 | -------------------------------------------------------------------------------- /jsonlike/api.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | 4 | import json 5 | import demjson 6 | import yaml 7 | import re 8 | import unwrapper 9 | 10 | 11 | def unwrap_and_load(content): 12 | cleaned = clean_json(unwrapper.unwrap_raw(content)) 13 | return loads(cleaned) 14 | 15 | 16 | def loads(content, try_yaml=False): 17 | try: 18 | json.loads(content) 19 | except Exception: 20 | cleaned = clean_json(content) 21 | try: 22 | # strip out HTML content and unescaped chars 23 | return json.loads(cleaned) 24 | except Exception: 25 | # try using demjson to decode a non-strict json string 26 | try: 27 | return demjson.decode(cleaned) 28 | except Exception: 29 | if try_yaml: 30 | # try loading as yaml-- yaml is a superset of json..this could be dangerous in cases 31 | return yaml.load(cleaned) 32 | raise 33 | 34 | def clean_json(content): 35 | return remove_html(remove_bad_double_quotes(remove_invalid_escapes(add_missing_commas(content)))) 36 | 37 | 38 | def process_repl(match): 39 | return match.group(2) 40 | 41 | 42 | def remove_bad_blocks(block_pairs, content): 43 | x = content 44 | for pair in block_pairs: 45 | x = remove_bad_block(pair[0], pair[1], x) 46 | 47 | return x 48 | 49 | 50 | def remove_bad_block(key_name_to_rem, key_to_stop_rem_at, content): 51 | if '"{}"'.format(key_name_to_rem) in content: 52 | x = re.sub('(\"{}\":.{{1,}})(\"{}\")'.format(key_name_to_rem, key_to_stop_rem_at), process_repl, content, flags=re.DOTALL) 53 | return x 54 | 55 | return content 56 | 57 | 58 | def sub_first(match): 59 | return match.group(1) + '"' 60 | 61 | 62 | def sub_last(match): 63 | return '"' + match.group(1) 64 | 65 | 66 | def remove_bad_double_quotes(content): 67 | # JSON requires values to be surrounded in " 's, ie `{"foo": "bar"}`. This handles 68 | # cases where the JSON is like `{"foo":""bar""}`--- json.loads() won't like this case ! 69 | return re.sub('([^\:])\"\"', sub_first, re.sub(r'\"\"([^,])', sub_last, content)) 70 | 71 | 72 | def remove_html(content): 73 | return re.sub('<[^<]+?>', '', content) 74 | 75 | 76 | def remove_invalid_escapes(content): 77 | return content.replace("\\", "") 78 | 79 | 80 | def add_missing_commas(content): 81 | return content.replace('"}', '",}') 82 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | demjson==2.2.4 2 | unwrapper==1.0.0 3 | nose 4 | sure 5 | pyyaml 6 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import os 4 | from setuptools import setup, find_packages 5 | 6 | here = os.path.abspath(os.path.dirname(__file__)) 7 | 8 | required = [ 9 | 'future' 10 | ] 11 | 12 | setup( 13 | name='jsonlike', 14 | version='0.0.2', 15 | packages=['jsonlike'], 16 | url='https://github.com/shaunvxc/jsonlike', 17 | license='MIT', 18 | author='Shaun Viguerie', 19 | author_email='shaunvig114@gmail.com', 20 | description='repair and parse invalid but jsonlike content', 21 | install_requires=required, 22 | classifiers=[ 23 | # How mature is this project? Common values are 24 | # 3 - Alpha 25 | # 4 - Beta 26 | # 5 - Production/Stable 27 | 'Development Status :: 3 - Alpha', 28 | 29 | # Indicate who your project is intended for 30 | 'Intended Audience :: Developers', 31 | 'Intended Audience :: System Administrators', 32 | 'Intended Audience :: End Users/Desktop', 33 | 'Topic :: System :: Shells', 34 | 'Topic :: System :: System Shells', 35 | 36 | # Pick your license as you wish (should match "license" above) 37 | 'License :: OSI Approved :: MIT License', 38 | 39 | # Specify the Python versions you support here. In particular, ensure 40 | # that you indicate whether you support Python 2, Python 3 or both. 41 | 'Programming Language :: Python :: 2.7', 42 | 'Programming Language :: Python :: 3.3', 43 | 'Programming Language :: Python :: 3.4', 44 | 'Programming Language :: Python :: 3.5', 45 | ] 46 | ) 47 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shaunvxc/jsonlike/7d933a46da348853b26f9717a875494bb964f642/tests/__init__.py -------------------------------------------------------------------------------- /tests/jsonlike_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | 4 | import sure 5 | import json 6 | 7 | from jsonlike import unwrap_and_load, loads 8 | 9 | def test_loads_w_bad_double_quotes(): 10 | loads('{"a":1, "b": 2, "c":""shaun""}').should.equal(json.loads('{"a":1, "b": 2, "c": "shaun"}')) 11 | 12 | 13 | def test_unwrap_and_loads(): 14 | unwrap_and_load('json13123({"a":1, "b": 2, "c":""shaun""})').should.equal(json.loads('{"a":1, "b": 2, "c": "shaun"}')) 15 | 16 | 17 | def test_loads_w_html(): 18 | loads('{"a":1, "b": 2, "c": "hey"}').should.equal(json.loads('{"a":1, "b": 2, "c": "hey"}')) 19 | 20 | 21 | def test_loads_w_html2(): 22 | loads('{"a":1, "b": 2, "c": "hey"}').should.equal(json.loads('{"a":1, "b": 2, "c": "hey"}')) 23 | --------------------------------------------------------------------------------