├── .gitignore ├── LICENSE ├── MANIFEST.in ├── README.md ├── setup.py └── webzio └── __init__.py /.gitignore: -------------------------------------------------------------------------------- 1 | /dist/webhoseio-0.2-py2-none-any.whl 2 | /dist/webhoseio-0.2.tar.gz 3 | /dist/webhoseio-0.4.tar.gz 4 | /PKG-INFO 5 | /PKG-INFO~ 6 | /build/lib.linux-x86_64-2.7/webhoseio/__init__.py 7 | /dist/webhoseio-0.1-py2-none-any.whl 8 | /dist/webhoseio-0.1-py2.7.egg 9 | /dist/webhoseio-0.1.tar.gz 10 | /webhoseio.egg-info/PKG-INFO 11 | /webhoseio.egg-info/SOURCES.txt 12 | /webhoseio.egg-info/dependency_links.txt 13 | /webhoseio.egg-info/requires.txt 14 | /webhoseio.egg-info/top_level.txt 15 | *.pyc -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License 2 | 3 | Copyright (c) 2022 Webz.io (https://webz.io) 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining 6 | a copy of this software and associated documentation files (the 7 | "Software"), to deal in the Software without restriction, including 8 | without limitation the rights to use, copy, modify, merge, publish, 9 | distribute, sublicense, and/or sell copies of the Software, and to 10 | permit persons to whom the Software is furnished to do so, subject to 11 | the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be 14 | included in all copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 19 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 20 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 22 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include *.md 2 | include LICENSE 3 | recursive-include webzio *.pyc 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | webz.io client for Python 2 | ============================ 3 | A simple way to access the [Webz.io](https://webz.io) API from your Python code 4 | ```python 5 | 6 | import webzio 7 | 8 | webzio.config(token=YOUR_API_KEY) 9 | output = webzio.query("filterWebContent", {"q":"github"}) 10 | print output['posts'][0]['text'] # Print the text of the first post 11 | print output['posts'][0]['published'] # Print the text of the first post publication date 12 | 13 | # Get the next batch of posts 14 | output = webzio.get_next() 15 | print output['posts'][0]['thread']['site'] # Print the site of the first post 16 | 17 | 18 | ``` 19 | 20 | API Key 21 | ------- 22 | 23 | To make use of the webz.io API, you need to obtain a token that would be 24 | used on every request. To obtain an API key, create an account at 25 | https://webz.io/auth/signup, and then go into 26 | https://webz.io/dashboard to see your token. 27 | 28 | 29 | Installing 30 | ---------- 31 | You can install from source: 32 | 33 | ``` bash 34 | 35 | $ git clone https://github.com/webz.io/webzio-python 36 | $ cd webzio-python 37 | $ python setup.py install 38 | 39 | ``` 40 | Or use pip install: 41 | 42 | ``` bash 43 | $ sudo pip install webzio 44 | ``` 45 | 46 | Use the API 47 | ----------- 48 | 49 | To get started, you need to import the library, and set your access token. 50 | (Replace YOUR_API_KEY with your actual API key). 51 | 52 | ```python 53 | 54 | >>> import webzio 55 | >>> webzio.config(token=YOUR_API_KEY) 56 | ``` 57 | 58 | **API Endpoints** 59 | 60 | The first parameter the query() function accepts is the API endpoint string. Available endpoints: 61 | * filterWebContent - access to the news/blogs/forums/reviews API 62 | 63 | Now you can make a request and inspect the results: 64 | 65 | ```python 66 | 67 | >>> output = webzio.query("filterWebContent", {"q":"github"}) 68 | >>> output['totalResults'] 69 | 15565094 70 | len(output['posts']) 71 | 100 72 | >>> output['posts'][0]['language'] 73 | u'english' 74 | >>> output['posts'][0]['title'] 75 | u'Putting quotes around dictionary keys in JS' 76 | ``` 77 | 78 | 79 | For your convenience, the ouput object is iterable, so you can loop over it 80 | and get all the results of this batch (up to 100). 81 | 82 | ```python 83 | 84 | >>> total_words = 0 85 | >>> for post in output['posts']: 86 | ... total_words += len(post['text'].split(" ")) 87 | ... 88 | >>> print(total_words) 89 | 8822 90 | ``` 91 | Full documentation 92 | ------------------ 93 | 94 | * ``config(token)`` 95 | 96 | * token - your API key 97 | 98 | * ``query(end_point_str, params)`` 99 | 100 | * end_point_str: 101 | * filterWebContent - access to the news/blogs/forums/reviews API 102 | 103 | * params: A key value dictionary. The most common key is the "q" parameter that hold the filters Boolean query. [Read about the available filters](https://webz.io/documentation). 104 | 105 | * ``get_next()`` - a method to fetch the next page of results. 106 | 107 | 108 | Polling 109 | ------- 110 | 111 | If you want to make repeated searches, performing an action whenever there are 112 | new results, use code like this: 113 | 114 | ``` python 115 | 116 | r = webzio.query("filterWebContent", {"q":"skyrim"}) 117 | while True: 118 | for post in r['posts']: 119 | perform_action(post) 120 | time.sleep(300) 121 | r = webzio.get_next() 122 | ``` 123 | 124 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from codecs import open 2 | from pathlib import Path 3 | 4 | try: 5 | from setuptools import setup 6 | except ImportError: 7 | from distutils.core import setup 8 | 9 | long_description = ''' 10 | ============================ 11 | webz.io client for Python 12 | ============================ 13 | A simple way to access the `Webz.io `_ API from your Python code:: 14 | .. code-block:: python 15 | 16 | import webzio 17 | 18 | webzio.config(token=YOUR_API_KEY) 19 | output = webzio.query("filterWebContent", {"q":"github"}) 20 | print output['posts'][0]['text'] # Print the text of the first post 21 | print output['posts'][0]['published'] # Print the text of the first post publication date 22 | 23 | # Get the next batch of posts 24 | output = webzio.get_next() 25 | print output['posts'][0]['thread']['site'] # Print the site of the first post 26 | 27 | API Key 28 | ------- 29 | 30 | To make use of the webz.io API, you need to obtain a token that would be 31 | used on every request. To obtain an API key, create an account at 32 | https://webz.io/auth/signup, and then go into 33 | https://webz.io/dashboard to see your token. 34 | 35 | 36 | Installing 37 | ---------- 38 | You can install from source: 39 | 40 | .. code-block:: bash 41 | 42 | $ git clone https://github.com/webz.io/webzio-python 43 | $ cd webzio-python 44 | $ python setup.py install 45 | 46 | Or use pip install: 47 | 48 | .. code-block:: bash 49 | 50 | $ sudo pip install webzio 51 | 52 | Use the API 53 | ----------- 54 | 55 | To get started, you need to import the library, and set your access token. 56 | (Replace YOUR_API_KEY with your actual API key). 57 | 58 | .. code-block:: python 59 | 60 | >>> import webzio 61 | >>> webzio.config(token=YOUR_API_KEY) 62 | 63 | **API Endpoints** 64 | 65 | The first parameter the query() function accepts is the API endpoint string. Available endpoints: 66 | * filterWebContent - access to the news/blogs/forums/reviews API 67 | 68 | Now you can make a request and inspect the results: 69 | 70 | .. code-block:: python 71 | 72 | >>> output = webzio.query("filterWebContent", {"q":"github"}) 73 | >>> output['totalResults'] 74 | 15565094 75 | len(output['posts']) 76 | 100 77 | >>> output['posts'][0]['language'] 78 | u'english' 79 | >>> output['posts'][0]['title'] 80 | u'Putting quotes around dictionary keys in JS' 81 | 82 | For your convenience, the ouput object is iterable, so you can loop over it 83 | and get all the results of this batch (up to 100). 84 | 85 | .. code-block:: python 86 | 87 | >>> total_words = 0 88 | >>> for post in output['posts']: 89 | ... total_words += len(post['text'].split(" ")) 90 | ... 91 | >>> print(total_words) 92 | 8822 93 | 94 | Full documentation 95 | ------------------ 96 | 97 | * ``config(token)`` 98 | 99 | * token - your API key 100 | 101 | * ``query(end_point_str, params)`` 102 | 103 | * end_point_str: 104 | * filterWebContent - access to the news/blogs/forums/reviews API 105 | 106 | * params: A key value dictionary. The most common key is the "q" parameter that hold the filters Boolean query. [Read about the available filters](https://webz.io/documentation). 107 | 108 | * ``get_next()`` - a method to fetch the next page of results. 109 | 110 | 111 | Polling 112 | ------- 113 | 114 | If you want to make repeated searches, performing an action whenever there are 115 | new results, use code like this: 116 | 117 | .. code-block:: python 118 | 119 | r = webzio.query("filterWebContent", {"q":"skyrim"}) 120 | while True: 121 | for post in r['posts']: 122 | perform_action(post) 123 | time.sleep(300) 124 | r = webzio.get_next() 125 | ''' 126 | 127 | setup( 128 | name='webzio', 129 | packages=['webzio'], 130 | version='1.0.2', 131 | author='Ran Geva', 132 | author_email='ran@webz.io', 133 | url='https://github.com/webhose/webzio-python', 134 | license='MIT', 135 | description='Simple client library for the webz.io REST API', 136 | long_description=long_description, 137 | install_requires=[ 138 | "requests >= 2.0.0" 139 | ], 140 | classifiers=[ 141 | 'Development Status :: 4 - Beta', 142 | 'Intended Audience :: Developers', 143 | 'Natural Language :: English', 144 | 'License :: OSI Approved :: MIT License', 145 | 'Programming Language :: Python', 146 | 'Programming Language :: Python :: 2.6', 147 | 'Programming Language :: Python :: 2.7', 148 | 'Programming Language :: Python :: 3', 149 | 'Programming Language :: Python :: 3.3', 150 | 'Programming Language :: Python :: 3.4', 151 | 'Programming Language :: Python :: 3.5', 152 | 'Programming Language :: Python :: 3.6' 153 | ] 154 | ) 155 | -------------------------------------------------------------------------------- /webzio/__init__.py: -------------------------------------------------------------------------------- 1 | import requests 2 | 3 | 4 | class Session(object): 5 | 6 | def __init__(self, token=None): 7 | self.next_call = None 8 | self.session = requests.Session() 9 | self.token = token 10 | 11 | def query(self, end_point_str, param_dict=None): 12 | 13 | if param_dict is not None: 14 | param_dict.update({"token": self.token}) 15 | param_dict.update({"format": "json"}) 16 | 17 | response = self.session.get("https://api.webz.io/" + end_point_str, params=param_dict) 18 | if response.status_code != 200: 19 | raise Exception(response.text) 20 | 21 | _output = response.json() 22 | self.next_call = _output['next'] 23 | return _output 24 | 25 | def get_next(self): 26 | return self.query(self.next_call[1:]) 27 | 28 | 29 | __session = Session() 30 | 31 | 32 | def config(token): 33 | __session.token = token 34 | 35 | 36 | def query(end_point_str, param_dict=None): 37 | return __session.query(end_point_str, param_dict) 38 | 39 | 40 | def get_next(): 41 | return __session.get_next() 42 | 43 | 44 | --------------------------------------------------------------------------------