├── .gitignore ├── LICENSE.txt ├── MANIFEST.in ├── Makefile ├── README.md ├── pydatastream ├── __init__.py ├── _version.py └── pydatastream.py ├── pytest.ini ├── setup.cfg ├── setup.py └── tests ├── test_basic.py └── test_fetch.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.ipynb 2 | *.pyc 3 | .pytest_cache/* 4 | .pytest_cache 5 | runtest.sh 6 | *.egg-info 7 | tasks.todo 8 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2013-2020 Vladimir Filimonov 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include README.md 2 | include LICENSE.txt 3 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | 2 | .PHONY: help release build clean 3 | 4 | LIBRARY := pydatastream 5 | 6 | # taken from: https://marmelab.com/blog/2016/02/29/auto-documented-makefile.html 7 | .DEFAULT_GOAL := help 8 | help: 9 | @grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' 10 | 11 | 12 | build: ## Make distribution package 13 | python setup.py sdist 14 | 15 | 16 | release: ## Make package; upload to pypi and clean after` 17 | @echo Do you really want to release the package to PyPi? 18 | @echo "Type library name for release ($(LIBRARY)):" 19 | @read -p"> " line; if [[ $$line != "$(LIBRARY)" ]]; then echo "Aborting. We'll release that later"; exit 1 ; fi 20 | @echo "Starting the release..." 21 | python setup.py sdist 22 | twine upload dist/* 23 | rm MANIFEST 24 | rm -rf dist 25 | 26 | 27 | clean: ## Erase distribution package 28 | rm MANIFEST 29 | rm -rf dist 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # PyDatastream 2 | 3 | PyDatastream is a Python interface to the Refinitiv Datastream (former Thomson Reuters Datastream) API via Datastream Web Services (DSWS) (non free), with some extra convenience functions. This package requires valid credentials for this API. 4 | 5 | **Note**: Up until version 0.5.1 the library has been using SOAP API of DataWorksEnterprise (DWE). As of July 1, 2019 this interface was discontinued by Thompson Reuters, and at the moment Datastream content is delivered through Datastream Web Services (DSWS). Starting version 0.6 pydatastream library is using REST API of DSWS. Any earlier versions of pydatastream do not work anymore. 6 | 7 | ## Installation 8 | 9 | The latest version of PyDatastream is always available at [GitHub](https://github.com/vfilimonov/pydatastream) at the `master` branch. Last release could be also installed from [PyPI](https://pypi.python.org/pypi/PyDatastream) using `pip`: 10 | ``` 11 | pip install pydatastream 12 | ``` 13 | 14 | Two external dependencies are [pandas](http://pandas.pydata.org/pandas-docs/stable/install.html) and [requests](https://2.python-requests.org/en/master/). 15 | 16 | ## Basic use 17 | 18 | All methods to work with the Datastream is organized as a class, so first you need to create an object with your valid credentials: 19 | ```python 20 | from pydatastream import Datastream 21 | DS = Datastream(username="ZXXX000", password="XXX000") 22 | ``` 23 | 24 | If necessary, the proxy server could be specified here via extra `proxy` parameter (e.g. `proxy='proxyLocation:portNumber'`). 25 | 26 | The following command requests daily closing price data for the Apple asset (Datastream mnemonic `"@AAPL"`) in 2008: 27 | ```python 28 | data = DS.get_price('@AAPL', date_from='2008', date_to='2009') 29 | ``` 30 | 31 | The data is retrieved as a `pandas.DataFrame` object, which can be easily plotted (`data.plot()`), aggregated into monthly data (`data.resample('M', how='last')``) or [manipulated in a variety of ways](http://nbviewer.ipython.org/urls/raw.github.com/changhiskhan/talks/master/pydata2012/pandas_timeseries.ipynb). Due to extreme simplicity of resampling the data in Pandas library (for example, taking into account business calendar), I would recommend to request daily data (unless the requests are huge or daily scale is not applicable) and perform all transformations locally. Also note that thanks to Pandas library format of the date string is extremely flexible. 32 | 33 | 34 | For fetching Open-High-Low-Close (OHLC) data there exist two methods: `get_OHLC` to fetch only price data and `get_OHLCV` to fetch both price and volume data. This separation is required as volume data is not available for financial indices. 35 | 36 | Request daily OHLC and Volume data for Apple in 2008: 37 | ```python 38 | data = DS.get_OHLCV('@AAPL', date_from='2008', date_to='2009') 39 | ``` 40 | 41 | Request daily OHLC data for S&P 500 Index from May 6, 2010 until present date: 42 | ```python 43 | data = DS.get_OHLC('S&PCOMP', date_from='May 6, 2010') 44 | ``` 45 | 46 | ### Requesting specific fields for data 47 | 48 | If mnemonics for specific fields are known, then more general function can be used. The following request 49 | ```python 50 | data = DS.fetch('@AAPL', ['P', 'MV', 'VO'], date_from='2000-01-01') 51 | ``` 52 | fetches the closing price, daily volume and market valuation for Apple Inc. 53 | 54 | **Note**: several fields should be passed as a list of strings, and not as a comma-separated string (i.e. `'P,MV,VO'` as a field name will raise an exception). 55 | 56 | If called without any mnemonics for the fields (e.g. `DS.fetch('@AAPL')`) it will fetch one series for the "default" field, which is usually `P`. In this case the single column might not have any name due to inconsitency in API. 57 | 58 | `fetch` can be used for requesting data for several tickers at once. In this case a MultiIndex Dataframe will be returned (see [Pandas: MultiIndex / Advanced Indexing](https://pandas.pydata.org/pandas-docs/stable/advanced.html)). 59 | ```python 60 | res = DS.fetch(['@AAPL','U:MMM'], fields=['P','MV','VO','PH'], date_from='2000-05-03') 61 | ``` 62 | 63 | The resulting data frame could be sliced, in order to select all fields for a given ticker (`res.loc['U:MMM']`) or data for the specific field for all tickers ('res['MV'].unstack(level=0)'). 64 | 65 | **Note**: in order to force resulting dataframe to have MultiIndex even in case of a single ticker requested, set argument `always_multiindex=True`: 66 | ```python 67 | res = DS.fetch('@AAPL', fields=['P','MV','VO','PH'], date_from='2000-05-03', always_multiindex=True) 68 | ``` 69 | 70 | Starting and ending dates could be provided as strings or `pandas.Timestamp` objects. One useful option is to fetch the data using `BDATE` as a starting date. In this case the whole series will be fetched from the first available date will be fetched. For several series - the earliest starting date of all of them should be used: 71 | ```python 72 | res = DS.fetch(['@AAPL','U:MMM'], 'P', date_from='BDATE') 73 | ``` 74 | **Note**: In some cases, especially for many mis-aligned series, using `BDATE`, the Datastream server returns mis-formatted response (series for individual tickers are of different lengths and do not match the common date index), which could not be parsed. In this case it is suggested either to use explicit `date_from` or to split request into several smaller ones. 75 | 76 | 77 | #### Note 1: Default field 78 | 79 | **Important**: "Default field" is the most likely reason for "E100, INVALID CODE OR EXPRESSION ENTERED" error message when fetching multiple symbols at once, even if they could be fetched one-by-one. This was observed so far in [exchange rates](https://github.com/vfilimonov/pydatastream/issues/14) and [economic](https://github.com/vfilimonov/pydatastream/issues/16) [series](https://github.com/vfilimonov/pydatastream/issues/11). 80 | 81 | There's a slight ambiguity of what "P" stands for. 82 | For cash equities there's a datatype "P" which correspond to adjusted price. 83 | However when one does the request to an API, "P" also stands for the default field. And if no fields (datatypes) are supplied, API will assume that the default field "P" is requested. 84 | 85 | It looks like, that when one requests several symbols, API will treat the `"P"` (even if it is implied, i.e. no datatypes are specified) strictly as a datatype. So if the datatype "P" does not exist (e.g. for exchange rates: "EUDOLLR" or equity indices: "S&PCOMP") the request will be resulting in an error: e.g. here `$$"ER", E100, INVALID CODE OR EXPRESSION ENTERED, USDOLLR(P)`. 86 | 87 | So in order retrieve several symbols at once, proper datatypes should be specified. For example: 88 | 89 | ```python 90 | res = DS.fetch(['EUDOLLR','USDOLLR'], fields=['EB','EO']) 91 | ``` 92 | 93 | #### Note 2: error catching 94 | 95 | As it will be discussed below, it may be convenient to set up property `raise_on_error` to `False` when fetching data of several tickers. In this case if one or several tickers are misspecified, error will no be raised, but they will not be present in the output data frame. Further, if some field does not exist for some of tickers, but exists for others, the missing data will be replaced with NaNs: 96 | ```python 97 | DS.raise_on_error = False 98 | res = DS.fetch(['@AAPL','U:MMM','xxxxxx','S&PCOMP'], fields=['P','MV','VO','PH'], date_from='2000-05-03') 99 | 100 | print(res['MV'].unstack(level=0).head()) 101 | print(res['P'].unstack(level=0).head()) 102 | ``` 103 | 104 | Please note, that in the last example the closing price (`P`) for `S&PCOMP` ticker was not retrieved. Due to Thomson Reuters Datastream mnemonics, field `P` is not available for indexes and field `PI` should be used instead. For the same reason, calling method `get_OHLC()` for getting the open, high, low, close values of the index would not work. This could be instead fetched using the direct call to fetch: 105 | ```python 106 | data = DS.fetch('S&PCOMP', ['PO', 'PH', 'PL', 'PI'], date_from='May 6, 2010') 107 | ``` 108 | 109 | #### Note 3: Currencies and names of symbols and fields 110 | 111 | `fetch` method also collects the currencies for the given mnemonics and fields. This information is not returned on `fetch` by default, and kept in the property `last_metadata` of the main class: 112 | ```python 113 | res = DS.fetch(['@AAPL','U:MMM'], fields=['P','MV','VO','PH']) 114 | print(DS.last_metadata['Currencies']) 115 | ``` 116 | **Note**: Currency is not the same as the unit of the given field (data type). For example, the field `VO` has currency of `U$` here, however the volume is reported in the number of shares. 117 | 118 | It can be also explicitly returned by the `fetch` method, if the property `return_metadata` is set to True: 119 | ```python 120 | res, meta = DS.fetch(['@AAPL','U:MMM'], fields=['P','MV','VO','PH'], return_metadata=True) 121 | print(meta['Currencies']) 122 | ``` 123 | In this case `fetch` also collects the names of the symbols and fields (data types): 124 | ```python 125 | print(meta['SymbolNames']) 126 | print(meta['DataTypeNames']) 127 | ``` 128 | 129 | ### Index constituents, listings of stocks and mnemonics 130 | 131 | PyDatastream also has an interface for retrieving list of constituents of indices: 132 | ```python 133 | DS.get_constituents('S&PCOMP') 134 | ``` 135 | Note: In contrast to retired DWE interface, DSWS does not support fetching historical lists of constituents. 136 | 137 | By default the method retrieves many various mnemonics and codes for the constituents, 138 | which might pose a problem for large indices like Russel-3000 (sometimes the request might be killed 139 | on timeout). In this case one can request only symbols and company names using: 140 | ```python 141 | DS.get_constituents('FRUSS3L', only_list=True) 142 | ``` 143 | 144 | Another useful method is `get_all_listings()`, that retrieves mnemonics (columns MJ and CD), countries (column GL) and exchanges (column MK) for different listings of the same stocks, as well as indicator for primary listing (columns PQ). For example for IBM and Vodafone: 145 | ```python 146 | DS.get_all_listings(['VOD', 'U:IBM']) 147 | ``` 148 | 149 | Finally, various symbols, codes and mnemonics could be fetched using the following method: 150 | ```python 151 | DS.get_codes(['U:IBM', '@MSFT']) 152 | ``` 153 | 154 | 155 | ### Futures markets 156 | 157 | Futures contracts have mnemonics of the form `XXXMMYY`, where `XXX` is a code for a futures market and `MM` and `YY` encode date and year of the contract. For example, `LLC0118` would denote market `LLC` (Brent Crude Oil) expiring in January 2018. 158 | 159 | For a given market code it is possible to get a list of all active contracts (together with some extra information, such as exchange traded, tick size, contract size, last trading date etc.): 160 | ```python 161 | DS.get_futures_contracts('LLC') 162 | ``` 163 | By default only active contracts are retrieved. In order to get the full list of contracts, use `include_dead=True` argument: 164 | ```python 165 | DS.get_futures_contracts('LLC', include_dead=True) 166 | ``` 167 | Similar to the index constituents, it is possible to set up an argument `only_list=True` in order to limit output to only mnemonics and names. This argument is suggested if server returns an error "Index was outside the bounds of the array." on the full request, such as in the following case: 168 | ```python 169 | DS.get_futures_contracts('NCT', include_dead=True, only_list=True) 170 | ``` 171 | 172 | **Note**: Futures contracts do not operate with the "Price" field (`P`) and instead they use "Settlement Price" (`PS`): 173 | ```python 174 | DS.fetch('LLC0118', 'PS', date_from='BDATE') 175 | ``` 176 | 177 | 178 | ### Static requests 179 | 180 | List of constituents of indices, that were considered above, is an example of static request, i.e. a request that does not retrieve a time-series, but a single snapshot with the data. 181 | 182 | Within the PyDatastream library static requests could be called using the same `fetch()` method by providing an argument `static=True`. 183 | 184 | For example, this request retrieves names, ISINs and identifiers for primary exchange (ISINID) for various tickers of BASF corporation (part of the `get_codes()` method): 185 | 186 | ```python 187 | DS.fetch(['D:BAS','D:BASX','HN:BAS','I:BAF','BFA','@BFFAF','S:BAS'], ['ISIN', 'ISINID', 'NAME'], static=True) 188 | ``` 189 | 190 | Another example of use of static requests is a cross-section of time-series. The following example retrieves an actual price, market capitalization and daily volume for same companies: 191 | 192 | ```python 193 | DS.fetch(['D:BAS','D:BASX','HN:BAS','I:BAF','BFA','@BFFAF','S:BAS'], ['P', 'MV', 'VO'], static=True) 194 | ``` 195 | 196 | ## Advanced use 197 | 198 | ### Datastream Functions 199 | 200 | Datastream allows to apply a number of functions to the series, which are calculated on the server side. Given the flexibility of pandas library in all types of data transformation, this functionality is not really needed for python users. However I will briefly describe it for the sake of completeness. 201 | 202 | Functions have a format `FUNC#(mnemonic,parameter)`. For example, calculating moving average on 20 days on the prices of IBM: 203 | ```python 204 | DS.fetch('MAV#(U:IBM,20D)', date_from='2013-09-01') 205 | ``` 206 | Functions could be combined, e.g. calculating moving 3 day percentage change on 20 days moving average: 207 | ```python 208 | DS.fetch('PCH#(MAV#(U:IBM,20D),3D)', date_from='2013-09-01') 209 | ``` 210 | Calculate percentage quarter-on-quarter change for the US real GDP (constant prices, seasonally adjusted): 211 | ```python 212 | DS.fetch('PCH#(USGDP...D,1Q)', date_from='1990-01-01') 213 | ``` 214 | Calculate year-on-year difference (actual change) for the UK real GDP (constant prices, seasonally adjusted): 215 | ```python 216 | DS.fetch('ACH#(UKGDP...D,1Y)', date_from='1990-01-01') 217 | ``` 218 | Documentation on the available functions are available on the [Thompson Reuters webhelp](http://product.datastream.com/navigator/advancehelpfiles/functions/webhelp/hfunc.htm). 219 | 220 | ### Trading calendar 221 | 222 | By default Datastream does padding of the prices on the holidays (i.e. on a vacation days it returns the price of the previous day). In order to remove the padding one need to know which days were holidays in the past. 223 | 224 | The history of trading and non-trading days for each country could be returned using the function (`get_trading_days`): 225 | ```python 226 | DS.get_trading_days(['US', 'UK', 'RS'], date_from='2010-01-01') 227 | ``` 228 | This function will return a dataframe that contains values 1 and NaN, where 1 identifies the business day. So by multiplying this list with the price time series it will remove padded values on non-trading days: 229 | ```python 230 | DS.fetch('@AAPL', 'P', date_from='2010').mul(DS.get_trading_days('US', date_from='2010')['US'], axis=0) 231 | ``` 232 | 233 | The full list of available calendars (countries) is contained in the property `DS.vacations_list`. 234 | 235 | ### Performing several requests at once 236 | 237 | [Thomson Dataworks Enterprise User Guide](http://dataworks.thomson.com/Dataworks/Enterprise/1.0/documentation/user%20guide.pdf) suggests to optimize requests: very often it is quicker to make one bigger request than several smaller requests because of the relatively high transport overhead with web services. 238 | 239 | PyDatastream allows to fetch several requests in a single API call (so-called "bundle request"): 240 | 241 | ```python 242 | r1 = DS.construct_request('@AAPL', ['PO', 'PH', 'PL', 'P'], date_from='2013-11-26') 243 | r2 = DS.construct_request('U:MMM', ['P', 'MV', 'PO'], date_from='2013-11-26') 244 | res = DS.request_many([r1, r2]) 245 | dfs = DS.parse_response(res) 246 | 247 | print(dfs[0].head()) 248 | print(dfs[1].head()) 249 | ``` 250 | 251 | 252 | ### Debugging and error handling 253 | 254 | If request contains errors then normally `DatastreamException` will be raised and the error message from the DSWS will be printed. To alter this behavior, one can use `raise_on_error` property of Datastream class. Being set to `False` it will force parser to ignore error messages and return empty pandas.Dataframe. For instance, since the field "P" does not exist for indices the following request would have raised an error, but instead it will only fill first 3 fields: 255 | ```python 256 | DS.raise_on_error = False 257 | DS.fetch('S&PCOMP', ['PO', 'PH', 'PL', 'P']) 258 | ``` 259 | 260 | `raise_on_error` can be useful for requests that contain several tickers. In this case data fields and/or tickers that can not be fetched will not be present in the output. However please be aware, that this is not a silver bullet against all types of errors. In all complicated cases it is suggested to go with one symbol at a time and check the raw response from the server to spot invalid combinations of ticker-field. 261 | 262 | For the debugging purposes, Datastream class has `last_request` property, contains URL, raw JSON request and raw JSON response from the last call to the API. 263 | ```python 264 | print(DS.last_request['url']) 265 | print(DS.last_request['request']) 266 | print(DS.last_request['response']) 267 | ``` 268 | In many cases the errors occur at the stage of parsing the raw response from the server. 269 | For example, this could happen if some fields requested are not supported for all symbols 270 | in the request. Because the data returned from the server in the unstructured form and the 271 | parser tries to map this to the structured pandas.DataFrame, the exceptions could be 272 | relatively uninformative. In this case it is highly suggested to check the raw response from the server to spot the problematic field. 273 | 274 | ### Identifying asset class 275 | 276 | Many data types (fields) in Datastream are asset type-specific. In order to check what exactly types do your symbols have call `get_asset_types`: 277 | ```python 278 | DS.get_asset_types(['MMM', '@AAPL', 'S&PCOMP', 'FRUSS3L', 'EUDOLLR', 'USGDP...D', 'FRFEDFD']) 279 | ``` 280 | 281 | ### Usage statistics 282 | 283 | Statistics about the number of requests, datatypes and returned data points for a given account could be requested using the `usage_statistics` method. By default it returns stats for the present month, but it could also return the usage statistics for some specified month in the past, or even the range of months. For instance: the usage statistics (monthly) for the last year could be fetched using: 284 | ```python 285 | DS.usage_statistics(months=12) 286 | ``` 287 | 288 | ## Thomson Reuters Economic Point-in-Time (EPiT) functionality 289 | 290 | PyDatastream has two useful methods to work with the Thomson Reuters Economic Point-in-Time (EPiT) concept (usually a separate subscription is required for this content). Most of the economic time series, such as GDP, employment or inflation figures are undergoing many revisions on their way. So that US GDP value undertakes two consecutive revisions after the initial release, after which the number is revised periodically when e.g. either the base year of calculation or seasonal adjustment are changed. For predictive exercises it is important to obtain the actual values as they were known at the time of releases (otherwise the data will be contaminated by look-ahead bias). 291 | 292 | EPiT allows user to request such historical data. For example, the following method retrieves the initial estimate and first revisions of the 2010-Q1 US GDP, as well as the dates when the data was published: 293 | ```python 294 | DS.get_epit_revisions('USGDP...D', period='2010-02-15') 295 | ``` 296 | The period here should contain a date which falls within a time period of interest (so any date from '2010-01-01' to '2010-03-31' will result in the same output). 297 | 298 | Another useful concept is a concept of "vintage" of a data, which defines when the particular series was released. This concept is widely used in economics, see for example [ALFRED database](https://alfred.stlouisfed.org/) of St. Louis Fed and their discussions about [capturing data as it happens](https://alfred.stlouisfed.org/docs/alfred_capturing_data.pdf). 299 | 300 | All vintages of the economic indicator could be summarized in a vintage matrix, that represents a DataFrame where columns correspond to a particular period (quarter or month) for the reported statistic and index represents timestamps at which these values were released by the respective official agency. I.e. every line corresponds to all available reported values by the given date. 301 | 302 | For example for the US GDP: 303 | ```python 304 | DS.get_epit_vintage_matrix('USGDP...D', date_from='2015-01-01') 305 | ``` 306 | The response is: 307 | ``` 308 | 2015-02-15 2015-05-15 2015-08-15 2015-11-15 \ 309 | 2015-04-29 16304.80 NaN NaN NaN 310 | 2015-05-29 16264.10 NaN NaN NaN 311 | 2015-06-24 16287.70 NaN NaN NaN 312 | 2015-07-30 16177.30 16270.400 NaN NaN 313 | 2015-08-27 16177.30 16324.300 NaN NaN 314 | 2015-09-25 16177.30 16333.600 NaN NaN 315 | 2015-10-29 16177.30 16333.600 16394.200 NaN 316 | 2015-11-24 16177.30 16333.600 16417.800 NaN 317 | ... 318 | ``` 319 | From the matrix it is seen for example, that the advance GDP estimate 320 | for 2015-Q1 (corresponding to 2015-02-15) was released on 2015-04-29 321 | and was equal to 16304.80 (B USD). The first revision (16264.10) has happened 322 | on 2015-05-29 and the second (16287.70) - on 2015-06-24. On 2015-07-30 323 | the advance GDP figure for 2015-Q2 was released (16270.400) together 324 | with update on the 2015-Q1 value (16177.30) and so on. 325 | 326 | Finally, when using economic data in any real-life analysis, it is important to know when the next value will be released by the reporting body. For example the following method returns dates of next 4 releases for US GDP and US Nonfarm Payroll figures: 327 | ```python 328 | DS.get_next_release_dates(['USGDP...D', 'USEMPALLO'], n_releases=4) 329 | ``` 330 | In the response release number is counted from the present day to the future. "DATE_FLAG" indicates whether the dates are given by the official agency or estimated by Thomson Reuters; and "TYPE" field indicates whether the new value is going to be provided or an old value is going to be updated. 331 | 332 | ## Resources 333 | 334 | - [Datastream Navigator](http://product.datastream.com/navigator/) could be used to look up codes and data types. 335 | 336 | - [Official support webpage](https://customers.reuters.com/sc/Contactus/simple?product=Datastream&env=PU&TP=Y). 337 | 338 | - [Official webpage for testing REST API requests](http://product.datastream.com/dswsclient/Docs/TestRestV1.aspx) 339 | 340 | - [Documentation for DSWS API calls](http://product.datastream.com/dswsclient/Docs/Default.aspx) 341 | 342 | - [Datastream Web Service Developer community](https://developers.refinitiv.com/eikon-apis/datastream-web-service) 343 | 344 | All these links could be printed in your terminal or iPython notebook by calling 345 | ```python 346 | DS.info() 347 | ``` 348 | 349 | Finally, for quick start with pandas have a look on [tutorial notebook](http://nbviewer.ipython.org/urls/gist.github.com/fonnesbeck/5850375/raw/c18cfcd9580d382cb6d14e4708aab33a0916ff3e/1.+Introduction+to+Pandas.ipynb) and [10-minutes introduction](http://pandas.pydata.org/pandas-docs/stable/10min.html). 350 | 351 | ### Datastream Navigator 352 | 353 | [Datastream Navigator](http://product.datastream.com/navigator/) is the best place to search for mnemonics for a particular security or a list of available datatypes/fields. 354 | 355 | Once a necessary security is located (either by search or via "Explore" menu item), a short list of most frequently used mnemonics is located right under the chart with the series. Further the small button ">>" next to them open a longer list. 356 | 357 | The complete list of datatypes (including static) for a particular asset class is located in the "Datatype search" menu item. Here the proper asset class should be selected. 358 | 359 | Help for Datastream Navigator is available [here](http://product.datastream.com/WebHelp/Navigator/4.5/HelpFiles/Nav_help.htm). 360 | 361 | ### Limits of the API 362 | 363 | - Datastream limits single call request to 100 series, i.e. (# of tickers) x (# of fields) should not exceed 100 (e.g. 10x10 or 20x5, etc). 364 | - Number of time-series data point is not limited as long as the monthly limit of 10 million data point is not exceeded. 365 | 366 | 367 | ## Version history 368 | 369 | - 0.1 (2013-11-12) First release 370 | - 0.2 (2013-12-01) Release as a python package; Added `get_constituents()` and other useful methods 371 | - 0.3 (2015-03-19) Improvements of static requests 372 | - 0.4 (2016-03-16) Support of Python 3 373 | - 0.5 (2017-07-01) Backward-incompatible change: method `fetch()` for many tickers returns MultiIndex Dataframe instead of former Panel. This follows the development of the Pandas library where the Panels are deprecated starting version 0.20.0 (see [here](http://pandas.pydata.org/pandas-docs/version/0.20/whatsnew.html#whatsnew-0200-api-breaking-deprecate-panel)). 374 | - 0.5.1 (2017-11-17) Added Economic Point-in-Time (EPiT) functionality 375 | - 0.6 (2019-08-27) The library is rewritten to use the new REST-based Datastream Web Services (DSWS) interfaces instead of old SOAP-based DataWorks Enterprise (DWE), which was discontinued on July 1, 2019. Some methods (such as `system_info()` or `sources()`) have been removed as they're not supported by a new API. 376 | - 0.6.1 (2019-10-10) Fixes, performance improvements. Added `get_futures_contracts()` and `get_next_release_dates()`. 377 | - 0.6.2 (2020-03-08) Added trading calendar. 378 | - 0.6.3 (2020-08-04) Minor bug fixes 379 | - 0.6.4 (2020-10-30) Fix token expiration time 380 | - 0.6.5 (2021-11-05) Add `always_multiindex` argument to `fetch()` 381 | 382 | Note 1: any versions of pydatastream prior to 0.6 will not work anymore. 383 | 384 | Note 2: Starting version 0.6 Python 2 is no longer supported - in line with many critical projects (including numpy, pandas and matplotlib) [dropping support of Python 2 by 2020](https://python3statement.org/). 385 | 386 | 387 | ## Acknowledgements 388 | 389 | A special thanks to: 390 | 391 | * François Cocquemas ([@fcocquemas](https://github.com/fcocquemas)) for [RDatastream](https://github.com/fcocquemas/rdatastream) that has inspired this library 392 | * Charles Allderman ([@ceaza](https://github.com/ceaza)) for added support of Python 3 393 | 394 | 395 | ## License 396 | 397 | PyDatastream library is released under the MIT license. 398 | 399 | The license for the library is not extended in any sense to any of the content of the Refinitiv (former Thomson Reuters) Dataworks Enterprise, Datastream, Datastream Web Services, Datastream Navigator or related services. Appropriate contract with the data vendor and valid credentials are required in order to use the API. 400 | 401 | Author of the library ([@vfilimonov](https://github.com/vfilimonov)) is not affiliated, associated, authorized, sponsored, endorsed by, or in any way officially connected with Thomson Reuters, or any of its subsidiaries or its affiliates. The names "Refinitiv", "Thomson Reuters" as well as related names are registered trademarks of respective owners. 402 | -------------------------------------------------------------------------------- /pydatastream/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | from .pydatastream import Datastream, DatastreamException 3 | from ._version import * 4 | -------------------------------------------------------------------------------- /pydatastream/_version.py: -------------------------------------------------------------------------------- 1 | """ Current version of the package 2 | """ 3 | 4 | __version__ = '0.6.6' 5 | __author__ = 'Vladimir Filimonov' 6 | __email__ = 'vladimir.a.filimonov@gmail.com' 7 | -------------------------------------------------------------------------------- /pydatastream/pydatastream.py: -------------------------------------------------------------------------------- 1 | """ pydatastream main module 2 | 3 | (c) Vladimir Filimonov, 2013 - 2021 4 | """ 5 | import warnings 6 | import json 7 | import math 8 | from functools import wraps 9 | import requests 10 | import pandas as pd 11 | 12 | ############################################################################### 13 | _URL = 'https://product.datastream.com/dswsclient/V1/DSService.svc/rest/' 14 | 15 | _FLDS_XREF = ('DSCD,EXMNEM,GEOGC,GEOGN,IBTKR,INDC,INDG,INDM,INDX,INDXEG,' 16 | 'INDXFS,INDXL,INDXS,ISIN,ISINID,LOC,MNEM,NAME,SECD,TYPE'.split(',')) 17 | 18 | _FLDS_XREF_FUT = ('MNEM,NAME,FLOT,FEX,GEOGC,GEOGN,EXCODE,LTDT,FUTBDATE,PCUR,ISOCUR,' 19 | 'TICKS,TICKV,TCYCLE,TPLAT'.split(',')) 20 | 21 | _ASSET_TYPE_CODES = {'BD': 'Bonds & Convertibles', 22 | 'BDIND': 'Bond Indices & Credit Default Swaps', 23 | 'CMD': 'Commodities', 24 | 'EC': 'Economics', 25 | 'EQ': 'Equities', 26 | 'EQIND': 'Equity Indices', 27 | 'EX': 'Exchange Rates', 28 | 'FT': 'Futures', 29 | 'INT': 'Interest Rates', 30 | 'INVT': 'Investment Trusts', 31 | 'OP': 'Options', 32 | 'UT': 'Unit Trusts', 33 | 'EWT': 'Warrants', 34 | 'NA': 'Not available'} 35 | 36 | ############################################################################### 37 | _INFO = """PyDatastream documentation (GitHub): 38 | https://github.com/vfilimonov/pydatastream 39 | 40 | Datastream Navigator: 41 | http://product.datastream.com/navigator/ 42 | 43 | Official support 44 | https://customers.reuters.com/sc/Contactus/simple?product=Datastream&env=PU&TP=Y 45 | 46 | Webpage for testing REST API requests 47 | http://product.datastream.com/dswsclient/Docs/TestRestV1.aspx 48 | 49 | Documentation for DSWS API 50 | http://product.datastream.com/dswsclient/Docs/Default.aspx 51 | 52 | Datastream Web Service Developer community 53 | https://developers.refinitiv.com/eikon-apis/datastream-web-service 54 | """ 55 | 56 | 57 | ############################################################################### 58 | ############################################################################### 59 | def _convert_date(date): 60 | """ Convert date to YYYY-MM-DD """ 61 | if date is None: 62 | return '' 63 | if isinstance(date, str) and (date.upper() == 'BDATE'): 64 | return 'BDATE' 65 | return pd.Timestamp(date).strftime('%Y-%m-%d') 66 | 67 | 68 | def _parse_dates(dates): 69 | """ Parse dates 70 | Example: 71 | /Date(1565817068486) -> 2019-08-14T21:11:08.486000000 72 | /Date(1565568000000+0000) -> 2019-08-12T00:00:00.000000000 73 | """ 74 | if dates is None: 75 | return None 76 | if isinstance(dates, str): 77 | return pd.Timestamp(_parse_dates([dates])[0]) 78 | res = [int(_[6:(-7 if '+' in _ else -2)]) for _ in dates] 79 | return pd.to_datetime(res, unit='ms').values 80 | 81 | 82 | class DatastreamException(Exception): 83 | """ Exception class for Datastream """ 84 | 85 | 86 | ############################################################################### 87 | def lazy_property(fn): 88 | """ Lazy-evaluated property of an object """ 89 | attr_name = '__lazy__' + fn.__name__ 90 | 91 | @property 92 | @wraps(fn) 93 | def _lazy_property(self): 94 | if not hasattr(self, attr_name): 95 | setattr(self, attr_name, fn(self)) 96 | return getattr(self, attr_name) 97 | return _lazy_property 98 | 99 | 100 | ############################################################################### 101 | # Main Datastream class 102 | ############################################################################### 103 | class Datastream(): 104 | """ Python interface to the Refinitiv Datastream API via Datastream Web 105 | Services (DSWS). 106 | """ 107 | def __init__(self, username, password, raise_on_error=True, proxy=None, **kwargs): 108 | """Establish a connection to the Python interface to the Refinitiv Datastream 109 | (former Thomson Reuters Datastream) API via Datastream Web Services (DSWS). 110 | 111 | username / password - credentials for the DSWS account. 112 | raise_on_error - If True then error request will raise a "DatastreamException", 113 | otherwise either empty dataframe or partially 114 | retrieved data will be returned 115 | proxy - URL for the proxy server. Valid values: 116 | (a) None: no proxy is used 117 | (b) string of format "host:port" or "username:password@host:port" 118 | 119 | Note: credentials will be saved in memory. In case if this is not 120 | desirable for security reasons, call the constructor having None 121 | instead of values and manually call renew_token(username, password) 122 | when needed. 123 | 124 | A custom REST API url (if necessary for some reasons) could be provided 125 | via "url" parameter. 126 | """ 127 | self.raise_on_error = raise_on_error 128 | self.last_request = None 129 | self.last_metadata = None 130 | self._last_response_raw = None 131 | 132 | # Setting up proxy parameters if necessary 133 | if isinstance(proxy, str): 134 | self._proxy = {'http': proxy, 'https': proxy} 135 | elif proxy is None: 136 | self._proxy = None 137 | else: 138 | raise ValueError('Proxy parameter should be either None or string') 139 | 140 | self._url = kwargs.pop('url', _URL) 141 | self._username = username 142 | self._password = password 143 | # request new token 144 | self.renew_token(username, password) 145 | 146 | ########################################################################### 147 | @staticmethod 148 | def info(): 149 | """ Some useful links """ 150 | print(_INFO) 151 | 152 | ########################################################################### 153 | def _api_post(self, method, request): 154 | """ Call to the POST method of DSWS API """ 155 | url = self._url + method 156 | self.last_request = {'url': url, 'request': request, 'error': None} 157 | self.last_metadata = None 158 | try: 159 | res = requests.post(url, json=request, proxies=self._proxy) 160 | self.last_request['response'] = res.text 161 | except Exception as e: 162 | self.last_request['error'] = str(e) 163 | raise 164 | 165 | try: 166 | response = self.last_request['response'] = json.loads(self.last_request['response']) 167 | except json.JSONDecodeError as e: 168 | raise DatastreamException('Server response could not be parsed') from e 169 | 170 | if 'Code' in response: 171 | code = response['Code'] 172 | if response['SubCode'] is not None: 173 | code += '/' + response['SubCode'] 174 | errormsg = f'{code}: {response["Message"]}' 175 | self.last_request['error'] = errormsg 176 | raise DatastreamException(errormsg) 177 | return self.last_request['response'] 178 | 179 | ########################################################################### 180 | def renew_token(self, username=None, password=None): 181 | """ Request new token from the server """ 182 | if username is None or password is None: 183 | warnings.warn('Username or password is not provided - could not renew token') 184 | return 185 | data = {"UserName": username, "Password": password} 186 | self._token = dict(self._api_post('GetToken', data)) 187 | self._token['TokenExpiry'] = _parse_dates(self._token['TokenExpiry']).tz_localize('UTC') 188 | 189 | # Token is invalidated 15 minutes before exporation time 190 | # Note: According to https://github.com/vfilimonov/pydatastream/issues/27 191 | # tokens do not always respect the (as of now 24 hours) expiry time 192 | # So for this reason I limit the token life at 6 hours. 193 | self._token['RenewTokenAt'] = min(self._token['TokenExpiry'] - pd.Timedelta('15m'), 194 | pd.Timestamp.utcnow() + pd.Timedelta('6H')) 195 | 196 | @property 197 | def _token_is_expired(self): 198 | if self._token is None: 199 | return True 200 | if pd.Timestamp.utcnow() > self._token['RenewTokenAt']: 201 | return True 202 | return False 203 | 204 | @property 205 | def token(self): 206 | """ Return actual token and renew it if necessary. """ 207 | if self._token_is_expired: 208 | self.renew_token(self._username, self._password) 209 | return self._token['TokenValue'] 210 | 211 | ########################################################################### 212 | def request(self, request): 213 | """ Generic wrapper to request data in raw format. Request should be 214 | properly formatted dictionary (see construct_request() method). 215 | """ 216 | data = {'DataRequest': request, 'TokenValue': self.token} 217 | return self._api_post('GetData', data) 218 | 219 | def request_many(self, list_of_requests): 220 | """ Generic wrapper to request multiple requests in raw format. 221 | list_of_requests should be a list of properly formatted dictionaries 222 | (see construct_request() method). 223 | """ 224 | data = {'DataRequests': list_of_requests, 'TokenValue': self.token} 225 | return self._api_post('GetDataBundle', data) 226 | 227 | ########################################################################### 228 | @staticmethod 229 | def construct_request(ticker, fields=None, date_from=None, date_to=None, 230 | freq=None, static=False, IsExpression=None, 231 | return_names=True): 232 | """Construct a request string for querying TR DSWS. 233 | 234 | tickers - ticker or symbol, or list of symbols 235 | fields - field or list of fields. 236 | date_from, date_to - date range (used only if "date" is not specified) 237 | freq - frequency of data: daily('D'), weekly('W') or monthly('M') 238 | static - True for static (snapshot) requests 239 | IsExpression - if True, it will explicitly assume that list of tickers 240 | contain expressions. Otherwise it will try to infer it. 241 | 242 | Some of available fields: 243 | P - adjusted closing price 244 | PO - opening price 245 | PH - high price 246 | PL - low price 247 | VO - volume, which is expressed in 1000's of shares. 248 | UP - unadjusted price 249 | OI - open interest 250 | 251 | MV - market value 252 | EPS - earnings per share 253 | DI - dividend index 254 | MTVB - market to book value 255 | PTVB - price to book value 256 | ... 257 | 258 | The full list of data fields is available at http://dtg.tfn.com/. 259 | """ 260 | req = {'Instrument': {}, 'Date': {}, 'DataTypes': []} 261 | 262 | # Instruments 263 | if isinstance(ticker, str): 264 | # ticker = ticker ## Nothing to change 265 | is_list = None 266 | elif hasattr(ticker, '__len__'): 267 | ticker = ','.join(ticker) 268 | is_list = True 269 | else: 270 | raise ValueError('ticker should be either string or list/array of strings') 271 | # Properties of instruments 272 | props = [] 273 | if is_list or (is_list is None and ',' in ticker): 274 | props.append({'Key': 'IsList', 'Value': True}) 275 | if IsExpression or (IsExpression is None and 276 | ('#' in ticker or '(' in ticker or ')' in ticker)): 277 | props.append({'Key': 'IsExpression', 'Value': True}) 278 | if return_names: 279 | props.append({'Key': 'ReturnName', 'Value': True}) 280 | req['Instrument'] = {'Value': ticker, 'Properties': props} 281 | 282 | # DataTypes 283 | props = [{'Key': 'ReturnName', 'Value': True}] if return_names else [] 284 | if fields is not None: 285 | if isinstance(fields, str): 286 | req['DataTypes'].append({'Value': fields, 'Properties': props}) 287 | elif isinstance(fields, list) and fields: 288 | for f in fields: 289 | req['DataTypes'].append({'Value': f, 'Properties': props}) 290 | else: 291 | raise ValueError('fields should be either string or list/array of strings') 292 | 293 | # Dates 294 | req['Date'] = {'Start': _convert_date(date_from), 295 | 'End': _convert_date(date_to), 296 | 'Frequency': freq if freq is not None else '', 297 | 'Kind': 0 if static else 1} 298 | return req 299 | 300 | ########################################################################### 301 | @staticmethod 302 | def _parse_meta(meta): 303 | """ Parse SymbolNames, DataTypeNames and Currencies """ 304 | res = {} 305 | for key in meta: 306 | if key in ('DataTypeNames', 'SymbolNames'): 307 | if not meta[key]: # None or empty list 308 | res[key] = None 309 | else: 310 | names = pd.DataFrame(meta[key]).set_index('Key')['Value'] 311 | names.index.name = key.replace('Names', '') 312 | names.name = 'Name' 313 | res[key] = names 314 | elif key == 'Currencies': 315 | cur = pd.concat({key: pd.DataFrame(meta['Currencies'][key]) 316 | for key in meta['Currencies']}) 317 | res[key] = cur.xs('Currency', level=1).T 318 | else: 319 | res[key] = meta[key] 320 | return res 321 | 322 | def _parse_one(self, res): 323 | """ Parse one response (either 'DataResponse' or one of 'DataResponses')""" 324 | data = res['DataTypeValues'] 325 | dates = _parse_dates(res['Dates']) 326 | res_meta = {_: res[_] for _ in res if _ not in ['DataTypeValues', 'Dates']} 327 | 328 | # Parse values 329 | meta = {} 330 | res = {} 331 | for d in data: 332 | data_type = d['DataType'] 333 | res[data_type] = {} 334 | meta[data_type] = {} 335 | 336 | for v in d['SymbolValues']: 337 | value = v['Value'] 338 | if v['Type'] == 0: # Error 339 | if self.raise_on_error: 340 | raise DatastreamException(f'"{v["Symbol"]}"("{data_type}"): {value}') 341 | res[data_type][v['Symbol']] = math.nan 342 | elif v['Type'] == 4: # Date 343 | res[data_type][v['Symbol']] = _parse_dates(value) 344 | else: 345 | res[data_type][v['Symbol']] = value 346 | meta[data_type][v['Symbol']] = {_: v[_] for _ in v if _ != 'Value'} 347 | 348 | if dates is None: 349 | # Fix - if dates are not returned, then simply use integer index 350 | dates = [0] 351 | res[data_type] = pd.DataFrame(res[data_type], index=dates) 352 | 353 | res = pd.concat(res).unstack(level=1).T.sort_index() 354 | res_meta['Currencies'] = meta 355 | return res, self._parse_meta(res_meta) 356 | 357 | def parse_response(self, response, return_metadata=False): 358 | """ Parse raw JSON response 359 | 360 | If return_metadata is True, then result is tuple (dataframe, metadata), 361 | where metadata is a dictionary. Otherwise only dataframe is returned. 362 | 363 | In case of response being constructed from several requests (method 364 | request_many()), then the result is a list of parsed responses. Here 365 | again, if return_metadata is True then each element is a tuple 366 | (dataframe, metadata), otherwise each element is a dataframe. 367 | """ 368 | if 'DataResponse' in response: # Single request 369 | res, meta = self._parse_one(response['DataResponse']) 370 | self.last_metadata = meta 371 | return (res, meta) if return_metadata else res 372 | if 'DataResponses' in response: # Multiple requests 373 | results = [self._parse_one(r) for r in response['DataResponses']] 374 | self.last_metadata = [_[1] for _ in results] 375 | return results if return_metadata else [_[0] for _ in results] 376 | raise DatastreamException('Neither DataResponse nor DataResponses are found') 377 | 378 | ########################################################################### 379 | def usage_statistics(self, date=None, months=1): 380 | """ Request usage statistics 381 | 382 | date - if None, stats for the current month will be fetched, 383 | otherwise for the month which contains the specified date. 384 | months - number of consecutive months prior to "date" for which 385 | stats should be retrieved. 386 | """ 387 | if date is None: 388 | date = pd.Timestamp('now').normalize() - pd.offsets.MonthBegin() 389 | req = [self.construct_request('STATS', 'DS.USERSTATS', 390 | date-pd.offsets.MonthBegin(n), static=True) 391 | for n in range(months)][::-1] 392 | res = self.parse_response(self.request_many(req)) 393 | res = pd.concat(res) 394 | res.index = res['Start Date'].dt.strftime('%B %Y') 395 | res.index.name = None 396 | return res 397 | 398 | ########################################################################### 399 | def fetch(self, tickers, fields=None, date_from=None, date_to=None, 400 | freq=None, static=False, IsExpression=None, return_metadata=False, 401 | always_multiindex=False): 402 | """Fetch the data from Datastream for a set of tickers and parse results. 403 | 404 | tickers - ticker or symbol, or list of symbols 405 | fields - field or list of fields 406 | date_from, date_to - date range (used only if "date" is not specified) 407 | freq - frequency of data: daily('D'), weekly('W') or monthly('M') 408 | static - True for static (snapshot) requests 409 | IsExpression - if True, it will explicitly assume that list of tickers 410 | contain expressions. Otherwise it will try to infer it. 411 | return_metadata - if True, then tuple (data, metadata) will be returned, 412 | otherwise only data 413 | always_multiindex - if True, then even for 1 ticker requested, resulting 414 | dataframe will have multiindex (ticker, date). 415 | If False (default), then request of single ticker will 416 | result in a dataframe indexed by date only. 417 | 418 | Notes: - several fields should be passed as a list, and not as a 419 | comma-separated string! 420 | - if no fields are provided, then the default field will be 421 | fetched. In this case the column might not have any name 422 | in the resulting dataframe. 423 | 424 | Result format depends on the number of requested tickers and fields: 425 | - 1 ticker - DataFrame with fields in column names 426 | (in order to keep multiindex even for single 427 | ticker, set `always_multiindex` to True) 428 | - several tickers - DataFrame with fields in column names and 429 | MultiIndex (ticker, date) 430 | - static request - DataFrame indexed by tickers and with fields 431 | in column names 432 | 433 | Some of available fields: 434 | P - adjusted closing price 435 | PO - opening price 436 | PH - high price 437 | PL - low price 438 | VO - volume, which is expressed in 1000's of shares. 439 | UP - unadjusted price 440 | OI - open interest 441 | 442 | MV - market value 443 | EPS - earnings per share 444 | DI - dividend index 445 | MTVB - market to book value 446 | PTVB - price to book value 447 | ... 448 | """ 449 | req = self.construct_request(tickers, fields, date_from, date_to, 450 | freq=freq, static=static, 451 | IsExpression=IsExpression, 452 | return_names=return_metadata) 453 | raw = self.request(req) 454 | self._last_response_raw = raw 455 | data, meta = self.parse_response(raw, return_metadata=True) 456 | 457 | if static: 458 | # Static request - drop date from MultiIndex 459 | data = data.reset_index(level=1, drop=True) 460 | elif len(data.index.levels[0]) == 1: 461 | # Only one ticker - drop tickers from MultiIndex 462 | if not always_multiindex: 463 | data = data.reset_index(level=0, drop=True) 464 | 465 | return (data, meta) if return_metadata else data 466 | 467 | ########################################################################### 468 | # Specific fetching methods 469 | ########################################################################### 470 | def get_OHLCV(self, ticker, date_from=None, date_to=None): 471 | """Get Open, High, Low, Close prices and daily Volume for a given ticker. 472 | 473 | ticker - ticker or symbol 474 | date_from, date_to - date range (used only if "date" is not specified) 475 | """ 476 | return self.fetch(ticker, ['PO', 'PH', 'PL', 'P', 'VO'], date_from, date_to, 477 | freq='D', return_metadata=False) 478 | 479 | def get_OHLC(self, ticker, date_from=None, date_to=None): 480 | """Get Open, High, Low and Close prices for a given ticker. 481 | 482 | ticker - ticker or symbol 483 | date_from, date_to - date range (used only if "date" is not specified) 484 | """ 485 | return self.fetch(ticker, ['PO', 'PH', 'PL', 'P'], date_from, date_to, 486 | freq='D', return_metadata=False) 487 | 488 | def get_price(self, ticker, date_from=None, date_to=None): 489 | """Get Close price for a given ticker. 490 | 491 | ticker - ticker or symbol 492 | date_from, date_to - date range (used only if "date" is not specified) 493 | """ 494 | return self.fetch(ticker, 'P', date_from, date_to, 495 | freq='D', return_metadata=False) 496 | 497 | ########################################################################### 498 | def get_constituents(self, index_ticker, only_list=False): 499 | """ Get a list of all constituents of a given index. 500 | 501 | index_ticker - Datastream ticker for index 502 | only_list - request only list of symbols. By default the method 503 | retrieves many extra fields with information (various 504 | mnemonics and codes). This might pose some problems 505 | for large indices like Russel-3000. If only_list=True, 506 | then only the list of symbols and names are retrieved. 507 | 508 | NOTE: In contrast to retired DWE interface, DSWS does not support 509 | fetching historical lists of constituents. 510 | """ 511 | if only_list: 512 | fields = ['MNEM', 'NAME'] 513 | else: 514 | fields = _FLDS_XREF 515 | return self.fetch('L' + index_ticker, fields, static=True) 516 | 517 | def get_all_listings(self, ticker): 518 | """ Get all listings and their symbols for the given security 519 | """ 520 | res = self.fetch(ticker, 'QTEALL', static=True) 521 | columns = list({_[:2] for _ in res.columns}) 522 | 523 | # Reformat the output 524 | df = {} 525 | for ind in range(1, 21): 526 | cols = {f'{c}{ind:02d}': c for c in columns} 527 | df[ind] = res[cols.keys()].rename(columns=cols) 528 | df = pd.concat(df).swaplevel(0).sort_index() 529 | df = df[~(df == '').all(axis=1)] 530 | return df 531 | 532 | def get_codes(self, ticker): 533 | """ Get codes and symbols for the given securities 534 | """ 535 | return self.fetch(ticker, _FLDS_XREF, static=True) 536 | 537 | ########################################################################### 538 | def get_asset_types(self, symbols): 539 | """ Get asset types for a given list of symbols 540 | Note: the method does not 541 | """ 542 | res = self.fetch(symbols, 'TYPE', static=True, IsExpression=False) 543 | names = pd.Series(_ASSET_TYPE_CODES).to_frame(name='AssetTypeName') 544 | res = res.join(names, on='TYPE') 545 | # try to preserve the order 546 | if isinstance(symbols, (list, pd.Series)): 547 | try: 548 | res = res.loc[symbols] 549 | except KeyError: 550 | pass # OK, we don't keep the order if not possible 551 | return res 552 | 553 | ########################################################################### 554 | def get_futures_contracts(self, market_code, only_list=False, include_dead=False): 555 | """ Get list of all contracts for a given futures market 556 | 557 | market_code - Datastream mnemonic for a market (e.g. LLC for the 558 | Brent Crude Oil, whose contracts have mnemonics like 559 | LLC0118 for January 2018) 560 | only_list - request only list of symbols. By default the method 561 | retrieves many extra fields with information (currency, 562 | lot size, last trading date, etc). If only_list=True, 563 | then only the list of symbols and names are retrieved. 564 | include_dead - if True, all delisted/expired contracts will be fetched 565 | as well. Otherwise only active contracts will be returned. 566 | """ 567 | if only_list: 568 | fields = ['MNEM', 'NAME'] 569 | else: 570 | fields = _FLDS_XREF_FUT 571 | res = self.fetch(f'LFUT{market_code}L', fields, static=True) 572 | res['active'] = True 573 | if include_dead: 574 | res2 = self.fetch(f'LFUT{market_code}D', fields, static=True) 575 | res2['active'] = False 576 | res = pd.concat([res, res2]) 577 | return res[res.MNEM != 'NA'] # Drop lines with empty mnemonics 578 | 579 | ########################################################################### 580 | def get_epit_vintage_matrix(self, mnemonic, date_from='1951-01-01', date_to=None): 581 | """ Construct the vintage matrix for a given economic series. 582 | Requires subscription to Thomson Reuters Economic Point-in-Time (EPiT). 583 | 584 | Vintage matrix represents a DataFrame where columns correspond to a 585 | particular period (quarter or month) for the reported statistic and 586 | index represents timestamps at which these values were released by 587 | the respective official agency. I.e. every line corresponds to all 588 | available reported values by the given date. 589 | 590 | For example: 591 | 592 | >> DS.get_epit_vintage_matrix('USGDP...D', date_from='2015-01-01') 593 | 594 | 2015-02-15 2015-05-15 2015-08-15 2015-11-15 \ 595 | 2015-04-29 16304.80 NaN NaN NaN 596 | 2015-05-29 16264.10 NaN NaN NaN 597 | 2015-06-24 16287.70 NaN NaN NaN 598 | 2015-07-30 16177.30 16270.400 NaN NaN 599 | 2015-08-27 16177.30 16324.300 NaN NaN 600 | 2015-09-25 16177.30 16333.600 NaN NaN 601 | 2015-10-29 16177.30 16333.600 16394.200 NaN 602 | 2015-11-24 16177.30 16333.600 16417.800 NaN 603 | 604 | From the matrix it is seen for example, that the advance GDP estimate 605 | for 2015-Q1 (corresponding to 2015-02-15) was released on 2015-04-29 606 | and was equal to 16304.80 (B USD). The first revision (16264.10) has 607 | happened on 2015-05-29 and the second (16287.70) - on 2015-06-24. 608 | On 2015-07-30 the advance GDP figure for 2015-Q2 was released 609 | (16270.400) together with update on the 2015-Q1 value (16177.30) 610 | and so on. 611 | """ 612 | # Get first available date from the REL1 series 613 | rel1 = self.fetch(mnemonic, 'REL1', date_from=date_from, date_to=date_to) 614 | date_0 = rel1.dropna().index[0] 615 | 616 | # All release dates 617 | reld123 = self.fetch(mnemonic, ['RELD1', 'RELD2', 'RELD3'], 618 | date_from=date_0, date_to=date_to).dropna(how='all') 619 | 620 | # Fetch all vintages 621 | res = {} 622 | for date in reld123.index: 623 | try: 624 | _tmp = self.fetch(mnemonic, 'RELV', date_from=date_0, date_to=date).dropna() 625 | except DatastreamException: 626 | continue 627 | res[date] = _tmp 628 | return pd.concat(res).RELV.unstack() 629 | 630 | ########################################################################### 631 | def get_epit_revisions(self, mnemonic, period, relh50=False): 632 | """ Return initial estimate and first revisions of a given economic time 633 | series and a given period. 634 | Requires subscription to Thomson Reuters Economic Point-in-Time (EPiT). 635 | 636 | "Period" parameter should represent a date which falls within a time 637 | period of interest, e.g. 2016 Q4 could be requested with the 638 | period='2016-11-15' for example. 639 | 640 | By default up to 20 values is returned unless argument "relh50" is 641 | set to True (in which case up to 50 values is returned). 642 | """ 643 | if relh50: 644 | data = self.fetch(mnemonic, 'RELH50', date_from=period, static=True) 645 | else: 646 | data = self.fetch(mnemonic, 'RELH', date_from=period, static=True) 647 | data = data.iloc[0] 648 | 649 | # Parse the response 650 | res = {data.loc['RELHD%02d' % i]: data.loc['RELHV%02d' % i] 651 | for i in range(1, 51 if relh50 else 21) 652 | if data.loc['RELHD%02d' % i] != ''} 653 | res = pd.Series(res, name=data.loc['RELHP ']).sort_index() 654 | return res 655 | 656 | ########################################################################### 657 | def get_next_release_dates(self, mnemonics, n_releases=1): 658 | """ Return the next date of release (NDoR) for a given economic series. 659 | Could return results for up to 12 releases in advance. 660 | 661 | Returned fields: 662 | DATE - Date of release (or start of range where the exact 663 | date is not available) 664 | DATE_LATEST - End of range when a range is given 665 | TIME_GMT - Expected time of release (for "official" dates only) 666 | DATE_FLAG - Indicates whether the dates are "official" ones from 667 | the source, or estimated by Thomson Reuters where 668 | "officials" not available 669 | REF_PERIOD - Corresponding reference period for the release 670 | TYPE - Indicates whether the release is for a new reference 671 | period ("NewValue") or an update to a period for which 672 | there has already been a release ("ValueUpdate") 673 | """ 674 | if n_releases > 12: 675 | raise Exception('Only up to 12 months in advance could be requested') 676 | if n_releases < 1: 677 | raise Exception('n_releases smaller than 1 does not make sense') 678 | 679 | # Fetch and parse 680 | reqs = [self.construct_request(mnemonics, f'DS.NDOR{i+1}', static=True) 681 | for i in range(n_releases)] 682 | res_parsed = self.parse_response(self.request_many(reqs)) 683 | 684 | # Rearrange the output 685 | res = [] 686 | for r in res_parsed: 687 | x = r.reset_index(level=1, drop=True) 688 | x.index.name = 'Mnemonic' 689 | # Index of the release counting from now 690 | ndor_idx = [_.split('_')[0] for _ in x.columns][0] 691 | x['ReleaseNo'] = int(ndor_idx.replace('DS.NDOR', '')) 692 | x = x.set_index('ReleaseNo', append=True) 693 | x.columns = [_.replace(ndor_idx+'_', '') for _ in x.columns] 694 | res.append(x) 695 | 696 | res = pd.concat(res).sort_index() 697 | for col in ['DATE', 'DATE_LATEST', 'REF_PERIOD']: 698 | res[col] = pd.to_datetime(res[col], errors='coerce') 699 | return res 700 | 701 | ########################################################################### 702 | @lazy_property 703 | def vacations_list(self): 704 | """ List of mnemonics for holidays in different countries """ 705 | res = self.fetch('HOLIDAYS', ['MNEM', 'ENAME', 'GEOGN', 'GEOGC'], static=True) 706 | return res[res['MNEM'] != 'NA'].sort_values('GEOGC') 707 | 708 | def get_trading_days(self, countries, date_from=None, date_to=None): 709 | """ Get list of trading dates for a given countries (speficied by ISO-2c) 710 | Returning dataframe will contain values 1 and NaN, where 1 identifies 711 | the business day. 712 | 713 | So by multiplying this list with the price time series it will remove 714 | padded values on non-trading days. 715 | 716 | Example: 717 | DS.get_trading_days(['US', 'UK', 'RS'], date_from='2010-01-01') 718 | """ 719 | if isinstance(countries, str): 720 | countries = [countries] 721 | 722 | vacs = self.vacations_list 723 | mnems = vacs[vacs.GEOGC.isin(countries)] 724 | missing_isos = set(countries).difference(mnems.GEOGC) 725 | 726 | if missing_isos: 727 | raise DatastreamException(f'Unknowns ISO codes: {", ".join(missing_isos)}') 728 | # By default 0 and NaN are returned, so we add 1 729 | res = self.fetch(mnems.MNEM, date_from=date_from, date_to=date_to) + 1 730 | 731 | if len(countries) == 1: 732 | return res.iloc[:, 0].to_frame(name=countries[0]) 733 | return (res.iloc[:, 0].unstack(level=0) 734 | .rename(columns=mnems.set_index('MNEM')['GEOGC'])[countries]) 735 | -------------------------------------------------------------------------------- /pytest.ini: -------------------------------------------------------------------------------- 1 | [pytest] 2 | addopts = -rs -p no:dash 3 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | description-file = README.md 3 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from distutils.core import setup 2 | 3 | DESCRIPTION = ('Python interface to the Refinitiv Datastream (former Thomson ' 4 | 'Reuters Datastream) API via Datastream Web Services (DSWS)') 5 | 6 | # Long description to be published in PyPi 7 | LONG_DESCRIPTION = """ 8 | **PyDatastream** is a Python interface to the Refinitiv Datastream (former Thomson 9 | Reuters Datastream) API via Datastream Web Services (DSWS) (non free), 10 | with some convenience functions. This package requires valid credentials for this 11 | API. 12 | 13 | For the documentation please refer to README.md inside the package or on the 14 | GitHub (https://github.com/vfilimonov/pydatastream/blob/master/README.md). 15 | """ 16 | 17 | _URL = 'http://github.com/vfilimonov/pydatastream' 18 | 19 | __version__ = __author__ = __email__ = None # will be extracted from _version.py 20 | exec(open('pydatastream/_version.py').read()) # defines __version__ pylint: disable=W0122 21 | 22 | setup(name='PyDatastream', 23 | version=__version__, 24 | description=DESCRIPTION, 25 | long_description=LONG_DESCRIPTION, 26 | url=_URL, 27 | download_url=_URL + '/archive/v' + __version__ + '.zip', 28 | author=__author__, 29 | author_email=__email__, 30 | license='MIT License', 31 | packages=['pydatastream'], 32 | install_requires=['requests'], 33 | extras_require={ 34 | 'pandas': ['pandas'], 35 | }, 36 | classifiers=['Programming Language :: Python :: 3'], 37 | ) 38 | -------------------------------------------------------------------------------- /tests/test_basic.py: -------------------------------------------------------------------------------- 1 | """ Tests for basic pydatastream functionality 2 | 3 | (c) Vladimir Filimonov, 2019 4 | """ 5 | # pylint: disable=C0103,C0301 6 | import warnings 7 | import pytest 8 | import pydatastream as pds 9 | 10 | 11 | ############################################################################### 12 | def empty_datastream(): 13 | """ Create an instance of datastream """ 14 | with warnings.catch_warnings(): 15 | warnings.simplefilter("ignore") 16 | return pds.Datastream(None, None) 17 | 18 | 19 | ############################################################################### 20 | ############################################################################### 21 | def test_create(): 22 | """ Test init without username """ 23 | empty_datastream() 24 | 25 | 26 | ############################################################################### 27 | def test_create_wrong_credentials(): 28 | """ Test init with wrong credentials - should raise an exception """ 29 | with pytest.raises(pds.DatastreamException): 30 | assert pds.Datastream("USERNAME", "PASSWORD") 31 | 32 | 33 | ############################################################################### 34 | def test_convert_date(): 35 | """ Test _convert_date() """ 36 | from pydatastream.pydatastream import _convert_date 37 | assert _convert_date('5 January 2019') == '2019-01-05' 38 | assert _convert_date('04/02/2017') == '2017-04-02' 39 | assert _convert_date('2019') == '2019-01-01' 40 | assert _convert_date('2019-04') == '2019-04-01' 41 | assert _convert_date(None) == '' 42 | assert _convert_date('bdate') == 'BDATE' 43 | 44 | 45 | ############################################################################### 46 | def test_parse_dates(): 47 | """ Test _parse_dates() """ 48 | from pydatastream.pydatastream import _parse_dates 49 | import pandas as pd 50 | assert _parse_dates('/Date(1565817068486)/') == pd.Timestamp('2019-08-14 21:11:08.486000') 51 | assert _parse_dates('/Date(1565568000000+0000)/') == pd.Timestamp('2019-08-12 00:00:00') 52 | 53 | # Array of dates 54 | dates = ['/Date(1217548800000+0000)/', '/Date(1217808000000+0000)/', 55 | '/Date(1217894400000+0000)/', '/Date(1217980800000+0000)/', 56 | '/Date(1218067200000+0000)/', '/Date(1218153600000+0000)/', 57 | '/Date(1218412800000+0000)/', '/Date(1218499200000+0000)/', 58 | '/Date(1218585600000+0000)/', '/Date(1218672000000+0000)/', 59 | '/Date(1218758400000+0000)/', '/Date(1219017600000+0000)/', 60 | '/Date(1219104000000+0000)/', '/Date(1219190400000+0000)/', 61 | '/Date(1219276800000+0000)/', '/Date(1219363200000+0000)/', 62 | '/Date(1219622400000+0000)/', '/Date(1219708800000+0000)/', 63 | '/Date(1219795200000+0000)/', '/Date(1219881600000+0000)/', 64 | '/Date(1219968000000+0000)/', '/Date(1220227200000+0000)/'] 65 | res = ['2008-08-01', '2008-08-04', '2008-08-05', '2008-08-06', '2008-08-07', 66 | '2008-08-08', '2008-08-11', '2008-08-12', '2008-08-13', '2008-08-14', 67 | '2008-08-15', '2008-08-18', '2008-08-19', '2008-08-20', '2008-08-21', 68 | '2008-08-22', '2008-08-25', '2008-08-26', '2008-08-27', '2008-08-28', 69 | '2008-08-29', '2008-09-01'] 70 | assert pd.np.all(_parse_dates(dates) == pd.np.array(res, dtype='datetime64[ns]')) 71 | 72 | 73 | ############################################################################### 74 | # Construct requests 75 | ############################################################################### 76 | def test_construct_request_series_1(): 77 | """ Construct request for series """ 78 | req = {'Instrument': {'Value': '@AAPL', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}, 79 | 'Date': {'Start': '2008-01-01', 'End': '2009-01-01', 'Frequency': '', 'Kind': 1}, 80 | 'DataTypes': []} 81 | DS = empty_datastream() 82 | assert DS.construct_request('@AAPL', date_from='2008', date_to='2009') == req 83 | 84 | 85 | ############################################################################### 86 | def test_construct_request_series_2(): 87 | """ Construct request for series """ 88 | req = {'Instrument': {'Value': '@AAPL', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}, 89 | 'Date': {'Start': 'BDATE', 'End': '', 'Frequency': '', 'Kind': 1}, 90 | 'DataTypes': [{'Value': 'P', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}, 91 | {'Value': 'MV', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}, 92 | {'Value': 'VO', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}]} 93 | DS = empty_datastream() 94 | assert DS.construct_request('@AAPL', ['P', 'MV', 'VO'], date_from='bdate') == req 95 | 96 | 97 | ############################################################################### 98 | def test_construct_request_static_1(): 99 | """ Construct static request """ 100 | req = {'Instrument': {'Value': 'D:BAS,D:BASX,HN:BAS', 101 | 'Properties': [{'Key': 'IsList', 'Value': True}, 102 | {'Key': 'ReturnName', 'Value': True}]}, 103 | 'Date': {'Start': '', 'End': '', 'Frequency': '', 'Kind': 0}, 104 | 'DataTypes': [{'Value': 'ISIN', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}, 105 | {'Value': 'ISINID', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}, 106 | {'Value': 'NAME', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}]} 107 | DS = empty_datastream() 108 | assert DS.construct_request(['D:BAS', 'D:BASX', 'HN:BAS'], ['ISIN', 'ISINID', 'NAME'], static=True) == req 109 | 110 | 111 | ############################################################################### 112 | # Parse responses 113 | ############################################################################### 114 | def test_parse_response_static_1(): 115 | """ Parse static request """ 116 | import pandas as pd 117 | from io import StringIO 118 | 119 | s = (',,ISIN,ISINID,NAME\nD:BAS,2019-09-27,DE000BASF111,P,BASF\n' 120 | 'D:BASX,2019-09-27,DE000BASF111,S,BASF (XET)\n' 121 | 'HN:BAS,2019-09-27,DE000BASF111,S,BASF (BUD)\n') 122 | df = pd.read_csv(StringIO(s), index_col=[0, 1], parse_dates=[1]) 123 | 124 | res = {'AdditionalResponses': None, 125 | 'DataTypeNames': [{'Key': 'ISIN', 'Value': 'ISIN CODE'}, 126 | {'Key': 'ISINID', 'Value': 'QUOTE INDICATOR'}, 127 | {'Key': 'NAME', 'Value': 'NAME'}], 128 | 'DataTypeValues': [{'DataType': 'ISIN', 129 | 'SymbolValues': [{'Currency': 'E ', 'Symbol': 'D:BAS', 'Type': 6, 'Value': 'DE000BASF111'}, 130 | {'Currency': 'E ', 'Symbol': 'D:BASX', 'Type': 6, 'Value': 'DE000BASF111'}, 131 | {'Currency': 'HF', 'Symbol': 'HN:BAS', 'Type': 6, 'Value': 'DE000BASF111'}]}, 132 | {'DataType': 'ISINID', 133 | 'SymbolValues': [{'Currency': 'E ', 'Symbol': 'D:BAS', 'Type': 6, 'Value': 'P'}, 134 | {'Currency': 'E ', 'Symbol': 'D:BASX', 'Type': 6, 'Value': 'S'}, 135 | {'Currency': 'HF', 'Symbol': 'HN:BAS', 'Type': 6, 'Value': 'S'}]}, 136 | {'DataType': 'NAME', 137 | 'SymbolValues': [{'Currency': 'E ', 'Symbol': 'D:BAS', 'Type': 6, 'Value': 'BASF'}, 138 | {'Currency': 'E ', 'Symbol': 'D:BASX', 'Type': 6, 'Value': 'BASF (XET)'}, 139 | {'Currency': 'HF', 'Symbol': 'HN:BAS', 'Type': 6, 'Value': 'BASF (BUD)'}]}], 140 | 'Dates': ['/Date(1569542400000+0000)/'], 141 | 'SymbolNames': [{'Key': 'D:BAS', 'Value': 'D:BAS'}, 142 | {'Key': 'D:BASX', 'Value': 'D:BASX'}, 143 | {'Key': 'HN:BAS', 'Value': 'HN:BAS'}], 'Tag': None} 144 | res = {'DataResponse': res, 'Properties': None} 145 | 146 | DS = empty_datastream() 147 | assert DS.parse_response(res).equals(df) 148 | 149 | 150 | ############################################################################### 151 | def test_parse_response_series_1(): 152 | """ Parse response as a series """ 153 | import pandas as pd 154 | from io import StringIO 155 | 156 | s = (',,P,MV,VO\n@AAPL,1999-12-31,3.6719,16540.47,40952.8\n@AAPL,2000-01-03,3.9978,18008.5,133932.3\n' 157 | '@AAPL,2000-01-04,3.6607,16490.2,127909.5\n@AAPL,2000-01-05,3.7143,16760.53,194426.3\n' 158 | '@AAPL,2000-01-06,3.3929,15310.1,191981.9\n@AAPL,2000-01-07,3.5536,16035.32,115180.7\n' 159 | '@AAPL,2000-01-10,3.4911,15753.29,126265.9\n') 160 | df = pd.read_csv(StringIO(s), index_col=[0, 1], parse_dates=[1]) 161 | 162 | res = {'AdditionalResponses': [{'Key': 'Frequency', 'Value': 'D'}], 163 | 'DataTypeNames': None, 164 | 'DataTypeValues': [{'DataType': 'P', 165 | 'SymbolValues': [{'Currency': 'U$', 'Symbol': '@AAPL', 'Type': 10, 166 | 'Value': [3.6719, 3.9978, 3.6607, 3.7143, 3.3929, 3.5536, 3.4911]}]}, 167 | {'DataType': 'MV', 168 | 'SymbolValues': [{'Currency': 'U$', 'Symbol': '@AAPL', 'Type': 10, 169 | 'Value': [16540.47, 18008.5, 16490.2, 16760.53, 15310.1, 16035.32, 15753.29]}]}, 170 | {'DataType': 'VO', 171 | 'SymbolValues': [{'Currency': 'U$', 'Symbol': '@AAPL', 'Type': 10, 172 | 'Value': [40952.8, 133932.3, 127909.5, 194426.3, 191981.9, 115180.7, 126265.9]}]}], 173 | 'Dates': ['/Date(946598400000+0000)/', '/Date(946857600000+0000)/', 174 | '/Date(946944000000+0000)/', '/Date(947030400000+0000)/', 175 | '/Date(947116800000+0000)/', '/Date(947203200000+0000)/', 176 | '/Date(947462400000+0000)/'], 177 | 'SymbolNames': None, 'Tag': None} 178 | res = {'DataResponse': res, 'Properties': None} 179 | 180 | DS = empty_datastream() 181 | assert DS.parse_response(res).equals(df) 182 | -------------------------------------------------------------------------------- /tests/test_fetch.py: -------------------------------------------------------------------------------- 1 | """ Tests for fetching the data using pydatastream 2 | 3 | Note: Proper credentials should be stored in environment variables: 4 | DSWS_USER and DSWS_PASS 5 | 6 | (c) Vladimir Filimonov, 2019 7 | """ 8 | import os 9 | import pytest 10 | import pydatastream as pds 11 | 12 | 13 | ############################################################################### 14 | ############################################################################### 15 | def has_credentials(): 16 | """ Credentials are in the environment variables """ 17 | return ('DSWS_USER' in os.environ) and ('DSWS_PASS' in os.environ) 18 | 19 | 20 | def init_datastream(): 21 | """ Create an instance of datastream """ 22 | return pds.Datastream(os.environ['DSWS_USER'], os.environ['DSWS_PASS']) 23 | 24 | 25 | ############################################################################### 26 | # Tests will be skipped if credentials are not passed 27 | ############################################################################### 28 | @pytest.mark.skipif(not has_credentials(), 29 | reason="credentials are not passed") 30 | def test_create_with_login(): 31 | """ Test init with credentials """ 32 | init_datastream() 33 | --------------------------------------------------------------------------------