├── .gitignore
├── LICENSE.txt
├── MANIFEST.in
├── Makefile
├── README.md
├── pydatastream
    ├── __init__.py
    ├── _version.py
    └── pydatastream.py
├── pytest.ini
├── setup.cfg
├── setup.py
└── tests
    ├── test_basic.py
    └── test_fetch.py


/.gitignore:
--------------------------------------------------------------------------------
1 | *.ipynb
2 | *.pyc
3 | .pytest_cache/*
4 | .pytest_cache
5 | runtest.sh
6 | *.egg-info
7 | tasks.todo
8 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2013-2020 Vladimir Filimonov
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include README.md
2 | include LICENSE.txt
3 | 


--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
 1 | 
 2 | .PHONY: help release build clean
 3 | 
 4 | LIBRARY := pydatastream
 5 | 
 6 | # taken from: https://marmelab.com/blog/2016/02/29/auto-documented-makefile.html
 7 | .DEFAULT_GOAL := help
 8 | help:
 9 | 	@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}'
10 | 
11 | 
12 | build:		## Make distribution package
13 | 	python setup.py sdist
14 | 
15 | 
16 | release:	## Make package; upload to pypi and clean after`
17 | 	@echo Do you really want to release the package to PyPi?
18 | 	@echo "Type library name for release ($(LIBRARY)):"
19 | 	@read -p"> " line; if [[ $$line != "$(LIBRARY)" ]]; then echo "Aborting. We'll release that later"; exit 1 ; fi
20 | 	@echo "Starting the release..."
21 | 	python setup.py sdist
22 | 	twine upload dist/*
23 | 	rm MANIFEST
24 | 	rm -rf dist
25 | 
26 | 
27 | clean:		## Erase distribution package
28 | 	rm MANIFEST
29 | 	rm -rf dist
30 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # PyDatastream
  2 | 
  3 | PyDatastream is a Python interface to the Refinitiv Datastream (former Thomson Reuters Datastream) API via Datastream Web Services (DSWS) (non free), with some extra convenience functions. This package requires valid credentials for this API.
  4 | 
  5 | **Note**: Up until version 0.5.1 the library has been using SOAP API of DataWorksEnterprise (DWE). As of July 1, 2019 this interface was discontinued by Thompson Reuters, and at the moment Datastream content is delivered through Datastream Web Services (DSWS). Starting version 0.6 pydatastream library is using REST API of DSWS. Any earlier versions of pydatastream do not work anymore.
  6 | 
  7 | ## Installation
  8 | 
  9 | The latest version of PyDatastream is always available at [GitHub](https://github.com/vfilimonov/pydatastream) at the `master` branch. Last release could be also installed from [PyPI](https://pypi.python.org/pypi/PyDatastream) using `pip`:
 10 | ```
 11 | pip install pydatastream
 12 | ```
 13 | 
 14 | Two external dependencies are [pandas](http://pandas.pydata.org/pandas-docs/stable/install.html) and [requests](https://2.python-requests.org/en/master/).
 15 | 
 16 | ## Basic use
 17 | 
 18 | All methods to work with the Datastream is organized as a class, so first you need to create an object with your valid credentials:
 19 | ```python
 20 | from pydatastream import Datastream
 21 | DS = Datastream(username="ZXXX000", password="XXX000")
 22 | ```
 23 | 
 24 | If necessary, the proxy server could be specified here via extra `proxy` parameter (e.g. `proxy='proxyLocation:portNumber'`).
 25 | 
 26 | The following command requests daily closing price data for the Apple asset (Datastream mnemonic `"@AAPL"`) in 2008:
 27 | ```python
 28 | data = DS.get_price('@AAPL', date_from='2008', date_to='2009')
 29 | ```
 30 | 
 31 | The data is retrieved as a `pandas.DataFrame` object, which can be easily plotted (`data.plot()`), aggregated into monthly data (`data.resample('M', how='last')``) or [manipulated in a variety of ways](http://nbviewer.ipython.org/urls/raw.github.com/changhiskhan/talks/master/pydata2012/pandas_timeseries.ipynb). Due to extreme simplicity of resampling  the data in Pandas library (for example, taking into account business calendar), I would recommend to request daily data (unless the requests are huge or daily scale is not applicable) and perform all transformations locally. Also note that thanks to Pandas library format of the date string is extremely flexible.
 32 | 
 33 | 
 34 | For fetching Open-High-Low-Close (OHLC) data there exist two methods: `get_OHLC` to fetch only price data and `get_OHLCV` to fetch both price and volume data. This separation is required as volume data is not available for financial indices.
 35 | 
 36 | Request daily OHLC and Volume data for Apple in 2008:
 37 | ```python
 38 | data = DS.get_OHLCV('@AAPL', date_from='2008', date_to='2009')
 39 | ```
 40 | 
 41 | Request daily OHLC data for S&P 500 Index from May 6, 2010 until present date:
 42 | ```python
 43 | data = DS.get_OHLC('S&PCOMP', date_from='May 6, 2010')
 44 | ```
 45 | 
 46 | ### Requesting specific fields for data
 47 | 
 48 | If mnemonics for specific fields are known, then more general function can be used. The following request
 49 | ```python
 50 | data = DS.fetch('@AAPL', ['P', 'MV', 'VO'], date_from='2000-01-01')
 51 | ```
 52 | fetches the closing price, daily volume and market valuation for Apple Inc.
 53 | 
 54 | **Note**: several fields should be passed as a list of strings, and not as a comma-separated string (i.e. `'P,MV,VO'` as a field name will raise an exception).
 55 | 
 56 | If called without any mnemonics for the fields (e.g. `DS.fetch('@AAPL')`) it will fetch one series for the "default" field, which is usually `P`. In this case the single column might not have any name due to inconsitency in API.
 57 | 
 58 | `fetch` can be used for requesting data for several tickers at once. In this case a MultiIndex Dataframe will be returned (see [Pandas: MultiIndex / Advanced Indexing](https://pandas.pydata.org/pandas-docs/stable/advanced.html)).
 59 | ```python
 60 | res = DS.fetch(['@AAPL','U:MMM'], fields=['P','MV','VO','PH'], date_from='2000-05-03')
 61 | ```
 62 | 
 63 | The resulting data frame could be sliced, in order to select all fields for a given ticker (`res.loc['U:MMM']`) or data for the specific field for all tickers ('res['MV'].unstack(level=0)').
 64 | 
 65 | **Note**: in order to force resulting dataframe to have MultiIndex even in case of a single ticker requested, set argument `always_multiindex=True`:
 66 | ```python
 67 | res = DS.fetch('@AAPL', fields=['P','MV','VO','PH'], date_from='2000-05-03', always_multiindex=True)
 68 | ```
 69 | 
 70 | Starting and ending dates could be provided as strings or `pandas.Timestamp` objects. One useful option is to fetch the data using `BDATE` as a starting date. In this case the whole series will be fetched from the first available date will be fetched. For several series - the earliest starting date of all of them should be used:
 71 | ```python
 72 | res = DS.fetch(['@AAPL','U:MMM'], 'P', date_from='BDATE')
 73 | ```
 74 | **Note**: In some cases, especially for many mis-aligned series, using `BDATE`, the Datastream server returns mis-formatted response (series for individual tickers are of different lengths and do not match the common date index), which could not be parsed. In this case it is suggested either to use explicit `date_from` or to split request into several smaller ones.
 75 | 
 76 | 
 77 | #### Note 1: Default field
 78 | 
 79 | **Important**: "Default field" is the most likely reason for "E100, INVALID CODE OR EXPRESSION ENTERED" error message when fetching multiple symbols at once, even if they could be fetched one-by-one. This was observed so far in [exchange rates](https://github.com/vfilimonov/pydatastream/issues/14) and [economic](https://github.com/vfilimonov/pydatastream/issues/16) [series](https://github.com/vfilimonov/pydatastream/issues/11).
 80 | 
 81 | There's a slight ambiguity of what "P" stands for.
 82 | For cash equities there's a datatype "P" which correspond to adjusted price.
 83 | However when one does the request to an API, "P" also stands for the default field. And if no fields (datatypes) are supplied, API will assume that the default field "P" is requested.
 84 | 
 85 | It looks like, that when one requests several symbols, API will treat the `"P"` (even if it is implied, i.e. no datatypes are specified) strictly as a datatype. So if the datatype "P" does not exist (e.g. for exchange rates: "EUDOLLR" or equity indices: "S&PCOMP") the request will be resulting in an error: e.g. here `$$"ER", E100, INVALID CODE OR EXPRESSION ENTERED, USDOLLR(P)`.
 86 | 
 87 | So in order retrieve several symbols at once, proper datatypes should be specified. For example:
 88 | 
 89 | ```python
 90 | res = DS.fetch(['EUDOLLR','USDOLLR'], fields=['EB','EO'])
 91 | ```
 92 | 
 93 | #### Note 2: error catching
 94 | 
 95 | As it will be discussed below, it may be convenient to set up property `raise_on_error` to `False` when fetching data of several tickers. In this case if one or several tickers are misspecified, error will no be raised, but they will not be present in the output data frame. Further, if some field does not exist for some of tickers, but exists for others, the missing data will be replaced with NaNs:
 96 | ```python
 97 | DS.raise_on_error = False
 98 | res = DS.fetch(['@AAPL','U:MMM','xxxxxx','S&PCOMP'], fields=['P','MV','VO','PH'], date_from='2000-05-03')
 99 | 
100 | print(res['MV'].unstack(level=0).head())
101 | print(res['P'].unstack(level=0).head())
102 | ```
103 | 
104 | Please note, that in the last example the closing price (`P`) for `S&PCOMP` ticker was not retrieved. Due to Thomson Reuters Datastream mnemonics, field `P` is not available for indexes and field `PI` should be used instead. For the same reason, calling method `get_OHLC()` for getting the open, high, low, close values of the index would not work. This could be instead fetched using the direct call to fetch:
105 | ```python
106 | data = DS.fetch('S&PCOMP', ['PO', 'PH', 'PL', 'PI'], date_from='May 6, 2010')
107 | ```
108 | 
109 | #### Note 3: Currencies and names of symbols and fields
110 | 
111 | `fetch` method also collects the currencies for the given mnemonics and fields. This information is not returned on `fetch` by default, and kept in the property `last_metadata` of the main class:
112 | ```python
113 | res = DS.fetch(['@AAPL','U:MMM'], fields=['P','MV','VO','PH'])
114 | print(DS.last_metadata['Currencies'])
115 | ```
116 | **Note**: Currency is not the same as the unit of the given field (data type). For example, the field `VO` has currency of `U$` here, however the volume is reported in the number of shares.
117 | 
118 | It can be also explicitly returned by the `fetch` method, if the property `return_metadata` is set to True:
119 | ```python
120 | res, meta = DS.fetch(['@AAPL','U:MMM'], fields=['P','MV','VO','PH'], return_metadata=True)
121 | print(meta['Currencies'])
122 | ```
123 | In this case `fetch` also collects the names of the symbols and fields (data types):
124 | ```python
125 | print(meta['SymbolNames'])
126 | print(meta['DataTypeNames'])
127 | ```
128 | 
129 | ### Index constituents, listings of stocks and mnemonics
130 | 
131 | PyDatastream also has an interface for retrieving list of constituents of indices:
132 | ```python
133 | DS.get_constituents('S&PCOMP')
134 | ```
135 | Note: In contrast to retired DWE interface, DSWS does not support fetching historical lists of constituents.
136 | 
137 | By default the method retrieves many various mnemonics and codes for the constituents,
138 | which might pose a problem for large indices like Russel-3000 (sometimes the request might be killed
139 | on timeout). In this case one can request only symbols and company names using:
140 | ```python
141 | DS.get_constituents('FRUSS3L', only_list=True)
142 | ```
143 | 
144 | Another useful method is `get_all_listings()`, that retrieves mnemonics (columns MJ and CD), countries (column GL) and exchanges (column MK) for different listings of the same stocks, as well as indicator for primary listing (columns PQ). For example for IBM and Vodafone:
145 | ```python
146 | DS.get_all_listings(['VOD', 'U:IBM'])
147 | ```
148 | 
149 | Finally, various symbols, codes and mnemonics could be fetched using the following method:
150 | ```python
151 | DS.get_codes(['U:IBM', '@MSFT'])
152 | ```
153 | 
154 | 
155 | ### Futures markets
156 | 
157 | Futures contracts have mnemonics of the form `XXXMMYY`, where `XXX` is a code for a futures market and `MM` and `YY` encode date and year of the contract. For example, `LLC0118` would denote market `LLC` (Brent Crude Oil) expiring in January 2018.
158 | 
159 | For a given market code it is possible to get a list of all active contracts (together with some extra information, such as exchange traded, tick size, contract size, last trading date etc.):
160 | ```python
161 | DS.get_futures_contracts('LLC')
162 | ```
163 | By default only active contracts are retrieved. In order to get the full list of contracts, use `include_dead=True` argument:
164 | ```python
165 | DS.get_futures_contracts('LLC', include_dead=True)
166 | ```
167 | Similar to the index constituents, it is possible to set up an argument `only_list=True` in order to limit output to only mnemonics and names. This argument is suggested if server returns an error "Index was outside the bounds of the array." on the full request, such as in the following case:
168 | ```python
169 | DS.get_futures_contracts('NCT', include_dead=True, only_list=True)
170 | ```
171 | 
172 | **Note**: Futures contracts do not operate with the "Price" field (`P`) and instead they use "Settlement Price" (`PS`):
173 | ```python
174 | DS.fetch('LLC0118', 'PS', date_from='BDATE')
175 | ```
176 | 
177 | 
178 | ### Static requests
179 | 
180 | List of constituents of indices, that were considered above, is an example of static request, i.e. a request that does not retrieve a time-series, but a single snapshot with the data.
181 | 
182 | Within the PyDatastream library static requests could be called using the same `fetch()` method by providing an argument `static=True`.
183 | 
184 | For example, this request retrieves names, ISINs and identifiers for primary exchange (ISINID) for various tickers of BASF corporation (part of the `get_codes()` method):
185 | 
186 | ```python
187 | DS.fetch(['D:BAS','D:BASX','HN:BAS','I:BAF','BFA','@BFFAF','S:BAS'], ['ISIN', 'ISINID', 'NAME'], static=True)
188 | ```
189 | 
190 | Another example of use of static requests is a cross-section of time-series. The following example retrieves an actual price, market capitalization and daily volume for same companies:
191 | 
192 | ```python
193 | DS.fetch(['D:BAS','D:BASX','HN:BAS','I:BAF','BFA','@BFFAF','S:BAS'], ['P', 'MV', 'VO'], static=True)
194 | ```
195 | 
196 | ## Advanced use
197 | 
198 | ### Datastream Functions
199 | 
200 | Datastream allows to apply a number of functions to the series, which are calculated on the server side. Given the flexibility of pandas library in all types of data transformation, this functionality is not really needed for python users. However I will briefly describe it for the sake of completeness.
201 | 
202 | Functions have a format `FUNC#(mnemonic,parameter)`. For example, calculating moving average on 20 days on the prices of IBM:
203 | ```python
204 | DS.fetch('MAV#(U:IBM,20D)', date_from='2013-09-01')
205 | ```
206 | Functions could be combined, e.g. calculating moving 3 day percentage change on 20 days moving average:
207 | ```python
208 | DS.fetch('PCH#(MAV#(U:IBM,20D),3D)', date_from='2013-09-01')
209 | ```
210 | Calculate percentage quarter-on-quarter change for the US real GDP (constant prices, seasonally adjusted):
211 | ```python
212 | DS.fetch('PCH#(USGDP...D,1Q)', date_from='1990-01-01')
213 | ```
214 | Calculate year-on-year difference (actual change) for the UK real GDP (constant prices, seasonally adjusted):
215 | ```python
216 | DS.fetch('ACH#(UKGDP...D,1Y)', date_from='1990-01-01')
217 | ```
218 | Documentation on the available functions are available on the [Thompson Reuters webhelp](http://product.datastream.com/navigator/advancehelpfiles/functions/webhelp/hfunc.htm).
219 | 
220 | ### Trading calendar
221 | 
222 | By default Datastream does padding of the prices on the holidays (i.e. on a vacation days it returns the price of the previous day). In order to remove the padding one need to know which days were holidays in the past.
223 | 
224 | The history of trading and non-trading days for each country could be returned using the function (`get_trading_days`):
225 | ```python
226 | DS.get_trading_days(['US', 'UK', 'RS'], date_from='2010-01-01')
227 | ```
228 | This function will return a dataframe that contains values 1 and NaN, where 1 identifies the business day. So by multiplying this list with the price time series it will remove padded values on non-trading days:
229 | ```python
230 | DS.fetch('@AAPL', 'P', date_from='2010').mul(DS.get_trading_days('US', date_from='2010')['US'], axis=0)
231 | ```
232 | 
233 | The full list of available calendars (countries) is contained in the property `DS.vacations_list`.
234 | 
235 | ### Performing several requests at once
236 | 
237 | [Thomson Dataworks Enterprise User Guide](http://dataworks.thomson.com/Dataworks/Enterprise/1.0/documentation/user%20guide.pdf) suggests to optimize requests: very often it is quicker to make one bigger request than several smaller requests because of the relatively high transport overhead with web services.
238 | 
239 | PyDatastream allows to fetch several requests in a single API call (so-called "bundle request"):
240 | 
241 | ```python
242 | r1 = DS.construct_request('@AAPL', ['PO', 'PH', 'PL', 'P'], date_from='2013-11-26')
243 | r2 = DS.construct_request('U:MMM', ['P', 'MV', 'PO'], date_from='2013-11-26')
244 | res = DS.request_many([r1, r2])
245 | dfs = DS.parse_response(res)
246 | 
247 | print(dfs[0].head())
248 | print(dfs[1].head())
249 | ```
250 | 
251 | 
252 | ### Debugging and error handling
253 | 
254 | If request contains errors then normally `DatastreamException` will be raised and the error message from the DSWS will be printed. To alter this behavior, one can use `raise_on_error` property of Datastream class. Being set to `False` it will force parser to ignore error messages and return empty pandas.Dataframe. For instance, since the field "P" does not exist for indices the following request would have raised an error, but instead it will only fill first 3 fields:
255 | ```python
256 | DS.raise_on_error = False
257 | DS.fetch('S&PCOMP', ['PO', 'PH', 'PL', 'P'])
258 | ```
259 | 
260 | `raise_on_error` can be useful for requests that contain several tickers. In this case data fields and/or tickers that can not be fetched will not be present in the output. However please be aware, that this is not a silver bullet against all types of errors. In all complicated cases it is suggested to go with one symbol at a time and check the raw response from the server to spot invalid combinations of ticker-field.
261 | 
262 | For the debugging purposes, Datastream class has `last_request` property, contains URL, raw JSON request and raw JSON response from the last call to the API.
263 | ```python
264 | print(DS.last_request['url'])
265 | print(DS.last_request['request'])
266 | print(DS.last_request['response'])
267 | ```
268 | In many cases the errors occur at the stage of parsing the raw response from the server.
269 | For example, this could happen if some fields requested are not supported for all symbols
270 | in the request. Because the data returned from the server in the unstructured form and the
271 | parser tries to map this to the structured pandas.DataFrame, the exceptions could be
272 | relatively uninformative. In this case it is highly suggested to check the raw response from the server to spot the problematic field.
273 | 
274 | ### Identifying asset class
275 | 
276 | Many data types (fields) in Datastream are asset type-specific. In order to check what exactly types do your symbols have call `get_asset_types`:
277 | ```python
278 | DS.get_asset_types(['MMM', '@AAPL', 'S&PCOMP', 'FRUSS3L', 'EUDOLLR', 'USGDP...D', 'FRFEDFD'])
279 | ```
280 | 
281 | ### Usage statistics
282 | 
283 | Statistics about the number of requests, datatypes and returned data points for a given account could be requested using the `usage_statistics` method. By default it returns stats for the present month, but it could also return the usage statistics for some specified month in the past, or even the range of months. For instance: the usage statistics (monthly) for the last year could be fetched using:
284 | ```python
285 | DS.usage_statistics(months=12)
286 | ```
287 | 
288 | ## Thomson Reuters Economic Point-in-Time (EPiT) functionality
289 | 
290 | PyDatastream has two useful methods to work with the Thomson Reuters Economic Point-in-Time (EPiT) concept (usually a separate subscription is required for this content). Most of the economic time series, such as GDP, employment or inflation figures are undergoing many revisions on their way. So that US GDP value undertakes two consecutive revisions after the initial release, after which the number is revised periodically when e.g. either the base year of calculation or seasonal adjustment are changed. For predictive exercises it is important to obtain the actual values as they were known at the time of releases (otherwise the data will be contaminated by look-ahead bias).
291 | 
292 | EPiT allows user to request such historical data. For example, the following method retrieves the initial estimate and first revisions of the 2010-Q1 US GDP, as well as the dates when the data was published:
293 | ```python
294 | DS.get_epit_revisions('USGDP...D', period='2010-02-15')
295 | ```
296 | The period here should contain a date which falls within a time period of interest (so any date from '2010-01-01' to '2010-03-31' will result in the same output).
297 | 
298 | Another useful concept is a concept of "vintage" of a data, which defines when the particular series was released. This concept is widely used in economics, see for example [ALFRED database](https://alfred.stlouisfed.org/) of St. Louis Fed and their discussions about [capturing data as it happens](https://alfred.stlouisfed.org/docs/alfred_capturing_data.pdf).
299 | 
300 | All vintages of the economic indicator could be summarized in a vintage matrix, that represents a DataFrame where columns correspond to a particular period (quarter or month) for the reported statistic and index represents timestamps at which these values were released by the respective official agency. I.e. every line corresponds to all available reported values by the given date.
301 | 
302 | For example for the US GDP:
303 | ```python
304 | DS.get_epit_vintage_matrix('USGDP...D', date_from='2015-01-01')
305 | ```
306 | The response is:
307 | ```
308 |             2015-02-15  2015-05-15  2015-08-15  2015-11-15  \
309 | 2015-04-29    16304.80         NaN         NaN         NaN
310 | 2015-05-29    16264.10         NaN         NaN         NaN
311 | 2015-06-24    16287.70         NaN         NaN         NaN
312 | 2015-07-30    16177.30   16270.400         NaN         NaN
313 | 2015-08-27    16177.30   16324.300         NaN         NaN
314 | 2015-09-25    16177.30   16333.600         NaN         NaN
315 | 2015-10-29    16177.30   16333.600   16394.200         NaN
316 | 2015-11-24    16177.30   16333.600   16417.800         NaN
317 | ...
318 | ```
319 | From the matrix it is seen for example, that the advance GDP estimate
320 | for 2015-Q1 (corresponding to 2015-02-15) was released on 2015-04-29
321 | and was equal to 16304.80 (B USD). The first revision (16264.10) has happened
322 | on 2015-05-29 and the second (16287.70) - on 2015-06-24. On 2015-07-30
323 | the advance GDP figure for 2015-Q2 was released (16270.400) together
324 | with update on the 2015-Q1 value (16177.30) and so on.
325 | 
326 | Finally, when using economic data in any real-life analysis, it is important to know when the next value will be released by the reporting body. For example the following method returns dates of next 4 releases for US GDP and US Nonfarm Payroll figures:
327 | ```python
328 | DS.get_next_release_dates(['USGDP...D', 'USEMPALLO'], n_releases=4)
329 | ```
330 | In the response release number is counted from the present day to the future. "DATE_FLAG" indicates whether the dates are given by the official agency or estimated by Thomson Reuters; and "TYPE" field indicates whether the new value is going to be provided or an old value is going to be updated.
331 | 
332 | ## Resources
333 | 
334 | - [Datastream Navigator](http://product.datastream.com/navigator/) could be used to look up codes and data types.
335 | 
336 | - [Official support webpage](https://customers.reuters.com/sc/Contactus/simple?product=Datastream&env=PU&TP=Y).
337 | 
338 | - [Official webpage for testing REST API requests](http://product.datastream.com/dswsclient/Docs/TestRestV1.aspx)
339 | 
340 | - [Documentation for DSWS API calls](http://product.datastream.com/dswsclient/Docs/Default.aspx)
341 | 
342 | - [Datastream Web Service Developer community](https://developers.refinitiv.com/eikon-apis/datastream-web-service)
343 | 
344 | All these links could be printed in your terminal or iPython notebook by calling
345 | ```python
346 | DS.info()
347 | ```
348 | 
349 | Finally, for quick start with pandas have a look on [tutorial notebook](http://nbviewer.ipython.org/urls/gist.github.com/fonnesbeck/5850375/raw/c18cfcd9580d382cb6d14e4708aab33a0916ff3e/1.+Introduction+to+Pandas.ipynb) and [10-minutes introduction](http://pandas.pydata.org/pandas-docs/stable/10min.html).
350 | 
351 | ### Datastream Navigator
352 | 
353 | [Datastream Navigator](http://product.datastream.com/navigator/) is the best place to search for mnemonics for a particular security or a list of available datatypes/fields.
354 | 
355 | Once a necessary security is located (either by search or via "Explore" menu item), a short list of most frequently used mnemonics is located right under the chart with the series. Further the small button ">>" next to them open a longer list.
356 | 
357 | The complete list of datatypes (including static) for a particular asset class is located in the "Datatype search" menu item. Here the proper asset class should be selected.
358 | 
359 | Help for Datastream Navigator is available [here](http://product.datastream.com/WebHelp/Navigator/4.5/HelpFiles/Nav_help.htm).
360 | 
361 | ### Limits of the API
362 | 
363 | - Datastream limits single call request to 100 series, i.e. (# of tickers) x (# of fields) should not exceed 100 (e.g. 10x10 or 20x5, etc). 
364 | - Number of time-series data point is not limited as long as the monthly limit of 10 million data point is not exceeded.
365 | 
366 | 
367 | ## Version history
368 | 
369 | - 0.1 (2013-11-12) First release
370 | - 0.2 (2013-12-01) Release as a python package; Added `get_constituents()` and other useful methods
371 | - 0.3 (2015-03-19) Improvements of static requests
372 | - 0.4 (2016-03-16) Support of Python 3
373 | - 0.5 (2017-07-01) Backward-incompatible change: method `fetch()` for many tickers returns MultiIndex Dataframe instead of former Panel. This follows the development of the Pandas library where the Panels are deprecated starting version 0.20.0 (see [here](http://pandas.pydata.org/pandas-docs/version/0.20/whatsnew.html#whatsnew-0200-api-breaking-deprecate-panel)).
374 | - 0.5.1 (2017-11-17) Added Economic Point-in-Time (EPiT) functionality
375 | - 0.6 (2019-08-27) The library is rewritten to use the new REST-based Datastream Web Services (DSWS) interfaces instead of old SOAP-based DataWorks Enterprise (DWE), which was discontinued on July 1, 2019. Some methods (such as `system_info()` or `sources()`) have been removed as they're not supported by a new API.
376 | - 0.6.1 (2019-10-10) Fixes, performance improvements. Added `get_futures_contracts()` and `get_next_release_dates()`.
377 | - 0.6.2 (2020-03-08) Added trading calendar.
378 | - 0.6.3 (2020-08-04) Minor bug fixes
379 | - 0.6.4 (2020-10-30) Fix token expiration time
380 | - 0.6.5 (2021-11-05) Add `always_multiindex` argument to `fetch()`
381 | 
382 | Note 1: any versions of pydatastream prior to 0.6 will not work anymore.
383 | 
384 | Note 2: Starting version 0.6 Python 2 is no longer supported - in line with many critical projects (including numpy, pandas and matplotlib) [dropping support of Python 2 by 2020](https://python3statement.org/).
385 | 
386 | 
387 | ## Acknowledgements
388 | 
389 | A special thanks to:
390 | 
391 | * François Cocquemas ([@fcocquemas](https://github.com/fcocquemas)) for  [RDatastream](https://github.com/fcocquemas/rdatastream) that has inspired this library
392 | * Charles Allderman ([@ceaza](https://github.com/ceaza)) for added support of Python 3
393 | 
394 | 
395 | ## License
396 | 
397 | PyDatastream library is released under the MIT license.
398 | 
399 | The license for the library is not extended in any sense to any of the content of the Refinitiv (former Thomson Reuters) Dataworks Enterprise, Datastream, Datastream Web Services, Datastream Navigator or related services. Appropriate contract with the data vendor and valid credentials are required in order to use the API.
400 | 
401 | Author of the library ([@vfilimonov](https://github.com/vfilimonov)) is not affiliated, associated, authorized, sponsored, endorsed by, or in any way officially connected with Thomson Reuters, or any of its subsidiaries or its affiliates. The names "Refinitiv", "Thomson Reuters" as well as related names are registered trademarks of respective owners.
402 | 


--------------------------------------------------------------------------------
/pydatastream/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | from .pydatastream import Datastream, DatastreamException
3 | from ._version import *
4 | 


--------------------------------------------------------------------------------
/pydatastream/_version.py:
--------------------------------------------------------------------------------
1 | """ Current version of the package
2 | """
3 | 
4 | __version__ = '0.6.6'
5 | __author__ = 'Vladimir Filimonov'
6 | __email__ = 'vladimir.a.filimonov@gmail.com'
7 | 


--------------------------------------------------------------------------------
/pydatastream/pydatastream.py:
--------------------------------------------------------------------------------
  1 | """ pydatastream main module
  2 | 
  3 |     (c) Vladimir Filimonov, 2013 - 2021
  4 | """
  5 | import warnings
  6 | import json
  7 | import math
  8 | from functools import wraps
  9 | import requests
 10 | import pandas as pd
 11 | 
 12 | ###############################################################################
 13 | _URL = 'https://product.datastream.com/dswsclient/V1/DSService.svc/rest/'
 14 | 
 15 | _FLDS_XREF = ('DSCD,EXMNEM,GEOGC,GEOGN,IBTKR,INDC,INDG,INDM,INDX,INDXEG,'
 16 |               'INDXFS,INDXL,INDXS,ISIN,ISINID,LOC,MNEM,NAME,SECD,TYPE'.split(','))
 17 | 
 18 | _FLDS_XREF_FUT = ('MNEM,NAME,FLOT,FEX,GEOGC,GEOGN,EXCODE,LTDT,FUTBDATE,PCUR,ISOCUR,'
 19 |                   'TICKS,TICKV,TCYCLE,TPLAT'.split(','))
 20 | 
 21 | _ASSET_TYPE_CODES = {'BD': 'Bonds & Convertibles',
 22 |                      'BDIND': 'Bond Indices & Credit Default Swaps',
 23 |                      'CMD': 'Commodities',
 24 |                      'EC': 'Economics',
 25 |                      'EQ': 'Equities',
 26 |                      'EQIND': 'Equity Indices',
 27 |                      'EX': 'Exchange Rates',
 28 |                      'FT': 'Futures',
 29 |                      'INT': 'Interest Rates',
 30 |                      'INVT': 'Investment Trusts',
 31 |                      'OP': 'Options',
 32 |                      'UT': 'Unit Trusts',
 33 |                      'EWT': 'Warrants',
 34 |                      'NA': 'Not available'}
 35 | 
 36 | ###############################################################################
 37 | _INFO = """PyDatastream documentation (GitHub):
 38 | https://github.com/vfilimonov/pydatastream
 39 | 
 40 | Datastream Navigator:
 41 | http://product.datastream.com/navigator/
 42 | 
 43 | Official support
 44 | https://customers.reuters.com/sc/Contactus/simple?product=Datastream&env=PU&TP=Y
 45 | 
 46 | Webpage for testing REST API requests
 47 | http://product.datastream.com/dswsclient/Docs/TestRestV1.aspx
 48 | 
 49 | Documentation for DSWS API
 50 | http://product.datastream.com/dswsclient/Docs/Default.aspx
 51 | 
 52 | Datastream Web Service Developer community
 53 | https://developers.refinitiv.com/eikon-apis/datastream-web-service
 54 | """
 55 | 
 56 | 
 57 | ###############################################################################
 58 | ###############################################################################
 59 | def _convert_date(date):
 60 |     """ Convert date to YYYY-MM-DD """
 61 |     if date is None:
 62 |         return ''
 63 |     if isinstance(date, str) and (date.upper() == 'BDATE'):
 64 |         return 'BDATE'
 65 |     return pd.Timestamp(date).strftime('%Y-%m-%d')
 66 | 
 67 | 
 68 | def _parse_dates(dates):
 69 |     """ Parse dates
 70 |         Example:
 71 |             /Date(1565817068486)       -> 2019-08-14T21:11:08.486000000
 72 |             /Date(1565568000000+0000)  -> 2019-08-12T00:00:00.000000000
 73 |     """
 74 |     if dates is None:
 75 |         return None
 76 |     if isinstance(dates, str):
 77 |         return pd.Timestamp(_parse_dates([dates])[0])
 78 |     res = [int(_[6:(-7 if '+' in _ else -2)]) for _ in dates]
 79 |     return pd.to_datetime(res, unit='ms').values
 80 | 
 81 | 
 82 | class DatastreamException(Exception):
 83 |     """ Exception class for Datastream """
 84 | 
 85 | 
 86 | ###############################################################################
 87 | def lazy_property(fn):
 88 |     """ Lazy-evaluated property of an object """
 89 |     attr_name = '__lazy__' + fn.__name__
 90 | 
 91 |     @property
 92 |     @wraps(fn)
 93 |     def _lazy_property(self):
 94 |         if not hasattr(self, attr_name):
 95 |             setattr(self, attr_name, fn(self))
 96 |         return getattr(self, attr_name)
 97 |     return _lazy_property
 98 | 
 99 | 
100 | ###############################################################################
101 | # Main Datastream class
102 | ###############################################################################
103 | class Datastream():
104 |     """ Python interface to the Refinitiv Datastream API via Datastream Web
105 |         Services (DSWS).
106 |     """
107 |     def __init__(self, username, password, raise_on_error=True, proxy=None, **kwargs):
108 |         """Establish a connection to the Python interface to the Refinitiv Datastream
109 |            (former Thomson Reuters Datastream) API via Datastream Web Services (DSWS).
110 | 
111 |            username / password - credentials for the DSWS account.
112 |            raise_on_error - If True then error request will raise a "DatastreamException",
113 |                             otherwise either empty dataframe or partially
114 |                             retrieved data will be returned
115 |            proxy - URL for the proxy server. Valid values:
116 |                    (a) None: no proxy is used
117 |                    (b) string of format "host:port" or "username:password@host:port"
118 | 
119 |            Note: credentials will be saved in memory. In case if this is not
120 |                  desirable for security reasons, call the constructor having None
121 |                  instead of values and manually call renew_token(username, password)
122 |                  when needed.
123 | 
124 |            A custom REST API url (if necessary for some reasons) could be provided
125 |            via "url" parameter.
126 |         """
127 |         self.raise_on_error = raise_on_error
128 |         self.last_request = None
129 |         self.last_metadata = None
130 |         self._last_response_raw = None
131 | 
132 |         # Setting up proxy parameters if necessary
133 |         if isinstance(proxy, str):
134 |             self._proxy = {'http': proxy, 'https': proxy}
135 |         elif proxy is None:
136 |             self._proxy = None
137 |         else:
138 |             raise ValueError('Proxy parameter should be either None or string')
139 | 
140 |         self._url = kwargs.pop('url', _URL)
141 |         self._username = username
142 |         self._password = password
143 |         # request new token
144 |         self.renew_token(username, password)
145 | 
146 |     ###########################################################################
147 |     @staticmethod
148 |     def info():
149 |         """ Some useful links """
150 |         print(_INFO)
151 | 
152 |     ###########################################################################
153 |     def _api_post(self, method, request):
154 |         """ Call to the POST method of DSWS API """
155 |         url = self._url + method
156 |         self.last_request = {'url': url, 'request': request, 'error': None}
157 |         self.last_metadata = None
158 |         try:
159 |             res = requests.post(url, json=request, proxies=self._proxy)
160 |             self.last_request['response'] = res.text
161 |         except Exception as e:
162 |             self.last_request['error'] = str(e)
163 |             raise
164 | 
165 |         try:
166 |             response = self.last_request['response'] = json.loads(self.last_request['response'])
167 |         except json.JSONDecodeError as e:
168 |             raise DatastreamException('Server response could not be parsed') from e
169 | 
170 |         if 'Code' in response:
171 |             code = response['Code']
172 |             if response['SubCode'] is not None:
173 |                 code += '/' + response['SubCode']
174 |             errormsg = f'{code}: {response["Message"]}'
175 |             self.last_request['error'] = errormsg
176 |             raise DatastreamException(errormsg)
177 |         return self.last_request['response']
178 | 
179 |     ###########################################################################
180 |     def renew_token(self, username=None, password=None):
181 |         """ Request new token from the server """
182 |         if username is None or password is None:
183 |             warnings.warn('Username or password is not provided - could not renew token')
184 |             return
185 |         data = {"UserName": username, "Password": password}
186 |         self._token = dict(self._api_post('GetToken', data))
187 |         self._token['TokenExpiry'] = _parse_dates(self._token['TokenExpiry']).tz_localize('UTC')
188 | 
189 |         # Token is invalidated 15 minutes before exporation time
190 |         # Note: According to https://github.com/vfilimonov/pydatastream/issues/27
191 |         #       tokens do not always respect the (as of now 24 hours) expiry time
192 |         #       So for this reason I limit the token life at 6 hours.
193 |         self._token['RenewTokenAt'] = min(self._token['TokenExpiry'] - pd.Timedelta('15m'),
194 |                                           pd.Timestamp.utcnow() + pd.Timedelta('6H'))
195 | 
196 |     @property
197 |     def _token_is_expired(self):
198 |         if self._token is None:
199 |             return True
200 |         if pd.Timestamp.utcnow() > self._token['RenewTokenAt']:
201 |             return True
202 |         return False
203 | 
204 |     @property
205 |     def token(self):
206 |         """ Return actual token and renew it if necessary. """
207 |         if self._token_is_expired:
208 |             self.renew_token(self._username, self._password)
209 |         return self._token['TokenValue']
210 | 
211 |     ###########################################################################
212 |     def request(self, request):
213 |         """ Generic wrapper to request data in raw format. Request should be
214 |             properly formatted dictionary (see construct_request() method).
215 |         """
216 |         data = {'DataRequest': request, 'TokenValue': self.token}
217 |         return self._api_post('GetData', data)
218 | 
219 |     def request_many(self, list_of_requests):
220 |         """ Generic wrapper to request multiple requests in raw format.
221 |             list_of_requests should be a list of properly formatted dictionaries
222 |             (see construct_request() method).
223 |         """
224 |         data = {'DataRequests': list_of_requests, 'TokenValue': self.token}
225 |         return self._api_post('GetDataBundle', data)
226 | 
227 |     ###########################################################################
228 |     @staticmethod
229 |     def construct_request(ticker, fields=None, date_from=None, date_to=None,
230 |                           freq=None, static=False, IsExpression=None,
231 |                           return_names=True):
232 |         """Construct a request string for querying TR DSWS.
233 | 
234 |            tickers - ticker or symbol, or list of symbols
235 |            fields  - field or list of fields.
236 |            date_from, date_to - date range (used only if "date" is not specified)
237 |            freq    - frequency of data: daily('D'), weekly('W') or monthly('M')
238 |            static  - True for static (snapshot) requests
239 |            IsExpression - if True, it will explicitly assume that list of tickers
240 |                           contain expressions. Otherwise it will try to infer it.
241 | 
242 |            Some of available fields:
243 |            P  - adjusted closing price
244 |            PO - opening price
245 |            PH - high price
246 |            PL - low price
247 |            VO - volume, which is expressed in 1000's of shares.
248 |            UP - unadjusted price
249 |            OI - open interest
250 | 
251 |            MV - market value
252 |            EPS - earnings per share
253 |            DI - dividend index
254 |            MTVB - market to book value
255 |            PTVB - price to book value
256 |            ...
257 | 
258 |            The full list of data fields is available at http://dtg.tfn.com/.
259 |         """
260 |         req = {'Instrument': {}, 'Date': {}, 'DataTypes': []}
261 | 
262 |         # Instruments
263 |         if isinstance(ticker, str):
264 |             # ticker = ticker  ## Nothing to change
265 |             is_list = None
266 |         elif hasattr(ticker, '__len__'):
267 |             ticker = ','.join(ticker)
268 |             is_list = True
269 |         else:
270 |             raise ValueError('ticker should be either string or list/array of strings')
271 |         # Properties of instruments
272 |         props = []
273 |         if is_list or (is_list is None and ',' in ticker):
274 |             props.append({'Key': 'IsList', 'Value': True})
275 |         if IsExpression or (IsExpression is None and
276 |                             ('#' in ticker or '(' in ticker or ')' in ticker)):
277 |             props.append({'Key': 'IsExpression', 'Value': True})
278 |         if return_names:
279 |             props.append({'Key': 'ReturnName', 'Value': True})
280 |         req['Instrument'] = {'Value': ticker, 'Properties': props}
281 | 
282 |         # DataTypes
283 |         props = [{'Key': 'ReturnName', 'Value': True}] if return_names else []
284 |         if fields is not None:
285 |             if isinstance(fields, str):
286 |                 req['DataTypes'].append({'Value': fields, 'Properties': props})
287 |             elif isinstance(fields, list) and fields:
288 |                 for f in fields:
289 |                     req['DataTypes'].append({'Value': f, 'Properties': props})
290 |             else:
291 |                 raise ValueError('fields should be either string or list/array of strings')
292 | 
293 |         # Dates
294 |         req['Date'] = {'Start': _convert_date(date_from),
295 |                        'End': _convert_date(date_to),
296 |                        'Frequency': freq if freq is not None else '',
297 |                        'Kind': 0 if static else 1}
298 |         return req
299 | 
300 |     ###########################################################################
301 |     @staticmethod
302 |     def _parse_meta(meta):
303 |         """ Parse SymbolNames, DataTypeNames and Currencies """
304 |         res = {}
305 |         for key in meta:
306 |             if key in ('DataTypeNames', 'SymbolNames'):
307 |                 if not meta[key]:   # None or empty list
308 |                     res[key] = None
309 |                 else:
310 |                     names = pd.DataFrame(meta[key]).set_index('Key')['Value']
311 |                     names.index.name = key.replace('Names', '')
312 |                     names.name = 'Name'
313 |                     res[key] = names
314 |             elif key == 'Currencies':
315 |                 cur = pd.concat({key: pd.DataFrame(meta['Currencies'][key])
316 |                                  for key in meta['Currencies']})
317 |                 res[key] = cur.xs('Currency', level=1).T
318 |             else:
319 |                 res[key] = meta[key]
320 |         return res
321 | 
322 |     def _parse_one(self, res):
323 |         """ Parse one response (either 'DataResponse' or one of 'DataResponses')"""
324 |         data = res['DataTypeValues']
325 |         dates = _parse_dates(res['Dates'])
326 |         res_meta = {_: res[_] for _ in res if _ not in ['DataTypeValues', 'Dates']}
327 | 
328 |         # Parse values
329 |         meta = {}
330 |         res = {}
331 |         for d in data:
332 |             data_type = d['DataType']
333 |             res[data_type] = {}
334 |             meta[data_type] = {}
335 | 
336 |             for v in d['SymbolValues']:
337 |                 value = v['Value']
338 |                 if v['Type'] == 0:  # Error
339 |                     if self.raise_on_error:
340 |                         raise DatastreamException(f'"{v["Symbol"]}"("{data_type}"): {value}')
341 |                     res[data_type][v['Symbol']] = math.nan
342 |                 elif v['Type'] == 4:  # Date
343 |                     res[data_type][v['Symbol']] = _parse_dates(value)
344 |                 else:
345 |                     res[data_type][v['Symbol']] = value
346 |                 meta[data_type][v['Symbol']] = {_: v[_] for _ in v if _ != 'Value'}
347 | 
348 |             if dates is None:
349 |                 # Fix - if dates are not returned, then simply use integer index
350 |                 dates = [0]
351 |             res[data_type] = pd.DataFrame(res[data_type], index=dates)
352 | 
353 |         res = pd.concat(res).unstack(level=1).T.sort_index()
354 |         res_meta['Currencies'] = meta
355 |         return res, self._parse_meta(res_meta)
356 | 
357 |     def parse_response(self, response, return_metadata=False):
358 |         """ Parse raw JSON response
359 | 
360 |             If return_metadata is True, then result is tuple (dataframe, metadata),
361 |             where metadata is a dictionary. Otherwise only dataframe is returned.
362 | 
363 |             In case of response being constructed from several requests (method
364 |             request_many()), then the result is a list of parsed responses. Here
365 |             again, if return_metadata is True then each element is a tuple
366 |             (dataframe, metadata), otherwise each element is a dataframe.
367 |         """
368 |         if 'DataResponse' in response:  # Single request
369 |             res, meta = self._parse_one(response['DataResponse'])
370 |             self.last_metadata = meta
371 |             return (res, meta) if return_metadata else res
372 |         if 'DataResponses' in response:  # Multiple requests
373 |             results = [self._parse_one(r) for r in response['DataResponses']]
374 |             self.last_metadata = [_[1] for _ in results]
375 |             return results if return_metadata else [_[0] for _ in results]
376 |         raise DatastreamException('Neither DataResponse nor DataResponses are found')
377 | 
378 |     ###########################################################################
379 |     def usage_statistics(self, date=None, months=1):
380 |         """ Request usage statistics
381 | 
382 |             date - if None, stats for the current month will be fetched,
383 |                    otherwise for the month which contains the specified date.
384 |             months - number of consecutive months prior to "date" for which
385 |                      stats should be retrieved.
386 |         """
387 |         if date is None:
388 |             date = pd.Timestamp('now').normalize() - pd.offsets.MonthBegin()
389 |         req = [self.construct_request('STATS', 'DS.USERSTATS',
390 |                                       date-pd.offsets.MonthBegin(n), static=True)
391 |                for n in range(months)][::-1]
392 |         res = self.parse_response(self.request_many(req))
393 |         res = pd.concat(res)
394 |         res.index = res['Start Date'].dt.strftime('%B %Y')
395 |         res.index.name = None
396 |         return res
397 | 
398 |     ###########################################################################
399 |     def fetch(self, tickers, fields=None, date_from=None, date_to=None,
400 |               freq=None, static=False, IsExpression=None, return_metadata=False,
401 |               always_multiindex=False):
402 |         """Fetch the data from Datastream for a set of tickers and parse results.
403 | 
404 |            tickers - ticker or symbol, or list of symbols
405 |            fields  - field or list of fields
406 |            date_from, date_to - date range (used only if "date" is not specified)
407 |            freq    - frequency of data: daily('D'), weekly('W') or monthly('M')
408 |            static  - True for static (snapshot) requests
409 |            IsExpression - if True, it will explicitly assume that list of tickers
410 |                           contain expressions. Otherwise it will try to infer it.
411 |            return_metadata - if True, then tuple (data, metadata) will be returned,
412 |                              otherwise only data
413 |            always_multiindex - if True, then even for 1 ticker requested, resulting
414 |                                dataframe will have multiindex (ticker, date).
415 |                                If False (default), then request of single ticker will
416 |                                result in a dataframe indexed by date only.
417 | 
418 |            Notes: - several fields should be passed as a list, and not as a
419 |                     comma-separated string!
420 |                   - if no fields are provided, then the default field will be
421 |                     fetched. In this case the column might not have any name
422 |                     in the resulting dataframe.
423 | 
424 |            Result format depends on the number of requested tickers and fields:
425 |              - 1 ticker         - DataFrame with fields in column names
426 |                                   (in order to keep multiindex even for single
427 |                                   ticker, set `always_multiindex` to True)
428 |              - several tickers  - DataFrame with fields in column names and
429 |                                   MultiIndex (ticker, date)
430 |              - static request   - DataFrame indexed by tickers and with fields
431 |                                   in column names
432 | 
433 |            Some of available fields:
434 |            P  - adjusted closing price
435 |            PO - opening price
436 |            PH - high price
437 |            PL - low price
438 |            VO - volume, which is expressed in 1000's of shares.
439 |            UP - unadjusted price
440 |            OI - open interest
441 | 
442 |            MV - market value
443 |            EPS - earnings per share
444 |            DI - dividend index
445 |            MTVB - market to book value
446 |            PTVB - price to book value
447 |            ...
448 |         """
449 |         req = self.construct_request(tickers, fields, date_from, date_to,
450 |                                      freq=freq, static=static,
451 |                                      IsExpression=IsExpression,
452 |                                      return_names=return_metadata)
453 |         raw = self.request(req)
454 |         self._last_response_raw = raw
455 |         data, meta = self.parse_response(raw, return_metadata=True)
456 | 
457 |         if static:
458 |             # Static request - drop date from MultiIndex
459 |             data = data.reset_index(level=1, drop=True)
460 |         elif len(data.index.levels[0]) == 1:
461 |             # Only one ticker - drop tickers from MultiIndex
462 |             if not always_multiindex:
463 |                 data = data.reset_index(level=0, drop=True)
464 | 
465 |         return (data, meta) if return_metadata else data
466 | 
467 |     ###########################################################################
468 |     # Specific fetching methods
469 |     ###########################################################################
470 |     def get_OHLCV(self, ticker, date_from=None, date_to=None):
471 |         """Get Open, High, Low, Close prices and daily Volume for a given ticker.
472 | 
473 |            ticker  - ticker or symbol
474 |            date_from, date_to - date range (used only if "date" is not specified)
475 |         """
476 |         return self.fetch(ticker, ['PO', 'PH', 'PL', 'P', 'VO'], date_from, date_to,
477 |                           freq='D', return_metadata=False)
478 | 
479 |     def get_OHLC(self, ticker, date_from=None, date_to=None):
480 |         """Get Open, High, Low and Close prices for a given ticker.
481 | 
482 |            ticker  - ticker or symbol
483 |            date_from, date_to - date range (used only if "date" is not specified)
484 |         """
485 |         return self.fetch(ticker, ['PO', 'PH', 'PL', 'P'], date_from, date_to,
486 |                           freq='D', return_metadata=False)
487 | 
488 |     def get_price(self, ticker, date_from=None, date_to=None):
489 |         """Get Close price for a given ticker.
490 | 
491 |            ticker  - ticker or symbol
492 |            date_from, date_to - date range (used only if "date" is not specified)
493 |         """
494 |         return self.fetch(ticker, 'P', date_from, date_to,
495 |                           freq='D', return_metadata=False)
496 | 
497 |     ###########################################################################
498 |     def get_constituents(self, index_ticker, only_list=False):
499 |         """ Get a list of all constituents of a given index.
500 | 
501 |             index_ticker - Datastream ticker for index
502 |             only_list    - request only list of symbols. By default the method
503 |                            retrieves many extra fields with information (various
504 |                            mnemonics and codes). This might pose some problems
505 |                            for large indices like Russel-3000. If only_list=True,
506 |                            then only the list of symbols and names are retrieved.
507 | 
508 |             NOTE: In contrast to retired DWE interface, DSWS does not support
509 |                   fetching historical lists of constituents.
510 |         """
511 |         if only_list:
512 |             fields = ['MNEM', 'NAME']
513 |         else:
514 |             fields = _FLDS_XREF
515 |         return self.fetch('L' + index_ticker, fields, static=True)
516 | 
517 |     def get_all_listings(self, ticker):
518 |         """ Get all listings and their symbols for the given security
519 |         """
520 |         res = self.fetch(ticker, 'QTEALL', static=True)
521 |         columns = list({_[:2] for _ in res.columns})
522 | 
523 |         # Reformat the output
524 |         df = {}
525 |         for ind in range(1, 21):
526 |             cols = {f'{c}{ind:02d}': c for c in columns}
527 |             df[ind] = res[cols.keys()].rename(columns=cols)
528 |         df = pd.concat(df).swaplevel(0).sort_index()
529 |         df = df[~(df == '').all(axis=1)]
530 |         return df
531 | 
532 |     def get_codes(self, ticker):
533 |         """ Get codes and symbols for the given securities
534 |         """
535 |         return self.fetch(ticker, _FLDS_XREF, static=True)
536 | 
537 |     ###########################################################################
538 |     def get_asset_types(self, symbols):
539 |         """ Get asset types for a given list of symbols
540 |             Note: the method does not
541 |         """
542 |         res = self.fetch(symbols, 'TYPE', static=True, IsExpression=False)
543 |         names = pd.Series(_ASSET_TYPE_CODES).to_frame(name='AssetTypeName')
544 |         res = res.join(names, on='TYPE')
545 |         # try to preserve the order
546 |         if isinstance(symbols, (list, pd.Series)):
547 |             try:
548 |                 res = res.loc[symbols]
549 |             except KeyError:
550 |                 pass  # OK, we don't keep the order if not possible
551 |         return res
552 | 
553 |     ###########################################################################
554 |     def get_futures_contracts(self, market_code, only_list=False, include_dead=False):
555 |         """ Get list of all contracts for a given futures market
556 | 
557 |             market_code  - Datastream mnemonic for a market (e.g. LLC for the
558 |                            Brent Crude Oil, whose contracts have mnemonics like
559 |                            LLC0118 for January 2018)
560 |             only_list    - request only list of symbols. By default the method
561 |                            retrieves many extra fields with information (currency,
562 |                            lot size, last trading date, etc). If only_list=True,
563 |                            then only the list of symbols and names are retrieved.
564 |             include_dead - if True, all delisted/expired contracts will be fetched
565 |                            as well. Otherwise only active contracts will be returned.
566 |         """
567 |         if only_list:
568 |             fields = ['MNEM', 'NAME']
569 |         else:
570 |             fields = _FLDS_XREF_FUT
571 |         res = self.fetch(f'LFUT{market_code}L', fields, static=True)
572 |         res['active'] = True
573 |         if include_dead:
574 |             res2 = self.fetch(f'LFUT{market_code}D', fields, static=True)
575 |             res2['active'] = False
576 |             res = pd.concat([res, res2])
577 |         return res[res.MNEM != 'NA']  # Drop lines with empty mnemonics
578 | 
579 |     ###########################################################################
580 |     def get_epit_vintage_matrix(self, mnemonic, date_from='1951-01-01', date_to=None):
581 |         """ Construct the vintage matrix for a given economic series.
582 |             Requires subscription to Thomson Reuters Economic Point-in-Time (EPiT).
583 | 
584 |             Vintage matrix represents a DataFrame where columns correspond to a
585 |             particular period (quarter or month) for the reported statistic and
586 |             index represents timestamps at which these values were released by
587 |             the respective official agency. I.e. every line corresponds to all
588 |             available reported values by the given date.
589 | 
590 |             For example:
591 | 
592 |             >> DS.get_epit_vintage_matrix('USGDP...D', date_from='2015-01-01')
593 | 
594 |                         2015-02-15  2015-05-15  2015-08-15  2015-11-15  \
595 |             2015-04-29    16304.80         NaN         NaN         NaN
596 |             2015-05-29    16264.10         NaN         NaN         NaN
597 |             2015-06-24    16287.70         NaN         NaN         NaN
598 |             2015-07-30    16177.30   16270.400         NaN         NaN
599 |             2015-08-27    16177.30   16324.300         NaN         NaN
600 |             2015-09-25    16177.30   16333.600         NaN         NaN
601 |             2015-10-29    16177.30   16333.600   16394.200         NaN
602 |             2015-11-24    16177.30   16333.600   16417.800         NaN
603 | 
604 |             From the matrix it is seen for example, that the advance GDP estimate
605 |             for 2015-Q1 (corresponding to 2015-02-15) was released on 2015-04-29
606 |             and was equal to 16304.80 (B USD). The first revision (16264.10) has
607 |             happened on 2015-05-29 and the second (16287.70) - on 2015-06-24.
608 |             On 2015-07-30 the advance GDP figure for 2015-Q2 was released
609 |             (16270.400) together with update on the 2015-Q1 value (16177.30)
610 |             and so on.
611 |         """
612 |         # Get first available date from the REL1 series
613 |         rel1 = self.fetch(mnemonic, 'REL1', date_from=date_from, date_to=date_to)
614 |         date_0 = rel1.dropna().index[0]
615 | 
616 |         # All release dates
617 |         reld123 = self.fetch(mnemonic, ['RELD1', 'RELD2', 'RELD3'],
618 |                              date_from=date_0, date_to=date_to).dropna(how='all')
619 | 
620 |         # Fetch all vintages
621 |         res = {}
622 |         for date in reld123.index:
623 |             try:
624 |                 _tmp = self.fetch(mnemonic, 'RELV', date_from=date_0, date_to=date).dropna()
625 |             except DatastreamException:
626 |                 continue
627 |             res[date] = _tmp
628 |         return pd.concat(res).RELV.unstack()
629 | 
630 |     ###########################################################################
631 |     def get_epit_revisions(self, mnemonic, period, relh50=False):
632 |         """ Return initial estimate and first revisions of a given economic time
633 |             series and a given period.
634 |             Requires subscription to Thomson Reuters Economic Point-in-Time (EPiT).
635 | 
636 |             "Period" parameter should represent a date which falls within a time
637 |             period of interest, e.g. 2016 Q4 could be requested with the
638 |             period='2016-11-15' for example.
639 | 
640 |             By default up to 20 values is returned unless argument "relh50" is
641 |             set to True (in which case up to 50 values is returned).
642 |         """
643 |         if relh50:
644 |             data = self.fetch(mnemonic, 'RELH50', date_from=period, static=True)
645 |         else:
646 |             data = self.fetch(mnemonic, 'RELH', date_from=period, static=True)
647 |         data = data.iloc[0]
648 | 
649 |         # Parse the response
650 |         res = {data.loc['RELHD%02d' % i]: data.loc['RELHV%02d' % i]
651 |                for i in range(1, 51 if relh50 else 21)
652 |                if data.loc['RELHD%02d' % i] != ''}
653 |         res = pd.Series(res, name=data.loc['RELHP  ']).sort_index()
654 |         return res
655 | 
656 |     ###########################################################################
657 |     def get_next_release_dates(self, mnemonics, n_releases=1):
658 |         """ Return the next date of release (NDoR) for a given economic series.
659 |             Could return results for up to 12 releases in advance.
660 | 
661 |             Returned fields:
662 |               DATE        - Date of release (or start of range where the exact
663 |                             date is not available)
664 |               DATE_LATEST - End of range when a range is given
665 |               TIME_GMT    - Expected time of release (for "official" dates only)
666 |               DATE_FLAG   - Indicates whether the dates are "official" ones from
667 |                             the source, or estimated by Thomson Reuters where
668 |                             "officials" not available
669 |               REF_PERIOD  - Corresponding reference period for the release
670 |               TYPE        - Indicates whether the release is for a new reference
671 |                             period ("NewValue") or an update to a period for which
672 |                             there has already been a release ("ValueUpdate")
673 |         """
674 |         if n_releases > 12:
675 |             raise Exception('Only up to 12 months in advance could be requested')
676 |         if n_releases < 1:
677 |             raise Exception('n_releases smaller than 1 does not make sense')
678 | 
679 |         # Fetch and parse
680 |         reqs = [self.construct_request(mnemonics, f'DS.NDOR{i+1}', static=True)
681 |                 for i in range(n_releases)]
682 |         res_parsed = self.parse_response(self.request_many(reqs))
683 | 
684 |         # Rearrange the output
685 |         res = []
686 |         for r in res_parsed:
687 |             x = r.reset_index(level=1, drop=True)
688 |             x.index.name = 'Mnemonic'
689 |             # Index of the release counting from now
690 |             ndor_idx = [_.split('_')[0] for _ in x.columns][0]
691 |             x['ReleaseNo'] = int(ndor_idx.replace('DS.NDOR', ''))
692 |             x = x.set_index('ReleaseNo', append=True)
693 |             x.columns = [_.replace(ndor_idx+'_', '') for _ in x.columns]
694 |             res.append(x)
695 | 
696 |         res = pd.concat(res).sort_index()
697 |         for col in ['DATE', 'DATE_LATEST', 'REF_PERIOD']:
698 |             res[col] = pd.to_datetime(res[col], errors='coerce')
699 |         return res
700 | 
701 |     ###########################################################################
702 |     @lazy_property
703 |     def vacations_list(self):
704 |         """ List of mnemonics for holidays in different countries """
705 |         res = self.fetch('HOLIDAYS', ['MNEM', 'ENAME', 'GEOGN', 'GEOGC'], static=True)
706 |         return res[res['MNEM'] != 'NA'].sort_values('GEOGC')
707 | 
708 |     def get_trading_days(self, countries, date_from=None, date_to=None):
709 |         """ Get list of trading dates for a given countries (speficied by ISO-2c)
710 |             Returning dataframe will contain values 1 and NaN, where 1 identifies
711 |             the business day.
712 | 
713 |             So by multiplying this list with the price time series it will remove
714 |             padded values on non-trading days.
715 | 
716 |             Example:
717 |                 DS.get_trading_days(['US', 'UK', 'RS'], date_from='2010-01-01')
718 |         """
719 |         if isinstance(countries, str):
720 |             countries = [countries]
721 | 
722 |         vacs = self.vacations_list
723 |         mnems = vacs[vacs.GEOGC.isin(countries)]
724 |         missing_isos = set(countries).difference(mnems.GEOGC)
725 | 
726 |         if missing_isos:
727 |             raise DatastreamException(f'Unknowns ISO codes: {", ".join(missing_isos)}')
728 |         # By default 0 and NaN are returned, so we add 1
729 |         res = self.fetch(mnems.MNEM, date_from=date_from, date_to=date_to) + 1
730 | 
731 |         if len(countries) == 1:
732 |             return res.iloc[:, 0].to_frame(name=countries[0])
733 |         return (res.iloc[:, 0].unstack(level=0)
734 |                    .rename(columns=mnems.set_index('MNEM')['GEOGC'])[countries])
735 | 


--------------------------------------------------------------------------------
/pytest.ini:
--------------------------------------------------------------------------------
1 | [pytest]
2 | addopts = -rs -p no:dash
3 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [metadata]
2 | description-file = README.md
3 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | from distutils.core import setup
 2 | 
 3 | DESCRIPTION = ('Python interface to the Refinitiv Datastream (former Thomson '
 4 |                'Reuters Datastream) API via Datastream Web Services (DSWS)')
 5 | 
 6 | # Long description to be published in PyPi
 7 | LONG_DESCRIPTION = """
 8 | **PyDatastream** is a Python interface to the Refinitiv Datastream (former Thomson
 9 | Reuters Datastream) API via Datastream Web Services (DSWS) (non free),
10 | with some convenience functions. This package requires valid credentials for this
11 | API.
12 | 
13 | For the documentation please refer to README.md inside the package or on the
14 | GitHub (https://github.com/vfilimonov/pydatastream/blob/master/README.md).
15 | """
16 | 
17 | _URL = 'http://github.com/vfilimonov/pydatastream'
18 | 
19 | __version__ = __author__ = __email__ = None  # will be extracted from _version.py
20 | exec(open('pydatastream/_version.py').read())  # defines __version__ pylint: disable=W0122
21 | 
22 | setup(name='PyDatastream',
23 |       version=__version__,
24 |       description=DESCRIPTION,
25 |       long_description=LONG_DESCRIPTION,
26 |       url=_URL,
27 |       download_url=_URL + '/archive/v' + __version__ + '.zip',
28 |       author=__author__,
29 |       author_email=__email__,
30 |       license='MIT License',
31 |       packages=['pydatastream'],
32 |       install_requires=['requests'],
33 |       extras_require={
34 |           'pandas':  ['pandas'],
35 |           },
36 |       classifiers=['Programming Language :: Python :: 3'],
37 |       )
38 | 


--------------------------------------------------------------------------------
/tests/test_basic.py:
--------------------------------------------------------------------------------
  1 | """ Tests for basic pydatastream functionality
  2 | 
  3 |     (c) Vladimir Filimonov, 2019
  4 | """
  5 | # pylint: disable=C0103,C0301
  6 | import warnings
  7 | import pytest
  8 | import pydatastream as pds
  9 | 
 10 | 
 11 | ###############################################################################
 12 | def empty_datastream():
 13 |     """ Create an instance of datastream """
 14 |     with warnings.catch_warnings():
 15 |         warnings.simplefilter("ignore")
 16 |         return pds.Datastream(None, None)
 17 | 
 18 | 
 19 | ###############################################################################
 20 | ###############################################################################
 21 | def test_create():
 22 |     """ Test init without username """
 23 |     empty_datastream()
 24 | 
 25 | 
 26 | ###############################################################################
 27 | def test_create_wrong_credentials():
 28 |     """ Test init with wrong credentials - should raise an exception """
 29 |     with pytest.raises(pds.DatastreamException):
 30 |         assert pds.Datastream("USERNAME", "PASSWORD")
 31 | 
 32 | 
 33 | ###############################################################################
 34 | def test_convert_date():
 35 |     """ Test _convert_date() """
 36 |     from pydatastream.pydatastream import _convert_date
 37 |     assert _convert_date('5 January 2019') == '2019-01-05'
 38 |     assert _convert_date('04/02/2017') == '2017-04-02'
 39 |     assert _convert_date('2019') == '2019-01-01'
 40 |     assert _convert_date('2019-04') == '2019-04-01'
 41 |     assert _convert_date(None) == ''
 42 |     assert _convert_date('bdate') == 'BDATE'
 43 | 
 44 | 
 45 | ###############################################################################
 46 | def test_parse_dates():
 47 |     """ Test _parse_dates() """
 48 |     from pydatastream.pydatastream import _parse_dates
 49 |     import pandas as pd
 50 |     assert _parse_dates('/Date(1565817068486)/') == pd.Timestamp('2019-08-14 21:11:08.486000')
 51 |     assert _parse_dates('/Date(1565568000000+0000)/') == pd.Timestamp('2019-08-12 00:00:00')
 52 | 
 53 |     # Array of dates
 54 |     dates = ['/Date(1217548800000+0000)/', '/Date(1217808000000+0000)/',
 55 |              '/Date(1217894400000+0000)/', '/Date(1217980800000+0000)/',
 56 |              '/Date(1218067200000+0000)/', '/Date(1218153600000+0000)/',
 57 |              '/Date(1218412800000+0000)/', '/Date(1218499200000+0000)/',
 58 |              '/Date(1218585600000+0000)/', '/Date(1218672000000+0000)/',
 59 |              '/Date(1218758400000+0000)/', '/Date(1219017600000+0000)/',
 60 |              '/Date(1219104000000+0000)/', '/Date(1219190400000+0000)/',
 61 |              '/Date(1219276800000+0000)/', '/Date(1219363200000+0000)/',
 62 |              '/Date(1219622400000+0000)/', '/Date(1219708800000+0000)/',
 63 |              '/Date(1219795200000+0000)/', '/Date(1219881600000+0000)/',
 64 |              '/Date(1219968000000+0000)/', '/Date(1220227200000+0000)/']
 65 |     res = ['2008-08-01', '2008-08-04', '2008-08-05', '2008-08-06', '2008-08-07',
 66 |            '2008-08-08', '2008-08-11', '2008-08-12', '2008-08-13', '2008-08-14',
 67 |            '2008-08-15', '2008-08-18', '2008-08-19', '2008-08-20', '2008-08-21',
 68 |            '2008-08-22', '2008-08-25', '2008-08-26', '2008-08-27', '2008-08-28',
 69 |            '2008-08-29', '2008-09-01']
 70 |     assert pd.np.all(_parse_dates(dates) == pd.np.array(res, dtype='datetime64[ns]'))
 71 | 
 72 | 
 73 | ###############################################################################
 74 | # Construct requests
 75 | ###############################################################################
 76 | def test_construct_request_series_1():
 77 |     """ Construct request for series """
 78 |     req = {'Instrument': {'Value': '@AAPL', 'Properties': [{'Key': 'ReturnName', 'Value': True}]},
 79 |            'Date': {'Start': '2008-01-01', 'End': '2009-01-01', 'Frequency': '', 'Kind': 1},
 80 |            'DataTypes': []}
 81 |     DS = empty_datastream()
 82 |     assert DS.construct_request('@AAPL', date_from='2008', date_to='2009') == req
 83 | 
 84 | 
 85 | ###############################################################################
 86 | def test_construct_request_series_2():
 87 |     """ Construct request for series """
 88 |     req = {'Instrument': {'Value': '@AAPL', 'Properties': [{'Key': 'ReturnName', 'Value': True}]},
 89 |            'Date': {'Start': 'BDATE', 'End': '', 'Frequency': '', 'Kind': 1},
 90 |            'DataTypes': [{'Value': 'P', 'Properties': [{'Key': 'ReturnName', 'Value': True}]},
 91 |                          {'Value': 'MV', 'Properties': [{'Key': 'ReturnName', 'Value': True}]},
 92 |                          {'Value': 'VO', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}]}
 93 |     DS = empty_datastream()
 94 |     assert DS.construct_request('@AAPL', ['P', 'MV', 'VO'], date_from='bdate') == req
 95 | 
 96 | 
 97 | ###############################################################################
 98 | def test_construct_request_static_1():
 99 |     """ Construct static request """
100 |     req = {'Instrument': {'Value': 'D:BAS,D:BASX,HN:BAS',
101 |                           'Properties': [{'Key': 'IsList', 'Value': True},
102 |                                          {'Key': 'ReturnName', 'Value': True}]},
103 |            'Date': {'Start': '', 'End': '', 'Frequency': '', 'Kind': 0},
104 |            'DataTypes': [{'Value': 'ISIN', 'Properties': [{'Key': 'ReturnName', 'Value': True}]},
105 |                          {'Value': 'ISINID', 'Properties': [{'Key': 'ReturnName', 'Value': True}]},
106 |                          {'Value': 'NAME', 'Properties': [{'Key': 'ReturnName', 'Value': True}]}]}
107 |     DS = empty_datastream()
108 |     assert DS.construct_request(['D:BAS', 'D:BASX', 'HN:BAS'], ['ISIN', 'ISINID', 'NAME'], static=True) == req
109 | 
110 | 
111 | ###############################################################################
112 | # Parse responses
113 | ###############################################################################
114 | def test_parse_response_static_1():
115 |     """ Parse static request """
116 |     import pandas as pd
117 |     from io import StringIO
118 | 
119 |     s = (',,ISIN,ISINID,NAME\nD:BAS,2019-09-27,DE000BASF111,P,BASF\n'
120 |          'D:BASX,2019-09-27,DE000BASF111,S,BASF (XET)\n'
121 |          'HN:BAS,2019-09-27,DE000BASF111,S,BASF (BUD)\n')
122 |     df = pd.read_csv(StringIO(s), index_col=[0, 1], parse_dates=[1])
123 | 
124 |     res = {'AdditionalResponses': None,
125 |            'DataTypeNames': [{'Key': 'ISIN', 'Value': 'ISIN CODE'},
126 |                              {'Key': 'ISINID', 'Value': 'QUOTE INDICATOR'},
127 |                              {'Key': 'NAME', 'Value': 'NAME'}],
128 |            'DataTypeValues': [{'DataType': 'ISIN',
129 |                                'SymbolValues': [{'Currency': 'E ', 'Symbol': 'D:BAS', 'Type': 6, 'Value': 'DE000BASF111'},
130 |                                                 {'Currency': 'E ', 'Symbol': 'D:BASX', 'Type': 6, 'Value': 'DE000BASF111'},
131 |                                                 {'Currency': 'HF', 'Symbol': 'HN:BAS', 'Type': 6, 'Value': 'DE000BASF111'}]},
132 |                               {'DataType': 'ISINID',
133 |                                'SymbolValues': [{'Currency': 'E ', 'Symbol': 'D:BAS', 'Type': 6, 'Value': 'P'},
134 |                                                 {'Currency': 'E ', 'Symbol': 'D:BASX', 'Type': 6, 'Value': 'S'},
135 |                                                 {'Currency': 'HF', 'Symbol': 'HN:BAS', 'Type': 6, 'Value': 'S'}]},
136 |                               {'DataType': 'NAME',
137 |                                'SymbolValues': [{'Currency': 'E ', 'Symbol': 'D:BAS', 'Type': 6, 'Value': 'BASF'},
138 |                                                 {'Currency': 'E ', 'Symbol': 'D:BASX', 'Type': 6, 'Value': 'BASF (XET)'},
139 |                                                 {'Currency': 'HF', 'Symbol': 'HN:BAS', 'Type': 6, 'Value': 'BASF (BUD)'}]}],
140 |            'Dates': ['/Date(1569542400000+0000)/'],
141 |            'SymbolNames': [{'Key': 'D:BAS', 'Value': 'D:BAS'},
142 |                            {'Key': 'D:BASX', 'Value': 'D:BASX'},
143 |                            {'Key': 'HN:BAS', 'Value': 'HN:BAS'}], 'Tag': None}
144 |     res = {'DataResponse': res, 'Properties': None}
145 | 
146 |     DS = empty_datastream()
147 |     assert DS.parse_response(res).equals(df)
148 | 
149 | 
150 | ###############################################################################
151 | def test_parse_response_series_1():
152 |     """ Parse response as a series """
153 |     import pandas as pd
154 |     from io import StringIO
155 | 
156 |     s = (',,P,MV,VO\n@AAPL,1999-12-31,3.6719,16540.47,40952.8\n@AAPL,2000-01-03,3.9978,18008.5,133932.3\n'
157 |          '@AAPL,2000-01-04,3.6607,16490.2,127909.5\n@AAPL,2000-01-05,3.7143,16760.53,194426.3\n'
158 |          '@AAPL,2000-01-06,3.3929,15310.1,191981.9\n@AAPL,2000-01-07,3.5536,16035.32,115180.7\n'
159 |          '@AAPL,2000-01-10,3.4911,15753.29,126265.9\n')
160 |     df = pd.read_csv(StringIO(s), index_col=[0, 1], parse_dates=[1])
161 | 
162 |     res = {'AdditionalResponses': [{'Key': 'Frequency', 'Value': 'D'}],
163 |            'DataTypeNames': None,
164 |            'DataTypeValues': [{'DataType': 'P',
165 |                                'SymbolValues': [{'Currency': 'U$', 'Symbol': '@AAPL', 'Type': 10,
166 |                                                  'Value': [3.6719, 3.9978, 3.6607, 3.7143, 3.3929, 3.5536, 3.4911]}]},
167 |                               {'DataType': 'MV',
168 |                                'SymbolValues': [{'Currency': 'U$', 'Symbol': '@AAPL', 'Type': 10,
169 |                                                  'Value': [16540.47, 18008.5, 16490.2, 16760.53, 15310.1, 16035.32, 15753.29]}]},
170 |                               {'DataType': 'VO',
171 |                                'SymbolValues': [{'Currency': 'U$', 'Symbol': '@AAPL', 'Type': 10,
172 |                                                  'Value': [40952.8, 133932.3, 127909.5, 194426.3, 191981.9, 115180.7, 126265.9]}]}],
173 |            'Dates': ['/Date(946598400000+0000)/', '/Date(946857600000+0000)/',
174 |                      '/Date(946944000000+0000)/', '/Date(947030400000+0000)/',
175 |                      '/Date(947116800000+0000)/', '/Date(947203200000+0000)/',
176 |                      '/Date(947462400000+0000)/'],
177 |            'SymbolNames': None, 'Tag': None}
178 |     res = {'DataResponse': res, 'Properties': None}
179 | 
180 |     DS = empty_datastream()
181 |     assert DS.parse_response(res).equals(df)
182 | 


--------------------------------------------------------------------------------
/tests/test_fetch.py:
--------------------------------------------------------------------------------
 1 | """ Tests for fetching the data using pydatastream
 2 | 
 3 |     Note: Proper credentials should be stored in environment variables:
 4 |           DSWS_USER and DSWS_PASS
 5 | 
 6 |     (c) Vladimir Filimonov, 2019
 7 | """
 8 | import os
 9 | import pytest
10 | import pydatastream as pds
11 | 
12 | 
13 | ###############################################################################
14 | ###############################################################################
15 | def has_credentials():
16 |     """ Credentials are in the environment variables """
17 |     return ('DSWS_USER' in os.environ) and ('DSWS_PASS' in os.environ)
18 | 
19 | 
20 | def init_datastream():
21 |     """ Create an instance of datastream """
22 |     return pds.Datastream(os.environ['DSWS_USER'], os.environ['DSWS_PASS'])
23 | 
24 | 
25 | ###############################################################################
26 | # Tests will be skipped if credentials are not passed
27 | ###############################################################################
28 | @pytest.mark.skipif(not has_credentials(),
29 |                     reason="credentials are not passed")
30 | def test_create_with_login():
31 |     """ Test init with credentials """
32 |     init_datastream()
33 | 


--------------------------------------------------------------------------------