├── .gitignore
├── CHANGELOG.md
├── LICENSE.md
├── README.md
├── config
├── __init__.py
└── config_request.py
├── constants
├── __init__.py
└── feed_constants.py
├── enums
├── __init__.py
├── config_enums.py
├── feed_enums.py
└── file_enums.py
├── errors
├── __init__.py
└── custom_exceptions.py
├── examples
├── __init__.py
└── config_examples.py
├── feed
├── __init__.py
└── feed_request.py
├── feed_cli.py
├── filter
├── __init__.py
└── feed_filter.py
├── requirements.txt
├── sample-config
├── config-file-download
├── config-file-download-filter
├── config-file-filter
└── config-file-query-only
├── tests
├── __init__.py
├── test-data
│ ├── test_config
│ └── test_json
├── test_config_request.py
├── test_date_utils.py
├── test_feed_filter.py
├── test_feed_request.py
├── test_file_utils.py
├── test_filter_utils.py
└── test_logging_utils.py
└── utils
├── __init__.py
├── date_utils.py
├── file_utils.py
├── filter_utils.py
└── logging_utils.py
/.gitignore:
--------------------------------------------------------------------------------
1 | .idea/
2 | venv/
3 | ## File-based project format:
4 | *.iws
5 |
6 | # IntelliJ
7 | out/
8 |
9 | # Python
10 | # Byte-compiled / optimized / DLL files
11 | *.py[cod]
12 | *$py.class
13 |
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | Feed SDK Python CHANGE LOG
2 | ==========================
3 | # 1.0.1-RELEASE (2022/04/12)
4 | Enhancement Requests:
5 | - added supports for Python 3
6 |
7 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
10 |
11 | "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
12 |
13 | "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
14 |
15 | "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.
16 |
17 | "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
18 |
19 | "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.
20 |
21 | "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).
22 |
23 | "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.
24 |
25 | "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."
26 |
27 | "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
28 |
29 | 2. Grant of Copyright License.
30 |
31 | Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
32 |
33 | 3. Grant of Patent License.
34 |
35 | Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
36 |
37 | 4. Redistribution.
38 |
39 | You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
40 |
41 | You must give any other recipients of the Work or Derivative Works a copy of this License; and
42 | You must cause any modified files to carry prominent notices stating that You changed the files; and
43 | You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
44 | If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
45 | You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
46 |
47 | 5. Submission of Contributions.
48 |
49 | Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
50 |
51 | 6. Trademarks.
52 |
53 | This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
54 |
55 | 7. Disclaimer of Warranty.
56 |
57 | Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
58 |
59 | 8. Limitation of Liability.
60 |
61 | In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
62 |
63 | 9. Accepting Warranty or Additional Liability.
64 |
65 | While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
66 |
67 | END OF TERMS AND CONDITIONS
68 |
69 | APPENDIX: How to apply the Apache License to your work
70 |
71 | To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives.
72 |
73 | Copyright [yyyy] [name of copyright owner]
74 |
75 | Licensed under the Apache License, Version 2.0 (the "License");
76 | you may not use this file except in compliance with the License.
77 | You may obtain a copy of the License at
78 |
79 | http://www.apache.org/licenses/LICENSE-2.0
80 |
81 | Unless required by applicable law or agreed to in writing, software
82 | distributed under the License is distributed on an "AS IS" BASIS,
83 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
84 | See the License for the specific language governing permissions and
85 | limitations under the License.
86 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Feed SDK
2 | ==========
3 | Python SDK for downloading and filtering item feed files
4 |
5 | Table of contents
6 | ==========
7 | * [Summary](#summary)
8 | * [Setup](#setup)
9 | - [Setting up in the local environment](#setting-up-in-the-local-environment)
10 | * [Downloading feed files](#downloading-feed-files)
11 | - [Customizing download location](#customizing-download-location)
12 | * [Filtering feed files](#filtering-feed-files)
13 | - [Available filters](#available-filters)
14 | - [Combining filter criteria](#combining-filter-criteria)
15 | - [Additional filter arguments](#additional-filter-arguments)
16 | * [Schemas](#schemas)
17 | - [GetFeedResponse](#getfeedresponse)
18 | - [Response](#response)
19 | * [Logging](#logging)
20 | * [Usage](#usage)
21 | - [Using command line options](#using-command-line-options)
22 | - [Using config file driven approach](#using-config-file-driven-approach)
23 | - [Using function calls](#using-function-calls)
24 | - [Code samples](#examples)
25 | * [Performance](#performance)
26 | * [Important notes](#important-notes)
27 |
28 | # Summary
29 |
30 | Similar to [Java Feed SDK](https://github.com/eBay/FeedSDK), this Python SDK facilitates download and filtering of eBay's item feed files provided through public [Feed API](https://developer.ebay.com/api-docs/buy/feed/overview.html).
31 |
32 | The feed SDK provides a simple interface to -
33 | * [Download](#downloading-feed-files)
34 | * [Filter](#filtering-feed-files)
35 |
36 | # Setup
37 |
38 | The the entire repository can be cloned/forked and changes can be made. You are most welcome to collaborate and enhance the existing code base.
39 |
40 | ## Setting up in the local environment
41 |
42 | For setting up the project in your local environment
43 | * Clone or download the repository
44 | * Install the requirements
45 | To set up your environment, please see the requirements listed in [requirements.txt](https://github.com/eBay/FeedSDK-Python/blob/master/requirements.txt). You can run $ pip install -r requirements.txt command to install all the requirements.
46 |
47 |
48 | ## Downloading feed files
49 | The feed files can be as big as several gigabytes. Feed API supports downloading such big feed files in chunks. Chunk size is 100 MB in production environment and is 10 MB in soundbox environment.
50 |
51 | The SDK abstracts the complexity involved in calculating the request header '__range__' based on the response header '__content-range__' and downloads and appends all the chunks until the whole feed file is downloaded.
52 |
53 | To download a feed file in production which is -
54 | * __bootstrap__ : (feed_scope = ALL_ACTIVE)
55 | * __L1 category 1__ : (category_id = 220)
56 | * __marketplace US__ : (X-EBAY-C-MARKETPLACE-ID: EBAY_US)
57 | instantiate a Feed object and call get() function
58 |
59 | ```
60 | feed_obj = Feed(feed_type='item', feed_scope='ALL_ACTIVE', category_id='220',
61 | marketplace_id='EBAY_US', token=, environment='PRODUCTION')
62 | result_code, api_status_code, file_path = feed_obj.get()
63 |
64 | ```
65 | The __filePath__ denotes the location where the file was downloaded.
66 |
67 | ### Customizing download location
68 |
69 | The default download location is ~/Desktop/feed-sdk directory. If the directory does not exist, it will be created.
70 | The download location can be changed by specifying the optional 'download_location' argument when instantiating Feed.
71 | The download location should point to a directory. If the directory does not exist, it will be created.
72 | For example, to download to the location __/tmp/feed__ -
73 |
74 | ```
75 | feed_obj = Feed(feed_type='item', feed_scope='ALL_ACTIVE', category_id='220',
76 | marketplace_id='EBAY_US', token=, environment='PRODUCTION',
77 | download_location='/tmp/feed')
78 | ```
79 | ---
80 |
81 | ## Filtering feed files
82 |
83 | ### Available filters
84 | The SDK provides the capability to filter the feed files based on :-
85 | * List of leaf category ids
86 | * List of seller usernames
87 | * List of item locations
88 | * List of item IDs
89 | * List of EPIDs
90 | * List of inferred EPIDs
91 | * List of GTINs
92 | * Price range
93 | * Any other SQL query
94 |
95 | On successful completion of a filter operation, a new __filtered__ file is created in the same directory as the feed file's.
96 |
97 | To filter a feed file on leaf category IDs create an object of FeedFilterRequest and call filter() function -
98 | ```
99 | feed_filter_obj = FeedFilterRequest(input_fila_path=,
100 | leaf_category_ids=)
101 | file_path = feed_filter_obj.filter()
102 |
103 | ```
104 |
105 | To filter on availability threshold type and availability threshold via any_query parameter
106 | ```
107 | feed_filter_obj = FeedFilterRequest(input_fila_path=,
108 | any_query='AvailabilityThresholdType=\'MORE_THAN\' AND AvailabilityThreshold==10')
109 | file_path = feed_filter_obj.filter()
110 |
111 | ```
112 |
113 | The __file_path__ denotes the location of the filtered file. The file_path value can also be read by filter_request.filtered_file_path.
114 |
115 | ### Combining filter criteria
116 |
117 | The SDK provides the freedom to combine the filter criteria.
118 |
119 | To filter on leaf category IDs and seller user names for listings in the price range of 1 to 100
120 |
121 | ```
122 | feed_filter_obj = FeedFilterRequest(input_fila_path=,
123 | leaf_category_ids=,
124 | seller_names=,
125 | price_lower_limit=1, price_upper_limit=100)
126 | file_path = feed_filter_obj.filter()
127 |
128 | ```
129 |
130 | To filter on item location countries for listings that have more than 10 items available
131 |
132 | ```
133 | feed_filter_obj = FeedFilterRequest(input_fila_path=,
134 | item_location_countries=,
135 | any_query='AvailabilityThresholdType=\'MORE_THAN\' AND AvailabilityThreshold=10')
136 | file_path = feed_filter_obj.filter()
137 |
138 | ```
139 |
140 | ### Additional filter arguments
141 | When filter function is called, feed data is loaded into a sqlite DB.
142 | If keep_db=True argument is passed to filter function, the sqlite db file is kept in the current directory with name sqlite_feed_sdk.db, otherwise it will be deleted after the program execution.
143 |
144 | By default all the columns except Title, ImageUrl, and AdditionalImageUrls are processed. This behaviour can be changed by passing column_name_list argument to filter function and changing IGNORE_COLUMNS set in feed_filter.py.
145 |
146 | ---
147 | ### Schemas
148 | This section provides more detail on what information is contained within the objects returned from the SDK function calls.
149 |
150 | ### GetFeedResponse
151 |
152 | An instance of GetFeedResponse named tuple is returned from the feed_obj.get() function.
153 |
154 | ```
155 | int status_ode
156 | String message
157 | String file_path
158 | List errors
159 |
160 | ```
161 |
162 | | Field name | Description
163 | |---|---|
164 | | status_code | int: 0 indicates a successful response. Any non zero value indicates an error
165 | | message | String: Detailed information on the status
166 | | file_path | String: Absolute path of the location of the resulting file
167 | | errors | List: Detailed error information
168 |
169 |
170 | ### Response
171 |
172 | An instance of Response named tuple is returned from feed_filter_object.filter() function.
173 |
174 | ```
175 | int status_code
176 | String message
177 | String file_path
178 | List applied_filters
179 | ```
180 | | Field name | Description
181 | |---|---|
182 | | status_code | int: 0 indicates a successful response. Any non zero value indicates an error
183 | | message | String: Detailed information on the status
184 | | file_path | String: Absolute path of the location of the resulting file
185 | | applied_filters | List: List of queries applied
186 |
187 | ---
188 | ## Logging
189 |
190 | Log files are created in the current directory.
191 |
192 | __Ensure that appropriate permissions are present to write to the directory__
193 |
194 | * The current log file name is : feed-sdk-log.log
195 | * Rolling log files are created per day with the pattern : feed-sdk-log.{yyyy-MM-dd}.log
196 |
197 | ---
198 | ## Usage
199 |
200 | The following sections describe the different ways in which the SDK can be used
201 |
202 | ### Using command line options
203 |
204 | All the capabilities of the SDK can be invoked using the command line.
205 |
206 | To see the available options and filters , use '--help'
207 | ```
208 | usage: FeedSDK [-h] [-dt DT] -c1 C1 [-scope {ALL_ACTIVE,NEWLY_LISTED}]
209 | [-mkt MKT] [-token TOKEN] [-env {SANDBOX,PRODUCTION}]
210 | [-lf LF [LF ...]] [-sellerf SELLERF [SELLERF ...]]
211 | [-locf LOCF [LOCF ...]] [-pricelf PRICELF] [-priceuf PRICEUF]
212 | [-epidf EPIDF [EPIDF ...]] [-iepidf IEPIDF [IEPIDF ...]]
213 | [-gtinf GTINF [GTINF ...]] [-itemf ITEMF [ITEMF ...]]
214 | [-dl DOWNLOADLOCATION] [--filteronly] [-format FORMAT] [-qf QF]
215 |
216 | Feed SDK CLI
217 |
218 | optional arguments:
219 | -h, --help show this help message and exit
220 | -dt DT the date when feed file was generated
221 | -c1 C1 the l1 category id of the feed file
222 | -scope {ALL_ACTIVE,NEWLY_LISTED}
223 | the feed scope. Available scopes are ALL_ACTIVE or
224 | NEWLY_LISTED
225 | -mkt MKT the marketplace id for which feed is being requested.
226 | For example - EBAY_US
227 | -token TOKEN the oauth token for the consumer. Omit the word
228 | 'Bearer'
229 | -env {SANDBOX,PRODUCTION}
230 | environment type. Supported Environments are SANDBOX
231 | and PRODUCTION
232 | -lf LF [LF ...] list of leaf categories which are used to filter the
233 | feed
234 | -sellerf SELLERF [SELLERF ...]
235 | list of seller names which are used to filter the feed
236 | -locf LOCF [LOCF ...]
237 | list of item locations which are used to filter the
238 | feed
239 | -pricelf PRICELF lower limit of the price range for items in the feed
240 | -priceuf PRICEUF upper limit of the price range for items in the feed
241 | -epidf EPIDF [EPIDF ...]
242 | list of epids which are used to filter the feed
243 | -iepidf IEPIDF [IEPIDF ...]
244 | list of inferred epids which are used to filter the
245 | feed
246 | -gtinf GTINF [GTINF ...]
247 | list of gtins which are used to filter the feed
248 | -itemf ITEMF [ITEMF ...]
249 | list of item IDs which are used to filter the feed
250 | -dl DOWNLOADLOCATION, --downloadlocation DOWNLOADLOCATION
251 | override for changing the directory where files are
252 | downloaded
253 | --filteronly filter the feed file that already exists in the
254 | default path or the path specified by -dl,
255 | --downloadlocation option. If --filteronly option is
256 | not specified, the feed file will be downloaded again
257 | -format FORMAT feed and filter file format. Default is gzip
258 | -qf QF any other query to filter the feed file. See Python
259 | dataframe query format
260 | ```
261 | For example, to use the command line options to
262 |
263 | Download and filter feed files using token
264 | ```
265 | python feed_cli.py -c1 3252 -scope ALL_ACTIVE -mkt EBAY_DE -env PRODUCTION -qf "AvailabilityThreshold=10" -locf IT GB -dl DIR -token xxx
266 | ```
267 |
268 | Filter feed files, no token is needed
269 | ```
270 | python feed_cli.py --filteronly -c1 260 -pricelf 5 -priceuf 20 -dl FILE_PATH
271 | ```
272 |
273 | ### Using config file driven approach
274 |
275 | All the capabilities of the SDK can be leveraged via a config file.
276 | The feed file download and filter parameters can be specified in the config file for multiple files, and SDK will process them sequentially.
277 |
278 | The structure of the config file
279 |
280 | ```
281 | {
282 | "requests": [
283 | {
284 | "feedRequest": {
285 | "categoryId": "260",
286 | "marketplaceId": "EBAY_US",
287 | "feedScope": "ALL_ACTIVE",
288 | "type": "ITEM"
289 | },
290 | "filterRequest": {
291 | "itemLocationCountries": [
292 | "US",
293 | "HK",
294 | "CA"
295 | ],
296 | "priceLowerLimit": 10.0,
297 | "priceUpperLimit": 100.0
298 | }
299 | },
300 | {
301 | "feedRequest": {
302 | "categoryId": "220",
303 | "marketplaceId": "EBAY_US",
304 | "date": "20190127",
305 | "feedScope": "NEWLY_LISTED",
306 | "type": "ITEM"
307 | }
308 | },
309 | {
310 | "filterRequest": {
311 | "inputFilePath": "",
312 | "leafCategoryIds": [
313 | "112529",
314 | "64619",
315 | "111694"
316 | ],
317 | "itemLocationCountries": [
318 | "DE",
319 | "GB",
320 | "ES"
321 | ],
322 | "anyQuery": "AvailabilityThresholdType='MORE_THAN' AND AvailabilityThreshold=10",
323 | "fileFormat" : "gzip"
324 | }
325 | }
326 | ]
327 | }
328 | ```
329 | An example of using the SDK through a config file is located at
330 |
331 | [Example config file - 1](https://github.com/eBay/FeedSDK-Python/blob/master/sample-config/config-file-download)
332 |
333 | [Example config file - 2](https://github.com/eBay/FeedSDK-Python/blob/master/sample-config/config-file-download-filter)
334 |
335 | [Example config file - 3](https://github.com/eBay/FeedSDK-Python/blob/master/sample-config/config-file-filter)
336 |
337 | [Example config file - 4](https://github.com/eBay/FeedSDK-Python/blob/master/sample-config/config-file-query-only)
338 |
339 | ### Using function calls
340 |
341 | Samples showing the usage of available operations and filters.
342 |
343 | #### Examples
344 | All the examples are located [__here__](https://github.com/eBay/FeedSDK-Python/tree/master/examples)
345 | [Download and filter by config request](https://github.com/eBay/FeedSDK-Python/blob/master/examples/config_examples.py)
346 |
347 |
348 | ---
349 | ## Performance
350 | | Category | Type | Size gz | Size unzipped | Records | Applied Filters | Filter Time | Loading Time | Save Time
351 | |---|---|---|---|---|---|---|---|---|
352 | | 11450 | BOOTSTRAP | 4.66 GB | 89.51 GB | 63.2 Million | PriceValue, AvailabilityThresholdType, AvailabilityThreshold | ~ 7 min | ~ 98 min | ~ 2 min
353 | | 220 | BOOTSTRAP | 867.8 MB | 4.26 GB | 3.3 Million | price, AvailabilityThresholdType, AvailabilityThreshold | ~ 18 sec | ~ 5 min | ~ 37 sec
354 | | 1281 | BOOTSTRAP | 118.4 MB | 1.06 GB | 812558 | item locations, AcceptedPaymentMethods | ~ 24 sec | ~ 1.2 min | ~ 1.8 min
355 | | 11232 | BOOTSTRAP | 102.5 MB | 499.9 MB | 405268 | epids, inferredEpids | ~ 0.3 sec | ~ 37 sec | ~ 0.003 sec
356 | | 550 | BOOTSTRAP | 60.7 MB | 986.5 MB | 1000795 | price, sellers, item locations | ~ 4 sec | ~ 1.4 min | ~ 0.1 sec
357 | | 260 | BOOTSTRAP | 2.3 MB | 15.6 MB | 24100 | price, AvailabilityThresholdType, AvailabilityThreshold | ~ 0.01 sec | ~ 2 sec | ~ 0.4 sec
358 | | 220 | DAILY | 13.5 MB | 60.4 MB | 55047 | price, leaf categories, item locations | ~ 0.08 sec | ~ 4 sec | ~ 0.007 sec
359 |
360 |
361 | ---
362 | ## Important notes
363 |
364 | * Ensure there is enough storage for feed files.
365 | * Ensure that the file storage directories have appropriate write permissions.
366 | * In case of failure in downloading due to network issues, the process needs to start again. There is no capability at the moment, to resume.
367 |
368 | # License
369 | Copyright (c) 2018-2022 eBay Inc.
370 |
371 | Use of this source code is governed by an Apache 2.0 license that can be found in the LICENSE file or at https://opensource.org/licenses/Apache-2.0.
372 |
--------------------------------------------------------------------------------
/config/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = [
2 | 'config_request'
3 | ]
4 |
--------------------------------------------------------------------------------
/config/config_request.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | import logging
19 | from os import path
20 | from utils.file_utils import read_json
21 | from feed.feed_request import Feed
22 | from filter.feed_filter import FeedFilterRequest
23 | from constants.feed_constants import SUCCESS_CODE
24 | from enums.config_enums import ConfigField, FeedField, FilterField
25 | from errors.custom_exceptions import ConfigError
26 | from utils.logging_utils import setup_logging
27 |
28 | setup_logging()
29 | logger = logging.getLogger(__name__)
30 |
31 |
32 | class ConfigRequest(object):
33 | def __init__(self, feed_obj, filter_request_obj):
34 | self.feed_obj = feed_obj
35 | self.filter_request_obj = filter_request_obj
36 |
37 | def __str__(self):
38 | return '[feed= %s, filter_request= %s]' % (self.feed_obj, self.filter_request_obj)
39 |
40 |
41 | class ConfigFileRequest(object):
42 | def __init__(self, config_file_path):
43 | self.file_path = config_file_path
44 | self.__token = None
45 | self.__config_json_obj = None
46 | self.__requests = []
47 |
48 | @property
49 | def requests(self):
50 | return self.__requests
51 |
52 | def parse_requests(self, token=None):
53 | self.__load_config()
54 | # populate requests list
55 | self.__create_requests(token)
56 |
57 | def process_requests(self):
58 | if not self.requests:
59 | logger.error('No requests to process')
60 | return False
61 | for config_request_obj in self.requests:
62 | get_response = None
63 | if config_request_obj.feed_obj:
64 | feed_req = config_request_obj.feed_obj
65 | get_response = feed_req.get()
66 | if get_response.status_code != SUCCESS_CODE:
67 | logger.error('Exception in downloading feed. Cannot proceed, continue to the next request\n'
68 | 'File Path: %s | Error message: %s\nFeed Request: %s\n', get_response.file_path,
69 | get_response.message, feed_req)
70 | continue
71 | if config_request_obj.filter_request_obj:
72 | filter_req = config_request_obj.filter_request_obj
73 | if get_response and get_response.file_path:
74 | # override input file path if set
75 | filter_req.input_file_path = get_response.file_path
76 | filter_response = filter_req.filter()
77 | if filter_response.status_code != SUCCESS_CODE:
78 | print(filter_response.message)
79 | return True
80 |
81 | def __load_config(self):
82 | # check the path
83 | if not self.file_path or not path.exists(self.file_path) or path.getsize(self.file_path) == 0:
84 | raise ConfigError('Config file %s does not exist or is empty' % self.file_path)
85 | # load the config file
86 | self.__config_json_obj = read_json(self.file_path)
87 | # check the config object
88 | if not self.__config_json_obj:
89 | raise ConfigError('Could not read config file %s' % self.file_path)
90 |
91 | def __create_requests(self, token):
92 | if ConfigField.REQUESTS.value not in self.__config_json_obj:
93 | raise ConfigError('No \"%s\" field exists in the config file %s' % (str(ConfigField.REQUESTS),
94 | self.file_path))
95 | for req in self.__config_json_obj[ConfigField.REQUESTS.value]:
96 | feed_obj = None
97 | feed_field = req.get(ConfigField.FEED_REQUEST.value)
98 | if feed_field:
99 | feed_obj = Feed(feed_field.get(FeedField.TYPE.value),
100 | feed_field.get(FeedField.SCOPE.value),
101 | feed_field.get(FeedField.CATEGORY_ID.value),
102 | feed_field.get(FeedField.MARKETPLACE_ID.value),
103 | token,
104 | feed_field.get(FeedField.DATE.value),
105 | feed_field.get(FeedField.ENVIRONMENT.value),
106 | feed_field.get(FeedField.DOWNLOAD_LOCATION.value),
107 | feed_field.get(FeedField.FILE_FORMAT.value))
108 | filter_request_obj = None
109 | filter_field = req.get(ConfigField.FILTER_REQUEST.value)
110 | if filter_field:
111 | filter_request_obj = FeedFilterRequest(str(filter_field.get(FilterField.INPUT_FILE_PATH.value)),
112 | filter_field.get(FilterField.ITEM_IDS.value),
113 | filter_field.get(FilterField.LEAF_CATEGORY_IDS.value),
114 | filter_field.get(FilterField.SELLER_NAMES.value),
115 | filter_field.get(FilterField.GTINS.value),
116 | filter_field.get(FilterField.EPIDS.value),
117 | filter_field.get(FilterField.PRICE_LOWER_LIMIT.value),
118 | filter_field.get(FilterField.PRICE_UPPER_LIMIT.value),
119 | filter_field.get(FilterField.ITEM_LOCATION_COUNTRIES.value),
120 | filter_field.get(FilterField.INFERRED_EPIDS.value),
121 | filter_field.get(FilterField.ANY_QUERY.value),
122 | filter_field.get(FilterField.FILE_FORMAT.value))
123 | config_request_obj = ConfigRequest(feed_obj, filter_request_obj)
124 | self.requests.append(config_request_obj)
125 |
--------------------------------------------------------------------------------
/constants/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = [
2 | 'feed_constants'
3 | ]
4 |
--------------------------------------------------------------------------------
/constants/feed_constants.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | REQUEST_TIMEOUT = 60
19 | REQUEST_RETRIES = 3
20 | BACK_OFF_TIME = 2
21 |
22 | FEED_API_PROD_URL = 'https://api.ebay.com/buy/feed/v1_beta/'
23 | FEED_API_SANDBOX_URL = 'https://api.sandbox.ebay.com/buy/feed/v1_beta/'
24 |
25 | # max content that can be downloaded in one request, in bytes
26 | PROD_CHUNK_SIZE = 104857600
27 | SANDBOX_CHUNK_SIZE = 10485760
28 |
29 | TOKEN_BEARER_PREFIX = 'Bearer '
30 |
31 | AUTHORIZATION_HEADER = 'Authorization'
32 | MARKETPLACE_HEADER = 'X-EBAY-C-MARKETPLACE-ID'
33 | CONTENT_TYPE_HEADER = 'Content-type'
34 | ACCEPT_HEADER = 'Accept'
35 | RANGE_HEADER = 'Range'
36 |
37 | CONTENT_RANGE_HEADER = 'Content-Range'
38 |
39 | RANGE_PREFIX = 'bytes='
40 |
41 | APPLICATION_JSON = 'application/json'
42 |
43 | QUERY_SCOPE = 'feed_scope'
44 | QUERY_CATEGORY_ID = 'category_id'
45 | QUERY_SNAPSHOT_DATE = 'snapshot_date'
46 | QUERY_DATE = 'date'
47 |
48 |
49 | SUCCESS_CODE = 0
50 | FAILURE_CODE = -1
51 |
52 | SUCCESS_STR = 'Success'
53 | FAILURE_STR = 'Failure'
54 |
55 | DATA_FRAME_CHUNK_SIZE = 2*(10**4) # rows
56 |
--------------------------------------------------------------------------------
/enums/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = [
2 | 'config_enums',
3 | 'feed_enums',
4 | 'file_enums'
5 | ]
6 |
--------------------------------------------------------------------------------
/enums/config_enums.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | from aenum import Enum, unique
19 |
20 |
21 | @unique
22 | class ConfigField(Enum):
23 | FILTER_REQUEST = 'filterRequest'
24 | FEED_REQUEST = 'feedRequest'
25 | REQUESTS = 'requests'
26 |
27 | def __str__(self):
28 | return str(self.value)
29 |
30 |
31 | @unique
32 | class FeedField(Enum):
33 | MARKETPLACE_ID = 'marketplaceId'
34 | CATEGORY_ID = 'categoryId'
35 | DATE = 'date'
36 | SCOPE = 'feedScope'
37 | TYPE = 'type'
38 | ENVIRONMENT = 'environment'
39 | DOWNLOAD_LOCATION = 'downloadLocation'
40 | FILE_FORMAT = 'fileFormat'
41 |
42 | def __str__(self):
43 | return str(self.value)
44 |
45 |
46 | @unique
47 | class FilterField(Enum):
48 | INPUT_FILE_PATH = 'inputFilePath'
49 | ITEM_IDS = 'itemIds'
50 | LEAF_CATEGORY_IDS = 'leafCategoryIds'
51 | SELLER_NAMES = 'sellerNames'
52 | GTINS = 'gtins'
53 | EPIDS = 'epids'
54 | PRICE_LOWER_LIMIT = 'priceLowerLimit'
55 | PRICE_UPPER_LIMIT = 'priceUpperLimit'
56 | ITEM_LOCATION_COUNTRIES = 'itemLocationCountries'
57 | INFERRED_EPIDS = 'inferredEpids'
58 | ANY_QUERY = 'anyQuery'
59 | FILE_FORMAT = 'fileFormat'
60 |
61 | def __str__(self):
62 | return str(self.value)
63 |
--------------------------------------------------------------------------------
/enums/feed_enums.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | from aenum import Enum, unique
19 |
20 |
21 | @unique
22 | class FeedColumn(Enum):
23 | ITEM_ID = 'ItemId' # column 0
24 | CATEGORY_ID = 'CategoryId' # column 4
25 | SELLER_USERNAME = 'SellerUsername' # column 6
26 | GTIN = 'GTIN' # column 9
27 | EPID = 'EPID' # column 12
28 | PRICE_VALUE = 'PriceValue' # column 15
29 | ITEM_LOCATION_COUNTRIES = 'ItemLocationCountry' # column 21
30 | INFERRED_EPID = 'InferredEPID' # column 40
31 |
32 | def __str__(self):
33 | return str(self.value)
34 |
35 |
36 | @unique
37 | class Environment(Enum):
38 | PRODUCTION = 'production'
39 | SANDBOX = 'sandbox'
40 |
41 | def __str__(self):
42 | return str(self.value)
43 |
44 |
45 | @unique
46 | class FeedPrefix(Enum):
47 | DAILY = 'daily'
48 | BOOTSTRAP = 'bootstrap'
49 |
50 | def __str__(self):
51 | return str(self.value)
52 |
53 |
54 | @unique
55 | class FeedScope(Enum):
56 | DAILY = 'NEWLY_LISTED'
57 | BOOTSTRAP = 'ALL_ACTIVE'
58 |
59 | def __str__(self):
60 | return str(self.value)
61 |
62 |
63 | @unique
64 | class FeedType(Enum):
65 | ITEM = 'item'
66 | SNAPSHOT = 'item_snapshot'
67 |
68 | def __str__(self):
69 | return str(self.value)
70 |
--------------------------------------------------------------------------------
/enums/file_enums.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | from aenum import Enum, unique
19 |
20 |
21 | @unique
22 | class FileEncoding(Enum):
23 | UTF8 = 'UTF-8'
24 |
25 | def __str__(self):
26 | return str(self.value)
27 |
28 |
29 | @unique
30 | class FileFormat(Enum):
31 | GZIP = 'gzip'
32 |
33 | def __str__(self):
34 | return str(self.value)
35 |
--------------------------------------------------------------------------------
/errors/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = [
2 | 'custom_exceptions'
3 | ]
4 |
--------------------------------------------------------------------------------
/errors/custom_exceptions.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | class Error(Exception):
19 | """Base class for errors in this module."""
20 | pass
21 |
22 |
23 | class AuthorizationError(Error):
24 | def __init__(self, msg):
25 | self.msg = msg
26 |
27 |
28 | class ConfigError(Error):
29 | def __init__(self, msg, mark=None):
30 | self.msg = msg
31 | self.mark = mark
32 |
33 |
34 | class FileCreationError(Error):
35 | def __init__(self, msg, path):
36 | self.msg = msg
37 | self.path = path
38 |
39 |
40 | class FilterError(Error):
41 | def __init__(self, msg, filter_query=None):
42 | self.msg = msg
43 | self.input_data = filter_query
44 |
45 |
46 | class InputDataError(Error):
47 | def __init__(self, msg, input_data=None):
48 | self.msg = msg
49 | self.input_data = input_data
50 |
--------------------------------------------------------------------------------
/examples/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/eBay/FeedSDK-Python/a225421b691803f027067721ad779e44d7647580/examples/__init__.py
--------------------------------------------------------------------------------
/examples/config_examples.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | from config.config_request import ConfigFileRequest
19 |
20 |
21 | def filter_feed(config_path):
22 | cr = ConfigFileRequest(config_path)
23 | cr.parse_requests()
24 | cr.process_requests()
25 |
26 |
27 | def download_filter_feed(config_path, token):
28 | cr = ConfigFileRequest(config_path)
29 | cr.parse_requests(token)
30 | cr.process_requests()
31 |
32 |
33 | if __name__ == '__main__':
34 | filter_feed('../sample-config/config-file-filter')
35 | download_filter_feed('../sample-config/config-file-download-filter', 'v^1.1#i...')
36 |
--------------------------------------------------------------------------------
/feed/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = [
2 | 'feed_request'
3 | ]
4 |
--------------------------------------------------------------------------------
/feed/feed_request.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | import certifi
19 | import urllib3
20 | import json
21 | import logging
22 | from os import path
23 | from utils import file_utils, date_utils
24 | import constants.feed_constants as const
25 | from filter.feed_filter import GetFeedResponse
26 | from enums.file_enums import FileFormat
27 | from enums.feed_enums import FeedType, FeedScope, FeedPrefix, Environment
28 | from errors.custom_exceptions import InputDataError, FileCreationError
29 | from utils.logging_utils import setup_logging
30 |
31 | setup_logging()
32 | logger = logging.getLogger(__name__)
33 |
34 | DEFAULT_DOWNLOAD_LOCATION = path.expanduser('~/Desktop/feed-sdk')
35 |
36 |
37 | class Feed(object):
38 | def __init__(self, feed_type, feed_scope, category_id, marketplace_id, token, feed_date=None,
39 | environment=Environment.PRODUCTION.value, download_location=None, file_format=FileFormat.GZIP.value):
40 | self.token = const.TOKEN_BEARER_PREFIX + token if (token and not token.startswith('Bearer')) else token
41 | self.feed_type = feed_type.lower() if feed_type else FeedType.ITEM.value
42 | self.feed_scope = feed_scope.upper() if feed_scope else FeedScope.DAILY.value
43 | self.category_id = category_id
44 | self.marketplace_id = marketplace_id
45 | self.environment = environment if environment else Environment.PRODUCTION.value
46 | self.download_location = download_location if download_location else DEFAULT_DOWNLOAD_LOCATION
47 | self.file_format = file_format if file_format else FileFormat.GZIP.value
48 | self.feed_date = feed_date if feed_date else date_utils.get_formatted_date(feed_type)
49 |
50 | def __str__(self):
51 | return '[feed_type= %s, feed_scope= %s, category_id= %s, marketplace_id= %s, feed_date= %s, ' \
52 | 'environment= %s, download_location= %s, file_format= %s, token= %s]' % (self.feed_type,
53 | self.feed_scope,
54 | self.category_id,
55 | self.marketplace_id,
56 | self.feed_date,
57 | self.environment,
58 | self.download_location,
59 | self.file_format,
60 | self.token)
61 |
62 | def get(self):
63 | """
64 | :return: GetFeedResponse
65 | """
66 | logger.info(
67 | 'Downloading... \ncategoryId: %s | marketplace: %s | date: %s | feed_scope: %s | environment: %s \n',
68 | self.category_id, self.marketplace_id, self.feed_date, self.feed_scope, self.environment)
69 | if not self.token:
70 | return GetFeedResponse(const.FAILURE_CODE, 'No token has been provided', None, None, None)
71 | if path.exists(self.download_location) and not path.isdir(self.download_location):
72 | return GetFeedResponse(const.FAILURE_CODE, 'Download location is not a directory', self.download_location,
73 | None, None)
74 | try:
75 | date_utils.validate_date(self.feed_date, self.feed_type)
76 | except InputDataError as exp:
77 | return GetFeedResponse(const.FAILURE_CODE, exp.msg, self.download_location, None, None)
78 | # generate the absolute file path
79 | file_name = self.__generate_file_name()
80 | file_path = path.join(self.download_location, file_name)
81 | # Create an empty file in the given path
82 | try:
83 | file_utils.create_and_replace_binary_file(file_path)
84 | with open(file_path, 'wb') as file_obj:
85 | # Get the feed file data
86 | result_code, message = self.__invoke_request(file_obj)
87 | return GetFeedResponse(result_code, message, file_path, None, None)
88 | except IOError as exp:
89 | return GetFeedResponse(const.FAILURE_CODE, 'Could not open file %s : %s' % (file_path, repr(exp)),
90 | file_path, None, None)
91 | except (InputDataError, FileCreationError) as exp:
92 | return GetFeedResponse(const.FAILURE_CODE, exp.msg, file_path, None, None)
93 |
94 | def __invoke_request(self, file_handler):
95 | # initialize API call counter
96 | api_call_counter = 0
97 | # Find max chunk size
98 | chunk_size = self.__find_max_chunk_size()
99 | logger.info('Chunk size: %s\n', chunk_size)
100 | # The initial request Range header is bytes=0-CHUNK_SIZE
101 | headers = {const.MARKETPLACE_HEADER: self.marketplace_id,
102 | const.AUTHORIZATION_HEADER: self.token,
103 | const.CONTENT_TYPE_HEADER: const.APPLICATION_JSON,
104 | const.ACCEPT_HEADER: const.APPLICATION_JSON,
105 | const.RANGE_HEADER: const.RANGE_PREFIX + '0-' + str(chunk_size)}
106 | parameters, endpoint = self.__get_query_parameters_and_base_url()
107 | http_manager = urllib3.PoolManager(timeout=const.REQUEST_TIMEOUT,
108 | retries=urllib3.Retry(const.REQUEST_RETRIES,
109 | backoff_factor=const.BACK_OFF_TIME),
110 | cert_reqs='CERT_REQUIRED', ca_certs=certifi.where())
111 | # Initial request
112 | feed_response = http_manager.request('GET', endpoint, parameters, headers)
113 | # increase and print API call counter
114 | api_call_counter = api_call_counter + 1
115 | logger.info('API call #%s\n', api_call_counter)
116 | # Get the status code
117 | status_code = feed_response.status
118 | # Append the data to the file, might raise an exception
119 | if status_code == 200:
120 | file_utils.append_response_to_file(file_handler, feed_response.data)
121 | return const.SUCCESS_CODE, const.SUCCESS_STR
122 | while status_code == 206:
123 | # Append the data to the file, might raise an exception
124 | file_utils.append_response_to_file(file_handler, feed_response.data)
125 | headers[const.RANGE_HEADER] = file_utils.find_next_range(feed_response.headers[const.CONTENT_RANGE_HEADER],
126 | chunk_size)
127 | # check if we have reached the end of the file
128 | if not headers[const.RANGE_HEADER]:
129 | break
130 | # Send another request
131 | feed_response = http_manager.request('GET', endpoint, parameters, headers)
132 | # increase and print API call counter
133 | api_call_counter = api_call_counter+1
134 | logger.info('API call #%s\n', api_call_counter)
135 | # Get the status code
136 | status_code = feed_response.status
137 | if status_code == 206 and not headers[const.RANGE_HEADER]:
138 | return const.SUCCESS_CODE, const.SUCCESS_STR
139 | json_response = json.loads(feed_response.data.decode('utf-8'))
140 | return const.FAILURE_CODE, json_response.get('errors')
141 |
142 | def __get_query_parameters_and_base_url(self):
143 | # Base URL
144 | base_url = self.__find_base_url()
145 | base_url = base_url + str(FeedType.ITEM)
146 | # Common query parameter
147 | fields = {const.QUERY_CATEGORY_ID: self.category_id}
148 | # Snapshot feed
149 | if self.feed_type == str(FeedType.SNAPSHOT):
150 | fields.update({const.QUERY_SNAPSHOT_DATE: self.feed_date})
151 | base_url = const.FEED_API_PROD_URL + str(FeedType.SNAPSHOT)
152 | return fields, base_url
153 | # Daily or bootstrap feed
154 | if self.feed_scope == str(FeedScope.DAILY):
155 | fields.update({const.QUERY_SCOPE: self.feed_scope,
156 | const.QUERY_DATE: self.feed_date})
157 | elif self.feed_scope == str(FeedScope.BOOTSTRAP):
158 | fields.update({const.QUERY_SCOPE: self.feed_scope})
159 | return fields, base_url
160 |
161 | def __find_base_url(self):
162 | if self.environment.lower() == str(Environment.PRODUCTION):
163 | return const.FEED_API_PROD_URL
164 | return const.FEED_API_SANDBOX_URL
165 |
166 | def __find_max_chunk_size(self):
167 | if self.environment.lower() == str(Environment.PRODUCTION):
168 | return const.PROD_CHUNK_SIZE
169 | return const.SANDBOX_CHUNK_SIZE
170 |
171 | def __generate_file_name(self):
172 | if str(FeedScope.BOOTSTRAP) == self.feed_scope:
173 | feed_prefix = str(FeedPrefix.BOOTSTRAP)
174 | elif str(FeedScope.DAILY) == self.feed_scope:
175 | feed_prefix = str(FeedPrefix.DAILY)
176 | else:
177 | raise InputDataError('Unknown feed scope', self.feed_scope)
178 | file_name = str(FeedType.ITEM) + '_' + feed_prefix + '_' + str(self.category_id) + '_' + self.feed_date + \
179 | '_' + self.marketplace_id + file_utils.get_extension(self.file_format)
180 | return file_name
181 |
--------------------------------------------------------------------------------
/feed_cli.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | import time
19 | import logging
20 | import argparse
21 | from enums.feed_enums import FeedType
22 | from feed.feed_request import Feed
23 | from filter.feed_filter import FeedFilterRequest
24 | from constants.feed_constants import SUCCESS_CODE
25 | from utils.logging_utils import setup_logging
26 |
27 | setup_logging()
28 | logger = logging.getLogger(__name__)
29 |
30 | parser = argparse.ArgumentParser(prog='FeedSDK', description='Feed SDK CLI')
31 |
32 | # date
33 | parser.add_argument('-dt', help='the date when feed file was generated')
34 | # l1 category
35 | parser.add_argument('-c1', help='the l1 category id of the feed file', required=True)
36 | # scope
37 | parser.add_argument('-scope', help='the feed scope. Available scopes are ALL_ACTIVE or NEWLY_LISTED',
38 | choices=['ALL_ACTIVE', 'NEWLY_LISTED'], default='NEWLY_LISTED')
39 | # marketplace
40 | parser.add_argument('-mkt', help='the marketplace id for which feed is being requested. For example - EBAY_US',
41 | default='EBAY_US')
42 | # token
43 | parser.add_argument('-token', help='the oauth token for the consumer. Omit the word \'Bearer\'')
44 | # environment
45 | parser.add_argument('-env', help='environment type. Supported Environments are SANDBOX and PRODUCTION',
46 | choices=['SANDBOX', 'PRODUCTION'])
47 |
48 | # options for filtering the files
49 | parser.add_argument('-lf', nargs='+', help='list of leaf categories which are used to filter the feed')
50 | parser.add_argument('-sellerf', nargs='+', help='list of seller names which are used to filter the feed')
51 | parser.add_argument('-locf', nargs='+', help='list of item locations which are used to filter the feed')
52 | parser.add_argument('-pricelf', type=float, help='lower limit of the price range for items in the feed')
53 | parser.add_argument('-priceuf', type=float, help='upper limit of the price range for items in the feed')
54 | parser.add_argument('-epidf', nargs='+', help='list of epids which are used to filter the feed')
55 | parser.add_argument('-iepidf', nargs='+', help='list of inferred epids which are used to filter the feed')
56 | parser.add_argument('-gtinf', nargs='+', help='list of gtins which are used to filter the feed')
57 | parser.add_argument('-itemf', nargs='+', help='list of item IDs which are used to filter the feed')
58 | # file location
59 | parser.add_argument('-dl', '--downloadlocation', help='override for changing the directory where files are downloaded')
60 | parser.add_argument('--filteronly', help='filter the feed file that already exists in the default path or the path '
61 | 'specified by -dl, --downloadlocation option. If --filteronly option is not '
62 | 'specified, the feed file will be downloaded again', action="store_true")
63 | # file format
64 | parser.add_argument('-format', help='feed and filter file format. Default is gzip', default='gzip')
65 |
66 | # any query to filter the feed file
67 | parser.add_argument('-qf', help='any other query to filter the feed file. See Python dataframe query format')
68 |
69 | # parse the arguments
70 | args = parser.parse_args()
71 |
72 |
73 | start = time.time()
74 | if args.filteronly:
75 | # create the filtered file
76 | feed_filter_obj = FeedFilterRequest(args.downloadlocation, args.itemf, args.lf, args.sellerf, args.gtinf,
77 | args.epidf, args.pricelf, args.priceuf, args.locf, args.iepidf, args.qf,
78 | args.format)
79 | filter_response = feed_filter_obj.filter()
80 | if filter_response.status_code != SUCCESS_CODE:
81 | print(filter_response.message)
82 |
83 | else:
84 | # download the feed file if --filteronly option is not set
85 | feed_obj = Feed(FeedType.ITEM.value, args.scope, args.c1, args.mkt, args.token, args.dt, args.env,
86 | args.downloadlocation, args.format)
87 | get_response = feed_obj.get()
88 | if get_response.status_code != SUCCESS_CODE:
89 | logger.error('Exception in downloading feed. Cannot proceed\nFile path: %s\n Error message: %s\n',
90 | get_response.file_path, get_response.message)
91 | else:
92 | # create the filtered file
93 | feed_filter_obj = FeedFilterRequest(get_response.file_path, args.itemf, args.lf, args.sellerf, args.gtinf,
94 | args.epidf, args.pricelf, args.priceuf, args.locf, args.iepidf, args.qf,
95 | args.format)
96 | filter_response = feed_filter_obj.filter()
97 | if filter_response.status_code != SUCCESS_CODE:
98 | print(filter_response.message)
99 | end = time.time()
100 | logger.info('Execution time (s): %s', str(round(end - start, 3)))
101 | print('Execution time (s): %s' % str(round(end - start, 3)))
102 |
103 |
104 |
105 |
--------------------------------------------------------------------------------
/filter/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = [
2 | 'feed_filter'
3 | ]
4 |
--------------------------------------------------------------------------------
/filter/feed_filter.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | import time
19 | import logging
20 | import pandas as pd
21 | from os import remove
22 | from sqlalchemy import create_engine
23 | from collections import namedtuple
24 | from os.path import split, abspath, join, isfile
25 | from utils import filter_utils
26 | from utils.file_utils import get_extension
27 |
28 | from enums.feed_enums import FeedColumn
29 | from enums.file_enums import FileEncoding, FileFormat
30 | import constants.feed_constants as const
31 | from utils.logging_utils import setup_logging
32 |
33 | setup_logging()
34 | logger = logging.getLogger(__name__)
35 |
36 | Response = namedtuple('Response', 'status_code message file_path applied_filters')
37 | GetFeedResponse = namedtuple('GetFeedResponse', Response._fields + ('errors',))
38 |
39 | BOOL_COLUMNS = {'ImageAlteringProhibited', 'ReturnsAccepted'}
40 | # using float64 for integer columns as well as the workaround for NAN values
41 | FLOAT_COLUMNS = {'AvailabilityThreshold', 'EstimatedAvailableQuantity',
42 | 'PriceValue', 'ReturnPeriodValue'}
43 | IGNORE_COLUMNS = {'AdditionalImageUrls', 'ImageUrl', 'Title'}
44 |
45 | DB_FILE_NAME = 'sqlite_feed_sdk.db'
46 | DB_TABLE_NAME = 'feed'
47 |
48 |
49 | class FeedFilterRequest(object):
50 | def __init__(self, input_file_path, item_ids=None, leaf_category_ids=None, seller_names=None, gtins=None,
51 | epids=None, price_lower_limit=None, price_upper_limit=None, item_location_countries=None,
52 | inferred_epids=None, any_query=None, compression_type=FileFormat.GZIP.value, separator='\t',
53 | encoding=FileEncoding.UTF8.value, rows_chunk_size=const.DATA_FRAME_CHUNK_SIZE):
54 | self.input_file_path = input_file_path
55 | self.item_ids = item_ids
56 | self.leaf_category_ids = leaf_category_ids
57 | self.seller_names = seller_names
58 | self.gtins = gtins
59 | self.epids = epids
60 | self.price_lower_limit = price_lower_limit
61 | self.price_upper_limit = price_upper_limit
62 | self.item_location_countries = item_location_countries
63 | self.inferred_epids = inferred_epids
64 | self.any_query = '(%s)' % any_query if any_query else None
65 | self.compression_type = compression_type if compression_type else FileFormat.GZIP.value
66 | self.separator = separator if separator else '\t'
67 | self.encoding = encoding if encoding else FileEncoding.UTF8.value
68 | self.rows_chunk_size = rows_chunk_size if rows_chunk_size else const.DATA_FRAME_CHUNK_SIZE
69 | self.__filtered_file_path = None
70 | self.__number_of_records = 0
71 | self.__number_of_filtered_records = 0
72 | self.__queries = []
73 |
74 | def __str__(self):
75 | return '[input_file_path= %s, item_ids= %s, leaf_category_ids= %s, seller_names= %s, gtins= %s, ' \
76 | 'epids= %s, price_lower_limit= %s, price_upper_limit= %s, item_location_countries= %s, ' \
77 | 'inferred_epids= %s, any_query= %s, compression_type= %s, separator= %s, encoding= %s]' % \
78 | (self.input_file_path,
79 | self.item_ids,
80 | self.leaf_category_ids,
81 | self.seller_names,
82 | self.gtins,
83 | self.epids,
84 | self.price_lower_limit,
85 | self.price_upper_limit,
86 | self.item_location_countries,
87 | self.inferred_epids,
88 | self.any_query,
89 | self.compression_type,
90 | self.separator,
91 | self.encoding)
92 |
93 | @property
94 | def filtered_file_path(self):
95 | return self.__filtered_file_path
96 |
97 | @property
98 | def number_of_records(self):
99 | return self.__number_of_records
100 |
101 | @property
102 | def number_of_filtered_records(self):
103 | return self.__number_of_filtered_records
104 |
105 | @property
106 | def queries(self):
107 | return self.__queries
108 |
109 | def __append_query(self, query_str):
110 | if query_str:
111 | self.__queries.append(query_str)
112 |
113 | def filter(self, column_name_list=None, keep_db=False):
114 | logger.info('Filtering... \nInput file: %s', self.input_file_path)
115 |
116 | self.__append_query(self.any_query)
117 | self.__append_query(filter_utils.get_list_string_element_query(FeedColumn.ITEM_ID, self.item_ids))
118 | self.__append_query(filter_utils.get_list_string_element_query(FeedColumn.CATEGORY_ID, self.leaf_category_ids))
119 | self.__append_query(filter_utils.get_list_string_element_query(FeedColumn.SELLER_USERNAME, self.seller_names))
120 | self.__append_query(filter_utils.get_list_string_element_query(FeedColumn.GTIN, self.gtins))
121 | self.__append_query(filter_utils.get_list_string_element_query(FeedColumn.EPID, self.epids))
122 | self.__append_query(filter_utils.get_inclusive_greater_query(FeedColumn.PRICE_VALUE, self.price_lower_limit))
123 | self.__append_query(filter_utils.get_inclusive_less_query(FeedColumn.PRICE_VALUE, self.price_upper_limit))
124 | self.__append_query(filter_utils.get_list_string_element_query(FeedColumn.INFERRED_EPID, self.inferred_epids))
125 | self.__append_query(filter_utils.get_list_string_element_query(FeedColumn.ITEM_LOCATION_COUNTRIES,
126 | self.item_location_countries))
127 | query_str = None
128 | if self.__queries:
129 | query_str = ' AND '.join(self.__queries)
130 | if not self.input_file_path or not isfile(self.input_file_path):
131 | return Response(const.FAILURE_CODE,
132 | 'Input file is a directory or does not exist. Cannot filter. Aborting...',
133 | self.filtered_file_path, self.queries)
134 | if not query_str:
135 | return Response(const.FAILURE_CODE, 'No filters have been specified. Cannot filter. Aborting...',
136 | self.filtered_file_path, self.queries)
137 | # create the data frame
138 | filtered_data = self.__read_chunks_gzip_file(query_str, column_name_list, keep_db)
139 | if not filtered_data.empty:
140 | self.__save_filtered_data_frame(filtered_data)
141 | else:
142 | logger.error('No filtered feed file created')
143 | return Response(const.SUCCESS_CODE, const.SUCCESS_STR, self.filtered_file_path, self.queries)
144 |
145 | def __derive_filtered_file_path(self):
146 | file_path, full_file_name = split(abspath(self.input_file_path))
147 | file_name = full_file_name.split('.')[0]
148 | time_milliseconds = int(time.time() * 1000)
149 | filtered_file_path = join(file_path, file_name + '-filtered-' + str(time_milliseconds) +
150 | get_extension(self.compression_type))
151 | return filtered_file_path
152 |
153 | def __read_chunks_gzip_file(self, query_str, column_name_list, keep_db):
154 | disk_engine = create_engine('sqlite:///'+DB_FILE_NAME)
155 | chunk_num = 0
156 | columns_to_process, data_types = self.__get_cols_and_type_dict()
157 | cols = column_name_list if column_name_list else columns_to_process
158 | start = time.time()
159 | for chunk_df in pd.read_csv(self.input_file_path, header=0,
160 | compression=self.compression_type, encoding=self.encoding, usecols=cols,
161 | sep=self.separator, quotechar='"', lineterminator='\n', skip_blank_lines=True,
162 | skipinitialspace=True, error_bad_lines=False, index_col=False,
163 | chunksize=self.rows_chunk_size, dtype=data_types,
164 | converters={'AvailabilityThreshold': filter_utils.convert_to_float_max_int,
165 | 'EstimatedAvailableQuantity': filter_utils.convert_to_float_max_int,
166 | 'PriceValue': filter_utils.convert_to_float_zero,
167 | 'ReturnPeriodValue': filter_utils.convert_to_float_zero,
168 | 'ImageAlteringProhibited': filter_utils.convert_to_bool_false,
169 | 'ReturnsAccepted': filter_utils.convert_to_bool_false}):
170 | self.__number_of_records = self.__number_of_records + len(chunk_df.index)
171 | chunk_num = chunk_num + 1
172 | chunk_df.to_sql(DB_TABLE_NAME, disk_engine, if_exists='append', index=False)
173 | execution_time = time.time() - start
174 | logger.info('Loaded %s records in %s (s) %s (m)', self.__number_of_records, str(round(execution_time, 3)),
175 | str(round(execution_time / 60, 3)))
176 | # apply query
177 | sql_string = '''SELECT * From %s WHERE %s ''' % (DB_TABLE_NAME, query_str)
178 |
179 | start = time.time()
180 | query_result_df = pd.read_sql_query(sql_string, disk_engine)
181 | execution_time = time.time() - start
182 | self.__number_of_filtered_records = len(query_result_df.index)
183 | logger.info('Filtered %s records in %s (s) %s (m)', self.number_of_filtered_records,
184 | str(round(execution_time, 3)),
185 | str(round(execution_time / 60, 3)))
186 | # remove the created db file
187 | if not keep_db:
188 | remove(DB_FILE_NAME)
189 | return query_result_df
190 |
191 | def __save_filtered_data_frame(self, data_frame):
192 | self.__filtered_file_path = self.__derive_filtered_file_path()
193 | start = time.time()
194 | data_frame.to_csv(self.__filtered_file_path, sep=self.separator, na_rep='', header=True, index=False, mode='w',
195 | encoding=self.encoding, compression=self.compression_type, quotechar='"',
196 | line_terminator='\n', doublequote=True, escapechar='\\', decimal='.')
197 | execution_time = time.time() - start
198 | logger.info('Saved %s records in %s (s) %s (m)', self.number_of_filtered_records,
199 | str(round(execution_time, 3)),
200 | str(round(execution_time / 60, 3)))
201 |
202 | def __get_cols_and_type_dict(self):
203 | all_columns = pd.read_csv(self.input_file_path, nrows=1, sep=self.separator,
204 | compression=self.compression_type).columns.tolist()
205 | type_dict = {}
206 | cols = []
207 | for col_name in all_columns:
208 | # Ignoring due to possibility of comma character in the value and breaking the parser
209 | if col_name in IGNORE_COLUMNS:
210 | continue
211 | else:
212 | cols.append(col_name)
213 | if col_name not in BOOL_COLUMNS and col_name not in FLOAT_COLUMNS:
214 | type_dict[col_name] = 'object'
215 | return cols, type_dict
216 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | urllib3==1.26.5
2 | certifi==2019.3.9
3 | aenum==2.1.2
4 | pandas==0.24.2
5 | SQLAlchemy==1.3.3
6 |
--------------------------------------------------------------------------------
/sample-config/config-file-download:
--------------------------------------------------------------------------------
1 | {
2 | "requests": [
3 | {
4 | "feedRequest": {
5 | "categoryId": "220",
6 | "marketplaceId": "EBAY_US",
7 | "feedScope": "ALL_ACTIVE",
8 | "type": "ITEM",
9 | "downloadLocation": "",
10 | "fileFormat": "gzip"
11 | }
12 | },
13 | {
14 | "feedRequest": {
15 | "categoryId": "11450",
16 | "marketplaceId": "EBAY_DE",
17 | "feedScope": "NEWLY_LISTED",
18 | "date": "20190127",
19 | "type": "ITEM"
20 | }
21 | }
22 | ]
23 | }
--------------------------------------------------------------------------------
/sample-config/config-file-download-filter:
--------------------------------------------------------------------------------
1 | {
2 | "requests": [
3 | {
4 | "feedRequest": {
5 | "categoryId": "550",
6 | "marketplaceId": "EBAY_US",
7 | "feedScope": "ALL_ACTIVE",
8 | "type": "ITEM"
9 | },
10 | "filterRequest": {
11 | "sellerNames": [
12 | "patikaszop",
13 | "cherp-serge",
14 | "itemtrade"
15 | ],
16 | "itemLocationCountries": [
17 | "US",
18 | "HK",
19 | "CA"
20 | ],
21 | "priceLowerLimit": 10.0,
22 | "priceUpperLimit": 100.0
23 | }
24 | },
25 | {
26 | "feedRequest": {
27 | "categoryId": "260",
28 | "marketplaceId": "EBAY_GB",
29 | "feedScope": "ALL_ACTIVE",
30 | "type": "ITEM"
31 | },
32 | "filterRequest": {
33 | "leafCategoryIds": [
34 | "162057",
35 | "705"
36 | ],
37 | "priceUpperLimit": 10
38 | }
39 | },
40 | {
41 | "feedRequest": {
42 | "categoryId": "1281",
43 | "marketplaceId": "EBAY_US",
44 | "feedScope": "ALL_ACTIVE",
45 | "type": "ITEM"
46 | },
47 | "filterRequest": {
48 | "anyQuery": "AcceptedPaymentMethods='PAYPAL'"
49 | },
50 | "itemLocationCountries": [
51 | "CA"
52 | ]
53 | },
54 | {
55 | "feedRequest": {
56 | "categoryId": "11232",
57 | "marketplaceId": "EBAY_DE",
58 | "date": "20180708",
59 | "feedScope": "ALL_ACTIVE",
60 | "type": "ITEM"
61 | },
62 | "filterRequest": {
63 | "epids": [
64 | "216949221",
65 | "3927841"
66 | ],
67 | "inferredEpids": [
68 | "216949221",
69 | "3927841"
70 | ]
71 | }
72 | },
73 | {
74 | "feedRequest": {
75 | "categoryId": "220",
76 | "marketplaceId": "EBAY_US",
77 | "date": "20190304",
78 | "feedScope": "NEWLY_LISTED",
79 | "type": "ITEM"
80 | },
81 | "filterRequest": {
82 | "leafCategoryIds": [
83 | "122569",
84 | "2537",
85 | "34061",
86 | "2624"
87 | ],
88 | "itemLocationCountries": [
89 | "US"
90 | ],
91 | "priceLowerLimit": 10.0,
92 | "priceUpperLimit": 140.0
93 | }
94 | }
95 | ]
96 | }
--------------------------------------------------------------------------------
/sample-config/config-file-filter:
--------------------------------------------------------------------------------
1 | {
2 | "requests": [
3 | {
4 | "filterRequest": {
5 | "inputFilePath": "",
6 | "leafCategoryIds": [
7 | "112529",
8 | "64619",
9 | "111694"
10 | ],
11 | "itemLocationCountries": [
12 | "DE",
13 | "GB",
14 | "ES"
15 | ],
16 | "anyQuery": "AvailabilityThresholdType='MORE_THAN' AND AvailabilityThreshold=10",
17 | "fileFormat" : "gzip"
18 | }
19 | }
20 | ]
21 | }
--------------------------------------------------------------------------------
/sample-config/config-file-query-only:
--------------------------------------------------------------------------------
1 | {
2 | "requests": [
3 | {
4 | "filterRequest": {
5 | "inputFilePath": "",
6 | "anyQuery": "AvailabilityThresholdType='MORE_THAN' AND AvailabilityThreshold=10"
7 | }
8 | }
9 | ]
10 | }
--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/eBay/FeedSDK-Python/a225421b691803f027067721ad779e44d7647580/tests/__init__.py
--------------------------------------------------------------------------------
/tests/test-data/test_config:
--------------------------------------------------------------------------------
1 | {
2 | "requests": [
3 | {
4 | "feedRequest": {
5 | "categoryId": "260",
6 | "marketplaceId": "EBAY_US",
7 | "feedScope": "ALL_ACTIVE",
8 | "type": "ITEM"
9 | },
10 | "filterRequest": {
11 | "itemLocationCountries": [
12 | "US",
13 | "HK",
14 | "CA"
15 | ],
16 | "priceLowerLimit": 10.0,
17 | "priceUpperLimit": 100.0
18 | }
19 | },
20 | {
21 | "feedRequest": {
22 | "categoryId": "220",
23 | "marketplaceId": "EBAY_US",
24 | "date": "20190127",
25 | "feedScope": "NEWLY_LISTED",
26 | "type": "ITEM"
27 | }
28 | },
29 | {
30 | "filterRequest": {
31 | "inputFilePath": "/Users/[USER]/Desktop/sdk/test_bootstrap.gz",
32 | "leafCategoryIds": [
33 | "112529",
34 | "64619",
35 | "111694"
36 | ],
37 | "itemLocationCountries": [
38 | "DE",
39 | "GB",
40 | "ES"
41 | ],
42 | "anyQuery": "AvailabilityThresholdType='MORE_THAN' AND AvailabilityThreshold=10",
43 | "fileFormat" : "gzip"
44 | }
45 | }
46 | ]
47 | }
--------------------------------------------------------------------------------
/tests/test-data/test_json:
--------------------------------------------------------------------------------
1 | {
2 | "requests": [
3 | {
4 | "feedRequest": {
5 | "categoryId": "220",
6 | "marketplaceId": "EBAY_US",
7 | "feedScope": "ALL_ACTIVE",
8 | "type": "ITEM",
9 | "downloadLocation": "/Users/[USER]/Desktop/sdk",
10 | "fileFormat": "gzip"
11 | }
12 | }
13 | ]
14 | }
--------------------------------------------------------------------------------
/tests/test_config_request.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | from enums.file_enums import FileFormat
3 | from enums.feed_enums import FeedScope
4 | from config.config_request import ConfigFileRequest
5 | from feed.feed_request import DEFAULT_DOWNLOAD_LOCATION
6 |
7 |
8 | class TestConfigRequest(unittest.TestCase):
9 | def test_parse_requests(self):
10 | cr = ConfigFileRequest('../tests/test-data/test_config')
11 | cr.parse_requests('Bearer v^1...')
12 | self.assertIsNotNone(cr.requests)
13 | self.assertEqual(len(cr.requests), 3)
14 |
15 | # first request has both feed and filter requests
16 | feed_req = cr.requests[0].feed_obj
17 | filter_req = cr.requests[0].filter_request_obj
18 | self.assertIsNotNone(feed_req)
19 | self.assertIsNotNone(filter_req)
20 | self.assertEqual(feed_req.category_id, u'260')
21 | self.assertEqual(filter_req.price_lower_limit, 10)
22 |
23 | # second request has a feed request only
24 | self.assertIsNone(cr.requests[1].filter_request_obj)
25 | feed_req = cr.requests[1].feed_obj
26 | self.assertIsNotNone(feed_req)
27 | self.assertIsNotNone(feed_req.token)
28 | self.assertEqual(feed_req.category_id, u'220')
29 | self.assertEqual(feed_req.marketplace_id, u'EBAY_US')
30 | self.assertEqual(feed_req.feed_date, '20190127')
31 | self.assertEqual(feed_req.feed_scope, FeedScope.DAILY.value)
32 | self.assertEqual(feed_req.download_location, DEFAULT_DOWNLOAD_LOCATION)
33 |
34 | # third request has a filter request only
35 | self.assertIsNone(cr.requests[2].feed_obj)
36 | filter_req = cr.requests[2].filter_request_obj
37 | self.assertIsNotNone(filter_req)
38 | self.assertEqual(filter_req.input_file_path, '/Users/[USER]/Desktop/sdk/test_bootstrap.gz')
39 | self.assertEqual(filter_req.leaf_category_ids, ['112529', '64619', '111694'])
40 | self.assertEqual(filter_req.item_location_countries, ['DE', 'GB', 'ES'])
41 | self.assertEqual(filter_req.any_query,
42 | '(AvailabilityThresholdType=\'MORE_THAN\' AND AvailabilityThreshold=10)')
43 | self.assertEqual(filter_req.compression_type, FileFormat.GZIP.value)
44 |
45 |
46 | if __name__ == '__main__':
47 | unittest.main()
48 |
--------------------------------------------------------------------------------
/tests/test_date_utils.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | from utils import date_utils
3 | from datetime import datetime
4 | from enums.feed_enums import FeedType
5 | from errors.custom_exceptions import InputDataError
6 |
7 |
8 | class TestDateUtils(unittest.TestCase):
9 | def test_get_formatted_date(self):
10 | today_date = date_utils.get_formatted_date(FeedType.ITEM)
11 | try:
12 | datetime.strptime(today_date, '%Y%m%d')
13 | except ValueError:
14 | self.fail('Invalid date format: %s' % today_date)
15 |
16 | def test_validate_date_exception(self):
17 | with self.assertRaises(InputDataError):
18 | date_utils.validate_date('2019/02/01', FeedType.ITEM)
19 |
20 | def test_validate_date(self):
21 | date_utils.validate_date('20190201', FeedType.ITEM)
22 |
23 |
24 | if __name__ == '__main__':
25 | unittest.main()
26 |
--------------------------------------------------------------------------------
/tests/test_feed_filter.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | from os import remove
3 | from os.path import isfile
4 | from filter.feed_filter import FeedFilterRequest
5 | from enums.file_enums import FileFormat, FileEncoding
6 | from constants.feed_constants import DATA_FRAME_CHUNK_SIZE, SUCCESS_CODE, FAILURE_CODE
7 |
8 |
9 | class TestFeed(unittest.TestCase):
10 | @classmethod
11 | def setUpClass(cls):
12 | cls.test_file_path = '../tests/test-data/test_bootstrap_feed_260_3'
13 | cls.test_any_query = 'AvailabilityThresholdType=\'MORE_THAN\' AND AvailabilityThreshold=10'
14 |
15 | def test_default_values(self):
16 | filter_request = FeedFilterRequest(self.test_file_path)
17 | self.assertIsNone(filter_request.item_ids)
18 | self.assertIsNone(filter_request.leaf_category_ids)
19 | self.assertIsNone(filter_request.seller_names)
20 | self.assertIsNone(filter_request.gtins)
21 | self.assertIsNone(filter_request.epids)
22 | self.assertIsNone(filter_request.price_upper_limit)
23 | self.assertIsNone(filter_request.price_lower_limit)
24 | self.assertIsNone(filter_request.item_location_countries)
25 | self.assertIsNone(filter_request.inferred_epids)
26 | self.assertIsNone(filter_request.item_location_countries)
27 | self.assertIsNone(filter_request.any_query)
28 | self.assertIsNone(filter_request.filtered_file_path)
29 | self.assertEqual(filter_request.compression_type, FileFormat.GZIP.value)
30 | self.assertEqual(filter_request.separator, '\t')
31 | self.assertEqual(filter_request.encoding, FileEncoding.UTF8.value)
32 | self.assertEqual(filter_request.rows_chunk_size, DATA_FRAME_CHUNK_SIZE)
33 | self.assertEqual(filter_request.number_of_records, 0)
34 | self.assertEqual(filter_request.number_of_filtered_records, 0)
35 | self.assertEqual(len(filter_request.queries), 0)
36 |
37 | def test_any_query_format(self):
38 | filter_request = FeedFilterRequest(self.test_file_path, any_query=self.test_any_query)
39 | self.assertEqual(filter_request.any_query, '(' + self.test_any_query + ')')
40 |
41 | def test_none_file_path(self):
42 | filter_request = FeedFilterRequest(None)
43 | filter_response = filter_request.filter()
44 | self.assertEqual(filter_response.status_code, FAILURE_CODE)
45 | self.assertIsNotNone(filter_response.message)
46 | self.assertIsNone(filter_response.file_path)
47 | self.assertEqual(len(filter_response.applied_filters), 0)
48 |
49 | def test_dir_file_path(self):
50 | filter_request = FeedFilterRequest('../tests/test-data')
51 | filter_response = filter_request.filter()
52 | self.assertEqual(filter_response.status_code, FAILURE_CODE)
53 | self.assertIsNotNone(filter_response.message)
54 | self.assertIsNone(filter_response.file_path)
55 | self.assertEqual(len(filter_response.applied_filters), 0)
56 |
57 | def test_no_query(self):
58 | filter_request = FeedFilterRequest(self.test_file_path)
59 | filter_response = filter_request.filter()
60 | self.assertEqual(filter_response.status_code, FAILURE_CODE)
61 | self.assertIsNotNone(filter_response.message)
62 | self.assertIsNone(filter_response.file_path)
63 | self.assertEqual(len(filter_response.applied_filters), 0)
64 |
65 | def test_apply_filters(self):
66 | filter_request = FeedFilterRequest(self.test_file_path, price_upper_limit=10, any_query=self.test_any_query)
67 | filter_response = filter_request.filter(keep_db=False)
68 | self.assertEqual(filter_response.status_code, SUCCESS_CODE)
69 | self.assertIsNotNone(filter_response.message)
70 |
71 | self.assertEqual(len(filter_request.queries), 2)
72 | self.assertEqual(len(filter_response.applied_filters), 2)
73 |
74 | self.assertTrue(filter_request.number_of_records > 0)
75 | self.assertTrue(filter_request.number_of_filtered_records > 0)
76 |
77 | self.assertIsNotNone(filter_request.filtered_file_path)
78 | self.assertTrue(isfile(filter_request.filtered_file_path))
79 | self.assertEqual(filter_request.filtered_file_path, filter_response.file_path)
80 | # clean up
81 | remove(filter_request.filtered_file_path)
82 |
83 |
84 | if __name__ == '__main__':
85 | unittest.main()
86 |
--------------------------------------------------------------------------------
/tests/test_feed_request.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | from os import remove
3 | from os.path import isfile, getsize, split, abspath
4 | from utils.date_utils import get_formatted_date
5 | from enums.file_enums import FileFormat
6 | from enums.feed_enums import FeedType, FeedScope, FeedPrefix, Environment
7 | from feed.feed_request import Feed, DEFAULT_DOWNLOAD_LOCATION
8 | from constants.feed_constants import SUCCESS_CODE, FAILURE_CODE, PROD_CHUNK_SIZE
9 |
10 |
11 | class TestFeed(unittest.TestCase):
12 |
13 | @classmethod
14 | def setUpClass(cls):
15 | cls.test_token = 'Bearer v^1 ...'
16 | cls.test_category_1 = '1'
17 | cls.test_category_2 = '625'
18 | cls.test_marketplace = 'EBAY_US'
19 | cls.file_paths = []
20 |
21 | @classmethod
22 | def tearDownClass(cls):
23 | for file_path in cls.file_paths:
24 | if file_path and isfile(file_path):
25 | remove(file_path)
26 |
27 | def test_none_token(self):
28 | feed_req_obj = Feed(FeedType.ITEM.value, FeedScope.BOOTSTRAP.value, '220', 'EBAY_US', None)
29 | get_response = feed_req_obj.get()
30 | self.assertEqual(get_response.status_code, FAILURE_CODE)
31 | self.assertIsNotNone(get_response.message)
32 | self.assertIsNone(get_response.file_path, 'file_path is not None in the response')
33 |
34 | def test_default_values(self):
35 | feed_req_obj = Feed(None, None, '220', 'EBAY_US', 'v^1 ...')
36 | self.assertEqual(feed_req_obj.feed_type, FeedType.ITEM.value)
37 | self.assertEqual(feed_req_obj.feed_scope, FeedScope.DAILY.value)
38 | self.assertTrue(feed_req_obj.token.startswith('Bearer'), 'Bearer is missing from token')
39 | self.assertEqual(feed_req_obj.feed_date, get_formatted_date(feed_req_obj.feed_type))
40 | self.assertEqual(feed_req_obj.environment, Environment.PRODUCTION.value)
41 | self.assertEqual(feed_req_obj.download_location, DEFAULT_DOWNLOAD_LOCATION)
42 | self.assertEqual(feed_req_obj.file_format, FileFormat.GZIP.value)
43 |
44 | def test_download_feed_invalid_path(self):
45 | feed_req_obj = Feed(FeedType.ITEM.value, FeedScope.BOOTSTRAP.value, '220', 'EBAY_US', 'Bearer v^1 ...',
46 | download_location='../tests/test-data/test_json')
47 | get_response = feed_req_obj.get()
48 | self.assertEqual(get_response.status_code, FAILURE_CODE)
49 | self.assertIsNotNone(get_response.message)
50 | self.assertIsNotNone(get_response.file_path, 'file_path is None in the response')
51 |
52 | def test_download_feed_invalid_date(self):
53 | feed_req_obj = Feed(FeedType.ITEM.value, FeedScope.BOOTSTRAP.value, '220', 'EBAY_US', 'Bearer v^1 ...',
54 | download_location='../tests/test-data/', feed_date='2019-02-01')
55 | get_response = feed_req_obj.get()
56 | self.assertEqual(get_response.status_code, FAILURE_CODE)
57 | self.assertIsNotNone(get_response.message)
58 | self.assertIsNotNone(get_response.file_path, 'file_path is None in the response')
59 |
60 | def test_download_feed_daily(self):
61 | test_date = get_formatted_date(FeedType.ITEM, -4)
62 | feed_req_obj = Feed(FeedType.ITEM.value, FeedScope.DAILY.value, self.test_category_1,
63 | self.test_marketplace, self.test_token, download_location='../tests/test-data/',
64 | feed_date=test_date)
65 | get_response = feed_req_obj.get()
66 | # store the file path for clean up
67 | self.file_paths.append(get_response.file_path)
68 | # assert the result
69 | self.assertEqual(get_response.status_code, SUCCESS_CODE)
70 | self.assertIsNotNone(get_response.message)
71 | self.assertIsNotNone(get_response.file_path, 'file_path is None')
72 | self.assertTrue(isfile(get_response.file_path), 'file_path is not pointing to a file. file_path: %s'
73 | % get_response.file_path)
74 | # check the file size and name
75 | self.assertTrue(getsize(get_response.file_path) > 0, 'feed file is empty. file_path: %s'
76 | % get_response.file_path)
77 | self.assertTrue(FeedPrefix.DAILY.value in get_response.file_path,
78 | 'feed file name does not have %s in it. file_path: %s' %
79 | (FeedPrefix.DAILY.value, get_response.file_path))
80 | file_dir, file_name = split(abspath(get_response.file_path))
81 | self.assertEqual(abspath(feed_req_obj.download_location), file_dir)
82 |
83 | def test_download_feed_daily_bad_request(self):
84 | # ask for a future feed file that does not exist
85 | test_date = get_formatted_date(FeedType.ITEM, 5)
86 | feed_req_obj = Feed(FeedType.ITEM.value, FeedScope.DAILY.value, self.test_category_1,
87 | self.test_marketplace, self.test_token, download_location='../tests/test-data/',
88 | feed_date=test_date)
89 | get_response = feed_req_obj.get()
90 | # store the file path for clean up
91 | self.file_paths.append(get_response.file_path)
92 | # assert the result
93 | self.assertEqual(get_response.status_code, FAILURE_CODE)
94 | self.assertIsNotNone(get_response.message)
95 | self.assertIsNotNone(get_response.file_path, 'file has not been created')
96 | self.assertTrue(isfile(get_response.file_path), 'file_path is not pointing to a file. file_path: %s'
97 | % get_response.file_path)
98 | # check the file size and name
99 | self.assertTrue(getsize(get_response.file_path) == 0, 'feed file is empty. file_path: %s'
100 | % get_response.file_path)
101 | self.assertTrue(FeedPrefix.DAILY.value in get_response.file_path,
102 | 'feed file name does not have %s in it. file_path: %s'
103 | % (FeedPrefix.DAILY.value, get_response.file_path))
104 | file_dir, file_name = split(abspath(get_response.file_path))
105 | self.assertEqual(abspath(feed_req_obj.download_location), file_dir)
106 |
107 | def test_download_feed_daily_multiple_calls(self):
108 | feed_req_obj = Feed(FeedType.ITEM.value, FeedScope.BOOTSTRAP.value, self.test_category_2,
109 | self.test_marketplace, self.test_token, download_location='../tests/test-data/')
110 | get_response = feed_req_obj.get()
111 | # store the file path for clean up
112 | self.file_paths.append(get_response.file_path)
113 | # assert the result
114 | self.assertEqual(get_response.status_code, SUCCESS_CODE)
115 | self.assertIsNotNone(get_response.message)
116 | self.assertIsNotNone(get_response.file_path, 'file has not been created')
117 | self.assertTrue(isfile(get_response.file_path), 'file_path is not pointing to a file. file_path: %s'
118 | % get_response.file_path)
119 | # check the file size and name
120 | self.assertTrue(getsize(get_response.file_path) > PROD_CHUNK_SIZE, 'feed file is less than %s. file_path: %s'
121 | % (PROD_CHUNK_SIZE, get_response.file_path))
122 | self.assertTrue(FeedPrefix.BOOTSTRAP.value in get_response.file_path,
123 | 'feed file name does not have %s in it. file_path: %s'
124 | % (FeedPrefix.BOOTSTRAP.value, get_response.file_path))
125 | file_dir, file_name = split(abspath(get_response.file_path))
126 | self.assertEqual(abspath(feed_req_obj.download_location), file_dir)
127 |
128 |
129 | if __name__ == '__main__':
130 | unittest.main()
131 |
--------------------------------------------------------------------------------
/tests/test_file_utils.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | import os
3 | import shutil
4 | from utils import file_utils
5 | from enums.file_enums import FileFormat
6 | from errors.custom_exceptions import FileCreationError, InputDataError
7 | from constants.feed_constants import SANDBOX_CHUNK_SIZE
8 |
9 |
10 | class TestFileUtils(unittest.TestCase):
11 | def test_append_response_to_file(self):
12 | test_binary_data = b'\x01\x02\x03\x04'
13 | test_file_path = '../tests/test-data/testFile1'
14 | try:
15 | with open(test_file_path, 'wb') as file_obj:
16 | # create and append to the file
17 | file_utils.append_response_to_file(file_obj, test_binary_data)
18 | except (IOError, FileCreationError) as exp:
19 | # clean up
20 | if os.path.isfile(test_file_path):
21 | os.remove(test_file_path)
22 | self.fail(repr(exp))
23 | # verify that the file is created
24 | self.assertTrue(os.path.isfile(test_file_path), 'test file has not been created')
25 | # verify that file size is not zero
26 | self.assertTrue(os.path.getsize(test_file_path) > 0, 'the test file is empty')
27 | # clean up
28 | os.remove(test_file_path)
29 |
30 | def test_append_response_to_file_exception(self):
31 | with self.assertRaises(FileCreationError):
32 | file_utils.append_response_to_file(None, b'\x01')
33 |
34 | def test_create_and_replace_binary_file_none(self):
35 | with self.assertRaises(FileCreationError):
36 | file_utils.create_and_replace_binary_file(None)
37 |
38 | def test_create_and_replace_binary_file_dir(self):
39 | with self.assertRaises(FileCreationError):
40 | test_dir = os.path.expanduser('~/Desktop')
41 | file_utils.create_and_replace_binary_file(test_dir)
42 |
43 | def test_create_and_replace_binary_file_exists(self):
44 | test_binary_data = b'\x01\x02\x03\x04'
45 | test_file_path = '../tests/test-data/testFile2'
46 | with open(test_file_path, 'wb') as file_obj:
47 | file_obj.write(test_binary_data)
48 | # verify that the file is created
49 | self.assertTrue(os.path.isfile(test_file_path), 'test file has not been created')
50 | # verify that file size is not zero
51 | self.assertTrue(os.path.getsize(test_file_path) > 0, 'the test file is empty')
52 | # create and replace
53 | file_utils.create_and_replace_binary_file(test_file_path)
54 | # verify that the file is created
55 | self.assertTrue(os.path.isfile(test_file_path), 'test file has not been created')
56 | # verify that file size is zero
57 | self.assertEqual(os.path.getsize(test_file_path), 0)
58 | # clean up
59 | os.remove(test_file_path)
60 |
61 | def test_create_and_replace_binary_file_not_exists(self):
62 | test_dir_to_be_created = '../tests/test-data/testDir'
63 | test_file_path = os.path.join(test_dir_to_be_created, 'testFile3')
64 | self.assertFalse(os.path.isfile(test_file_path), 'test file exists')
65 | # create and replace
66 | file_utils.create_and_replace_binary_file(test_file_path)
67 | # verify that the file is created
68 | self.assertTrue(os.path.isfile(test_file_path), 'test file has not been created')
69 | # verify that file size is zero
70 | self.assertEqual(os.path.getsize(test_file_path), 0)
71 | # clean up
72 | shutil.rmtree(test_dir_to_be_created)
73 |
74 | def test_find_next_range_none_range_header(self):
75 | next_range = file_utils.find_next_range(None, 100)
76 | self.assertEqual(next_range, 'bytes=0-100')
77 |
78 | def test_find_next_range_none_chunk(self):
79 | next_range = file_utils.find_next_range('0-1000/718182376', None)
80 | self.assertEqual(next_range, 'bytes=1001-%s' % (SANDBOX_CHUNK_SIZE + 1001))
81 |
82 | def test_find_next_range(self):
83 | next_range = file_utils.find_next_range('1001-2001/718182376', 1000)
84 | self.assertEqual(next_range, 'bytes=2002-3002')
85 |
86 | def test_get_file_extension_none(self):
87 | ext = file_utils.get_extension(None)
88 | self.assertEqual(ext, '')
89 |
90 | def test_get_file_extension(self):
91 | ext = file_utils.get_extension(FileFormat.GZIP.value)
92 | self.assertEqual(ext, '.gz')
93 |
94 | def test_get_file_name_dir(self):
95 | test_dir = os.path.expanduser('../feed-sdk/tests')
96 | returned_dir_name = file_utils.get_file_name(test_dir)
97 | self.assertEqual(returned_dir_name, 'tests')
98 |
99 | def test_get_file_name(self):
100 | test_dir = os.path.expanduser('../feed-sdk/tests/test_json')
101 | returned_file_name = file_utils.get_file_name(test_dir)
102 | self.assertEqual(returned_file_name, 'test_json')
103 |
104 | def test_get_file_name_none(self):
105 | with self.assertRaises(InputDataError):
106 | file_utils.get_file_name(None)
107 |
108 | def test_get_file_name_name(self):
109 | test_file_name = 'abc.txt'
110 | self.assertEqual(file_utils.get_file_name(test_file_name), test_file_name)
111 |
112 | def test_read_json(self):
113 | json_obj = file_utils.read_json('../tests/test-data/test_json')
114 | self.assertIsNotNone(json_obj)
115 | self.assertIsNotNone(json_obj.get('requests'))
116 |
117 |
118 | if __name__ == '__main__':
119 | unittest.main()
120 |
--------------------------------------------------------------------------------
/tests/test_filter_utils.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | import sys
3 | from utils import filter_utils
4 | from enums.feed_enums import FeedColumn
5 |
6 |
7 | class TestFilterUtils(unittest.TestCase):
8 | @classmethod
9 | def setUpClass(cls):
10 | cls.test_column_1 = FeedColumn.PRICE_VALUE
11 | cls.test_column_2 = FeedColumn.ITEM_LOCATION_COUNTRIES
12 |
13 | def test_get_inclusive_less_query_none(self):
14 | query_str = filter_utils.get_inclusive_less_query(self.test_column_1, None)
15 | self.assertEqual('', query_str, 'query is not an empty string')
16 |
17 | def test_get_inclusive_less_query_empty(self):
18 | query_str = filter_utils.get_inclusive_less_query(self.test_column_1, '')
19 | self.assertEqual('', query_str, 'query is not an empty string')
20 |
21 | def test_get_inclusive_less_query(self):
22 | query_str = filter_utils.get_inclusive_less_query(self.test_column_1, 10)
23 | expected_query = '%s <= 10' % self.test_column_1
24 | self.assertEqual(expected_query, query_str)
25 |
26 | def test_get_inclusive_greater_query_none(self):
27 | query_str = filter_utils.get_inclusive_greater_query(self.test_column_1, None)
28 | self.assertEqual('', query_str, 'query is not an empty string')
29 |
30 | def test_get_inclusive_greater_query_empty(self):
31 | query_str = filter_utils.get_inclusive_greater_query(self.test_column_1, '')
32 | self.assertEqual('', query_str, 'query is not an empty string')
33 |
34 | def test_get_inclusive_greater_query(self):
35 | query_str = filter_utils.get_inclusive_greater_query(self.test_column_1, 10)
36 | expected_query = '%s >= 10' % self.test_column_1
37 | self.assertEqual(expected_query, query_str)
38 |
39 | def test_get_list_number_element_query_none(self):
40 | query_str = filter_utils.get_list_number_element_query(self.test_column_2, None)
41 | self.assertEqual('', query_str, 'query is not an empty string')
42 |
43 | def test_get_list_number_element_query_empty(self):
44 | query_str = filter_utils.get_list_number_element_query(self.test_column_2, '')
45 | self.assertEqual('', query_str, 'query is not an empty string')
46 |
47 | def test_get_list_string_element_query_none(self):
48 | query_str = filter_utils.get_list_string_element_query(self.test_column_2, None)
49 | self.assertEqual('', query_str, 'query is not an empty string')
50 |
51 | def test_get_list_string_element_query_empty(self):
52 | query_str = filter_utils.get_list_string_element_query(self.test_column_2, '')
53 | self.assertEqual('', query_str, 'query is not an empty string')
54 |
55 | def test_get_list_number_element_query(self):
56 | query_str = filter_utils.get_list_number_element_query(self.test_column_2, [1, 2])
57 | expected_query = '%s IN (1,2)' % self.test_column_2
58 | self.assertEqual(expected_query, query_str)
59 |
60 | def test_get_list_string_element_query(self):
61 | query_str = filter_utils.get_list_string_element_query(self.test_column_2, ['CA', 'US'])
62 | expected_query = '%s IN (\'CA\',\'US\')' % self.test_column_2
63 | self.assertEqual(expected_query, query_str)
64 |
65 | def test_convert_to_bool_false_invalid(self):
66 | converted_bool = filter_utils.convert_to_bool_false('invalid')
67 | self.assertEqual(False, converted_bool)
68 |
69 | def test_convert_to_bool_false_true(self):
70 | converted_bool = filter_utils.convert_to_bool_false('True')
71 | self.assertEqual(True, converted_bool)
72 |
73 | def test_convert_to_bool_false_false(self):
74 | converted_bool = filter_utils.convert_to_bool_false('False')
75 | self.assertEqual(False, converted_bool)
76 |
77 | def convert_to_float_max_int_invalid(self):
78 | converted_float = filter_utils.convert_to_float_max_int('invalid')
79 | self.assertEqual(sys.maxsize, converted_float)
80 |
81 | def convert_to_float_max_int(self):
82 | converted_float = filter_utils.convert_to_float_max_int('1.2')
83 | self.assertEqual(1.2, converted_float)
84 |
85 | def convert_to_float_zero_invalid(self):
86 | converted_float = filter_utils.convert_to_float_zero('invalid')
87 | self.assertEqual(0, converted_float)
88 |
89 | def convert_to_float_zero(self):
90 | converted_float = filter_utils.convert_to_float_zero('1.2')
91 | self.assertEqual(1.2, converted_float)
92 |
93 |
94 | if __name__ == '__main__':
95 | unittest.main()
96 |
--------------------------------------------------------------------------------
/tests/test_logging_utils.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | import re
3 | import utils.logging_utils as logging_utils
4 |
5 |
6 | class TestLoggingUtils(unittest.TestCase):
7 | def test_log_file_name(self):
8 | self.assertIsNotNone(logging_utils.log_file_name)
9 | pattern = re.compile(logging_utils.LOG_FILE_NAME + '.\\d{4}-\\d{2}-\\d{2}' + logging_utils.LOG_FILE_EXTENSION)
10 | self.assertTrue(pattern.match(logging_utils.log_file_name),
11 | 'logging file name %s does not match the format' % logging_utils.log_file_name)
12 |
13 |
14 | if __name__ == '__main__':
15 | unittest.main()
16 |
--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = [
2 | 'date_utils',
3 | 'file_utils',
4 | 'filter_utils',
5 | 'logging_utils'
6 | ]
7 |
--------------------------------------------------------------------------------
/utils/date_utils.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | from datetime import datetime, timedelta
19 | from enums.feed_enums import FeedType
20 | from errors.custom_exceptions import InputDataError
21 |
22 |
23 | def get_formatted_date(feed_type, day_delta=None):
24 | """
25 | :param day_delta: the day difference
26 | :param feed_type: item or item_snapshot
27 | :return: today date string in the correct format according to feed_type
28 | """
29 | delta = day_delta if day_delta else 0
30 | date_obj = datetime.now() + timedelta(days=delta)
31 | if feed_type == str(FeedType.SNAPSHOT):
32 | # TODO: Fix the date format
33 | return date_obj.strftime('%Y-%m-%dT%H:%M:%SZ')
34 | else:
35 | return date_obj.strftime('%Y%m%d')
36 |
37 |
38 | def validate_date(feed_date, feed_type):
39 | """
40 | Validates the feed_date string format according to feed_type.
41 | :param feed_date: the date string feed is requested for
42 | :param feed_type: item or item_snapshot
43 | :raise InputDataError: if the date string format is not correct an InputDataError exception is raised
44 | """
45 | if feed_type == str(FeedType.SNAPSHOT):
46 | try:
47 | datetime.strptime(feed_date, '%Y-%m-%dT%H:%M:%SZ')
48 | except ValueError:
49 | raise InputDataError('Bad feed date format. Date should be in UTC format (yyyy-MM-ddThh:00:00.000Z)',
50 | feed_date)
51 | else:
52 | try:
53 | datetime.strptime(feed_date, '%Y%m%d')
54 | except ValueError:
55 | raise InputDataError('Bad feed date format. Date should be in yyyyMMdd format', feed_date)
56 |
--------------------------------------------------------------------------------
/utils/file_utils.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | import json
19 | from os import makedirs
20 | from os.path import isdir, basename, splitext, exists, dirname
21 | from errors import custom_exceptions
22 | import constants.feed_constants as const
23 |
24 |
25 | def append_response_to_file(file_handler, data):
26 | """
27 | Appends the given data to the existing file
28 | :param file_handler: the existing and open file object
29 | :param data: the data to be appended to the file
30 | :raise if there are any IO errors a FileCreationError exception is raised
31 | """
32 | try:
33 | file_handler.write(data)
34 | except (IOError, AttributeError) as exp:
35 | if file_handler:
36 | file_handler.close()
37 | raise custom_exceptions.FileCreationError('Error while writing in the file: %s' % repr(exp), data)
38 |
39 |
40 | def create_and_replace_binary_file(file_path):
41 | """
42 | Creates a binary file in the given path including the file name and extension
43 | If the file exists, it will be replaced
44 | :param file_path: The path to the file including the file name and extension
45 | :raise: if the file is not created successfully an FileCreationError exception is raised
46 | """
47 | try:
48 | if not exists(dirname(file_path)):
49 | makedirs(dirname(file_path))
50 | with open(file_path, 'wb'):
51 | pass
52 | except (IOError, OSError, AttributeError) as exp:
53 | raise custom_exceptions.FileCreationError('IO error in creating file %s: %s' % (file_path, repr(exp)),
54 | file_path)
55 |
56 |
57 | def find_next_range(content_range_header, chunk_size=const.SANDBOX_CHUNK_SIZE):
58 | """
59 | Finds the next value of the Range header
60 | :param content_range_header: The content-range header value returned in the response, ex. 0-1000/7181823761
61 | If None, the default Range header that is bytes=0-CHUNK_SIZE is returned
62 | :param chunk_size: The chunk size in bytes. If not provided, the default chunk size is used
63 | :return: The next value of the Range header in the format of bytes=lower-upper or empty string if no data is left
64 | :raise: If the input content-range value is not correct an InputDataError exception is raised
65 | """
66 | chunk = chunk_size if chunk_size else const.SANDBOX_CHUNK_SIZE
67 | if content_range_header is None:
68 | return const.RANGE_PREFIX + '0-' + str(chunk)
69 | else:
70 | try:
71 | # ex. content-range : 0-1000/7181823761
72 | range_components = content_range_header.split('/')
73 | total_size = int(range_components[1])
74 | bounds = range_components[0].split('-')
75 | upper_bound = int(bounds[1]) + 1
76 | if upper_bound > total_size:
77 | return ''
78 | return const.RANGE_PREFIX + str(upper_bound) + '-' + str(upper_bound + chunk)
79 | except Exception:
80 | raise custom_exceptions.InputDataError('Bad content-range header format: %s' % content_range_header,
81 | content_range_header)
82 |
83 |
84 | def get_extension(file_type):
85 | """
86 | Returns file extension including '.' according to the given file type
87 | :param file_type: format of the file such as gzip
88 | :return: extension of the file such as '.gz'
89 | """
90 | if not file_type:
91 | return ''
92 | if file_type.lower() == 'gz' or file_type.lower() == 'gzip':
93 | return '.gz'
94 |
95 |
96 | def get_file_name(name_or_path):
97 | """
98 | Finds name of the file from the given file path or name
99 | :param name_or_path: name or path to the file
100 | :return: file name
101 | """
102 | if not name_or_path:
103 | raise custom_exceptions.InputDataError('Bad file name or directory %s' % name_or_path, name_or_path)
104 | if isdir(name_or_path):
105 | base = basename(name_or_path)
106 | return splitext(base)
107 | elif '/' in name_or_path:
108 | return name_or_path[name_or_path.rfind('/') + 1:]
109 | else:
110 | return name_or_path
111 |
112 |
113 | def read_json(file_path):
114 | """
115 | Reads json from a file and returns a json object
116 | :param file_path: the path to the file
117 | :return: a json object
118 | """
119 | with open(file_path) as config_file:
120 | json_obj = json.load(config_file)
121 | return json_obj
122 |
--------------------------------------------------------------------------------
/utils/filter_utils.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | import sys
19 | import pandas as pd
20 | from distutils.util import strtobool
21 |
22 |
23 | def convert_to_bool_false(data):
24 | try:
25 | bool_value = strtobool(data)
26 | return pd.np.bool(bool_value)
27 | except (ValueError, TypeError, AttributeError):
28 | return pd.np.bool(False)
29 |
30 |
31 | def convert_to_float_max_int(data):
32 | try:
33 | return pd.np.float(data)
34 | except (ValueError, TypeError, AttributeError):
35 | return pd.np.float(sys.maxsize)
36 |
37 |
38 | def convert_to_float_zero(data):
39 | try:
40 | return pd.np.float(data)
41 | except (ValueError, TypeError, AttributeError):
42 | return pd.np.float(0)
43 |
44 |
45 | def get_inclusive_less_query(column_name, upper_limit):
46 | if not upper_limit:
47 | return ''
48 | return '%s <= %s' % (column_name, upper_limit)
49 |
50 |
51 | def get_inclusive_greater_query(column_name, lower_limit):
52 | if not lower_limit:
53 | return ''
54 | return '%s >= %s' % (column_name, lower_limit)
55 |
56 |
57 | def get_list_number_element_query(column_name, value_list):
58 | if not value_list:
59 | return ''
60 | list_str = ','.join(str(element) for element in value_list)
61 | return '%s IN (%s)' % (column_name, list_str)
62 |
63 |
64 | def get_list_string_element_query(column_name, value_list):
65 | if not value_list:
66 | return ''
67 | list_str = (','.join('\'' + item + '\'' for item in value_list))
68 | return '%s IN (%s)' % (column_name, list_str)
69 |
--------------------------------------------------------------------------------
/utils/logging_utils.py:
--------------------------------------------------------------------------------
1 | # **************************************************************************
2 | # Copyright 2018-2019 eBay Inc.
3 | # Author/Developers: --
4 |
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 |
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 |
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | # **************************************************************************/
17 |
18 | import logging
19 | from datetime import datetime
20 |
21 | LOG_FILE_NAME = 'feed-sdk-log'
22 | LOG_FILE_EXTENSION = '.log'
23 | LOGGING_FORMAT = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
24 |
25 | log_file_name = LOG_FILE_NAME + '.' + datetime.now().strftime('%Y-%m-%d') + LOG_FILE_EXTENSION
26 |
27 |
28 | def setup_logging():
29 | logging.basicConfig(filename=log_file_name, filemode='a', level=logging.DEBUG, format=LOGGING_FORMAT)
30 |
--------------------------------------------------------------------------------