├── .github └── FUNDING.yml ├── .gitignore ├── .travis.yml ├── AUTHORS.md ├── CONTRIBUTING.md ├── HISTORY.md ├── LICENSE ├── MANIFEST.in ├── Makefile ├── Pipfile ├── Pipfile.lock ├── README.md ├── dataclass_csv ├── __init__.py ├── __init__.pyi ├── dataclass_reader.py ├── dataclass_reader.pyi ├── dataclass_writer.py ├── dataclass_writer.pyi ├── decorators.py ├── decorators.pyi ├── exceptions.py ├── exceptions.pyi ├── field_mapper.py ├── field_mapper.pyi ├── header_mapper.py ├── header_mapper.pyi └── py.typed ├── setup.cfg ├── setup.py └── tests ├── __init__.py ├── conftest.py ├── mocks.py ├── test_csv_data_validation.py ├── test_dataclass_reader.py ├── test_dataclass_writer.py └── test_decorators.py /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | 3 | github: [dfurtado] 4 | patreon: # Replace with a single Patreon username 5 | open_collective: # Replace with a single Open Collective username 6 | ko_fi: # Replace with a single Ko-fi username 7 | tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel 8 | community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry 9 | liberapay: # Replace with a single Liberapay username 10 | issuehunt: # Replace with a single IssueHunt username 11 | otechie: # Replace with a single Otechie username 12 | custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2'] 13 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ 2 | *.pyc 3 | .idea 4 | env/ 5 | *.swo 6 | *.swp 7 | *.*~ 8 | *.egg 9 | *.egg-info 10 | .#*.* 11 | TAGS 12 | docs 13 | .mypy_cache 14 | .pytest_cache 15 | build 16 | dist 17 | .eggs 18 | .vscode 19 | gmon.out 20 | .vim 21 | pyproject.toml 22 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | python: 3 | - "3.7-dev" 4 | # command to install dependencies 5 | install: 6 | - "pip install pipenv" 7 | - "pipenv install" 8 | # command to run tests 9 | script: pytest 10 | -------------------------------------------------------------------------------- /AUTHORS.md: -------------------------------------------------------------------------------- 1 | # Credits 2 | 3 | ## Development Lead 4 | 5 | * Daniel Furtado 6 | 7 | ## Contributors 8 | 9 | * Nick Schober 10 | * Zoltan Ivanfi 11 | * Alec Benzer 12 | * Clint Byrum 13 | * @johnthangen 14 | 15 | See complete list at: https://github.com/dfurtado/dataclass-csv/graphs/contributors 16 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | I love to work together with people so if you have an excellent idea for a feature, improvements or maybe you found 4 | a bug. Please, don't hesitate in adding an issue, so we can discuss and find a good solution together. 5 | 6 | If you never participated in any open-source project before it is even better! I can help you through the process. The 7 | important thing is to get more people involved in the Python community. So, don't be shy! 8 | 9 | ## Getting started 10 | 11 | The best way to get started is to look at the issues section and see if the bug or feature you are planning to work with 12 | isn't already in development. 13 | 14 | When the features and bug are not reported in the issues section, you can start by adding the issue. 15 | 16 | ### Reporting a bug 17 | 18 | To report a bug is simple, add a few things: 19 | 20 | - Description of the bug 21 | - Error messages and tracebacks 22 | - Steps to reproduce 23 | - Python version 24 | - System version 25 | 26 | ### Feature proposal 27 | 28 | If you had a great idea for a feature, I would be thrilled to hear. You can add to the feature description 29 | the following information: 30 | 31 | - Description of the feature 32 | - Some use cases 33 | - Some examples of how the feature could work 34 | 35 | When adding a proposal for a new feature try being as detailed as possible. 36 | 37 | ### Documentation 38 | 39 | The project could always need more detailed and complete documentation. Contributions with docstrings and 40 | improving the current documentation on README are very welcome. 41 | 42 | ## Start coding 43 | 44 | If you get an issue to work on, then you can: 45 | 46 | - Fork the project 47 | - Create a new branch 48 | - Hack away! 49 | - Submit the pull request when you are ready 50 | 51 | 52 | ## Before you submit a pull request 53 | 54 | - Make sure to add unit tests (if applicable) 55 | - Make sure all tests are passing 56 | - Run a code formatter. This project uses black, you can run the command: `black -l79 -N -S ./dataclass_csv` 57 | - Add docstrings for new functions and classes. 58 | 59 | 60 | 61 | Happy Hacking!!! 62 | -------------------------------------------------------------------------------- /HISTORY.md: -------------------------------------------------------------------------------- 1 | # History 2 | 3 | ### 0.1.0 (2018-11-25) 4 | 5 | * First release on PyPI. 6 | 7 | ### 0.1.1 (2018-11-25) 8 | 9 | * Documentation fixes. 10 | 11 | ### 0.1.2 (2018-11-25) 12 | 13 | * Documentation fixes. 14 | 15 | ### 0.1.3 (2018-11-26) 16 | 17 | * Bug fixes 18 | * Removed the requirement of setting the dataclass init to `True` 19 | 20 | ### 0.1.5 (2018-11-29) 21 | 22 | * Support for parsing datetime values. 23 | * Better handling when default values are set to `None` 24 | 25 | ### 0.1.6 (2018-12-01) 26 | 27 | * Added support for reader default values from the default property of the `dataclasses.field`. 28 | * Added support for allowing string values with only white spaces in a class level using the `@accept_whitespaces` decorator or through the `dataclasses.field` metadata. 29 | * Added support for specifying date format using the `dataclasses.field` metadata. 30 | 31 | ### 0.1.7 (2018-12-01) 32 | 33 | * Added support for default values from `default_factory` in the field's metadata. This allows adding mutable default values to the dataclass properties. 34 | 35 | ### 1.0.0 (2018-12-16) 36 | 37 | * When a data does not pass validation it shows the line number in the CSV file where the data contain errors. 38 | * Improved error handling. 39 | * Changed the usage of the `@accept_whitespaces` decorator. 40 | * Updated documentation. 41 | 42 | ### 1.0.1 (2019-01-29) 43 | 44 | * Fixed issue when parsing headers on a CSV file with trailing white spaces. 45 | 46 | ### 1.1.0 (2019-02-17) 47 | 48 | * Added support for boolean values. 49 | * Docstrings 50 | 51 | ### 1.1.1 (2019-02-17) 52 | 53 | * Documentation fixes. 54 | 55 | ### 1.1.2 (2019-02-17) 56 | 57 | * Documentation fixes. 58 | 59 | ### 1.1.3 (2020-03-01) 60 | 61 | * Handle properties with init set to False 62 | * Handle Option type annotation 63 | 64 | ### 1.2.0 (2021-03-02) 65 | 66 | * Introduction of a DataclassWriter 67 | * Added type hinting to external API 68 | * Documentation updates 69 | * Bug fixes 70 | 71 | ## 1.3.0 (2021-04-10) 72 | 73 | * Included stub files 74 | * check if the CSV file has duplicated header values 75 | * Fixed issues #22 and #33 76 | * code cleanup 77 | 78 | ## 1.4.0 (2021-12-13) 79 | 80 | * Bug fixes 81 | * Support for date types -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | 3 | BSD License 4 | 5 | Copyright (c) 2018, Daniel Furtado 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without modification, 9 | are permitted provided that the following conditions are met: 10 | 11 | * Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | * Redistributions in binary form must reproduce the above copyright notice, this 15 | list of conditions and the following disclaimer in the documentation and/or 16 | other materials provided with the distribution. 17 | 18 | * Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from this 20 | software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 23 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 24 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 26 | INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 29 | OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE 30 | OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED 31 | OF THE POSSIBILITY OF SUCH DAMAGE. 32 | 33 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include AUTHORS.md 2 | include CONTRIBUTING.md 3 | include HISTORY.md 4 | include LICENSE 5 | include README.md 6 | 7 | recursive-include tests * 8 | recursive-exclude * __pycache__ 9 | recursive-exclude * *.py[co] 10 | 11 | recursive-include docs *.md conf.py Makefile make.bat *.jpg *.png *.gif 12 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: clean clean-test clean-pyc clean-build docs help 2 | .DEFAULT_GOAL := help 3 | 4 | define BROWSER_PYSCRIPT 5 | import os, webbrowser, sys 6 | 7 | try: 8 | from urllib import pathname2url 9 | except: 10 | from urllib.request import pathname2url 11 | 12 | webbrowser.open("file://" + pathname2url(os.path.abspath(sys.argv[1]))) 13 | endef 14 | export BROWSER_PYSCRIPT 15 | 16 | define PRINT_HELP_PYSCRIPT 17 | import re, sys 18 | 19 | for line in sys.stdin: 20 | match = re.match(r'^([a-zA-Z_-]+):.*?## (.*)$$', line) 21 | if match: 22 | target, help = match.groups() 23 | print("%-20s %s" % (target, help)) 24 | endef 25 | export PRINT_HELP_PYSCRIPT 26 | 27 | BROWSER := python -c "$$BROWSER_PYSCRIPT" 28 | 29 | help: 30 | @python -c "$$PRINT_HELP_PYSCRIPT" < $(MAKEFILE_LIST) 31 | 32 | clean: clean-build clean-pyc clean-test ## remove all build, test, coverage and Python artifacts 33 | 34 | clean-build: ## remove build artifacts 35 | rm -fr build/ 36 | rm -fr dist/ 37 | rm -fr .eggs/ 38 | find . -name '*.egg-info' -exec rm -fr {} + 39 | find . -name '*.egg' -exec rm -f {} + 40 | 41 | clean-pyc: ## remove Python file artifacts 42 | find . -name '*.pyc' -exec rm -f {} + 43 | find . -name '*.pyo' -exec rm -f {} + 44 | find . -name '*~' -exec rm -f {} + 45 | find . -name '__pycache__' -exec rm -fr {} + 46 | 47 | clean-test: ## remove test and coverage artifacts 48 | rm -fr .tox/ 49 | rm -f .coverage 50 | rm -fr htmlcov/ 51 | rm -fr .pytest_cache 52 | 53 | lint: ## check style with mypy and flake8 54 | mypy dataclass_csv tests 55 | flake8 dataclass_csv tests 56 | 57 | test: ## run tests quickly with the default Python 58 | py.test 59 | 60 | test-all: ## run tests on every Python version with tox 61 | tox 62 | 63 | coverage: ## check code coverage quickly with the default Python 64 | coverage run --source dataclass-csv -m pytest 65 | coverage report -m 66 | coverage html 67 | $(BROWSER) htmlcov/index.html 68 | 69 | docs: ## generate Sphinx HTML documentation, including API docs 70 | rm -f docs/dataclass-csv.rst 71 | rm -f docs/modules.rst 72 | sphinx-apidoc -o docs/ dataclass-csv 73 | $(MAKE) -C docs clean 74 | $(MAKE) -C docs html 75 | $(BROWSER) docs/_build/html/index.html 76 | 77 | servedocs: docs ## compile the docs watching for changes 78 | watchmedo shell-command -p '*.rst' -c '$(MAKE) -C docs html' -R -D . 79 | 80 | release: dist ## package and upload a release 81 | twine upload dist/* 82 | 83 | dist: clean ## builds source and wheel package 84 | python setup.py sdist 85 | python setup.py bdist_wheel 86 | ls -l dist 87 | 88 | install: clean ## install the package to the active Python's site-packages 89 | python setup.py install 90 | -------------------------------------------------------------------------------- /Pipfile: -------------------------------------------------------------------------------- 1 | [[source]] 2 | url = "https://pypi.org/simple" 3 | verify_ssl = true 4 | name = "pypi" 5 | 6 | [dev-packages] 7 | pytest = "*" 8 | mypy = "*" 9 | flake8 = "*" 10 | 11 | [packages] 12 | 13 | [requires] 14 | python_version = "3.7" 15 | -------------------------------------------------------------------------------- /Pipfile.lock: -------------------------------------------------------------------------------- 1 | { 2 | "_meta": { 3 | "hash": { 4 | "sha256": "ff9642fcfefdd196731283041b11231f54200352dd42071a89fc5dbe84ce128b" 5 | }, 6 | "pipfile-spec": 6, 7 | "requires": { 8 | "python_version": "3.7" 9 | }, 10 | "sources": [ 11 | { 12 | "name": "pypi", 13 | "url": "https://pypi.org/simple", 14 | "verify_ssl": true 15 | } 16 | ] 17 | }, 18 | "default": {}, 19 | "develop": { 20 | "attrs": { 21 | "hashes": [ 22 | "sha256:31b2eced602aa8423c2aea9c76a724617ed67cf9513173fd3a4f03e3a929c7e6", 23 | "sha256:832aa3cde19744e49938b91fea06d69ecb9e649c93ba974535d08ad92164f700" 24 | ], 25 | "markers": "python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'", 26 | "version": "==20.3.0" 27 | }, 28 | "flake8": { 29 | "hashes": [ 30 | "sha256:12d05ab02614b6aee8df7c36b97d1a3b2372761222b19b58621355e82acddcff", 31 | "sha256:78873e372b12b093da7b5e5ed302e8ad9e988b38b063b61ad937f26ca58fc5f0" 32 | ], 33 | "index": "pypi", 34 | "version": "==3.9.0" 35 | }, 36 | "iniconfig": { 37 | "hashes": [ 38 | "sha256:011e24c64b7f47f6ebd835bb12a743f2fbe9a26d4cecaa7f53bc4f35ee9da8b3", 39 | "sha256:bc3af051d7d14b2ee5ef9969666def0cd1a000e121eaea580d4a313df4b37f32" 40 | ], 41 | "version": "==1.1.1" 42 | }, 43 | "mccabe": { 44 | "hashes": [ 45 | "sha256:ab8a6258860da4b6677da4bd2fe5dc2c659cff31b3ee4f7f5d64e79735b80d42", 46 | "sha256:dd8d182285a0fe56bace7f45b5e7d1a6ebcbf524e8f3bd87eb0f125271b8831f" 47 | ], 48 | "version": "==0.6.1" 49 | }, 50 | "mypy": { 51 | "hashes": [ 52 | "sha256:0d0a87c0e7e3a9becdfbe936c981d32e5ee0ccda3e0f07e1ef2c3d1a817cf73e", 53 | "sha256:25adde9b862f8f9aac9d2d11971f226bd4c8fbaa89fb76bdadb267ef22d10064", 54 | "sha256:28fb5479c494b1bab244620685e2eb3c3f988d71fd5d64cc753195e8ed53df7c", 55 | "sha256:2f9b3407c58347a452fc0736861593e105139b905cca7d097e413453a1d650b4", 56 | "sha256:33f159443db0829d16f0a8d83d94df3109bb6dd801975fe86bacb9bf71628e97", 57 | "sha256:3f2aca7f68580dc2508289c729bd49ee929a436208d2b2b6aab15745a70a57df", 58 | "sha256:499c798053cdebcaa916eef8cd733e5584b5909f789de856b482cd7d069bdad8", 59 | "sha256:4eec37370483331d13514c3f55f446fc5248d6373e7029a29ecb7b7494851e7a", 60 | "sha256:552a815579aa1e995f39fd05dde6cd378e191b063f031f2acfe73ce9fb7f9e56", 61 | "sha256:5873888fff1c7cf5b71efbe80e0e73153fe9212fafdf8e44adfe4c20ec9f82d7", 62 | "sha256:61a3d5b97955422964be6b3baf05ff2ce7f26f52c85dd88db11d5e03e146a3a6", 63 | "sha256:674e822aa665b9fd75130c6c5f5ed9564a38c6cea6a6432ce47eafb68ee578c5", 64 | "sha256:7ce3175801d0ae5fdfa79b4f0cfed08807af4d075b402b7e294e6aa72af9aa2a", 65 | "sha256:9743c91088d396c1a5a3c9978354b61b0382b4e3c440ce83cf77994a43e8c521", 66 | "sha256:9f94aac67a2045ec719ffe6111df543bac7874cee01f41928f6969756e030564", 67 | "sha256:a26f8ec704e5a7423c8824d425086705e381b4f1dfdef6e3a1edab7ba174ec49", 68 | "sha256:abf7e0c3cf117c44d9285cc6128856106183938c68fd4944763003decdcfeb66", 69 | "sha256:b09669bcda124e83708f34a94606e01b614fa71931d356c1f1a5297ba11f110a", 70 | "sha256:cd07039aa5df222037005b08fbbfd69b3ab0b0bd7a07d7906de75ae52c4e3119", 71 | "sha256:d23e0ea196702d918b60c8288561e722bf437d82cb7ef2edcd98cfa38905d506", 72 | "sha256:d65cc1df038ef55a99e617431f0553cd77763869eebdf9042403e16089fe746c", 73 | "sha256:d7da2e1d5f558c37d6e8c1246f1aec1e7349e4913d8fb3cb289a35de573fe2eb" 74 | ], 75 | "index": "pypi", 76 | "version": "==0.812" 77 | }, 78 | "mypy-extensions": { 79 | "hashes": [ 80 | "sha256:090fedd75945a69ae91ce1303b5824f428daf5a028d2f6ab8a299250a846f15d", 81 | "sha256:2d82818f5bb3e369420cb3c4060a7970edba416647068eb4c5343488a6c604a8" 82 | ], 83 | "version": "==0.4.3" 84 | }, 85 | "packaging": { 86 | "hashes": [ 87 | "sha256:5b327ac1320dc863dca72f4514ecc086f31186744b84a230374cc1fd776feae5", 88 | "sha256:67714da7f7bc052e064859c05c595155bd1ee9f69f76557e21f051443c20947a" 89 | ], 90 | "markers": "python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'", 91 | "version": "==20.9" 92 | }, 93 | "pluggy": { 94 | "hashes": [ 95 | "sha256:15b2acde666561e1298d71b523007ed7364de07029219b604cf808bfa1c765b0", 96 | "sha256:966c145cd83c96502c3c3868f50408687b38434af77734af1e9ca461a4081d2d" 97 | ], 98 | "markers": "python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'", 99 | "version": "==0.13.1" 100 | }, 101 | "py": { 102 | "hashes": [ 103 | "sha256:21b81bda15b66ef5e1a777a21c4dcd9c20ad3efd0b3f817e7a809035269e1bd3", 104 | "sha256:3b80836aa6d1feeaa108e046da6423ab8f6ceda6468545ae8d02d9d58d18818a" 105 | ], 106 | "markers": "python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'", 107 | "version": "==1.10.0" 108 | }, 109 | "pycodestyle": { 110 | "hashes": [ 111 | "sha256:514f76d918fcc0b55c6680472f0a37970994e07bbb80725808c17089be302068", 112 | "sha256:c389c1d06bf7904078ca03399a4816f974a1d590090fecea0c63ec26ebaf1cef" 113 | ], 114 | "markers": "python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'", 115 | "version": "==2.7.0" 116 | }, 117 | "pyflakes": { 118 | "hashes": [ 119 | "sha256:910208209dcea632721cb58363d0f72913d9e8cf64dc6f8ae2e02a3609aba40d", 120 | "sha256:e59fd8e750e588358f1b8885e5a4751203a0516e0ee6d34811089ac294c8806f" 121 | ], 122 | "markers": "python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'", 123 | "version": "==2.3.0" 124 | }, 125 | "pyparsing": { 126 | "hashes": [ 127 | "sha256:c203ec8783bf771a155b207279b9bccb8dea02d8f0c9e5f8ead507bc3246ecc1", 128 | "sha256:ef9d7589ef3c200abe66653d3f1ab1033c3c419ae9b9bdb1240a85b024efc88b" 129 | ], 130 | "markers": "python_version >= '2.6' and python_version not in '3.0, 3.1, 3.2, 3.3'", 131 | "version": "==2.4.7" 132 | }, 133 | "pytest": { 134 | "hashes": [ 135 | "sha256:9d1edf9e7d0b84d72ea3dbcdfd22b35fb543a5e8f2a60092dd578936bf63d7f9", 136 | "sha256:b574b57423e818210672e07ca1fa90aaf194a4f63f3ab909a2c67ebb22913839" 137 | ], 138 | "index": "pypi", 139 | "version": "==6.2.2" 140 | }, 141 | "toml": { 142 | "hashes": [ 143 | "sha256:806143ae5bfb6a3c6e736a764057db0e6a0e05e338b5630894a5f779cabb4f9b", 144 | "sha256:b3bda1d108d5dd99f4a20d24d9c348e91c4db7ab1b749200bded2f839ccbe68f" 145 | ], 146 | "markers": "python_version >= '2.6' and python_version not in '3.0, 3.1, 3.2, 3.3'", 147 | "version": "==0.10.2" 148 | }, 149 | "typed-ast": { 150 | "hashes": [ 151 | "sha256:07d49388d5bf7e863f7fa2f124b1b1d89d8aa0e2f7812faff0a5658c01c59aa1", 152 | "sha256:14bf1522cdee369e8f5581238edac09150c765ec1cb33615855889cf33dcb92d", 153 | "sha256:240296b27397e4e37874abb1df2a608a92df85cf3e2a04d0d4d61055c8305ba6", 154 | "sha256:36d829b31ab67d6fcb30e185ec996e1f72b892255a745d3a82138c97d21ed1cd", 155 | "sha256:37f48d46d733d57cc70fd5f30572d11ab8ed92da6e6b28e024e4a3edfb456e37", 156 | "sha256:4c790331247081ea7c632a76d5b2a265e6d325ecd3179d06e9cf8d46d90dd151", 157 | "sha256:5dcfc2e264bd8a1db8b11a892bd1647154ce03eeba94b461effe68790d8b8e07", 158 | "sha256:7147e2a76c75f0f64c4319886e7639e490fee87c9d25cb1d4faef1d8cf83a440", 159 | "sha256:7703620125e4fb79b64aa52427ec192822e9f45d37d4b6625ab37ef403e1df70", 160 | "sha256:8368f83e93c7156ccd40e49a783a6a6850ca25b556c0fa0240ed0f659d2fe496", 161 | "sha256:84aa6223d71012c68d577c83f4e7db50d11d6b1399a9c779046d75e24bed74ea", 162 | "sha256:85f95aa97a35bdb2f2f7d10ec5bbdac0aeb9dafdaf88e17492da0504de2e6400", 163 | "sha256:8db0e856712f79c45956da0c9a40ca4246abc3485ae0d7ecc86a20f5e4c09abc", 164 | "sha256:9044ef2df88d7f33692ae3f18d3be63dec69c4fb1b5a4a9ac950f9b4ba571606", 165 | "sha256:963c80b583b0661918718b095e02303d8078950b26cc00b5e5ea9ababe0de1fc", 166 | "sha256:987f15737aba2ab5f3928c617ccf1ce412e2e321c77ab16ca5a293e7bbffd581", 167 | "sha256:9ec45db0c766f196ae629e509f059ff05fc3148f9ffd28f3cfe75d4afb485412", 168 | "sha256:9fc0b3cb5d1720e7141d103cf4819aea239f7d136acf9ee4a69b047b7986175a", 169 | "sha256:a2c927c49f2029291fbabd673d51a2180038f8cd5a5b2f290f78c4516be48be2", 170 | "sha256:a38878a223bdd37c9709d07cd357bb79f4c760b29210e14ad0fb395294583787", 171 | "sha256:b4fcdcfa302538f70929eb7b392f536a237cbe2ed9cba88e3bf5027b39f5f77f", 172 | "sha256:c0c74e5579af4b977c8b932f40a5464764b2f86681327410aa028a22d2f54937", 173 | "sha256:c1c876fd795b36126f773db9cbb393f19808edd2637e00fd6caba0e25f2c7b64", 174 | "sha256:c9aadc4924d4b5799112837b226160428524a9a45f830e0d0f184b19e4090487", 175 | "sha256:cc7b98bf58167b7f2db91a4327da24fb93368838eb84a44c472283778fc2446b", 176 | "sha256:cf54cfa843f297991b7388c281cb3855d911137223c6b6d2dd82a47ae5125a41", 177 | "sha256:d003156bb6a59cda9050e983441b7fa2487f7800d76bdc065566b7d728b4581a", 178 | "sha256:d175297e9533d8d37437abc14e8a83cbc68af93cc9c1c59c2c292ec59a0697a3", 179 | "sha256:d746a437cdbca200622385305aedd9aef68e8a645e385cc483bdc5e488f07166", 180 | "sha256:e683e409e5c45d5c9082dc1daf13f6374300806240719f95dc783d1fc942af10" 181 | ], 182 | "version": "==1.4.2" 183 | }, 184 | "typing-extensions": { 185 | "hashes": [ 186 | "sha256:7cb407020f00f7bfc3cb3e7881628838e69d8f3fcab2f64742a5e76b2f841918", 187 | "sha256:99d4073b617d30288f569d3f13d2bd7548c3a7e4c8de87db09a9d29bb3a4a60c", 188 | "sha256:dafc7639cde7f1b6e1acc0f457842a83e722ccca8eef5270af2d74792619a89f" 189 | ], 190 | "markers": "python_version < '3.8'", 191 | "version": "==3.7.4.3" 192 | } 193 | } 194 | } 195 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![Build Status](https://travis-ci.org/dfurtado/dataclass-csv.svg?branch=master)](https://travis-ci.org/dfurtado/dataclass-csv) 2 | [![pypi](https://img.shields.io/pypi/v/dataclass-csv.svg)](https://pypi.python.org/pypi/dataclass-csv) 3 | [![Downloads](https://pepy.tech/badge/dataclass-csv)](https://pepy.tech/project/dataclass-csv) 4 | 5 | 6 | 7 | # Dataclass CSV 8 | 9 | Dataclass CSV makes working with CSV files easier and much better than working with Dicts. It uses Python's Dataclasses to store data of every row on the CSV file and also uses type annotations which enables proper type checking and validation. 10 | 11 | 12 | ## Main features 13 | 14 | - Use `dataclasses` instead of dictionaries to represent the rows in the CSV file. 15 | - Take advantage of the `dataclass` properties type annotation. `DataclassReader` use the type annotation to perform validation of the data of the CSV file. 16 | - Automatic type conversion. `DataclassReader` supports `str`, `int`, `float`, `complex`, `datetime` and `bool`, as well as any type whose constructor accepts a string as its single argument. 17 | - Helps you troubleshoot issues with the data in the CSV file. `DataclassReader` will show exactly in which line of the CSV file contain errors. 18 | - Extract only the data you need. It will only parse the properties defined in the `dataclass` 19 | - Familiar syntax. The `DataclassReader` is used almost the same way as the `DictReader` in the standard library. 20 | - It uses `dataclass` features that let you define metadata properties so the data can be parsed exactly the way you want. 21 | - Make the code cleaner. No more extra loops to convert data to the correct type, perform validation, set default values, the `DataclassReader` will do all this for you. 22 | - In additon of the `DataclassReader` the library also provides a `DataclassWriter` which enables creating a CSV file 23 | using a list of instances of a dataclass. 24 | 25 | 26 | ## Installation 27 | 28 | ```shell 29 | pipenv install dataclass-csv 30 | ``` 31 | 32 | ## Getting started 33 | 34 | ## Using the DataclassReader 35 | 36 | First, add the necessary imports: 37 | 38 | ```python 39 | from dataclasses import dataclass 40 | 41 | from dataclass_csv import DataclassReader 42 | ``` 43 | 44 | Assuming that we have a CSV file with the contents below: 45 | ```text 46 | firstname,email,age 47 | Elsa,elsa@test.com, 11 48 | Astor,astor@test.com, 7 49 | Edit,edit@test.com, 3 50 | Ella,ella@test.com, 2 51 | ``` 52 | 53 | Let's create a dataclass that will represent a row in the CSV file above: 54 | ```python 55 | @dataclass 56 | class User: 57 | firstname: str 58 | email: str 59 | age: int 60 | ``` 61 | 62 | The dataclass `User` has 3 properties, `firstname` and `email` is of type `str` and `age` is of type `int`. 63 | 64 | To load and read the contents of the CSV file we do the same thing as if we would be using the `DictReader` from the `csv` module in the Python's standard library. After opening the file we create an instance of the `DataclassReader` passing two arguments. The first is the `file` and the second is the dataclass that we wish to use to represent the data of every row of the CSV file. Like so: 65 | 66 | ```python 67 | with open(filename) as users_csv: 68 | reader = DataclassReader(users_csv, User) 69 | for row in reader: 70 | print(row) 71 | ``` 72 | 73 | The `DataclassReader` internally uses the `DictReader` from the `csv` module to read the CSV file which means that you can pass the same arguments that you would pass to the `DictReader`. The complete argument list is shown below: 74 | 75 | ```python 76 | dataclass_csv.DataclassReader( 77 | f, 78 | cls, 79 | fieldnames=None, 80 | restkey=None, 81 | restval=None, 82 | dialect='excel', 83 | *args, 84 | **kwds 85 | ) 86 | ``` 87 | 88 | All keyword arguments support by `DictReader` are supported by the `DataclassReader`, with the addition of: 89 | 90 | `validate_header` - The `DataclassReader` will raise a `ValueError` if the CSV file cointain columns with the same name. This 91 | validation is performed to avoid data being overwritten. To skip this validation set `validate_header=False` when creating a 92 | instance of the `DataclassReader`, see an example below: 93 | 94 | ```python 95 | reader = DataclassReader(f, User, validate_header=False) 96 | ``` 97 | 98 | If you run this code you should see an output like this: 99 | 100 | ```python 101 | User(firstname='Elsa', email='elsa@test.com', age=11) 102 | User(firstname='Astor', email='astor@test.com', age=7) 103 | User(firstname='Edit', email='edit@test.com', age=3) 104 | User(firstname='Ella', email='ella@test.com', age=2) 105 | ``` 106 | 107 | ### Error handling 108 | 109 | One of the advantages of using the `DataclassReader` is that it makes it easy to detect when the type of data in the CSV file is not what your application's model is expecting. And, the `DataclassReader` shows errors that will help to identify the rows with problem in your CSV file. 110 | 111 | For example, say we change the contents of the CSV file shown in the **Getting started** section and, modify the `age` of the user Astor, let's change it to a string value: 112 | 113 | ```text 114 | Astor, astor@test.com, test 115 | ``` 116 | 117 | Remember that in the dataclass `User` the `age` property is annotated with `int`. If we run the code again an exception will be raised with the message below: 118 | 119 | ```text 120 | dataclass_csv.exceptions.CsvValueError: The field `age` is defined as but 121 | received a value of type . [CSV Line number: 3] 122 | ``` 123 | 124 | Note that apart from telling what the error was, the `DataclassReader` will also show which line of the CSV file contain the data with errors. 125 | 126 | ### Default values 127 | 128 | The `DataclassReader` also handles properties with default values. Let's modify the dataclass `User` and add a default value for the field `email`: 129 | 130 | ```python 131 | from dataclasses import dataclass 132 | 133 | 134 | @dataclass 135 | class User: 136 | firstname: str 137 | email: str = 'Not specified' 138 | age: int 139 | ``` 140 | 141 | And we modify the CSV file and remove the email for the user Astor: 142 | 143 | ```python 144 | Astor,, 7 145 | ``` 146 | 147 | If we run the code we should see the output below: 148 | 149 | ```text 150 | User(firstname='Elsa', email='elsa@test.com', age=11) 151 | User(firstname='Astor', email='Not specified', age=7) 152 | User(firstname='Edit', email='edit@test.com', age=3) 153 | User(firstname='Ella', email='ella@test.com', age=2) 154 | ``` 155 | 156 | Note that now the object for the user Astor have the default value `Not specified` assigned to the email property. 157 | 158 | Default values can also be set using `dataclasses.field` like so: 159 | 160 | ```python 161 | from dataclasses import dataclass, field 162 | 163 | 164 | @dataclass 165 | class User: 166 | firstname: str 167 | email: str = field(default='Not specified') 168 | age: int 169 | ``` 170 | 171 | ### Mapping dataclass fields to columns 172 | 173 | The mapping between a dataclass property and a column in the CSV file will be done automatically if the names match, however, there are situations that the name of the header for a column is different. We can easily tell the `DataclassReader` how the mapping should be done using the method `map`. Assuming that we have a CSV file with the contents below: 174 | 175 | ```text 176 | First Name,email,age 177 | Elsa,elsa@test.com, 11 178 | ``` 179 | 180 | Note that now, the column is called **First Name** and not **firstname** 181 | 182 | And we can use the method `map`, like so: 183 | 184 | ```python 185 | reader = DataclassReader(users_csv, User) 186 | reader.map('First name').to('firstname') 187 | ``` 188 | 189 | Now the DataclassReader will know how to extract the data from the column **First Name** and add it to the to dataclass property **firstname** 190 | 191 | ### Supported type annotation 192 | 193 | At the moment the `DataclassReader` support `int`, `str`, `float`, `complex`, `datetime`, and `bool`. When defining a `datetime` property, it is necessary to use the `dateformat` decorator, for example: 194 | 195 | ```python 196 | from dataclasses import dataclass 197 | from datetime import datetime 198 | 199 | from dataclass_csv import DataclassReader, dateformat 200 | 201 | 202 | @dataclass 203 | @dateformat('%Y/%m/%d') 204 | class User: 205 | name: str 206 | email: str 207 | birthday: datetime 208 | 209 | 210 | if __name__ == '__main__': 211 | 212 | with open('users.csv') as f: 213 | reader = DataclassReader(f, User) 214 | for row in reader: 215 | print(row) 216 | ``` 217 | 218 | Assuming that the CSV file have the following contents: 219 | 220 | ```text 221 | name,email,birthday 222 | Edit,edit@test.com,2018/11/23 223 | ``` 224 | 225 | The output would look like this: 226 | 227 | ```text 228 | User(name='Edit', email='edit@test.com', birthday=datetime.datetime(2018, 11, 23, 0, 0)) 229 | ``` 230 | 231 | ### Fields metadata 232 | 233 | It is important to note that the `dateformat` decorator will define the date format that will be used to parse date to all properties 234 | in the class. Now there are situations where the data in a CSV file contains two or more columns with date values in different formats. It is possible 235 | to set a format specific for every property using the `dataclasses.field`. Let's say that we now have a CSV file with the following contents: 236 | 237 | ```text 238 | name,email,birthday, create_date 239 | Edit,edit@test.com,2018/11/23,2018/11/23 10:43 240 | ``` 241 | 242 | As you can see the `create_date` contains time information as well. 243 | 244 | The `dataclass` User can be defined like this: 245 | 246 | ```python 247 | from dataclasses import dataclass, field 248 | from datetime import datetime 249 | 250 | from dataclass_csv import DataclassReader, dateformat 251 | 252 | 253 | @dataclass 254 | @dateformat('%Y/%m/%d') 255 | class User: 256 | name: str 257 | email: str 258 | birthday: datetime 259 | create_date: datetime = field(metadata={'dateformat': '%Y/%m/%d %H:%M'}) 260 | ``` 261 | 262 | Note that the format for the `birthday` field was not speficied using the `field` metadata. In this case the format specified in the `dateformat` 263 | decorator will be used. 264 | 265 | ### Handling values with empty spaces 266 | 267 | When defining a property of type `str` in the `dataclass`, the `DataclassReader` will treat values with only white spaces as invalid. To change this 268 | behavior, there is a decorator called `@accept_whitespaces`. When decorating the class with the `@accept_whitespaces` all the properties in the class 269 | will accept values with only white spaces. 270 | 271 | For example: 272 | 273 | ```python 274 | from dataclass_csv import DataclassReader, accept_whitespaces 275 | 276 | @accept_whitespaces 277 | @dataclass 278 | class User: 279 | name: str 280 | email: str 281 | birthday: datetime 282 | created_at: datetime 283 | ``` 284 | 285 | If you need a specific field to accept white spaces, you can set the property `accept_whitespaces` in the field's metadata, like so: 286 | 287 | ```python 288 | @dataclass 289 | class User: 290 | name: str 291 | email: str = field(metadata={'accept_whitespaces': True}) 292 | birthday: datetime 293 | created_at: datetime 294 | ``` 295 | 296 | ### User-defined types 297 | 298 | You can use any type for a field as long as its constructor accepts a string: 299 | 300 | ```python 301 | class SSN: 302 | def __init__(self, val): 303 | if re.match(r"\d{9}", val): 304 | self.val = f"{val[0:3]}-{val[3:5]}-{val[5:9]}" 305 | elif re.match(r"\d{3}-\d{2}-\d{4}", val): 306 | self.val = val 307 | else: 308 | raise ValueError(f"Invalid SSN: {val!r}") 309 | 310 | 311 | @dataclasses.dataclass 312 | class User: 313 | name: str 314 | ssn: SSN 315 | ``` 316 | 317 | 318 | ## Using the DataclassWriter 319 | 320 | Reading a CSV file using the `DataclassReader` is great and gives us the type-safety of Python's dataclasses and type annotation, however, there are situations where we would like to use dataclasses for creating CSV files, that's where the `DataclassWriter` comes in handy. 321 | 322 | Using the `DataclassWriter` is quite simple. Given that we have a dataclass `User`: 323 | 324 | ```python 325 | from dataclasses import dataclass 326 | 327 | 328 | @dataclass 329 | class User: 330 | firstname: str 331 | lastname: str 332 | age: int 333 | ``` 334 | 335 | And in your program we have a list of users: 336 | 337 | ```python 338 | 339 | users = [ 340 | User(firstname="John", lastname="Smith", age=40), 341 | User(firstname="Daniel", lastname="Nilsson", age=10), 342 | User(firstname="Ella", "Fralla", age=4) 343 | ] 344 | ``` 345 | 346 | In order to create a CSV using the `DataclassWriter` import it from `dataclass_csv`: 347 | 348 | ```python 349 | from dataclass_csv import DataclassWriter 350 | ``` 351 | 352 | Initialize it with the required arguments and call the method `write`: 353 | 354 | ```python 355 | with open("users.csv", "w") as f: 356 | w = DataclassWriter(f, users, User) 357 | w.write() 358 | ``` 359 | 360 | That's it! Let's break down the snippet above. 361 | 362 | First, we open a file called `user.csv` for writing. After that, an instance of the `DataclassWriter` is created. To create a `DataclassWriter` we need to pass the `file`, the list of `User` instances, and lastly, the type, which in this case is `User`. 363 | 364 | The type is required since the writer uses it when trying to figure out the CSV header. By default, it will use the names of the 365 | properties defined in the dataclass, in the case of the dataclass `User` the title of each column 366 | will be `firstname`, `lastname` and `age`. 367 | 368 | See below the CSV created out of a list of `User`: 369 | 370 | ```text 371 | firstname,lastname,age 372 | John,Smith,40 373 | Daniel,Nilsson,10 374 | Ella,Fralla,4 375 | ``` 376 | 377 | The `DataclassWriter` also takes a `**fmtparams` which accepts the same parameters as the `csv.writer`, for more 378 | information see: https://docs.python.org/3/library/csv.html#csv-fmt-params 379 | 380 | Now, there are situations where we don't want to write the CSV header. In this case, the method `write` of 381 | the `DataclassWriter` accepts an extra argument, called `skip_header`. The default value is `False` and when set to 382 | `True` it will skip the header. 383 | 384 | #### Modifying the CSV header 385 | 386 | As previously mentioned the `DataclassWriter` uses the names of the properties defined in the dataclass as the CSV header titles, however, 387 | depending on your use case it makes sense to change it. The `DataclassWriter` has a `map` method just for this purpose. 388 | 389 | Using the `User` dataclass with the properties `firstname`, `lastname` and `age`. The snippet below shows how to change `firstname` to `First name` and `lastname` to `Last name`: 390 | 391 | ```python 392 | with open("users.csv", "w") as f: 393 | w = DataclassWriter(f, users, User) 394 | 395 | # Add mappings for firstname and lastname 396 | w.map("firstname").to("First name") 397 | w.map("lastname").to("Last name") 398 | 399 | w.write() 400 | ``` 401 | 402 | The CSV output of the snippet above will be: 403 | 404 | ```text 405 | First name,Last name,age 406 | John,Smith,40 407 | Daniel,Nilsson,10 408 | Ella,Fralla,4 409 | ``` 410 | 411 | ## Copyright and License 412 | 413 | Copyright (c) 2018 Daniel Furtado. Code released under BSD 3-clause license 414 | 415 | ## Credits 416 | 417 | This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and the [audreyr/cookiecutter-pypackage](https://github.com/audreyr/cookiecutter-pypackage) project template. 418 | -------------------------------------------------------------------------------- /dataclass_csv/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | dataclass_csv 3 | ~~~~~~~~~~~~~ 4 | 5 | The dataclass_csv is a library that parses every row of a CSV file into 6 | `dataclasses`. It takes advantage of `dataclasses` features to perform 7 | data validation and type conversion. 8 | 9 | Basic Usage 10 | ~~~~~~~~~~~~~ 11 | 12 | Read data from a CSV file: 13 | 14 | >>> from dataclasses import dataclass 15 | >>> from dataclass_csv import DataclassReader 16 | 17 | 18 | >>> @dataclass 19 | >>> class User: 20 | >>> firstname: str 21 | >>> lastname: str 22 | >>> age: int 23 | 24 | >>> with open('users.csv') as f: 25 | >>> reader = DataclassReader(f, User) 26 | >>> users = list(reader) 27 | >>> print(users) 28 | [ 29 | User(firstname='User1', lastname='Test', age=23), 30 | User(firstname='User2', lastname='Test', age=34) 31 | ] 32 | 33 | Write dataclasses to a CSV file: 34 | 35 | >>> from dataclasses import dataclass 36 | >>> from dataclass_csv import DataclassWriter 37 | 38 | >>> @dataclass 39 | >>> class User: 40 | >>> firstname: str 41 | >>> lastname: str 42 | >>> age: int 43 | 44 | >>> users = [ 45 | >>> User(firstname='User1', lastname='Test', age=23), 46 | >>> User(firstname='User2', lastname='Test', age=34) 47 | >>> ] 48 | 49 | >>> with open('users.csv', 'w') as f: 50 | >>> writer = DataclassWriter(f, users, User) 51 | >>> writer.write() 52 | 53 | 54 | :copyright: (c) 2018 by Daniel Furtado. 55 | :license: BSD, see LICENSE for more details. 56 | """ 57 | 58 | 59 | from .dataclass_reader import DataclassReader 60 | from .dataclass_writer import DataclassWriter 61 | from .decorators import dateformat, accept_whitespaces 62 | from .exceptions import CsvValueError 63 | 64 | 65 | __all__ = [ 66 | "DataclassReader", 67 | "DataclassWriter", 68 | "dateformat", 69 | "accept_whitespaces", 70 | "CsvValueError", 71 | ] 72 | -------------------------------------------------------------------------------- /dataclass_csv/__init__.pyi: -------------------------------------------------------------------------------- 1 | from .dataclass_reader import DataclassReader as DataclassReader 2 | from .dataclass_writer import DataclassWriter as DataclassWriter 3 | from .decorators import ( 4 | accept_whitespaces as accept_whitespaces, 5 | dateformat as dateformat, 6 | ) 7 | from .exceptions import CsvValueError as CsvValueError 8 | -------------------------------------------------------------------------------- /dataclass_csv/dataclass_reader.py: -------------------------------------------------------------------------------- 1 | import dataclasses 2 | import csv 3 | 4 | from datetime import date, datetime 5 | from distutils.util import strtobool 6 | from typing import Union, Type, Optional, Sequence, Dict, Any, List 7 | 8 | import typing 9 | 10 | from .field_mapper import FieldMapper 11 | from .exceptions import CsvValueError 12 | 13 | from collections import Counter 14 | 15 | 16 | def _verify_duplicate_header_items(header): 17 | if header is not None and len(header) == 0: 18 | return 19 | 20 | header_counter = Counter(header) 21 | duplicated = [k for k, v in header_counter.items() if v > 1] 22 | 23 | if len(duplicated) > 0: 24 | raise ValueError( 25 | ( 26 | "It seems like the CSV file contain duplicated header " 27 | f"values: {duplicated}. This may cause inconsistent data. " 28 | "Use the kwarg validate_header=False when initializing the " 29 | "DataclassReader to skip the header validation." 30 | ) 31 | ) 32 | 33 | 34 | def is_union_type(t): 35 | if hasattr(t, "__origin__") and t.__origin__ is Union: 36 | return True 37 | 38 | return False 39 | 40 | 41 | def get_args(t): 42 | if hasattr(t, "__args__"): 43 | return t.__args__ 44 | 45 | return tuple() 46 | 47 | 48 | class DataclassReader: 49 | def __init__( 50 | self, 51 | f: Any, 52 | cls: Type[object], 53 | fieldnames: Optional[Sequence[str]] = None, 54 | restkey: Optional[str] = None, 55 | restval: Optional[Any] = None, 56 | dialect: str = "excel", 57 | *args: Any, 58 | **kwds: Any, 59 | ): 60 | 61 | if not f: 62 | raise ValueError("The f argument is required.") 63 | 64 | if cls is None or not dataclasses.is_dataclass(cls): 65 | raise ValueError("cls argument needs to be a dataclass.") 66 | 67 | self._cls = cls 68 | self._optional_fields = self._get_optional_fields() 69 | self._field_mapping: Dict[str, Dict[str, Any]] = {} 70 | 71 | validate_header = kwds.pop("validate_header", True) 72 | 73 | self._reader = csv.DictReader( 74 | f, fieldnames, restkey, restval, dialect, *args, **kwds 75 | ) 76 | 77 | if validate_header: 78 | _verify_duplicate_header_items(self._reader.fieldnames) 79 | 80 | self.type_hints = typing.get_type_hints(cls) 81 | 82 | def _get_optional_fields(self): 83 | return [ 84 | field.name 85 | for field in dataclasses.fields(self._cls) 86 | if not isinstance(field.default, dataclasses._MISSING_TYPE) 87 | or not isinstance(field.default_factory, dataclasses._MISSING_TYPE) 88 | ] 89 | 90 | def _add_to_mapping(self, property_name, csv_fieldname): 91 | self._field_mapping[property_name] = csv_fieldname 92 | 93 | def _get_metadata_option(self, field, key): 94 | option = field.metadata.get(key, getattr(self._cls, f"__{key}__", None)) 95 | return option 96 | 97 | def _get_default_value(self, field): 98 | return ( 99 | field.default 100 | if not isinstance(field.default, dataclasses._MISSING_TYPE) 101 | else field.default_factory() 102 | ) 103 | 104 | def _get_possible_keys(self, fieldname, row): 105 | possible_keys = list(filter(lambda x: x.strip() == fieldname, row.keys())) 106 | if possible_keys: 107 | return possible_keys[0] 108 | 109 | def _get_value(self, row, field): 110 | is_field_mapped = False 111 | 112 | try: 113 | if field.name in self._field_mapping.keys(): 114 | is_field_mapped = True 115 | key = self._field_mapping.get(field.name) 116 | else: 117 | key = field.name 118 | 119 | if key in row.keys(): 120 | value = row[key] 121 | else: 122 | possible_key = self._get_possible_keys(field.name, row) 123 | key = possible_key if possible_key else key 124 | value = row[key] 125 | 126 | except KeyError: 127 | if field.name in self._optional_fields: 128 | return self._get_default_value(field) 129 | else: 130 | keyerror_message = f"The value for the column `{field.name}`" 131 | if is_field_mapped: 132 | keyerror_message = f"The value for the mapped column `{key}`" 133 | raise KeyError(f"{keyerror_message} is missing in the CSV file") 134 | else: 135 | if not value and field.name in self._optional_fields: 136 | return self._get_default_value(field) 137 | elif not value and field.name not in self._optional_fields: 138 | raise ValueError(f"The field `{field.name}` is required.") 139 | elif ( 140 | value 141 | and field.type is str 142 | and not len(value.strip()) 143 | and not self._get_metadata_option(field, "accept_whitespaces") 144 | ): 145 | raise ValueError( 146 | ( 147 | f"It seems like the value of `{field.name}` contains " 148 | "only white spaces. To allow white spaces to all " 149 | "string fields, use the @accept_whitespaces " 150 | "decorator. " 151 | "To allow white spaces specifically for the field " 152 | f"`{field.name}` change its definition to: " 153 | f"`{field.name}: str = field(metadata=" 154 | "{'accept_whitespaces': True})`." 155 | ) 156 | ) 157 | else: 158 | return value 159 | 160 | def _parse_date_value(self, field, date_value, field_type): 161 | dateformat = self._get_metadata_option(field, "dateformat") 162 | 163 | if not isinstance(date_value, str): 164 | return date_value 165 | 166 | if not dateformat: 167 | raise AttributeError( 168 | ( 169 | "Unable to parse the datetime string value. Date format " 170 | "not specified. To specify a date format for all " 171 | "datetime fields in the class, use the @dateformat " 172 | "decorator. To define a date format specifically for this " 173 | "field, change its definition to: " 174 | f"`{field.name}: datetime = field(metadata=" 175 | "{'dateformat': })`." 176 | ) 177 | ) 178 | 179 | datetime_obj = datetime.strptime(date_value, dateformat) 180 | 181 | if field_type == date: 182 | return datetime_obj.date() 183 | else: 184 | return datetime_obj 185 | 186 | def _process_row(self, row): 187 | values = dict() 188 | 189 | for field in dataclasses.fields(self._cls): 190 | if not field.init: 191 | continue 192 | 193 | try: 194 | value = self._get_value(row, field) 195 | except ValueError as ex: 196 | raise CsvValueError(ex, line_number=self._reader.line_num) from None 197 | 198 | if not value and field.default is None: 199 | values[field.name] = None 200 | continue 201 | 202 | field_type = self.type_hints[field.name] 203 | 204 | if is_union_type(field_type): 205 | type_args = [x for x in get_args(field_type) if x is not type(None)] 206 | if len(type_args) == 1: 207 | field_type = type_args[0] 208 | 209 | if field_type is datetime or field_type is date: 210 | try: 211 | transformed_value = self._parse_date_value(field, value, field_type) 212 | except ValueError as ex: 213 | raise CsvValueError(ex, line_number=self._reader.line_num) from None 214 | else: 215 | values[field.name] = transformed_value 216 | continue 217 | 218 | if field_type is bool: 219 | try: 220 | transformed_value = ( 221 | value 222 | if isinstance(value, bool) 223 | else strtobool(str(value).strip()) == 1 224 | ) 225 | except ValueError as ex: 226 | raise CsvValueError(ex, line_number=self._reader.line_num) from None 227 | else: 228 | values[field.name] = transformed_value 229 | continue 230 | 231 | try: 232 | transformed_value = field_type(value) 233 | except ValueError as e: 234 | raise CsvValueError( 235 | ( 236 | f"The field `{field.name}` is defined as {field.type} " 237 | f"but received a value of type {type(value)}." 238 | ), 239 | line_number=self._reader.line_num, 240 | ) from e 241 | else: 242 | values[field.name] = transformed_value 243 | return self._cls(**values) 244 | 245 | def __next__(self): 246 | row = next(self._reader) 247 | return self._process_row(row) 248 | 249 | def __iter__(self): 250 | return self 251 | 252 | def map(self, csv_fieldname: str) -> FieldMapper: 253 | """Used to map a field in the CSV file to a `dataclass` field 254 | :param csv_fieldname: The name of the CSV field 255 | """ 256 | return FieldMapper( 257 | lambda property_name: self._add_to_mapping(property_name, csv_fieldname) 258 | ) 259 | -------------------------------------------------------------------------------- /dataclass_csv/dataclass_reader.pyi: -------------------------------------------------------------------------------- 1 | from .field_mapper import FieldMapper as FieldMapper 2 | from typing import Any, Optional, Sequence, Type 3 | 4 | class DataclassReader: 5 | def __init__( 6 | self, 7 | f: Any, 8 | cls: Type[object], 9 | fieldnames: Optional[Sequence[str]] = ..., 10 | restkey: Optional[str] = ..., 11 | restval: Optional[Any] = ..., 12 | dialect: str = ..., 13 | *args: Any, 14 | **kwds: Any 15 | ) -> None: ... 16 | def __next__(self) -> None: ... 17 | def __iter__(self) -> Any: ... 18 | def map(self, csv_fieldname: str) -> FieldMapper: ... 19 | -------------------------------------------------------------------------------- /dataclass_csv/dataclass_writer.py: -------------------------------------------------------------------------------- 1 | import csv 2 | import dataclasses 3 | from typing import Type, Dict, Any, List 4 | from .header_mapper import HeaderMapper 5 | 6 | 7 | class DataclassWriter: 8 | def __init__( 9 | self, 10 | f: Any, 11 | data: List[Any], 12 | cls: Type[object], 13 | dialect: str = "excel", 14 | **fmtparams: Dict[str, Any], 15 | ): 16 | if not f: 17 | raise ValueError("The f argument is required") 18 | 19 | if not isinstance(data, list): 20 | raise ValueError("Invalid 'data' argument. It must be a list") 21 | 22 | if not dataclasses.is_dataclass(cls): 23 | raise ValueError("Invalid 'cls' argument. It must be a dataclass") 24 | 25 | self._data = data 26 | self._cls = cls 27 | self._field_mapping: Dict[str, str] = dict() 28 | 29 | self._fieldnames = [x.name for x in dataclasses.fields(cls)] 30 | 31 | self._writer = csv.writer(f, dialect=dialect, **fmtparams) 32 | 33 | def _add_to_mapping(self, header: str, propname: str): 34 | self._field_mapping[propname] = header 35 | 36 | def _apply_mapping(self): 37 | mapped_fields = [] 38 | 39 | for field in self._fieldnames: 40 | mapped_item = self._field_mapping.get(field, field) 41 | mapped_fields.append(mapped_item) 42 | 43 | return mapped_fields 44 | 45 | def write(self, skip_header: bool = False): 46 | if not skip_header: 47 | if self._field_mapping: 48 | self._fieldnames = self._apply_mapping() 49 | 50 | self._writer.writerow(self._fieldnames) 51 | 52 | for item in self._data: 53 | if not isinstance(item, self._cls): 54 | raise TypeError( 55 | ( 56 | f"The item [{item}] is not an instance of " 57 | f"{self._cls.__name__}. All items on the list must be " 58 | "instances of the same type" 59 | ) 60 | ) 61 | row = dataclasses.astuple(item) 62 | self._writer.writerow(row) 63 | 64 | def map(self, propname: str) -> HeaderMapper: 65 | """Used to map a field in the dataclass to header item in the CSV file 66 | :param propname: The name of the property of the dataclass to be mapped 67 | """ 68 | return HeaderMapper(lambda header: self._add_to_mapping(header, propname)) 69 | -------------------------------------------------------------------------------- /dataclass_csv/dataclass_writer.pyi: -------------------------------------------------------------------------------- 1 | from .header_mapper import HeaderMapper as HeaderMapper 2 | from typing import Any, Dict, List, Type 3 | 4 | class DataclassWriter: 5 | def __init__( 6 | self, 7 | f: Any, 8 | data: List[Any], 9 | cls: Type[object], 10 | dialect: str = ..., 11 | **fmtparams: Dict[str, Any], 12 | ) -> None: ... 13 | def write(self, skip_header: bool = ...) -> Any: ... 14 | def map(self, propname: str) -> HeaderMapper: ... 15 | -------------------------------------------------------------------------------- /dataclass_csv/decorators.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Callable, TypeVar, Type 2 | 3 | F = TypeVar("F", bound=Callable[..., Any]) 4 | 5 | 6 | def dateformat(date_format: str) -> Callable[[F], F]: 7 | """The dateformat decorator is used to specify the format 8 | the `DataclassReader` should use when parsing datetime strings. 9 | 10 | Usage: 11 | >>> from dataclasses import dataclass 12 | >>> from datetime import datetime 13 | >>> from dataclass_csv import dateformat 14 | 15 | >>> @dataclass 16 | >>> @dateformat('%Y-%m-%d') 17 | >>> class User: 18 | >>> firstname: str 19 | >>> lastname: str 20 | >>> brithday: datetime 21 | """ 22 | 23 | if not date_format or not isinstance(date_format, str): 24 | raise ValueError("Invalid value for the date_format argument") 25 | 26 | def func(cls): 27 | cls.__dateformat__ = date_format 28 | return cls 29 | 30 | return func 31 | 32 | 33 | def accept_whitespaces(_cls: Type[Any] = None) -> Callable[[F], F]: 34 | """The accept_whitespaces decorator tells the `DataclassReader` 35 | that `str` fields defined in the `dataclass` should accept 36 | values containing only white spaces. 37 | 38 | Usage: 39 | >>> from dataclasses import dataclass 40 | >>> from dataclass_csv import accept_whitespaces 41 | 42 | >>> @dataclass 43 | >>> @accept_whitespaces 44 | >>> class User: 45 | >>> firstname: str 46 | >>> lastname: str 47 | >>> brithday: datetime 48 | """ 49 | 50 | def func(cls): 51 | cls.__accept_whitespaces__ = True 52 | return cls 53 | 54 | if _cls: 55 | return func(_cls) 56 | 57 | return func 58 | -------------------------------------------------------------------------------- /dataclass_csv/decorators.pyi: -------------------------------------------------------------------------------- 1 | from typing import Any, Callable, Type, TypeVar 2 | 3 | F = TypeVar("F", bound=Callable[..., Any]) 4 | 5 | def dateformat(date_format: str) -> Callable[[F], F]: ... 6 | def accept_whitespaces(_cls: Type[Any] = ...) -> Callable[[F], F]: ... 7 | -------------------------------------------------------------------------------- /dataclass_csv/exceptions.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | 4 | class CsvValueError(Exception): 5 | """Error when a value in the CSV file cannot be parsed.""" 6 | 7 | def __init__(self, error: Any, line_number: int): 8 | self.error: Any = error 9 | self.line_number: int = line_number 10 | 11 | def __str__(self): 12 | return f"{self.error} [CSV Line number: {self.line_number}]" 13 | -------------------------------------------------------------------------------- /dataclass_csv/exceptions.pyi: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | class CsvValueError(Exception): 4 | error: Any = ... 5 | line_number: Any = ... 6 | def __init__(self, error: Any, line_number: int) -> None: ... 7 | -------------------------------------------------------------------------------- /dataclass_csv/field_mapper.py: -------------------------------------------------------------------------------- 1 | from typing import Callable 2 | 3 | 4 | class FieldMapper: 5 | """The `FieldMapper` class is used to explicitly map a field 6 | in the CSV file to a specific `dataclass` field. 7 | """ 8 | 9 | def __init__(self, callback: Callable[[str], None]): 10 | def to(property_name: str) -> None: 11 | """Specify the dataclass field to receive the value 12 | :param property_name: The dataclass property that 13 | will receive the csv value. 14 | """ 15 | 16 | callback(property_name) 17 | 18 | self.to: Callable[[str], None] = to 19 | -------------------------------------------------------------------------------- /dataclass_csv/field_mapper.pyi: -------------------------------------------------------------------------------- 1 | from typing import Any, Callable 2 | 3 | class FieldMapper: 4 | to: Any = ... 5 | def __init__(self, callback: Callable[[str], None]) -> None: ... 6 | -------------------------------------------------------------------------------- /dataclass_csv/header_mapper.py: -------------------------------------------------------------------------------- 1 | from typing import Callable 2 | 3 | 4 | class HeaderMapper: 5 | """The `HeaderMapper` class is used to explicitly map property in a 6 | dataclass to a header. Useful when the header on the CSV file needs to 7 | be different from a dataclass property name. 8 | """ 9 | 10 | def __init__(self, callback: Callable[[str], None]): 11 | def to(header: str) -> None: 12 | """Specify how a property in the dataclass will be 13 | displayed in the CSV file 14 | :param header: Specify the CSV title for the dataclass property 15 | """ 16 | 17 | callback(header) 18 | 19 | self.to: Callable[[str], None] = to 20 | -------------------------------------------------------------------------------- /dataclass_csv/header_mapper.pyi: -------------------------------------------------------------------------------- 1 | from typing import Any, Callable 2 | 3 | class HeaderMapper: 4 | to: Any = ... 5 | def __init__(self, callback: Callable[[str], None]) -> None: ... 6 | -------------------------------------------------------------------------------- /dataclass_csv/py.typed: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dfurtado/dataclass-csv/2dc71be81cb253eb10aba5ba70c6cebe42ab0301/dataclass_csv/py.typed -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [bumpversion] 2 | current_version = 1.4.0 3 | commit = True 4 | tag = True 5 | 6 | [bumpversion:file:setup.py] 7 | search = version='{current_version}' 8 | replace = version='{new_version}' 9 | 10 | [bumpversion:file:easycsv/__init__.py] 11 | search = __version__ = '{current_version}' 12 | replace = __version__ = '{new_version}' 13 | 14 | [bdist_wheel] 15 | universal = 1 16 | 17 | [flake8] 18 | exclude = docs 19 | max-line-length = 88 20 | 21 | [aliases] 22 | # Define setup.py command aliases here 23 | test = pytest 24 | 25 | [tool:pytest] 26 | collect_ignore = ['setup.py'] 27 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """The setup script.""" 5 | 6 | from setuptools import setup, find_packages 7 | 8 | with open("README.md") as readme_file: 9 | readme = readme_file.read() 10 | 11 | with open("HISTORY.md") as history_file: 12 | history = history_file.read() 13 | 14 | requirements = [] 15 | 16 | setup_requirements = [ 17 | "pytest-runner", 18 | ] 19 | 20 | test_requirements = [ 21 | "pytest", 22 | ] 23 | 24 | setup( 25 | author="Daniel Furtado", 26 | author_email="daniel@dfurtado.com", 27 | classifiers=[ 28 | "Development Status :: 5 - Production/Stable", 29 | "Intended Audience :: Developers", 30 | "License :: OSI Approved :: BSD License", 31 | "Natural Language :: English", 32 | "Programming Language :: Python :: 3 :: Only", 33 | "Programming Language :: Python :: 3.7", 34 | "Operating System :: Microsoft :: Windows", 35 | "Operating System :: MacOS :: MacOS X", 36 | "Operating System :: Unix", 37 | "Operating System :: POSIX", 38 | "Environment :: Console", 39 | ], 40 | description="Map CSV data into dataclasses", 41 | install_requires=requirements, 42 | license="BSD license", 43 | long_description=readme + "\n\n" + history, 44 | long_description_content_type="text/markdown", 45 | include_package_data=True, 46 | keywords="dataclass dataclasses csv dataclass-csv", 47 | name="dataclass-csv", 48 | packages=find_packages(include=["dataclass_csv"]), 49 | package_data={"dataclass_csv": ["py.typed", "*.pyi"]}, 50 | setup_requires=setup_requirements, 51 | test_suite="tests", 52 | tests_require=test_requirements, 53 | url="https://github.com/dfurtado/dataclass-csv", 54 | version="1.4.0", 55 | zip_safe=False, 56 | ) 57 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dfurtado/dataclass-csv/2dc71be81cb253eb10aba5ba70c6cebe42ab0301/tests/__init__.py -------------------------------------------------------------------------------- /tests/conftest.py: -------------------------------------------------------------------------------- 1 | from csv import DictWriter 2 | 3 | import pytest 4 | 5 | 6 | @pytest.fixture() 7 | def create_csv(tmpdir_factory): 8 | def func(data, fieldnames=None, filename="user.csv", factory=tmpdir_factory): 9 | 10 | assert data 11 | 12 | file = tmpdir_factory.mktemp("data").join(filename) 13 | 14 | row = data[0] if isinstance(data, list) else data 15 | 16 | header = fieldnames if fieldnames is not None else row.keys() 17 | 18 | with file.open("w") as f: 19 | writer = DictWriter(f, fieldnames=header) 20 | writer.writeheader() 21 | addrow = writer.writerows if isinstance(data, list) else writer.writerow 22 | addrow(data) 23 | 24 | return file 25 | 26 | return func 27 | -------------------------------------------------------------------------------- /tests/mocks.py: -------------------------------------------------------------------------------- 1 | import dataclasses 2 | import re 3 | 4 | from datetime import date, datetime 5 | 6 | from dataclass_csv import dateformat, accept_whitespaces 7 | 8 | from typing import Optional 9 | 10 | 11 | @dataclasses.dataclass 12 | class User: 13 | name: str 14 | age: int 15 | 16 | 17 | @dataclasses.dataclass 18 | class SimpleUser: 19 | name: str 20 | 21 | 22 | class NonDataclassUser: 23 | name: str 24 | 25 | 26 | @dataclasses.dataclass 27 | class UserWithoutAcceptWhiteSpacesDecorator: 28 | name: str 29 | 30 | 31 | @accept_whitespaces 32 | @dataclasses.dataclass 33 | class UserWithAcceptWhiteSpacesDecorator: 34 | name: str 35 | 36 | 37 | @dataclasses.dataclass 38 | class UserWithAcceptWhiteSpacesMetadata: 39 | name: str = dataclasses.field(metadata={"accept_whitespaces": True}) 40 | 41 | 42 | @dataclasses.dataclass 43 | class UserWithoutDateFormatDecorator: 44 | name: str 45 | create_date: datetime 46 | 47 | 48 | @dateformat("%Y-%m-%d") 49 | @dataclasses.dataclass 50 | class UserWithDateFormatDecorator: 51 | name: str 52 | create_date: datetime 53 | 54 | 55 | @dateformat("%Y-%m-%d") 56 | @dataclasses.dataclass 57 | class UserWithDateFormatDecoratorAndDateField: 58 | name: str 59 | create_date: date 60 | 61 | 62 | @dataclasses.dataclass 63 | class UserWithDateFormatMetadata: 64 | name: str 65 | create_date: datetime = dataclasses.field(metadata={"dateformat": "%Y-%m-%d"}) 66 | 67 | 68 | @dateformat("%Y-%m-%d") 69 | @dataclasses.dataclass 70 | class UserWithDateFormatDecoratorAndMetadata: 71 | name: str 72 | birthday: datetime 73 | create_date: datetime = dataclasses.field(metadata={"dateformat": "%Y-%m-%d %H:%M"}) 74 | 75 | 76 | @dataclasses.dataclass 77 | class DataclassWithBooleanValue: 78 | boolValue: bool 79 | 80 | 81 | @dataclasses.dataclass 82 | class DataclassWithBooleanValueNoneDefault: 83 | boolValue: Optional[bool] = None 84 | 85 | 86 | @dataclasses.dataclass 87 | class UserWithInitFalse: 88 | firstname: str 89 | lastname: str 90 | age: int = dataclasses.field(init=False) 91 | 92 | 93 | @dataclasses.dataclass 94 | class UserWithInitFalseAndDefaultValue: 95 | firstname: str 96 | lastname: str 97 | age: int = dataclasses.field(init=False, default=0) 98 | 99 | 100 | @dataclasses.dataclass 101 | class UserWithOptionalAge: 102 | name: str 103 | age: Optional[int] 104 | 105 | 106 | @dataclasses.dataclass 107 | class UserWithDefaultDatetimeField: 108 | name: str 109 | birthday: datetime = datetime.now() 110 | 111 | 112 | class SSN: 113 | def __init__(self, val): 114 | if re.match(r"\d{9}", val): 115 | self.val = f"{val[0:3]}-{val[3:5]}-{val[5:9]}" 116 | elif re.match(r"\d{3}-\d{2}-\d{4}", val): 117 | self.val = val 118 | else: 119 | raise ValueError(f"Invalid SSN: {val!r}") 120 | 121 | 122 | @dataclasses.dataclass 123 | class UserWithSSN: 124 | name: str 125 | ssn: SSN 126 | 127 | 128 | @dataclasses.dataclass 129 | class UserWithEmail: 130 | name: str 131 | email: str 132 | 133 | 134 | @dataclasses.dataclass 135 | class UserWithOptionalEmail: 136 | name: str 137 | email: str = "not specified" 138 | -------------------------------------------------------------------------------- /tests/test_csv_data_validation.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from dataclass_csv import DataclassReader, CsvValueError 4 | 5 | from .mocks import User, UserWithDateFormatDecorator, UserWithSSN 6 | 7 | 8 | def test_should_raise_error_str_to_int_prop(create_csv): 9 | csv_file = create_csv({"name": "User1", "age": "wrong type"}) 10 | 11 | with csv_file.open() as f: 12 | with pytest.raises(CsvValueError): 13 | reader = DataclassReader(f, User) 14 | list(reader) 15 | 16 | 17 | def test_should_raise_error_with_incorrect_dateformat(create_csv): 18 | csv_file = create_csv({"name": "User1", "create_date": "2018-12-07 10:00"}) 19 | 20 | with csv_file.open() as f: 21 | with pytest.raises(CsvValueError): 22 | reader = DataclassReader(f, UserWithDateFormatDecorator) 23 | list(reader) 24 | 25 | 26 | def test_should_raise_error_when_required_value_is_missing(create_csv): 27 | csv_file = create_csv({"name": "User1", "age": None}) 28 | 29 | with csv_file.open() as f: 30 | with pytest.raises(CsvValueError): 31 | reader = DataclassReader(f, User) 32 | list(reader) 33 | 34 | 35 | def test_should_raise_error_when_required_column_is_missing(create_csv): 36 | csv_file = create_csv({"name": "User1"}) 37 | 38 | with csv_file.open() as f: 39 | with pytest.raises(KeyError): 40 | reader = DataclassReader(f, User) 41 | list(reader) 42 | 43 | 44 | def test_should_raise_error_when_required_value_is_emptyspaces(create_csv): 45 | csv_file = create_csv({"name": " ", "age": 40}) 46 | 47 | with csv_file.open() as f: 48 | with pytest.raises(CsvValueError): 49 | reader = DataclassReader(f, User) 50 | list(reader) 51 | 52 | 53 | def test_csv_header_items_with_spaces_with_missing_props_raises_keyerror(create_csv): 54 | csv_file = create_csv({" name": "User1"}) 55 | 56 | with csv_file.open() as f: 57 | with pytest.raises(KeyError): 58 | reader = DataclassReader(f, User) 59 | list(reader) 60 | 61 | 62 | def test_csv_header_items_with_spaces_with_missing_value(create_csv): 63 | csv_file = create_csv({" name": "User1", "age ": None}) 64 | 65 | with csv_file.open() as f: 66 | with pytest.raises(CsvValueError): 67 | reader = DataclassReader(f, User) 68 | list(reader) 69 | 70 | 71 | def test_csv_header_items_with_spaces_with_prop_with_wrong_type(create_csv): 72 | csv_file = create_csv({" name": "User1", "age ": "this should be an int"}) 73 | 74 | with csv_file.open() as f: 75 | with pytest.raises(CsvValueError): 76 | reader = DataclassReader(f, User) 77 | list(reader) 78 | 79 | 80 | def test_passes_through_exceptions_from_user_defined_types(create_csv): 81 | csv_file = create_csv({"name": "User1", "ssn": "123-45-678"}) 82 | 83 | with csv_file.open() as f: 84 | with pytest.raises(CsvValueError) as exc_info: 85 | reader = DataclassReader(f, UserWithSSN) 86 | list(reader) 87 | cause = exc_info.value.__cause__ 88 | assert isinstance(cause, ValueError) 89 | assert "Invalid SSN" in str(cause) 90 | -------------------------------------------------------------------------------- /tests/test_dataclass_reader.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | import dataclasses 3 | 4 | from datetime import date, datetime 5 | from dataclass_csv import DataclassReader, CsvValueError 6 | 7 | from .mocks import ( 8 | User, 9 | UserWithOptionalAge, 10 | DataclassWithBooleanValue, 11 | DataclassWithBooleanValueNoneDefault, 12 | UserWithInitFalse, 13 | UserWithInitFalseAndDefaultValue, 14 | UserWithDefaultDatetimeField, 15 | UserWithDateFormatDecoratorAndDateField, 16 | UserWithSSN, 17 | SSN, 18 | UserWithEmail, 19 | UserWithOptionalEmail, 20 | ) 21 | 22 | 23 | def test_reader_with_non_dataclass(create_csv): 24 | csv_file = create_csv({"name": "User1", "age": 40}) 25 | 26 | class DummyUser: 27 | pass 28 | 29 | with csv_file.open() as f: 30 | with pytest.raises(ValueError): 31 | DataclassReader(f, DummyUser) 32 | 33 | 34 | def test_reader_with_none_class(create_csv): 35 | csv_file = create_csv({"name": "User1", "age": 40}) 36 | 37 | with csv_file.open() as f: 38 | with pytest.raises(ValueError): 39 | DataclassReader(f, None) 40 | 41 | 42 | def test_reader_with_none_file(): 43 | with pytest.raises(ValueError): 44 | DataclassReader(None, User) 45 | 46 | 47 | def test_reader_with_correct_values(create_csv): 48 | csv_file = create_csv({"name": "User", "age": 40}) 49 | 50 | with csv_file.open() as f: 51 | reader = DataclassReader(f, User) 52 | list(reader) 53 | 54 | 55 | def test_reader_values(create_csv): 56 | csv_file = create_csv([{"name": "User1", "age": 40}, {"name": "User2", "age": 30}]) 57 | 58 | with csv_file.open() as f: 59 | reader = DataclassReader(f, User) 60 | items = list(reader) 61 | 62 | assert items and len(items) == 2 63 | 64 | for item in items: 65 | assert dataclasses.is_dataclass(item) 66 | 67 | user1, user2 = items[0], items[1] 68 | 69 | assert user1.name == "User1" 70 | assert user1.age == 40 71 | 72 | assert user2.name == "User2" 73 | assert user2.age == 30 74 | 75 | 76 | def test_csv_header_items_with_spaces(create_csv): 77 | csv_file = create_csv({" name": "User1", "age ": 40}) 78 | 79 | with csv_file.open() as f: 80 | reader = DataclassReader(f, User) 81 | items = list(reader) 82 | 83 | assert items and len(items) > 0 84 | 85 | user = items[0] 86 | 87 | assert user.name == "User1" 88 | assert user.age == 40 89 | 90 | 91 | def test_csv_header_items_with_spaces_together_with_skipinitialspaces(create_csv): 92 | csv_file = create_csv({" name": "User1", "age ": 40}) 93 | 94 | with csv_file.open() as f: 95 | reader = DataclassReader(f, User, skipinitialspace=True) 96 | items = list(reader) 97 | 98 | assert items and len(items) > 0 99 | 100 | user = items[0] 101 | 102 | assert user.name == "User1" 103 | assert user.age == 40 104 | 105 | 106 | def test_parse_bool_value_true(create_csv): 107 | for true_value in ["yes", "true", "t", "y", "1"]: 108 | csv_file = create_csv({"boolValue": f"{true_value}"}) 109 | with csv_file.open() as f: 110 | reader = DataclassReader(f, DataclassWithBooleanValue) 111 | items = list(reader) 112 | dataclass_instance = items[0] 113 | assert dataclass_instance.boolValue is True 114 | 115 | 116 | def test_parse_bool_value_false(create_csv): 117 | for false_value in ["no", "false", "f", "n", "0"]: 118 | csv_file = create_csv({"boolValue": f"{false_value}"}) 119 | with csv_file.open() as f: 120 | reader = DataclassReader(f, DataclassWithBooleanValue) 121 | items = list(reader) 122 | dataclass_instance = items[0] 123 | assert dataclass_instance.boolValue is False 124 | 125 | 126 | def test_parse_bool_value_invalid(create_csv): 127 | csv_file = create_csv({"boolValue": "notValidBoolean"}) 128 | with csv_file.open() as f: 129 | with pytest.raises(CsvValueError): 130 | reader = DataclassReader(f, DataclassWithBooleanValue) 131 | list(reader) 132 | 133 | 134 | def test_parse_bool_value_none_default(create_csv): 135 | csv_file = create_csv({"boolValue": ""}) 136 | with csv_file.open() as f: 137 | reader = DataclassReader(f, DataclassWithBooleanValueNoneDefault) 138 | items = list(reader) 139 | dataclass_instance = items[0] 140 | assert dataclass_instance.boolValue is None 141 | 142 | 143 | def test_skip_dataclass_field_when_init_is_false(create_csv): 144 | csv_file = create_csv({"firstname": "User1", "lastname": "TestUser"}) 145 | with csv_file.open() as f: 146 | reader = DataclassReader(f, UserWithInitFalse) 147 | list(reader) 148 | 149 | 150 | def test_try_to_access_not_initialized_prop_raise_attr_error(create_csv): 151 | csv_file = create_csv({"firstname": "User1", "lastname": "TestUser"}) 152 | with csv_file.open() as f: 153 | reader = DataclassReader(f, UserWithInitFalse) 154 | items = list(reader) 155 | with pytest.raises(AttributeError): 156 | user = items[0] 157 | assert user.age is not None 158 | 159 | 160 | def test_try_to_access_not_initialized_prop_with_default_value(create_csv): 161 | csv_file = create_csv({"firstname": "User1", "lastname": "TestUser"}) 162 | with csv_file.open() as f: 163 | reader = DataclassReader(f, UserWithInitFalseAndDefaultValue) 164 | items = list(reader) 165 | user = items[0] 166 | assert user.age == 0 167 | 168 | 169 | def test_reader_with_optional_types(create_csv): 170 | csv_file = create_csv({"name": "User", "age": 40}) 171 | 172 | with csv_file.open() as f: 173 | reader = DataclassReader(f, UserWithOptionalAge) 174 | list(reader) 175 | 176 | 177 | def test_reader_with_datetime_default_value(create_csv): 178 | csv_file = create_csv({"name": "User", "birthday": ""}) 179 | 180 | with csv_file.open() as f: 181 | reader = DataclassReader(f, UserWithDefaultDatetimeField) 182 | items = list(reader) 183 | assert len(items) > 0 184 | assert isinstance(items[0].birthday, datetime) 185 | 186 | 187 | def test_reader_with_date(create_csv): 188 | csv_file = create_csv({"name": "User", "create_date": "2019-01-01"}) 189 | 190 | with csv_file.open() as f: 191 | reader = DataclassReader(f, UserWithDateFormatDecoratorAndDateField) 192 | items = list(reader) 193 | assert len(items) > 0 194 | assert isinstance(items[0].create_date, date) 195 | assert items[0].create_date == date(2019, 1, 1) 196 | 197 | 198 | def test_should_parse_user_defined_types(create_csv): 199 | csv_file = create_csv( 200 | [ 201 | {"name": "User1", "ssn": "123-45-6789"}, 202 | {"name": "User1", "ssn": "123456789"}, 203 | ] 204 | ) 205 | 206 | with csv_file.open() as f: 207 | reader = DataclassReader(f, UserWithSSN) 208 | items = list(reader) 209 | assert len(items) == 2 210 | 211 | assert isinstance(items[0].ssn, SSN) 212 | assert items[0].ssn.val == "123-45-6789" 213 | 214 | assert isinstance(items[1].ssn, SSN) 215 | assert items[1].ssn.val == "123-45-6789" 216 | 217 | 218 | def test_raise_error_when_mapped_column_not_found(create_csv): 219 | csv_file = create_csv({"name": "User1", "e-mail": "test@test.com"}) 220 | 221 | with csv_file.open() as f: 222 | with pytest.raises( 223 | KeyError, 224 | match="The value for the mapped column `e_mail` is missing in the CSV file", 225 | ): 226 | reader = DataclassReader(f, UserWithEmail) 227 | reader.map("e_mail").to("email") 228 | list(reader) 229 | 230 | 231 | def test_raise_error_when_field_not_found(create_csv): 232 | csv_file = create_csv({"name": "User1", "e-mail": "test@test.com"}) 233 | 234 | with csv_file.open() as f: 235 | with pytest.raises( 236 | KeyError, 237 | match="The value for the column `email` is missing in the CSV file.", 238 | ): 239 | reader = DataclassReader(f, UserWithEmail) 240 | list(reader) 241 | 242 | 243 | def test_raise_error_when_duplicate_header_items(create_csv): 244 | csv_file = create_csv( 245 | {"name": "User1", "email": "test@test.com"}, 246 | fieldnames=["name", "email", "name"], 247 | ) 248 | 249 | with csv_file.open() as f: 250 | with pytest.raises(ValueError): 251 | reader = DataclassReader(f, UserWithEmail) 252 | list(reader) 253 | 254 | 255 | def test_skip_header_validation(create_csv): 256 | csv_file = create_csv( 257 | {"name": "User1", "email": "test@test.com"}, 258 | fieldnames=["name", "email", "name"], 259 | ) 260 | 261 | with csv_file.open() as f: 262 | reader = DataclassReader(f, UserWithEmail, validate_header=False) 263 | list(reader) 264 | 265 | 266 | def test_dt_different_order_as_csv(create_csv): 267 | csv_file = create_csv( 268 | {"email": "test@test.com", "name": "User1"}, 269 | fieldnames=[ 270 | "email", 271 | "name", 272 | ], 273 | ) 274 | 275 | with csv_file.open() as f: 276 | reader = DataclassReader(f, UserWithEmail) 277 | list(reader) 278 | 279 | 280 | def test_dt_different_order_as_csv_and_option_field(create_csv): 281 | data = [ 282 | {"email": "test@test.com", "name": "User1"}, 283 | {"name": "User1"}, 284 | ] 285 | 286 | csv_file = create_csv( 287 | data, 288 | fieldnames=[ 289 | "email", 290 | "name", 291 | ], 292 | ) 293 | 294 | with csv_file.open() as f: 295 | reader = DataclassReader(f, UserWithOptionalEmail) 296 | list(reader) 297 | -------------------------------------------------------------------------------- /tests/test_dataclass_writer.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from dataclass_csv import DataclassWriter, DataclassReader 4 | 5 | from .mocks import User, SimpleUser, NonDataclassUser 6 | 7 | 8 | def test_create_csv_file(tmpdir_factory): 9 | tempfile = tmpdir_factory.mktemp("data").join("user_001.csv") 10 | 11 | users = [User(name="test", age=40)] 12 | 13 | with tempfile.open("w") as f: 14 | w = DataclassWriter(f, users, User) 15 | w.write() 16 | 17 | with tempfile.open() as f: 18 | reader = DataclassReader(f, User) 19 | saved_users = list(reader) 20 | 21 | assert len(saved_users) > 0 22 | assert saved_users[0].name == users[0].name 23 | 24 | 25 | def test_wrong_type_items(tmpdir_factory): 26 | tempfile = tmpdir_factory.mktemp("data").join("user_001.csv") 27 | 28 | users = [User(name="test", age=40)] 29 | 30 | with tempfile.open("w") as f: 31 | with pytest.raises(TypeError): 32 | w = DataclassWriter(f, users, SimpleUser) 33 | w.write() 34 | 35 | 36 | def test_with_a_non_dataclass(tmpdir_factory): 37 | tempfile = tmpdir_factory.mktemp("data").join("user_001.csv") 38 | 39 | users = [User(name="test", age=40)] 40 | 41 | with tempfile.open("w") as f: 42 | with pytest.raises(ValueError): 43 | DataclassWriter(f, users, NonDataclassUser) 44 | 45 | 46 | def test_with_a_empty_cls_value(tmpdir_factory): 47 | tempfile = tmpdir_factory.mktemp("data").join("user_001.csv") 48 | 49 | users = [User(name="test", age=40)] 50 | 51 | with tempfile.open("w") as f: 52 | with pytest.raises(ValueError): 53 | DataclassWriter(f, users, None) 54 | 55 | 56 | def test_invalid_file_value(tmpdir_factory): 57 | tmpdir_factory.mktemp("data").join("user_001.csv") 58 | 59 | users = [User(name="test", age=40)] 60 | 61 | with pytest.raises(ValueError): 62 | DataclassWriter(None, users, User) 63 | 64 | 65 | def test_with_data_not_a_list(tmpdir_factory): 66 | tempfile = tmpdir_factory.mktemp("data").join("user_001.csv") 67 | 68 | users = User(name="test", age=40) 69 | 70 | with tempfile.open("w") as f: 71 | with pytest.raises(ValueError): 72 | DataclassWriter(f, users, User) 73 | -------------------------------------------------------------------------------- /tests/test_decorators.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from dataclass_csv import DataclassReader, CsvValueError 4 | 5 | from .mocks import ( 6 | UserWithoutDateFormatDecorator, 7 | UserWithDateFormatDecorator, 8 | UserWithDateFormatMetadata, 9 | UserWithDateFormatDecoratorAndMetadata, 10 | UserWithoutAcceptWhiteSpacesDecorator, 11 | UserWithAcceptWhiteSpacesDecorator, 12 | UserWithAcceptWhiteSpacesMetadata, 13 | ) 14 | 15 | 16 | def test_should_raise_error_without_dateformat(create_csv): 17 | csv_file = create_csv({"name": "Test", "create_date": "2018-12-09"}) 18 | 19 | with csv_file.open("r") as f: 20 | with pytest.raises(AttributeError): 21 | reader = DataclassReader(f, UserWithoutDateFormatDecorator) 22 | list(reader) 23 | 24 | 25 | def test_shold_not_raise_error_when_using_dateformat_decorator(create_csv): 26 | csv_file = create_csv({"name": "Test", "create_date": "2018-12-09"}) 27 | 28 | with csv_file.open("r") as f: 29 | reader = DataclassReader(f, UserWithDateFormatDecorator) 30 | list(reader) 31 | 32 | 33 | def test_shold_not_raise_error_when_dateformat_metadata(create_csv): 34 | csv_file = create_csv({"name": "Test", "create_date": "2018-12-09"}) 35 | 36 | with csv_file.open("r") as f: 37 | reader = DataclassReader(f, UserWithDateFormatMetadata) 38 | list(reader) 39 | 40 | 41 | def test_use_decorator_when_metadata_is_not_defined(create_csv): 42 | csv_file = create_csv( 43 | { 44 | "name": "Test", 45 | "birthday": "1977-08-26", 46 | "create_date": "2018-12-09 11:11", 47 | } 48 | ) 49 | 50 | with csv_file.open("r") as f: 51 | reader = DataclassReader(f, UserWithDateFormatDecoratorAndMetadata) 52 | list(reader) 53 | 54 | 55 | def test_should_raise_error_when_value_is_whitespaces(create_csv): 56 | csv_file = create_csv({"name": " "}) 57 | 58 | with csv_file.open("r") as f: 59 | with pytest.raises(CsvValueError): 60 | reader = DataclassReader(f, UserWithoutAcceptWhiteSpacesDecorator) 61 | list(reader) 62 | 63 | 64 | def test_should_not_raise_error_when_value_is_whitespaces(create_csv): 65 | csv_file = create_csv({"name": " "}) 66 | 67 | with csv_file.open("r") as f: 68 | reader = DataclassReader(f, UserWithAcceptWhiteSpacesDecorator) 69 | data = list(reader) 70 | 71 | user = data[0] 72 | assert user.name == " " 73 | 74 | 75 | def test_should_not_raise_error_when_using_meta_accept_whitespaces(create_csv): 76 | csv_file = create_csv({"name": " "}) 77 | 78 | with csv_file.open("r") as f: 79 | reader = DataclassReader(f, UserWithAcceptWhiteSpacesMetadata) 80 | data = list(reader) 81 | 82 | user = data[0] 83 | assert user.name == " " 84 | --------------------------------------------------------------------------------