├── .gitignore ├── CHANGELOG.md ├── CONTRIBUTING.md ├── LICENSE.md ├── MANIFEST.in ├── README.md ├── pdf_importer ├── Importers.py ├── __init__.py ├── __main__.py ├── __project__.py └── pdf_importer.py ├── pyproject.toml ├── setup.cfg └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | /.idea/ 2 | /pdf_importer.egg-info/ 3 | /.eggs/ 4 | /build/ 5 | /dist/ 6 | /docs/build/ 7 | /docs/source/api/ 8 | .vscode/ 9 | __pycache__/ 10 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | All notable changes to this project will be documented in this file. 4 | 5 | The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), 6 | and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). 7 | 8 | ## [0.5] - 2021-12-26 9 | 10 | - Fix [issue #6](https://github.com/c-vigo/StatementPDFImporter/issues/6): statements were silently skipped if the 11 | amount contains an apostrophe 12 | 13 | - Fix [issue #4](https://github.com/c-vigo/StatementPDFImporter/issues/4): last transactions got omitted in long 14 | statements (over two pages) 15 | 16 | ## [0.4] - 2021-03-27 17 | 18 | - Fix bug: amounts above 1000 with coma-separated thousands 19 | 20 | ## [0.3] - 2021-02-24 21 | 22 | - Fix [issue #2](https://github.com/c-vigo/StatementPDFImporter/issues/2): newline in transaction description messing with CSV output 23 | 24 | ## [0.2] - 2021-02-07 25 | 26 | - Hotfix 27 | 28 | ## [0.1] - 2021-02-03 29 | 30 | - First version of the package 31 | - Supported statements: 32 | - [Cembra & Cumulus](https://www.cembra.ch/en/cards/cembra-mastercard/) MasterCard 33 | - [SwissCard Cashback](https://www.swisscard.ch/en/private-customers/products) (AMEX / VISA / MasterCard) 34 | 35 | [Unreleased]: https://github.com/c-vigo/StatementPDFImporter/compare/v0.2...HEAD 36 | [0.5]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.5 37 | [0.4]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.4 38 | [0.3]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.3 39 | [0.2]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.2 40 | [0.1]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.1 41 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | When contributing to this repository, please first discuss the change you wish to make via issue, 4 | email, or any other method with the owners of this repository before making a change. 5 | 6 | Please note we have a code of conduct, please follow it in all your interactions with the project. 7 | 8 | ## Pull Request Process 9 | 10 | 1. Ensure any install or build dependencies are removed before the end of the layer when doing a 11 | build. 12 | 2. Update the README.md with details of changes to the interface, this includes new environment 13 | variables, exposed ports, useful file locations and container parameters. 14 | 3. Increase the version numbers in any examples files and the README.md to the new version that this 15 | Pull Request would represent. The versioning scheme we use is [SemVer](http://semver.org/). 16 | 4. You may merge the Pull Request in once you have the sign-off of two other developers, or if you 17 | do not have permission to do that, you may request the second reviewer to merge it for you. 18 | 19 | ## Code of Conduct 20 | 21 | ### Our Pledge 22 | 23 | In the interest of fostering an open and welcoming environment, we as 24 | contributors and maintainers pledge to making participation in our project and 25 | our community a harassment-free experience for everyone, regardless of age, body 26 | size, disability, ethnicity, gender identity and expression, level of experience, 27 | nationality, personal appearance, race, religion, or sexual identity and 28 | orientation. 29 | 30 | ### Our Standards 31 | 32 | Examples of behavior that contributes to creating a positive environment 33 | include: 34 | 35 | * Using welcoming and inclusive language 36 | * Being respectful of differing viewpoints and experiences 37 | * Gracefully accepting constructive criticism 38 | * Focusing on what is best for the community 39 | * Showing empathy towards other community members 40 | 41 | Examples of unacceptable behavior by participants include: 42 | 43 | * The use of sexualized language or imagery and unwelcome sexual attention or 44 | advances 45 | * Trolling, insulting/derogatory comments, and personal or political attacks 46 | * Public or private harassment 47 | * Publishing others' private information, such as a physical or electronic 48 | address, without explicit permission 49 | * Other conduct which could reasonably be considered inappropriate in a 50 | professional setting 51 | 52 | ### Our Responsibilities 53 | 54 | Project maintainers are responsible for clarifying the standards of acceptable 55 | behavior and are expected to take appropriate and fair corrective action in 56 | response to any instances of unacceptable behavior. 57 | 58 | Project maintainers have the right and responsibility to remove, edit, or 59 | reject comments, commits, code, wiki edits, issues, and other contributions 60 | that are not aligned to this Code of Conduct, or to ban temporarily or 61 | permanently any contributor for other behaviors that they deem inappropriate, 62 | threatening, offensive, or harmful. 63 | 64 | ### Scope 65 | 66 | This Code of Conduct applies both within project spaces and in public spaces 67 | when an individual is representing the project or its community. Examples of 68 | representing a project or community include using an official project e-mail 69 | address, posting via an official social media account, or acting as an appointed 70 | representative at an online or offline event. Representation of a project may be 71 | further defined and clarified by project maintainers. 72 | 73 | ### Enforcement 74 | 75 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 76 | reported by contacting the project team at [INSERT EMAIL ADDRESS]. All 77 | complaints will be reviewed and investigated and will result in a response that 78 | is deemed necessary and appropriate to the circumstances. The project team is 79 | obligated to maintain confidentiality with regard to the reporter of an incident. 80 | Further details of specific enforcement policies may be posted separately. 81 | 82 | Project maintainers who do not follow or enforce the Code of Conduct in good 83 | faith may face temporary or permanent repercussions as determined by other 84 | members of the project's leadership. 85 | 86 | ### Attribution 87 | 88 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 89 | available at [http://contributor-covenant.org/version/1/4][version] 90 | 91 | [homepage]: http://contributor-covenant.org 92 | [version]: http://contributor-covenant.org/version/1/4/ -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | GNU General Public License 2 | ========================== 3 | 4 | [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) 5 | 6 | _Version 3, 29 June 2007_ 7 | _Copyright © 2007 Free Software Foundation, Inc. <>_ 8 | 9 | Everyone is permitted to copy and distribute verbatim copies of this license 10 | document, but changing it is not allowed. 11 | 12 | ## Preamble 13 | 14 | The GNU General Public License is a free, copyleft license for software and other 15 | kinds of works. 16 | 17 | The licenses for most software and other practical works are designed to take away 18 | your freedom to share and change the works. By contrast, the GNU General Public 19 | License is intended to guarantee your freedom to share and change all versions of a 20 | program--to make sure it remains free software for all its users. We, the Free 21 | Software Foundation, use the GNU General Public License for most of our software; it 22 | applies also to any other work released this way by its authors. You can apply it to 23 | your programs, too. 24 | 25 | When we speak of free software, we are referring to freedom, not price. Our General 26 | Public Licenses are designed to make sure that you have the freedom to distribute 27 | copies of free software (and charge for them if you wish), that you receive source 28 | code or can get it if you want it, that you can change the software or use pieces of 29 | it in new free programs, and that you know you can do these things. 30 | 31 | To protect your rights, we need to prevent others from denying you these rights or 32 | asking you to surrender the rights. Therefore, you have certain responsibilities if 33 | you distribute copies of the software, or if you modify it: responsibilities to 34 | respect the freedom of others. 35 | 36 | For example, if you distribute copies of such a program, whether gratis or for a fee, 37 | you must pass on to the recipients the same freedoms that you received. You must make 38 | sure that they, too, receive or can get the source code. And you must show them these 39 | terms so they know their rights. 40 | 41 | Developers that use the GNU GPL protect your rights with two steps: **(1)** assert 42 | copyright on the software, and **(2)** offer you this License giving you legal permission 43 | to copy, distribute and/or modify it. 44 | 45 | For the developers' and authors' protection, the GPL clearly explains that there is 46 | no warranty for this free software. For both users' and authors' sake, the GPL 47 | requires that modified versions be marked as changed, so that their problems will not 48 | be attributed erroneously to authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run modified versions of 51 | the software inside them, although the manufacturer can do so. This is fundamentally 52 | incompatible with the aim of protecting users' freedom to change the software. The 53 | systematic pattern of such abuse occurs in the area of products for individuals to 54 | use, which is precisely where it is most unacceptable. Therefore, we have designed 55 | this version of the GPL to prohibit the practice for those products. If such problems 56 | arise substantially in other domains, we stand ready to extend this provision to 57 | those domains in future versions of the GPL, as needed to protect the freedom of 58 | users. 59 | 60 | Finally, every program is threatened constantly by software patents. States should 61 | not allow patents to restrict development and use of software on general-purpose 62 | computers, but in those that do, we wish to avoid the special danger that patents 63 | applied to a free program could make it effectively proprietary. To prevent this, the 64 | GPL assures that patents cannot be used to render the program non-free. 65 | 66 | The precise terms and conditions for copying, distribution and modification follow. 67 | 68 | ## TERMS AND CONDITIONS 69 | 70 | ### 0. Definitions 71 | 72 | “This License” refers to version 3 of the GNU General Public License. 73 | 74 | “Copyright” also means copyright-like laws that apply to other kinds of 75 | works, such as semiconductor masks. 76 | 77 | “The Program” refers to any copyrightable work licensed under this 78 | License. Each licensee is addressed as “you”. “Licensees” and 79 | “recipients” may be individuals or organizations. 80 | 81 | To “modify” a work means to copy from or adapt all or part of the work in 82 | a fashion requiring copyright permission, other than the making of an exact copy. The 83 | resulting work is called a “modified version” of the earlier work or a 84 | work “based on” the earlier work. 85 | 86 | A “covered work” means either the unmodified Program or a work based on 87 | the Program. 88 | 89 | To “propagate” a work means to do anything with it that, without 90 | permission, would make you directly or secondarily liable for infringement under 91 | applicable copyright law, except executing it on a computer or modifying a private 92 | copy. Propagation includes copying, distribution (with or without modification), 93 | making available to the public, and in some countries other activities as well. 94 | 95 | To “convey” a work means any kind of propagation that enables other 96 | parties to make or receive copies. Mere interaction with a user through a computer 97 | network, with no transfer of a copy, is not conveying. 98 | 99 | An interactive user interface displays “Appropriate Legal Notices” to the 100 | extent that it includes a convenient and prominently visible feature that **(1)** 101 | displays an appropriate copyright notice, and **(2)** tells the user that there is no 102 | warranty for the work (except to the extent that warranties are provided), that 103 | licensees may convey the work under this License, and how to view a copy of this 104 | License. If the interface presents a list of user commands or options, such as a 105 | menu, a prominent item in the list meets this criterion. 106 | 107 | ### 1. Source Code 108 | 109 | The “source code” for a work means the preferred form of the work for 110 | making modifications to it. “Object code” means any non-source form of a 111 | work. 112 | 113 | A “Standard Interface” means an interface that either is an official 114 | standard defined by a recognized standards body, or, in the case of interfaces 115 | specified for a particular programming language, one that is widely used among 116 | developers working in that language. 117 | 118 | The “System Libraries” of an executable work include anything, other than 119 | the work as a whole, that **(a)** is included in the normal form of packaging a Major 120 | Component, but which is not part of that Major Component, and **(b)** serves only to 121 | enable use of the work with that Major Component, or to implement a Standard 122 | Interface for which an implementation is available to the public in source code form. 123 | A “Major Component”, in this context, means a major essential component 124 | (kernel, window system, and so on) of the specific operating system (if any) on which 125 | the executable work runs, or a compiler used to produce the work, or an object code 126 | interpreter used to run it. 127 | 128 | The “Corresponding Source” for a work in object code form means all the 129 | source code needed to generate, install, and (for an executable work) run the object 130 | code and to modify the work, including scripts to control those activities. However, 131 | it does not include the work's System Libraries, or general-purpose tools or 132 | generally available free programs which are used unmodified in performing those 133 | activities but which are not part of the work. For example, Corresponding Source 134 | includes interface definition files associated with source files for the work, and 135 | the source code for shared libraries and dynamically linked subprograms that the work 136 | is specifically designed to require, such as by intimate data communication or 137 | control flow between those subprograms and other parts of the work. 138 | 139 | The Corresponding Source need not include anything that users can regenerate 140 | automatically from other parts of the Corresponding Source. 141 | 142 | The Corresponding Source for a work in source code form is that same work. 143 | 144 | ### 2. Basic Permissions 145 | 146 | All rights granted under this License are granted for the term of copyright on the 147 | Program, and are irrevocable provided the stated conditions are met. This License 148 | explicitly affirms your unlimited permission to run the unmodified Program. The 149 | output from running a covered work is covered by this License only if the output, 150 | given its content, constitutes a covered work. This License acknowledges your rights 151 | of fair use or other equivalent, as provided by copyright law. 152 | 153 | You may make, run and propagate covered works that you do not convey, without 154 | conditions so long as your license otherwise remains in force. You may convey covered 155 | works to others for the sole purpose of having them make modifications exclusively 156 | for you, or provide you with facilities for running those works, provided that you 157 | comply with the terms of this License in conveying all material for which you do not 158 | control copyright. Those thus making or running the covered works for you must do so 159 | exclusively on your behalf, under your direction and control, on terms that prohibit 160 | them from making any copies of your copyrighted material outside their relationship 161 | with you. 162 | 163 | Conveying under any other circumstances is permitted solely under the conditions 164 | stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 165 | 166 | ### 3. Protecting Users' Legal Rights From Anti-Circumvention Law 167 | 168 | No covered work shall be deemed part of an effective technological measure under any 169 | applicable law fulfilling obligations under article 11 of the WIPO copyright treaty 170 | adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention 171 | of such measures. 172 | 173 | When you convey a covered work, you waive any legal power to forbid circumvention of 174 | technological measures to the extent such circumvention is effected by exercising 175 | rights under this License with respect to the covered work, and you disclaim any 176 | intention to limit operation or modification of the work as a means of enforcing, 177 | against the work's users, your or third parties' legal rights to forbid circumvention 178 | of technological measures. 179 | 180 | ### 4. Conveying Verbatim Copies 181 | 182 | You may convey verbatim copies of the Program's source code as you receive it, in any 183 | medium, provided that you conspicuously and appropriately publish on each copy an 184 | appropriate copyright notice; keep intact all notices stating that this License and 185 | any non-permissive terms added in accord with section 7 apply to the code; keep 186 | intact all notices of the absence of any warranty; and give all recipients a copy of 187 | this License along with the Program. 188 | 189 | You may charge any price or no price for each copy that you convey, and you may offer 190 | support or warranty protection for a fee. 191 | 192 | ### 5. Conveying Modified Source Versions 193 | 194 | You may convey a work based on the Program, or the modifications to produce it from 195 | the Program, in the form of source code under the terms of section 4, provided that 196 | you also meet all of these conditions: 197 | 198 | * **a)** The work must carry prominent notices stating that you modified it, and giving a 199 | relevant date. 200 | * **b)** The work must carry prominent notices stating that it is released under this 201 | License and any conditions added under section 7. This requirement modifies the 202 | requirement in section 4 to “keep intact all notices”. 203 | * **c)** You must license the entire work, as a whole, under this License to anyone who 204 | comes into possession of a copy. This License will therefore apply, along with any 205 | applicable section 7 additional terms, to the whole of the work, and all its parts, 206 | regardless of how they are packaged. This License gives no permission to license the 207 | work in any other way, but it does not invalidate such permission if you have 208 | separately received it. 209 | * **d)** If the work has interactive user interfaces, each must display Appropriate Legal 210 | Notices; however, if the Program has interactive interfaces that do not display 211 | Appropriate Legal Notices, your work need not make them do so. 212 | 213 | A compilation of a covered work with other separate and independent works, which are 214 | not by their nature extensions of the covered work, and which are not combined with 215 | it such as to form a larger program, in or on a volume of a storage or distribution 216 | medium, is called an “aggregate” if the compilation and its resulting 217 | copyright are not used to limit the access or legal rights of the compilation's users 218 | beyond what the individual works permit. Inclusion of a covered work in an aggregate 219 | does not cause this License to apply to the other parts of the aggregate. 220 | 221 | ### 6. Conveying Non-Source Forms 222 | 223 | You may convey a covered work in object code form under the terms of sections 4 and 224 | 5, provided that you also convey the machine-readable Corresponding Source under the 225 | terms of this License, in one of these ways: 226 | 227 | * **a)** Convey the object code in, or embodied in, a physical product (including a 228 | physical distribution medium), accompanied by the Corresponding Source fixed on a 229 | durable physical medium customarily used for software interchange. 230 | * **b)** Convey the object code in, or embodied in, a physical product (including a 231 | physical distribution medium), accompanied by a written offer, valid for at least 232 | three years and valid for as long as you offer spare parts or customer support for 233 | that product model, to give anyone who possesses the object code either **(1)** a copy of 234 | the Corresponding Source for all the software in the product that is covered by this 235 | License, on a durable physical medium customarily used for software interchange, for 236 | a price no more than your reasonable cost of physically performing this conveying of 237 | source, or **(2)** access to copy the Corresponding Source from a network server at no 238 | charge. 239 | * **c)** Convey individual copies of the object code with a copy of the written offer to 240 | provide the Corresponding Source. This alternative is allowed only occasionally and 241 | noncommercially, and only if you received the object code with such an offer, in 242 | accord with subsection 6b. 243 | * **d)** Convey the object code by offering access from a designated place (gratis or for 244 | a charge), and offer equivalent access to the Corresponding Source in the same way 245 | through the same place at no further charge. You need not require recipients to copy 246 | the Corresponding Source along with the object code. If the place to copy the object 247 | code is a network server, the Corresponding Source may be on a different server 248 | (operated by you or a third party) that supports equivalent copying facilities, 249 | provided you maintain clear directions next to the object code saying where to find 250 | the Corresponding Source. Regardless of what server hosts the Corresponding Source, 251 | you remain obligated to ensure that it is available for as long as needed to satisfy 252 | these requirements. 253 | * **e)** Convey the object code using peer-to-peer transmission, provided you inform 254 | other peers where the object code and Corresponding Source of the work are being 255 | offered to the general public at no charge under subsection 6d. 256 | 257 | A separable portion of the object code, whose source code is excluded from the 258 | Corresponding Source as a System Library, need not be included in conveying the 259 | object code work. 260 | 261 | A “User Product” is either **(1)** a “consumer product”, which 262 | means any tangible personal property which is normally used for personal, family, or 263 | household purposes, or **(2)** anything designed or sold for incorporation into a 264 | dwelling. In determining whether a product is a consumer product, doubtful cases 265 | shall be resolved in favor of coverage. For a particular product received by a 266 | particular user, “normally used” refers to a typical or common use of 267 | that class of product, regardless of the status of the particular user or of the way 268 | in which the particular user actually uses, or expects or is expected to use, the 269 | product. A product is a consumer product regardless of whether the product has 270 | substantial commercial, industrial or non-consumer uses, unless such uses represent 271 | the only significant mode of use of the product. 272 | 273 | “Installation Information” for a User Product means any methods, 274 | procedures, authorization keys, or other information required to install and execute 275 | modified versions of a covered work in that User Product from a modified version of 276 | its Corresponding Source. The information must suffice to ensure that the continued 277 | functioning of the modified object code is in no case prevented or interfered with 278 | solely because modification has been made. 279 | 280 | If you convey an object code work under this section in, or with, or specifically for 281 | use in, a User Product, and the conveying occurs as part of a transaction in which 282 | the right of possession and use of the User Product is transferred to the recipient 283 | in perpetuity or for a fixed term (regardless of how the transaction is 284 | characterized), the Corresponding Source conveyed under this section must be 285 | accompanied by the Installation Information. But this requirement does not apply if 286 | neither you nor any third party retains the ability to install modified object code 287 | on the User Product (for example, the work has been installed in ROM). 288 | 289 | The requirement to provide Installation Information does not include a requirement to 290 | continue to provide support service, warranty, or updates for a work that has been 291 | modified or installed by the recipient, or for the User Product in which it has been 292 | modified or installed. Access to a network may be denied when the modification itself 293 | materially and adversely affects the operation of the network or violates the rules 294 | and protocols for communication across the network. 295 | 296 | Corresponding Source conveyed, and Installation Information provided, in accord with 297 | this section must be in a format that is publicly documented (and with an 298 | implementation available to the public in source code form), and must require no 299 | special password or key for unpacking, reading or copying. 300 | 301 | ### 7. Additional Terms 302 | 303 | “Additional permissions” are terms that supplement the terms of this 304 | License by making exceptions from one or more of its conditions. Additional 305 | permissions that are applicable to the entire Program shall be treated as though they 306 | were included in this License, to the extent that they are valid under applicable 307 | law. If additional permissions apply only to part of the Program, that part may be 308 | used separately under those permissions, but the entire Program remains governed by 309 | this License without regard to the additional permissions. 310 | 311 | When you convey a copy of a covered work, you may at your option remove any 312 | additional permissions from that copy, or from any part of it. (Additional 313 | permissions may be written to require their own removal in certain cases when you 314 | modify the work.) You may place additional permissions on material, added by you to a 315 | covered work, for which you have or can give appropriate copyright permission. 316 | 317 | Notwithstanding any other provision of this License, for material you add to a 318 | covered work, you may (if authorized by the copyright holders of that material) 319 | supplement the terms of this License with terms: 320 | 321 | * **a)** Disclaiming warranty or limiting liability differently from the terms of 322 | sections 15 and 16 of this License; or 323 | * **b)** Requiring preservation of specified reasonable legal notices or author 324 | attributions in that material or in the Appropriate Legal Notices displayed by works 325 | containing it; or 326 | * **c)** Prohibiting misrepresentation of the origin of that material, or requiring that 327 | modified versions of such material be marked in reasonable ways as different from the 328 | original version; or 329 | * **d)** Limiting the use for publicity purposes of names of licensors or authors of the 330 | material; or 331 | * **e)** Declining to grant rights under trademark law for use of some trade names, 332 | trademarks, or service marks; or 333 | * **f)** Requiring indemnification of licensors and authors of that material by anyone 334 | who conveys the material (or modified versions of it) with contractual assumptions of 335 | liability to the recipient, for any liability that these contractual assumptions 336 | directly impose on those licensors and authors. 337 | 338 | All other non-permissive additional terms are considered “further 339 | restrictions” within the meaning of section 10. If the Program as you received 340 | it, or any part of it, contains a notice stating that it is governed by this License 341 | along with a term that is a further restriction, you may remove that term. If a 342 | license document contains a further restriction but permits relicensing or conveying 343 | under this License, you may add to a covered work material governed by the terms of 344 | that license document, provided that the further restriction does not survive such 345 | relicensing or conveying. 346 | 347 | If you add terms to a covered work in accord with this section, you must place, in 348 | the relevant source files, a statement of the additional terms that apply to those 349 | files, or a notice indicating where to find the applicable terms. 350 | 351 | Additional terms, permissive or non-permissive, may be stated in the form of a 352 | separately written license, or stated as exceptions; the above requirements apply 353 | either way. 354 | 355 | ### 8. Termination 356 | 357 | You may not propagate or modify a covered work except as expressly provided under 358 | this License. Any attempt otherwise to propagate or modify it is void, and will 359 | automatically terminate your rights under this License (including any patent licenses 360 | granted under the third paragraph of section 11). 361 | 362 | However, if you cease all violation of this License, then your license from a 363 | particular copyright holder is reinstated **(a)** provisionally, unless and until the 364 | copyright holder explicitly and finally terminates your license, and **(b)** permanently, 365 | if the copyright holder fails to notify you of the violation by some reasonable means 366 | prior to 60 days after the cessation. 367 | 368 | Moreover, your license from a particular copyright holder is reinstated permanently 369 | if the copyright holder notifies you of the violation by some reasonable means, this 370 | is the first time you have received notice of violation of this License (for any 371 | work) from that copyright holder, and you cure the violation prior to 30 days after 372 | your receipt of the notice. 373 | 374 | Termination of your rights under this section does not terminate the licenses of 375 | parties who have received copies or rights from you under this License. If your 376 | rights have been terminated and not permanently reinstated, you do not qualify to 377 | receive new licenses for the same material under section 10. 378 | 379 | ### 9. Acceptance Not Required for Having Copies 380 | 381 | You are not required to accept this License in order to receive or run a copy of the 382 | Program. Ancillary propagation of a covered work occurring solely as a consequence of 383 | using peer-to-peer transmission to receive a copy likewise does not require 384 | acceptance. However, nothing other than this License grants you permission to 385 | propagate or modify any covered work. These actions infringe copyright if you do not 386 | accept this License. Therefore, by modifying or propagating a covered work, you 387 | indicate your acceptance of this License to do so. 388 | 389 | ### 10. Automatic Licensing of Downstream Recipients 390 | 391 | Each time you convey a covered work, the recipient automatically receives a license 392 | from the original licensors, to run, modify and propagate that work, subject to this 393 | License. You are not responsible for enforcing compliance by third parties with this 394 | License. 395 | 396 | An “entity transaction” is a transaction transferring control of an 397 | organization, or substantially all assets of one, or subdividing an organization, or 398 | merging organizations. If propagation of a covered work results from an entity 399 | transaction, each party to that transaction who receives a copy of the work also 400 | receives whatever licenses to the work the party's predecessor in interest had or 401 | could give under the previous paragraph, plus a right to possession of the 402 | Corresponding Source of the work from the predecessor in interest, if the predecessor 403 | has it or can get it with reasonable efforts. 404 | 405 | You may not impose any further restrictions on the exercise of the rights granted or 406 | affirmed under this License. For example, you may not impose a license fee, royalty, 407 | or other charge for exercise of rights granted under this License, and you may not 408 | initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging 409 | that any patent claim is infringed by making, using, selling, offering for sale, or 410 | importing the Program or any portion of it. 411 | 412 | ### 11. Patents 413 | 414 | A “contributor” is a copyright holder who authorizes use under this 415 | License of the Program or a work on which the Program is based. The work thus 416 | licensed is called the contributor's “contributor version”. 417 | 418 | A contributor's “essential patent claims” are all patent claims owned or 419 | controlled by the contributor, whether already acquired or hereafter acquired, that 420 | would be infringed by some manner, permitted by this License, of making, using, or 421 | selling its contributor version, but do not include claims that would be infringed 422 | only as a consequence of further modification of the contributor version. For 423 | purposes of this definition, “control” includes the right to grant patent 424 | sublicenses in a manner consistent with the requirements of this License. 425 | 426 | Each contributor grants you a non-exclusive, worldwide, royalty-free patent license 427 | under the contributor's essential patent claims, to make, use, sell, offer for sale, 428 | import and otherwise run, modify and propagate the contents of its contributor 429 | version. 430 | 431 | In the following three paragraphs, a “patent license” is any express 432 | agreement or commitment, however denominated, not to enforce a patent (such as an 433 | express permission to practice a patent or covenant not to sue for patent 434 | infringement). To “grant” such a patent license to a party means to make 435 | such an agreement or commitment not to enforce a patent against the party. 436 | 437 | If you convey a covered work, knowingly relying on a patent license, and the 438 | Corresponding Source of the work is not available for anyone to copy, free of charge 439 | and under the terms of this License, through a publicly available network server or 440 | other readily accessible means, then you must either **(1)** cause the Corresponding 441 | Source to be so available, or **(2)** arrange to deprive yourself of the benefit of the 442 | patent license for this particular work, or **(3)** arrange, in a manner consistent with 443 | the requirements of this License, to extend the patent license to downstream 444 | recipients. “Knowingly relying” means you have actual knowledge that, but 445 | for the patent license, your conveying the covered work in a country, or your 446 | recipient's use of the covered work in a country, would infringe one or more 447 | identifiable patents in that country that you have reason to believe are valid. 448 | 449 | If, pursuant to or in connection with a single transaction or arrangement, you 450 | convey, or propagate by procuring conveyance of, a covered work, and grant a patent 451 | license to some of the parties receiving the covered work authorizing them to use, 452 | propagate, modify or convey a specific copy of the covered work, then the patent 453 | license you grant is automatically extended to all recipients of the covered work and 454 | works based on it. 455 | 456 | A patent license is “discriminatory” if it does not include within the 457 | scope of its coverage, prohibits the exercise of, or is conditioned on the 458 | non-exercise of one or more of the rights that are specifically granted under this 459 | License. You may not convey a covered work if you are a party to an arrangement with 460 | a third party that is in the business of distributing software, under which you make 461 | payment to the third party based on the extent of your activity of conveying the 462 | work, and under which the third party grants, to any of the parties who would receive 463 | the covered work from you, a discriminatory patent license **(a)** in connection with 464 | copies of the covered work conveyed by you (or copies made from those copies), or **(b)** 465 | primarily for and in connection with specific products or compilations that contain 466 | the covered work, unless you entered into that arrangement, or that patent license 467 | was granted, prior to 28 March 2007. 468 | 469 | Nothing in this License shall be construed as excluding or limiting any implied 470 | license or other defenses to infringement that may otherwise be available to you 471 | under applicable patent law. 472 | 473 | ### 12. No Surrender of Others' Freedom 474 | 475 | If conditions are imposed on you (whether by court order, agreement or otherwise) 476 | that contradict the conditions of this License, they do not excuse you from the 477 | conditions of this License. If you cannot convey a covered work so as to satisfy 478 | simultaneously your obligations under this License and any other pertinent 479 | obligations, then as a consequence you may not convey it at all. For example, if you 480 | agree to terms that obligate you to collect a royalty for further conveying from 481 | those to whom you convey the Program, the only way you could satisfy both those terms 482 | and this License would be to refrain entirely from conveying the Program. 483 | 484 | ### 13. Use with the GNU Affero General Public License 485 | 486 | Notwithstanding any other provision of this License, you have permission to link or 487 | combine any covered work with a work licensed under version 3 of the GNU Affero 488 | General Public License into a single combined work, and to convey the resulting work. 489 | The terms of this License will continue to apply to the part which is the covered 490 | work, but the special requirements of the GNU Affero General Public License, section 491 | 13, concerning interaction through a network will apply to the combination as such. 492 | 493 | ### 14. Revised Versions of this License 494 | 495 | The Free Software Foundation may publish revised and/or new versions of the GNU 496 | General Public License from time to time. Such new versions will be similar in spirit 497 | to the present version, but may differ in detail to address new problems or concerns. 498 | 499 | Each version is given a distinguishing version number. If the Program specifies that 500 | a certain numbered version of the GNU General Public License “or any later 501 | version” applies to it, you have the option of following the terms and 502 | conditions either of that numbered version or of any later version published by the 503 | Free Software Foundation. If the Program does not specify a version number of the GNU 504 | General Public License, you may choose any version ever published by the Free 505 | Software Foundation. 506 | 507 | If the Program specifies that a proxy can decide which future versions of the GNU 508 | General Public License can be used, that proxy's public statement of acceptance of a 509 | version permanently authorizes you to choose that version for the Program. 510 | 511 | Later license versions may give you additional or different permissions. However, no 512 | additional obligations are imposed on any author or copyright holder as a result of 513 | your choosing to follow a later version. 514 | 515 | ### 15. Disclaimer of Warranty 516 | 517 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. 518 | EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 519 | PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER 520 | EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 521 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE 522 | QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE 523 | DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 524 | 525 | ### 16. Limitation of Liability 526 | 527 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY 528 | COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS 529 | PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, 530 | INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE 531 | PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE 532 | OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE 533 | WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 534 | POSSIBILITY OF SUCH DAMAGES. 535 | 536 | ### 17. Interpretation of Sections 15 and 16 537 | 538 | If the disclaimer of warranty and limitation of liability provided above cannot be 539 | given local legal effect according to their terms, reviewing courts shall apply local 540 | law that most closely approximates an absolute waiver of all civil liability in 541 | connection with the Program, unless a warranty or assumption of liability accompanies 542 | a copy of the Program in return for a fee. 543 | 544 | _END OF TERMS AND CONDITIONS_ 545 | 546 | ## How to Apply These Terms to Your New Programs 547 | 548 | If you develop a new program, and you want it to be of the greatest possible use to 549 | the public, the best way to achieve this is to make it free software which everyone 550 | can redistribute and change under these terms. 551 | 552 | To do so, attach the following notices to the program. It is safest to attach them 553 | to the start of each source file to most effectively state the exclusion of warranty; 554 | and each file should have at least the “copyright” line and a pointer to 555 | where the full notice is found. 556 | 557 | 558 | Copyright (C) 559 | 560 | This program is free software: you can redistribute it and/or modify 561 | it under the terms of the GNU General Public License as published by 562 | the Free Software Foundation, either version 3 of the License, or 563 | (at your option) any later version. 564 | 565 | This program is distributed in the hope that it will be useful, 566 | but WITHOUT ANY WARRANTY; without even the implied warranty of 567 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 568 | GNU General Public License for more details. 569 | 570 | You should have received a copy of the GNU General Public License 571 | along with this program. If not, see . 572 | 573 | Also add information on how to contact you by electronic and paper mail. 574 | 575 | If the program does terminal interaction, make it output a short notice like this 576 | when it starts in an interactive mode: 577 | 578 | Copyright (C) 579 | This program comes with ABSOLUTELY NO WARRANTY; for details type 'show w'. 580 | This is free software, and you are welcome to redistribute it 581 | under certain conditions; type 'show c' for details. 582 | 583 | The hypothetical commands `show w` and `show c` should show the appropriate parts of 584 | the General Public License. Of course, your program's commands might be different; 585 | for a GUI interface, you would use an “about box”. 586 | 587 | You should also get your employer (if you work as a programmer) or school, if any, to 588 | sign a “copyright disclaimer” for the program, if necessary. For more 589 | information on this, and how to apply and follow the GNU GPL, see 590 | <>. 591 | 592 | The GNU General Public License does not permit incorporating your program into 593 | proprietary programs. If your program is a subroutine library, you may consider it 594 | more useful to permit linking proprietary applications with the library. If this is 595 | what you want to do, use the GNU Lesser General Public License instead of this 596 | License. But first, please read 597 | <>. -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include README.md 2 | include CHANGELOG.md 3 | include LICENSE.md 4 | include CONTRIBUTING.md -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # README 2 | 3 | [![PyPI Latest Version](https://badge.fury.io/py/pdf-importer.svg)](https://badge.fury.io/py/pdf-importer) 4 | [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) 5 | 6 | ## Statement PDF Importer 7 | 8 | **pdf-importer** is a PDF parser for credit card statements. 9 | It accepts statement from the following issuers: 10 | 11 | - [Cembra & Cumulus](https://www.cembra.ch/en/cards/cembra-mastercard/) MasterCard 12 | - [SwissCard Cashback](https://www.swisscard.ch/en/private-customers/products) (AMEX / VISA / MasterCard) 13 | 14 | The data can be saved to a CSV file compatible with [Wallet by budgetbakers](https://budgetbakers.com/) import feature. 15 | 16 | ## Dependencies 17 | 18 | - [Python 3.6](https://www.python.org/downloads/release/python-360/) and [pip 10.0](https://pip.pypa.io/en/stable/). 19 | - [camelot-py](https://camelot-py.readthedocs.io/en/master/) and 20 | [opencv-python](https://github.com/opencv/opencv-python) for PDF parsing. 21 | - [python-dateutil](https://dateutil.readthedocs.io/en/stable/) for date format management. 22 | - [pandas](https://pandas.pydata.org/) for CSV export. 23 | 24 | ## Installation 25 | 26 | You can install the package by cloning the [GitHub repository](https://github.com/c-vigo/StatementPDFImporter) or directly 27 | using [pip](https://pip.pypa.io/en/stable/): 28 | 29 | ``` 30 | python -m pip install pdf-importer 31 | ``` 32 | 33 | ## Usage 34 | 35 | You can parse a PDF statement simply with 36 | 37 | ``` 38 | python -m pdf_importer [filename] [type] [-o csv_file] 39 | ``` 40 | where 41 | 42 | - *filename* is the full path to the PDF file 43 | - *type* is either *cembra* or *cashback* 44 | - *csv_file* is the full path to the CSV file where the data is saved. 45 | 46 | ## Authors 47 | 48 | * [**Carlos Vigo**](mailto:carviher1990@gmail.com?subject=[GitHub%-%pdf-importer]) - *Initial work* - 49 | [GitHub](https://github.com/c-vigo) 50 | 51 | ## Contributing 52 | 53 | Please read our [contributing policy](CONTRIBUTING.md) for details on our code of 54 | conduct, and the process for submitting pull requests to us. 55 | 56 | ## Versioning 57 | 58 | We use [Git](https://git-scm.com/) for versioning. For the versions available, see the 59 | [tags on this repository](https://gitlab.ethz.ch/exotic-matter/cw-beam/pdf-importer). 60 | 61 | ## License 62 | 63 | This project is licensed under the [GNU GPLv3 License](LICENSE.md) 64 | 65 | ## Built With 66 | 67 | * [PyCharm Professional 2020](https://www.jetbrains.com/pycharm//) - The IDE used 68 | -------------------------------------------------------------------------------- /pdf_importer/Importers.py: -------------------------------------------------------------------------------- 1 | """ Collection of importers. 2 | """ 3 | 4 | # Imports 5 | 6 | # Third party 7 | import camelot 8 | from dateutil.parser import parse 9 | from pandas import concat 10 | 11 | 12 | def extract_cembra(filename): 13 | entries = [] 14 | 15 | tables = camelot.read_pdf(filename, pages='2-end') 16 | 17 | for page, pdf_table in enumerate(tables): 18 | df = tables[page].df 19 | for _, row in df.iterrows(): 20 | try: 21 | date = parse(row[1].strip(), dayfirst=True).date() 22 | _ = parse(row[0].strip(), dayfirst=True).date() 23 | text = row[2] 24 | credit = row[3].replace('\'', '') 25 | debit = row[4].replace('\'', '') 26 | amount = -float(debit) if debit else float(credit) 27 | entries.append([date, amount, text]) 28 | except ValueError: 29 | pass 30 | 31 | return entries 32 | 33 | 34 | def extract_cashback(filename): 35 | entries = [] 36 | 37 | # noinspection PyUnresolvedReferences 38 | table1 = camelot.read_pdf( 39 | filename, 40 | pages='1', 41 | flavor='stream', 42 | table_areas=['50,320,560,50'], 43 | columns=['120,530'] 44 | ) 45 | table2 = camelot.read_pdf( 46 | filename, 47 | pages='2-end', 48 | flavor='stream', 49 | table_areas=['50,800,560,50'], 50 | columns=['120,530'] 51 | ) 52 | 53 | df = concat([table1[0].df, table2[0].df]) 54 | 55 | for index, row in df.iterrows(): 56 | try: 57 | date = parse(row[0].strip(), dayfirst=True).date() 58 | text = row[1].replace("\n", " ") 59 | amount = -float(row[2].replace("'", "")) 60 | 61 | if text == "YOUR PAYMENT – THANK YOU": 62 | amount = -amount 63 | 64 | entries.append([date, amount, text]) 65 | except ValueError: 66 | pass 67 | 68 | return entries 69 | -------------------------------------------------------------------------------- /pdf_importer/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # Author: Carlos Vigo 3 | # Contact: carviher1990@gmail.com 4 | 5 | """ PDF parser for credit card statements. 6 | It accepts statement from the following issuers: 7 | - Cembra (Cumulus MasterCard) 8 | - SwissCard Cashback (AMEX / VISA / MasterCard) 9 | The data can be saved to a CSV file compatible with Wallet import feature. 10 | """ 11 | 12 | # Local imports 13 | from . import __project__, pdf_importer 14 | 15 | __all__ = [ 16 | __project__.__author__, 17 | __project__.__copyright__, 18 | __project__.__short_version__, 19 | __project__.__version__, 20 | __project__.__project_name__, 21 | 'pdf_importer', 22 | ] 23 | -------------------------------------------------------------------------------- /pdf_importer/__main__.py: -------------------------------------------------------------------------------- 1 | from .pdf_importer import pdf_importer 2 | 3 | if __name__ == "__main__": 4 | pdf_importer() 5 | -------------------------------------------------------------------------------- /pdf_importer/__project__.py: -------------------------------------------------------------------------------- 1 | __author__ = 'Carlos Vigo ' 2 | __email__ = '' 3 | __short_author__ = 'Carlos Vigo' 4 | __copyright__ = '2021, Carlos Vigo' 5 | __package_name__ = 'pdf-importer' 6 | __module_name__ = 'pdf_importer.py' 7 | __project_name__ = 'Statement PDF Importer' 8 | __url__ = 'https://github.com/c-vigo/StatementPDFImporter' 9 | __documentation__ = 'https://github.com/c-vigo/StatementPDFImporter' 10 | __version__ = '0.5' 11 | __short_version__ = '0.5' 12 | __description__ = 'A PDF importer to generate CSV files from bank statements' 13 | 14 | -------------------------------------------------------------------------------- /pdf_importer/pdf_importer.py: -------------------------------------------------------------------------------- 1 | """ PDF parser for credit card statements. 2 | It accepts statement from the following issuers: 3 | - Cembra (Cumulus MasterCard) 4 | - SwissCard Cashback (AMEX / VISA / MasterCard) 5 | The data can be saved to a CSV file compatible with Wallet import feature. 6 | """ 7 | 8 | # Imports 9 | import csv 10 | from argparse import ArgumentParser 11 | 12 | # Local packages 13 | from .__project__ import ( 14 | __documentation__ as docs_url, 15 | __module_name__ as module, 16 | __description__ as prog_desc, 17 | ) 18 | from .Importers import extract_cembra, extract_cashback 19 | 20 | 21 | def pdf_importer(): 22 | """The main routine. It parses the input argument and acts accordingly.""" 23 | 24 | # The argument parser 25 | ap = ArgumentParser( 26 | prog=module, 27 | description=prog_desc, 28 | add_help=True, 29 | epilog='Check out the package documentation for more information:\n{}'.format(docs_url) 30 | ) 31 | 32 | # Positional arguments: 33 | # 1. File name 34 | ap.add_argument( 35 | 'filename', 36 | help='PDF file to parse', 37 | type=str, 38 | ) 39 | 40 | # 2. Statement type 41 | ap.add_argument( 42 | 'type', 43 | help='statement type', 44 | type=str, 45 | choices=['cembra', 'cashback'] 46 | ) 47 | 48 | # 3. Output file 49 | ap.add_argument( 50 | '--o', 51 | '-output', 52 | dest='output', 53 | help='output CSV file', 54 | type=str, 55 | default=None 56 | ) 57 | 58 | # Parse the arguments 59 | args = ap.parse_args() 60 | 61 | # Extract data from PDF 62 | if args.type == 'cembra': 63 | print('Processing file {} as a Cembra PDF statement.'.format(args.filename)) 64 | entries = extract_cembra(args.filename) 65 | elif args.type == 'cashback': 66 | entries = extract_cashback(args.filename) 67 | else: 68 | raise RuntimeError('Invalid statement type {}'.format(args.type)) 69 | 70 | # Print to console 71 | total_value = 0. 72 | for entry in entries: 73 | print('{} {:+7.2f} {}'.format(entry[0], entry[1], entry[2])) 74 | total_value += entry[1] 75 | print('\nTotal: {:+7.2f}'.format(total_value)) 76 | 77 | # Save to CSV file 78 | if args.output is not None: 79 | print('Saving data to {}.'.format(args.output)) 80 | with open(args.output, mode='w') as f: 81 | writer = csv.writer( 82 | f, 83 | delimiter=';', 84 | quotechar='"', 85 | quoting=csv.QUOTE_MINIMAL 86 | ) 87 | for entry in entries: 88 | writer.writerow(entry) 89 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | # Minimum requirements for the build system to execute. 3 | requires = [ 4 | "setuptools", 5 | "wheel", 6 | "pip>=10.0.0", 7 | ] 8 | build-backend = "setuptools.build_meta:__legacy__" 9 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [aliases] 2 | test=pytest 3 | check=flake8 4 | 5 | [flake8] 6 | max-line-length=120 -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # Author: Carlos Vigo 3 | # Contact: carviher1990@gmail.com 4 | 5 | from setuptools import setup 6 | 7 | from os.path import join, dirname, abspath 8 | from sys import path as sys_path 9 | sys_path.append(abspath('pdf_importer')) 10 | import __project__ # noqa: E402 11 | 12 | 13 | # Read the README.md file 14 | with open(join(dirname(__file__), 'README.md'), "r") as fh: 15 | long_description = fh.read() 16 | 17 | setup( 18 | name=__project__.__package_name__, 19 | version=__project__.__version__, 20 | author=__project__.__short_author__, 21 | author_email=__project__.__email__, 22 | description=__project__.__description__, 23 | long_description=long_description, 24 | long_description_content_type="text/markdown", 25 | url=__project__.__url__, 26 | packages=['pdf_importer'], 27 | classifiers=[ 28 | "Programming Language :: Python :: 3.6", 29 | "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", 30 | "Operating System :: POSIX :: Linux", 31 | ], 32 | license='GPLv3', 33 | keywords=[ 34 | 'PDF' 35 | ], 36 | python_requires='>=3.6', 37 | setup_requires=[ 38 | 'pip>=10.0', 39 | 'wheel', 40 | 'setuptools>=30', 41 | ], 42 | install_requires=[ 43 | 'camelot-py', 44 | 'python-dateutil', 45 | 'opencv-python', 46 | 'pandas' 47 | ], 48 | include_package_data=True 49 | ) 50 | --------------------------------------------------------------------------------