├── .gitignore
├── CHANGELOG.md
├── CONTRIBUTING.md
├── LICENSE.md
├── MANIFEST.in
├── README.md
├── pdf_importer
├── Importers.py
├── __init__.py
├── __main__.py
├── __project__.py
└── pdf_importer.py
├── pyproject.toml
├── setup.cfg
└── setup.py
/.gitignore:
--------------------------------------------------------------------------------
1 | /.idea/
2 | /pdf_importer.egg-info/
3 | /.eggs/
4 | /build/
5 | /dist/
6 | /docs/build/
7 | /docs/source/api/
8 | .vscode/
9 | __pycache__/
10 |
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | # Changelog
2 |
3 | All notable changes to this project will be documented in this file.
4 |
5 | The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6 | and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7 |
8 | ## [0.5] - 2021-12-26
9 |
10 | - Fix [issue #6](https://github.com/c-vigo/StatementPDFImporter/issues/6): statements were silently skipped if the
11 | amount contains an apostrophe
12 |
13 | - Fix [issue #4](https://github.com/c-vigo/StatementPDFImporter/issues/4): last transactions got omitted in long
14 | statements (over two pages)
15 |
16 | ## [0.4] - 2021-03-27
17 |
18 | - Fix bug: amounts above 1000 with coma-separated thousands
19 |
20 | ## [0.3] - 2021-02-24
21 |
22 | - Fix [issue #2](https://github.com/c-vigo/StatementPDFImporter/issues/2): newline in transaction description messing with CSV output
23 |
24 | ## [0.2] - 2021-02-07
25 |
26 | - Hotfix
27 |
28 | ## [0.1] - 2021-02-03
29 |
30 | - First version of the package
31 | - Supported statements:
32 | - [Cembra & Cumulus](https://www.cembra.ch/en/cards/cembra-mastercard/) MasterCard
33 | - [SwissCard Cashback](https://www.swisscard.ch/en/private-customers/products) (AMEX / VISA / MasterCard)
34 |
35 | [Unreleased]: https://github.com/c-vigo/StatementPDFImporter/compare/v0.2...HEAD
36 | [0.5]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.5
37 | [0.4]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.4
38 | [0.3]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.3
39 | [0.2]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.2
40 | [0.1]: https://github.com/c-vigo/StatementPDFImporter/tree/v0.1
41 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing
2 |
3 | When contributing to this repository, please first discuss the change you wish to make via issue,
4 | email, or any other method with the owners of this repository before making a change.
5 |
6 | Please note we have a code of conduct, please follow it in all your interactions with the project.
7 |
8 | ## Pull Request Process
9 |
10 | 1. Ensure any install or build dependencies are removed before the end of the layer when doing a
11 | build.
12 | 2. Update the README.md with details of changes to the interface, this includes new environment
13 | variables, exposed ports, useful file locations and container parameters.
14 | 3. Increase the version numbers in any examples files and the README.md to the new version that this
15 | Pull Request would represent. The versioning scheme we use is [SemVer](http://semver.org/).
16 | 4. You may merge the Pull Request in once you have the sign-off of two other developers, or if you
17 | do not have permission to do that, you may request the second reviewer to merge it for you.
18 |
19 | ## Code of Conduct
20 |
21 | ### Our Pledge
22 |
23 | In the interest of fostering an open and welcoming environment, we as
24 | contributors and maintainers pledge to making participation in our project and
25 | our community a harassment-free experience for everyone, regardless of age, body
26 | size, disability, ethnicity, gender identity and expression, level of experience,
27 | nationality, personal appearance, race, religion, or sexual identity and
28 | orientation.
29 |
30 | ### Our Standards
31 |
32 | Examples of behavior that contributes to creating a positive environment
33 | include:
34 |
35 | * Using welcoming and inclusive language
36 | * Being respectful of differing viewpoints and experiences
37 | * Gracefully accepting constructive criticism
38 | * Focusing on what is best for the community
39 | * Showing empathy towards other community members
40 |
41 | Examples of unacceptable behavior by participants include:
42 |
43 | * The use of sexualized language or imagery and unwelcome sexual attention or
44 | advances
45 | * Trolling, insulting/derogatory comments, and personal or political attacks
46 | * Public or private harassment
47 | * Publishing others' private information, such as a physical or electronic
48 | address, without explicit permission
49 | * Other conduct which could reasonably be considered inappropriate in a
50 | professional setting
51 |
52 | ### Our Responsibilities
53 |
54 | Project maintainers are responsible for clarifying the standards of acceptable
55 | behavior and are expected to take appropriate and fair corrective action in
56 | response to any instances of unacceptable behavior.
57 |
58 | Project maintainers have the right and responsibility to remove, edit, or
59 | reject comments, commits, code, wiki edits, issues, and other contributions
60 | that are not aligned to this Code of Conduct, or to ban temporarily or
61 | permanently any contributor for other behaviors that they deem inappropriate,
62 | threatening, offensive, or harmful.
63 |
64 | ### Scope
65 |
66 | This Code of Conduct applies both within project spaces and in public spaces
67 | when an individual is representing the project or its community. Examples of
68 | representing a project or community include using an official project e-mail
69 | address, posting via an official social media account, or acting as an appointed
70 | representative at an online or offline event. Representation of a project may be
71 | further defined and clarified by project maintainers.
72 |
73 | ### Enforcement
74 |
75 | Instances of abusive, harassing, or otherwise unacceptable behavior may be
76 | reported by contacting the project team at [INSERT EMAIL ADDRESS]. All
77 | complaints will be reviewed and investigated and will result in a response that
78 | is deemed necessary and appropriate to the circumstances. The project team is
79 | obligated to maintain confidentiality with regard to the reporter of an incident.
80 | Further details of specific enforcement policies may be posted separately.
81 |
82 | Project maintainers who do not follow or enforce the Code of Conduct in good
83 | faith may face temporary or permanent repercussions as determined by other
84 | members of the project's leadership.
85 |
86 | ### Attribution
87 |
88 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
89 | available at [http://contributor-covenant.org/version/1/4][version]
90 |
91 | [homepage]: http://contributor-covenant.org
92 | [version]: http://contributor-covenant.org/version/1/4/
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | GNU General Public License
2 | ==========================
3 |
4 | [](https://www.gnu.org/licenses/gpl-3.0)
5 |
6 | _Version 3, 29 June 2007_
7 | _Copyright © 2007 Free Software Foundation, Inc. <>_
8 |
9 | Everyone is permitted to copy and distribute verbatim copies of this license
10 | document, but changing it is not allowed.
11 |
12 | ## Preamble
13 |
14 | The GNU General Public License is a free, copyleft license for software and other
15 | kinds of works.
16 |
17 | The licenses for most software and other practical works are designed to take away
18 | your freedom to share and change the works. By contrast, the GNU General Public
19 | License is intended to guarantee your freedom to share and change all versions of a
20 | program--to make sure it remains free software for all its users. We, the Free
21 | Software Foundation, use the GNU General Public License for most of our software; it
22 | applies also to any other work released this way by its authors. You can apply it to
23 | your programs, too.
24 |
25 | When we speak of free software, we are referring to freedom, not price. Our General
26 | Public Licenses are designed to make sure that you have the freedom to distribute
27 | copies of free software (and charge for them if you wish), that you receive source
28 | code or can get it if you want it, that you can change the software or use pieces of
29 | it in new free programs, and that you know you can do these things.
30 |
31 | To protect your rights, we need to prevent others from denying you these rights or
32 | asking you to surrender the rights. Therefore, you have certain responsibilities if
33 | you distribute copies of the software, or if you modify it: responsibilities to
34 | respect the freedom of others.
35 |
36 | For example, if you distribute copies of such a program, whether gratis or for a fee,
37 | you must pass on to the recipients the same freedoms that you received. You must make
38 | sure that they, too, receive or can get the source code. And you must show them these
39 | terms so they know their rights.
40 |
41 | Developers that use the GNU GPL protect your rights with two steps: **(1)** assert
42 | copyright on the software, and **(2)** offer you this License giving you legal permission
43 | to copy, distribute and/or modify it.
44 |
45 | For the developers' and authors' protection, the GPL clearly explains that there is
46 | no warranty for this free software. For both users' and authors' sake, the GPL
47 | requires that modified versions be marked as changed, so that their problems will not
48 | be attributed erroneously to authors of previous versions.
49 |
50 | Some devices are designed to deny users access to install or run modified versions of
51 | the software inside them, although the manufacturer can do so. This is fundamentally
52 | incompatible with the aim of protecting users' freedom to change the software. The
53 | systematic pattern of such abuse occurs in the area of products for individuals to
54 | use, which is precisely where it is most unacceptable. Therefore, we have designed
55 | this version of the GPL to prohibit the practice for those products. If such problems
56 | arise substantially in other domains, we stand ready to extend this provision to
57 | those domains in future versions of the GPL, as needed to protect the freedom of
58 | users.
59 |
60 | Finally, every program is threatened constantly by software patents. States should
61 | not allow patents to restrict development and use of software on general-purpose
62 | computers, but in those that do, we wish to avoid the special danger that patents
63 | applied to a free program could make it effectively proprietary. To prevent this, the
64 | GPL assures that patents cannot be used to render the program non-free.
65 |
66 | The precise terms and conditions for copying, distribution and modification follow.
67 |
68 | ## TERMS AND CONDITIONS
69 |
70 | ### 0. Definitions
71 |
72 | “This License” refers to version 3 of the GNU General Public License.
73 |
74 | “Copyright” also means copyright-like laws that apply to other kinds of
75 | works, such as semiconductor masks.
76 |
77 | “The Program” refers to any copyrightable work licensed under this
78 | License. Each licensee is addressed as “you”. “Licensees” and
79 | “recipients” may be individuals or organizations.
80 |
81 | To “modify” a work means to copy from or adapt all or part of the work in
82 | a fashion requiring copyright permission, other than the making of an exact copy. The
83 | resulting work is called a “modified version” of the earlier work or a
84 | work “based on” the earlier work.
85 |
86 | A “covered work” means either the unmodified Program or a work based on
87 | the Program.
88 |
89 | To “propagate” a work means to do anything with it that, without
90 | permission, would make you directly or secondarily liable for infringement under
91 | applicable copyright law, except executing it on a computer or modifying a private
92 | copy. Propagation includes copying, distribution (with or without modification),
93 | making available to the public, and in some countries other activities as well.
94 |
95 | To “convey” a work means any kind of propagation that enables other
96 | parties to make or receive copies. Mere interaction with a user through a computer
97 | network, with no transfer of a copy, is not conveying.
98 |
99 | An interactive user interface displays “Appropriate Legal Notices” to the
100 | extent that it includes a convenient and prominently visible feature that **(1)**
101 | displays an appropriate copyright notice, and **(2)** tells the user that there is no
102 | warranty for the work (except to the extent that warranties are provided), that
103 | licensees may convey the work under this License, and how to view a copy of this
104 | License. If the interface presents a list of user commands or options, such as a
105 | menu, a prominent item in the list meets this criterion.
106 |
107 | ### 1. Source Code
108 |
109 | The “source code” for a work means the preferred form of the work for
110 | making modifications to it. “Object code” means any non-source form of a
111 | work.
112 |
113 | A “Standard Interface” means an interface that either is an official
114 | standard defined by a recognized standards body, or, in the case of interfaces
115 | specified for a particular programming language, one that is widely used among
116 | developers working in that language.
117 |
118 | The “System Libraries” of an executable work include anything, other than
119 | the work as a whole, that **(a)** is included in the normal form of packaging a Major
120 | Component, but which is not part of that Major Component, and **(b)** serves only to
121 | enable use of the work with that Major Component, or to implement a Standard
122 | Interface for which an implementation is available to the public in source code form.
123 | A “Major Component”, in this context, means a major essential component
124 | (kernel, window system, and so on) of the specific operating system (if any) on which
125 | the executable work runs, or a compiler used to produce the work, or an object code
126 | interpreter used to run it.
127 |
128 | The “Corresponding Source” for a work in object code form means all the
129 | source code needed to generate, install, and (for an executable work) run the object
130 | code and to modify the work, including scripts to control those activities. However,
131 | it does not include the work's System Libraries, or general-purpose tools or
132 | generally available free programs which are used unmodified in performing those
133 | activities but which are not part of the work. For example, Corresponding Source
134 | includes interface definition files associated with source files for the work, and
135 | the source code for shared libraries and dynamically linked subprograms that the work
136 | is specifically designed to require, such as by intimate data communication or
137 | control flow between those subprograms and other parts of the work.
138 |
139 | The Corresponding Source need not include anything that users can regenerate
140 | automatically from other parts of the Corresponding Source.
141 |
142 | The Corresponding Source for a work in source code form is that same work.
143 |
144 | ### 2. Basic Permissions
145 |
146 | All rights granted under this License are granted for the term of copyright on the
147 | Program, and are irrevocable provided the stated conditions are met. This License
148 | explicitly affirms your unlimited permission to run the unmodified Program. The
149 | output from running a covered work is covered by this License only if the output,
150 | given its content, constitutes a covered work. This License acknowledges your rights
151 | of fair use or other equivalent, as provided by copyright law.
152 |
153 | You may make, run and propagate covered works that you do not convey, without
154 | conditions so long as your license otherwise remains in force. You may convey covered
155 | works to others for the sole purpose of having them make modifications exclusively
156 | for you, or provide you with facilities for running those works, provided that you
157 | comply with the terms of this License in conveying all material for which you do not
158 | control copyright. Those thus making or running the covered works for you must do so
159 | exclusively on your behalf, under your direction and control, on terms that prohibit
160 | them from making any copies of your copyrighted material outside their relationship
161 | with you.
162 |
163 | Conveying under any other circumstances is permitted solely under the conditions
164 | stated below. Sublicensing is not allowed; section 10 makes it unnecessary.
165 |
166 | ### 3. Protecting Users' Legal Rights From Anti-Circumvention Law
167 |
168 | No covered work shall be deemed part of an effective technological measure under any
169 | applicable law fulfilling obligations under article 11 of the WIPO copyright treaty
170 | adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention
171 | of such measures.
172 |
173 | When you convey a covered work, you waive any legal power to forbid circumvention of
174 | technological measures to the extent such circumvention is effected by exercising
175 | rights under this License with respect to the covered work, and you disclaim any
176 | intention to limit operation or modification of the work as a means of enforcing,
177 | against the work's users, your or third parties' legal rights to forbid circumvention
178 | of technological measures.
179 |
180 | ### 4. Conveying Verbatim Copies
181 |
182 | You may convey verbatim copies of the Program's source code as you receive it, in any
183 | medium, provided that you conspicuously and appropriately publish on each copy an
184 | appropriate copyright notice; keep intact all notices stating that this License and
185 | any non-permissive terms added in accord with section 7 apply to the code; keep
186 | intact all notices of the absence of any warranty; and give all recipients a copy of
187 | this License along with the Program.
188 |
189 | You may charge any price or no price for each copy that you convey, and you may offer
190 | support or warranty protection for a fee.
191 |
192 | ### 5. Conveying Modified Source Versions
193 |
194 | You may convey a work based on the Program, or the modifications to produce it from
195 | the Program, in the form of source code under the terms of section 4, provided that
196 | you also meet all of these conditions:
197 |
198 | * **a)** The work must carry prominent notices stating that you modified it, and giving a
199 | relevant date.
200 | * **b)** The work must carry prominent notices stating that it is released under this
201 | License and any conditions added under section 7. This requirement modifies the
202 | requirement in section 4 to “keep intact all notices”.
203 | * **c)** You must license the entire work, as a whole, under this License to anyone who
204 | comes into possession of a copy. This License will therefore apply, along with any
205 | applicable section 7 additional terms, to the whole of the work, and all its parts,
206 | regardless of how they are packaged. This License gives no permission to license the
207 | work in any other way, but it does not invalidate such permission if you have
208 | separately received it.
209 | * **d)** If the work has interactive user interfaces, each must display Appropriate Legal
210 | Notices; however, if the Program has interactive interfaces that do not display
211 | Appropriate Legal Notices, your work need not make them do so.
212 |
213 | A compilation of a covered work with other separate and independent works, which are
214 | not by their nature extensions of the covered work, and which are not combined with
215 | it such as to form a larger program, in or on a volume of a storage or distribution
216 | medium, is called an “aggregate” if the compilation and its resulting
217 | copyright are not used to limit the access or legal rights of the compilation's users
218 | beyond what the individual works permit. Inclusion of a covered work in an aggregate
219 | does not cause this License to apply to the other parts of the aggregate.
220 |
221 | ### 6. Conveying Non-Source Forms
222 |
223 | You may convey a covered work in object code form under the terms of sections 4 and
224 | 5, provided that you also convey the machine-readable Corresponding Source under the
225 | terms of this License, in one of these ways:
226 |
227 | * **a)** Convey the object code in, or embodied in, a physical product (including a
228 | physical distribution medium), accompanied by the Corresponding Source fixed on a
229 | durable physical medium customarily used for software interchange.
230 | * **b)** Convey the object code in, or embodied in, a physical product (including a
231 | physical distribution medium), accompanied by a written offer, valid for at least
232 | three years and valid for as long as you offer spare parts or customer support for
233 | that product model, to give anyone who possesses the object code either **(1)** a copy of
234 | the Corresponding Source for all the software in the product that is covered by this
235 | License, on a durable physical medium customarily used for software interchange, for
236 | a price no more than your reasonable cost of physically performing this conveying of
237 | source, or **(2)** access to copy the Corresponding Source from a network server at no
238 | charge.
239 | * **c)** Convey individual copies of the object code with a copy of the written offer to
240 | provide the Corresponding Source. This alternative is allowed only occasionally and
241 | noncommercially, and only if you received the object code with such an offer, in
242 | accord with subsection 6b.
243 | * **d)** Convey the object code by offering access from a designated place (gratis or for
244 | a charge), and offer equivalent access to the Corresponding Source in the same way
245 | through the same place at no further charge. You need not require recipients to copy
246 | the Corresponding Source along with the object code. If the place to copy the object
247 | code is a network server, the Corresponding Source may be on a different server
248 | (operated by you or a third party) that supports equivalent copying facilities,
249 | provided you maintain clear directions next to the object code saying where to find
250 | the Corresponding Source. Regardless of what server hosts the Corresponding Source,
251 | you remain obligated to ensure that it is available for as long as needed to satisfy
252 | these requirements.
253 | * **e)** Convey the object code using peer-to-peer transmission, provided you inform
254 | other peers where the object code and Corresponding Source of the work are being
255 | offered to the general public at no charge under subsection 6d.
256 |
257 | A separable portion of the object code, whose source code is excluded from the
258 | Corresponding Source as a System Library, need not be included in conveying the
259 | object code work.
260 |
261 | A “User Product” is either **(1)** a “consumer product”, which
262 | means any tangible personal property which is normally used for personal, family, or
263 | household purposes, or **(2)** anything designed or sold for incorporation into a
264 | dwelling. In determining whether a product is a consumer product, doubtful cases
265 | shall be resolved in favor of coverage. For a particular product received by a
266 | particular user, “normally used” refers to a typical or common use of
267 | that class of product, regardless of the status of the particular user or of the way
268 | in which the particular user actually uses, or expects or is expected to use, the
269 | product. A product is a consumer product regardless of whether the product has
270 | substantial commercial, industrial or non-consumer uses, unless such uses represent
271 | the only significant mode of use of the product.
272 |
273 | “Installation Information” for a User Product means any methods,
274 | procedures, authorization keys, or other information required to install and execute
275 | modified versions of a covered work in that User Product from a modified version of
276 | its Corresponding Source. The information must suffice to ensure that the continued
277 | functioning of the modified object code is in no case prevented or interfered with
278 | solely because modification has been made.
279 |
280 | If you convey an object code work under this section in, or with, or specifically for
281 | use in, a User Product, and the conveying occurs as part of a transaction in which
282 | the right of possession and use of the User Product is transferred to the recipient
283 | in perpetuity or for a fixed term (regardless of how the transaction is
284 | characterized), the Corresponding Source conveyed under this section must be
285 | accompanied by the Installation Information. But this requirement does not apply if
286 | neither you nor any third party retains the ability to install modified object code
287 | on the User Product (for example, the work has been installed in ROM).
288 |
289 | The requirement to provide Installation Information does not include a requirement to
290 | continue to provide support service, warranty, or updates for a work that has been
291 | modified or installed by the recipient, or for the User Product in which it has been
292 | modified or installed. Access to a network may be denied when the modification itself
293 | materially and adversely affects the operation of the network or violates the rules
294 | and protocols for communication across the network.
295 |
296 | Corresponding Source conveyed, and Installation Information provided, in accord with
297 | this section must be in a format that is publicly documented (and with an
298 | implementation available to the public in source code form), and must require no
299 | special password or key for unpacking, reading or copying.
300 |
301 | ### 7. Additional Terms
302 |
303 | “Additional permissions” are terms that supplement the terms of this
304 | License by making exceptions from one or more of its conditions. Additional
305 | permissions that are applicable to the entire Program shall be treated as though they
306 | were included in this License, to the extent that they are valid under applicable
307 | law. If additional permissions apply only to part of the Program, that part may be
308 | used separately under those permissions, but the entire Program remains governed by
309 | this License without regard to the additional permissions.
310 |
311 | When you convey a copy of a covered work, you may at your option remove any
312 | additional permissions from that copy, or from any part of it. (Additional
313 | permissions may be written to require their own removal in certain cases when you
314 | modify the work.) You may place additional permissions on material, added by you to a
315 | covered work, for which you have or can give appropriate copyright permission.
316 |
317 | Notwithstanding any other provision of this License, for material you add to a
318 | covered work, you may (if authorized by the copyright holders of that material)
319 | supplement the terms of this License with terms:
320 |
321 | * **a)** Disclaiming warranty or limiting liability differently from the terms of
322 | sections 15 and 16 of this License; or
323 | * **b)** Requiring preservation of specified reasonable legal notices or author
324 | attributions in that material or in the Appropriate Legal Notices displayed by works
325 | containing it; or
326 | * **c)** Prohibiting misrepresentation of the origin of that material, or requiring that
327 | modified versions of such material be marked in reasonable ways as different from the
328 | original version; or
329 | * **d)** Limiting the use for publicity purposes of names of licensors or authors of the
330 | material; or
331 | * **e)** Declining to grant rights under trademark law for use of some trade names,
332 | trademarks, or service marks; or
333 | * **f)** Requiring indemnification of licensors and authors of that material by anyone
334 | who conveys the material (or modified versions of it) with contractual assumptions of
335 | liability to the recipient, for any liability that these contractual assumptions
336 | directly impose on those licensors and authors.
337 |
338 | All other non-permissive additional terms are considered “further
339 | restrictions” within the meaning of section 10. If the Program as you received
340 | it, or any part of it, contains a notice stating that it is governed by this License
341 | along with a term that is a further restriction, you may remove that term. If a
342 | license document contains a further restriction but permits relicensing or conveying
343 | under this License, you may add to a covered work material governed by the terms of
344 | that license document, provided that the further restriction does not survive such
345 | relicensing or conveying.
346 |
347 | If you add terms to a covered work in accord with this section, you must place, in
348 | the relevant source files, a statement of the additional terms that apply to those
349 | files, or a notice indicating where to find the applicable terms.
350 |
351 | Additional terms, permissive or non-permissive, may be stated in the form of a
352 | separately written license, or stated as exceptions; the above requirements apply
353 | either way.
354 |
355 | ### 8. Termination
356 |
357 | You may not propagate or modify a covered work except as expressly provided under
358 | this License. Any attempt otherwise to propagate or modify it is void, and will
359 | automatically terminate your rights under this License (including any patent licenses
360 | granted under the third paragraph of section 11).
361 |
362 | However, if you cease all violation of this License, then your license from a
363 | particular copyright holder is reinstated **(a)** provisionally, unless and until the
364 | copyright holder explicitly and finally terminates your license, and **(b)** permanently,
365 | if the copyright holder fails to notify you of the violation by some reasonable means
366 | prior to 60 days after the cessation.
367 |
368 | Moreover, your license from a particular copyright holder is reinstated permanently
369 | if the copyright holder notifies you of the violation by some reasonable means, this
370 | is the first time you have received notice of violation of this License (for any
371 | work) from that copyright holder, and you cure the violation prior to 30 days after
372 | your receipt of the notice.
373 |
374 | Termination of your rights under this section does not terminate the licenses of
375 | parties who have received copies or rights from you under this License. If your
376 | rights have been terminated and not permanently reinstated, you do not qualify to
377 | receive new licenses for the same material under section 10.
378 |
379 | ### 9. Acceptance Not Required for Having Copies
380 |
381 | You are not required to accept this License in order to receive or run a copy of the
382 | Program. Ancillary propagation of a covered work occurring solely as a consequence of
383 | using peer-to-peer transmission to receive a copy likewise does not require
384 | acceptance. However, nothing other than this License grants you permission to
385 | propagate or modify any covered work. These actions infringe copyright if you do not
386 | accept this License. Therefore, by modifying or propagating a covered work, you
387 | indicate your acceptance of this License to do so.
388 |
389 | ### 10. Automatic Licensing of Downstream Recipients
390 |
391 | Each time you convey a covered work, the recipient automatically receives a license
392 | from the original licensors, to run, modify and propagate that work, subject to this
393 | License. You are not responsible for enforcing compliance by third parties with this
394 | License.
395 |
396 | An “entity transaction” is a transaction transferring control of an
397 | organization, or substantially all assets of one, or subdividing an organization, or
398 | merging organizations. If propagation of a covered work results from an entity
399 | transaction, each party to that transaction who receives a copy of the work also
400 | receives whatever licenses to the work the party's predecessor in interest had or
401 | could give under the previous paragraph, plus a right to possession of the
402 | Corresponding Source of the work from the predecessor in interest, if the predecessor
403 | has it or can get it with reasonable efforts.
404 |
405 | You may not impose any further restrictions on the exercise of the rights granted or
406 | affirmed under this License. For example, you may not impose a license fee, royalty,
407 | or other charge for exercise of rights granted under this License, and you may not
408 | initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging
409 | that any patent claim is infringed by making, using, selling, offering for sale, or
410 | importing the Program or any portion of it.
411 |
412 | ### 11. Patents
413 |
414 | A “contributor” is a copyright holder who authorizes use under this
415 | License of the Program or a work on which the Program is based. The work thus
416 | licensed is called the contributor's “contributor version”.
417 |
418 | A contributor's “essential patent claims” are all patent claims owned or
419 | controlled by the contributor, whether already acquired or hereafter acquired, that
420 | would be infringed by some manner, permitted by this License, of making, using, or
421 | selling its contributor version, but do not include claims that would be infringed
422 | only as a consequence of further modification of the contributor version. For
423 | purposes of this definition, “control” includes the right to grant patent
424 | sublicenses in a manner consistent with the requirements of this License.
425 |
426 | Each contributor grants you a non-exclusive, worldwide, royalty-free patent license
427 | under the contributor's essential patent claims, to make, use, sell, offer for sale,
428 | import and otherwise run, modify and propagate the contents of its contributor
429 | version.
430 |
431 | In the following three paragraphs, a “patent license” is any express
432 | agreement or commitment, however denominated, not to enforce a patent (such as an
433 | express permission to practice a patent or covenant not to sue for patent
434 | infringement). To “grant” such a patent license to a party means to make
435 | such an agreement or commitment not to enforce a patent against the party.
436 |
437 | If you convey a covered work, knowingly relying on a patent license, and the
438 | Corresponding Source of the work is not available for anyone to copy, free of charge
439 | and under the terms of this License, through a publicly available network server or
440 | other readily accessible means, then you must either **(1)** cause the Corresponding
441 | Source to be so available, or **(2)** arrange to deprive yourself of the benefit of the
442 | patent license for this particular work, or **(3)** arrange, in a manner consistent with
443 | the requirements of this License, to extend the patent license to downstream
444 | recipients. “Knowingly relying” means you have actual knowledge that, but
445 | for the patent license, your conveying the covered work in a country, or your
446 | recipient's use of the covered work in a country, would infringe one or more
447 | identifiable patents in that country that you have reason to believe are valid.
448 |
449 | If, pursuant to or in connection with a single transaction or arrangement, you
450 | convey, or propagate by procuring conveyance of, a covered work, and grant a patent
451 | license to some of the parties receiving the covered work authorizing them to use,
452 | propagate, modify or convey a specific copy of the covered work, then the patent
453 | license you grant is automatically extended to all recipients of the covered work and
454 | works based on it.
455 |
456 | A patent license is “discriminatory” if it does not include within the
457 | scope of its coverage, prohibits the exercise of, or is conditioned on the
458 | non-exercise of one or more of the rights that are specifically granted under this
459 | License. You may not convey a covered work if you are a party to an arrangement with
460 | a third party that is in the business of distributing software, under which you make
461 | payment to the third party based on the extent of your activity of conveying the
462 | work, and under which the third party grants, to any of the parties who would receive
463 | the covered work from you, a discriminatory patent license **(a)** in connection with
464 | copies of the covered work conveyed by you (or copies made from those copies), or **(b)**
465 | primarily for and in connection with specific products or compilations that contain
466 | the covered work, unless you entered into that arrangement, or that patent license
467 | was granted, prior to 28 March 2007.
468 |
469 | Nothing in this License shall be construed as excluding or limiting any implied
470 | license or other defenses to infringement that may otherwise be available to you
471 | under applicable patent law.
472 |
473 | ### 12. No Surrender of Others' Freedom
474 |
475 | If conditions are imposed on you (whether by court order, agreement or otherwise)
476 | that contradict the conditions of this License, they do not excuse you from the
477 | conditions of this License. If you cannot convey a covered work so as to satisfy
478 | simultaneously your obligations under this License and any other pertinent
479 | obligations, then as a consequence you may not convey it at all. For example, if you
480 | agree to terms that obligate you to collect a royalty for further conveying from
481 | those to whom you convey the Program, the only way you could satisfy both those terms
482 | and this License would be to refrain entirely from conveying the Program.
483 |
484 | ### 13. Use with the GNU Affero General Public License
485 |
486 | Notwithstanding any other provision of this License, you have permission to link or
487 | combine any covered work with a work licensed under version 3 of the GNU Affero
488 | General Public License into a single combined work, and to convey the resulting work.
489 | The terms of this License will continue to apply to the part which is the covered
490 | work, but the special requirements of the GNU Affero General Public License, section
491 | 13, concerning interaction through a network will apply to the combination as such.
492 |
493 | ### 14. Revised Versions of this License
494 |
495 | The Free Software Foundation may publish revised and/or new versions of the GNU
496 | General Public License from time to time. Such new versions will be similar in spirit
497 | to the present version, but may differ in detail to address new problems or concerns.
498 |
499 | Each version is given a distinguishing version number. If the Program specifies that
500 | a certain numbered version of the GNU General Public License “or any later
501 | version” applies to it, you have the option of following the terms and
502 | conditions either of that numbered version or of any later version published by the
503 | Free Software Foundation. If the Program does not specify a version number of the GNU
504 | General Public License, you may choose any version ever published by the Free
505 | Software Foundation.
506 |
507 | If the Program specifies that a proxy can decide which future versions of the GNU
508 | General Public License can be used, that proxy's public statement of acceptance of a
509 | version permanently authorizes you to choose that version for the Program.
510 |
511 | Later license versions may give you additional or different permissions. However, no
512 | additional obligations are imposed on any author or copyright holder as a result of
513 | your choosing to follow a later version.
514 |
515 | ### 15. Disclaimer of Warranty
516 |
517 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
518 | EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
519 | PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER
520 | EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
521 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE
522 | QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE
523 | DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
524 |
525 | ### 16. Limitation of Liability
526 |
527 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY
528 | COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS
529 | PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL,
530 | INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
531 | PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE
532 | OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
533 | WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
534 | POSSIBILITY OF SUCH DAMAGES.
535 |
536 | ### 17. Interpretation of Sections 15 and 16
537 |
538 | If the disclaimer of warranty and limitation of liability provided above cannot be
539 | given local legal effect according to their terms, reviewing courts shall apply local
540 | law that most closely approximates an absolute waiver of all civil liability in
541 | connection with the Program, unless a warranty or assumption of liability accompanies
542 | a copy of the Program in return for a fee.
543 |
544 | _END OF TERMS AND CONDITIONS_
545 |
546 | ## How to Apply These Terms to Your New Programs
547 |
548 | If you develop a new program, and you want it to be of the greatest possible use to
549 | the public, the best way to achieve this is to make it free software which everyone
550 | can redistribute and change under these terms.
551 |
552 | To do so, attach the following notices to the program. It is safest to attach them
553 | to the start of each source file to most effectively state the exclusion of warranty;
554 | and each file should have at least the “copyright” line and a pointer to
555 | where the full notice is found.
556 |
557 |
558 | Copyright (C)
559 |
560 | This program is free software: you can redistribute it and/or modify
561 | it under the terms of the GNU General Public License as published by
562 | the Free Software Foundation, either version 3 of the License, or
563 | (at your option) any later version.
564 |
565 | This program is distributed in the hope that it will be useful,
566 | but WITHOUT ANY WARRANTY; without even the implied warranty of
567 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
568 | GNU General Public License for more details.
569 |
570 | You should have received a copy of the GNU General Public License
571 | along with this program. If not, see .
572 |
573 | Also add information on how to contact you by electronic and paper mail.
574 |
575 | If the program does terminal interaction, make it output a short notice like this
576 | when it starts in an interactive mode:
577 |
578 | Copyright (C)
579 | This program comes with ABSOLUTELY NO WARRANTY; for details type 'show w'.
580 | This is free software, and you are welcome to redistribute it
581 | under certain conditions; type 'show c' for details.
582 |
583 | The hypothetical commands `show w` and `show c` should show the appropriate parts of
584 | the General Public License. Of course, your program's commands might be different;
585 | for a GUI interface, you would use an “about box”.
586 |
587 | You should also get your employer (if you work as a programmer) or school, if any, to
588 | sign a “copyright disclaimer” for the program, if necessary. For more
589 | information on this, and how to apply and follow the GNU GPL, see
590 | <>.
591 |
592 | The GNU General Public License does not permit incorporating your program into
593 | proprietary programs. If your program is a subroutine library, you may consider it
594 | more useful to permit linking proprietary applications with the library. If this is
595 | what you want to do, use the GNU Lesser General Public License instead of this
596 | License. But first, please read
597 | <>.
--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include README.md
2 | include CHANGELOG.md
3 | include LICENSE.md
4 | include CONTRIBUTING.md
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # README
2 |
3 | [](https://badge.fury.io/py/pdf-importer)
4 | [](https://www.gnu.org/licenses/gpl-3.0)
5 |
6 | ## Statement PDF Importer
7 |
8 | **pdf-importer** is a PDF parser for credit card statements.
9 | It accepts statement from the following issuers:
10 |
11 | - [Cembra & Cumulus](https://www.cembra.ch/en/cards/cembra-mastercard/) MasterCard
12 | - [SwissCard Cashback](https://www.swisscard.ch/en/private-customers/products) (AMEX / VISA / MasterCard)
13 |
14 | The data can be saved to a CSV file compatible with [Wallet by budgetbakers](https://budgetbakers.com/) import feature.
15 |
16 | ## Dependencies
17 |
18 | - [Python 3.6](https://www.python.org/downloads/release/python-360/) and [pip 10.0](https://pip.pypa.io/en/stable/).
19 | - [camelot-py](https://camelot-py.readthedocs.io/en/master/) and
20 | [opencv-python](https://github.com/opencv/opencv-python) for PDF parsing.
21 | - [python-dateutil](https://dateutil.readthedocs.io/en/stable/) for date format management.
22 | - [pandas](https://pandas.pydata.org/) for CSV export.
23 |
24 | ## Installation
25 |
26 | You can install the package by cloning the [GitHub repository](https://github.com/c-vigo/StatementPDFImporter) or directly
27 | using [pip](https://pip.pypa.io/en/stable/):
28 |
29 | ```
30 | python -m pip install pdf-importer
31 | ```
32 |
33 | ## Usage
34 |
35 | You can parse a PDF statement simply with
36 |
37 | ```
38 | python -m pdf_importer [filename] [type] [-o csv_file]
39 | ```
40 | where
41 |
42 | - *filename* is the full path to the PDF file
43 | - *type* is either *cembra* or *cashback*
44 | - *csv_file* is the full path to the CSV file where the data is saved.
45 |
46 | ## Authors
47 |
48 | * [**Carlos Vigo**](mailto:carviher1990@gmail.com?subject=[GitHub%-%pdf-importer]) - *Initial work* -
49 | [GitHub](https://github.com/c-vigo)
50 |
51 | ## Contributing
52 |
53 | Please read our [contributing policy](CONTRIBUTING.md) for details on our code of
54 | conduct, and the process for submitting pull requests to us.
55 |
56 | ## Versioning
57 |
58 | We use [Git](https://git-scm.com/) for versioning. For the versions available, see the
59 | [tags on this repository](https://gitlab.ethz.ch/exotic-matter/cw-beam/pdf-importer).
60 |
61 | ## License
62 |
63 | This project is licensed under the [GNU GPLv3 License](LICENSE.md)
64 |
65 | ## Built With
66 |
67 | * [PyCharm Professional 2020](https://www.jetbrains.com/pycharm//) - The IDE used
68 |
--------------------------------------------------------------------------------
/pdf_importer/Importers.py:
--------------------------------------------------------------------------------
1 | """ Collection of importers.
2 | """
3 |
4 | # Imports
5 |
6 | # Third party
7 | import camelot
8 | from dateutil.parser import parse
9 | from pandas import concat
10 |
11 |
12 | def extract_cembra(filename):
13 | entries = []
14 |
15 | tables = camelot.read_pdf(filename, pages='2-end')
16 |
17 | for page, pdf_table in enumerate(tables):
18 | df = tables[page].df
19 | for _, row in df.iterrows():
20 | try:
21 | date = parse(row[1].strip(), dayfirst=True).date()
22 | _ = parse(row[0].strip(), dayfirst=True).date()
23 | text = row[2]
24 | credit = row[3].replace('\'', '')
25 | debit = row[4].replace('\'', '')
26 | amount = -float(debit) if debit else float(credit)
27 | entries.append([date, amount, text])
28 | except ValueError:
29 | pass
30 |
31 | return entries
32 |
33 |
34 | def extract_cashback(filename):
35 | entries = []
36 |
37 | # noinspection PyUnresolvedReferences
38 | table1 = camelot.read_pdf(
39 | filename,
40 | pages='1',
41 | flavor='stream',
42 | table_areas=['50,320,560,50'],
43 | columns=['120,530']
44 | )
45 | table2 = camelot.read_pdf(
46 | filename,
47 | pages='2-end',
48 | flavor='stream',
49 | table_areas=['50,800,560,50'],
50 | columns=['120,530']
51 | )
52 |
53 | df = concat([table1[0].df, table2[0].df])
54 |
55 | for index, row in df.iterrows():
56 | try:
57 | date = parse(row[0].strip(), dayfirst=True).date()
58 | text = row[1].replace("\n", " ")
59 | amount = -float(row[2].replace("'", ""))
60 |
61 | if text == "YOUR PAYMENT – THANK YOU":
62 | amount = -amount
63 |
64 | entries.append([date, amount, text])
65 | except ValueError:
66 | pass
67 |
68 | return entries
69 |
--------------------------------------------------------------------------------
/pdf_importer/__init__.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | # Author: Carlos Vigo
3 | # Contact: carviher1990@gmail.com
4 |
5 | """ PDF parser for credit card statements.
6 | It accepts statement from the following issuers:
7 | - Cembra (Cumulus MasterCard)
8 | - SwissCard Cashback (AMEX / VISA / MasterCard)
9 | The data can be saved to a CSV file compatible with Wallet import feature.
10 | """
11 |
12 | # Local imports
13 | from . import __project__, pdf_importer
14 |
15 | __all__ = [
16 | __project__.__author__,
17 | __project__.__copyright__,
18 | __project__.__short_version__,
19 | __project__.__version__,
20 | __project__.__project_name__,
21 | 'pdf_importer',
22 | ]
23 |
--------------------------------------------------------------------------------
/pdf_importer/__main__.py:
--------------------------------------------------------------------------------
1 | from .pdf_importer import pdf_importer
2 |
3 | if __name__ == "__main__":
4 | pdf_importer()
5 |
--------------------------------------------------------------------------------
/pdf_importer/__project__.py:
--------------------------------------------------------------------------------
1 | __author__ = 'Carlos Vigo '
2 | __email__ = ''
3 | __short_author__ = 'Carlos Vigo'
4 | __copyright__ = '2021, Carlos Vigo'
5 | __package_name__ = 'pdf-importer'
6 | __module_name__ = 'pdf_importer.py'
7 | __project_name__ = 'Statement PDF Importer'
8 | __url__ = 'https://github.com/c-vigo/StatementPDFImporter'
9 | __documentation__ = 'https://github.com/c-vigo/StatementPDFImporter'
10 | __version__ = '0.5'
11 | __short_version__ = '0.5'
12 | __description__ = 'A PDF importer to generate CSV files from bank statements'
13 |
14 |
--------------------------------------------------------------------------------
/pdf_importer/pdf_importer.py:
--------------------------------------------------------------------------------
1 | """ PDF parser for credit card statements.
2 | It accepts statement from the following issuers:
3 | - Cembra (Cumulus MasterCard)
4 | - SwissCard Cashback (AMEX / VISA / MasterCard)
5 | The data can be saved to a CSV file compatible with Wallet import feature.
6 | """
7 |
8 | # Imports
9 | import csv
10 | from argparse import ArgumentParser
11 |
12 | # Local packages
13 | from .__project__ import (
14 | __documentation__ as docs_url,
15 | __module_name__ as module,
16 | __description__ as prog_desc,
17 | )
18 | from .Importers import extract_cembra, extract_cashback
19 |
20 |
21 | def pdf_importer():
22 | """The main routine. It parses the input argument and acts accordingly."""
23 |
24 | # The argument parser
25 | ap = ArgumentParser(
26 | prog=module,
27 | description=prog_desc,
28 | add_help=True,
29 | epilog='Check out the package documentation for more information:\n{}'.format(docs_url)
30 | )
31 |
32 | # Positional arguments:
33 | # 1. File name
34 | ap.add_argument(
35 | 'filename',
36 | help='PDF file to parse',
37 | type=str,
38 | )
39 |
40 | # 2. Statement type
41 | ap.add_argument(
42 | 'type',
43 | help='statement type',
44 | type=str,
45 | choices=['cembra', 'cashback']
46 | )
47 |
48 | # 3. Output file
49 | ap.add_argument(
50 | '--o',
51 | '-output',
52 | dest='output',
53 | help='output CSV file',
54 | type=str,
55 | default=None
56 | )
57 |
58 | # Parse the arguments
59 | args = ap.parse_args()
60 |
61 | # Extract data from PDF
62 | if args.type == 'cembra':
63 | print('Processing file {} as a Cembra PDF statement.'.format(args.filename))
64 | entries = extract_cembra(args.filename)
65 | elif args.type == 'cashback':
66 | entries = extract_cashback(args.filename)
67 | else:
68 | raise RuntimeError('Invalid statement type {}'.format(args.type))
69 |
70 | # Print to console
71 | total_value = 0.
72 | for entry in entries:
73 | print('{} {:+7.2f} {}'.format(entry[0], entry[1], entry[2]))
74 | total_value += entry[1]
75 | print('\nTotal: {:+7.2f}'.format(total_value))
76 |
77 | # Save to CSV file
78 | if args.output is not None:
79 | print('Saving data to {}.'.format(args.output))
80 | with open(args.output, mode='w') as f:
81 | writer = csv.writer(
82 | f,
83 | delimiter=';',
84 | quotechar='"',
85 | quoting=csv.QUOTE_MINIMAL
86 | )
87 | for entry in entries:
88 | writer.writerow(entry)
89 |
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [build-system]
2 | # Minimum requirements for the build system to execute.
3 | requires = [
4 | "setuptools",
5 | "wheel",
6 | "pip>=10.0.0",
7 | ]
8 | build-backend = "setuptools.build_meta:__legacy__"
9 |
--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [aliases]
2 | test=pytest
3 | check=flake8
4 |
5 | [flake8]
6 | max-line-length=120
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | # Author: Carlos Vigo
3 | # Contact: carviher1990@gmail.com
4 |
5 | from setuptools import setup
6 |
7 | from os.path import join, dirname, abspath
8 | from sys import path as sys_path
9 | sys_path.append(abspath('pdf_importer'))
10 | import __project__ # noqa: E402
11 |
12 |
13 | # Read the README.md file
14 | with open(join(dirname(__file__), 'README.md'), "r") as fh:
15 | long_description = fh.read()
16 |
17 | setup(
18 | name=__project__.__package_name__,
19 | version=__project__.__version__,
20 | author=__project__.__short_author__,
21 | author_email=__project__.__email__,
22 | description=__project__.__description__,
23 | long_description=long_description,
24 | long_description_content_type="text/markdown",
25 | url=__project__.__url__,
26 | packages=['pdf_importer'],
27 | classifiers=[
28 | "Programming Language :: Python :: 3.6",
29 | "License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
30 | "Operating System :: POSIX :: Linux",
31 | ],
32 | license='GPLv3',
33 | keywords=[
34 | 'PDF'
35 | ],
36 | python_requires='>=3.6',
37 | setup_requires=[
38 | 'pip>=10.0',
39 | 'wheel',
40 | 'setuptools>=30',
41 | ],
42 | install_requires=[
43 | 'camelot-py',
44 | 'python-dateutil',
45 | 'opencv-python',
46 | 'pandas'
47 | ],
48 | include_package_data=True
49 | )
50 |
--------------------------------------------------------------------------------