├── tests ├── __init__.py └── test_controller.py ├── docs ├── .ruby-version ├── .ruby-gemset ├── testingdayslides │ ├── css │ │ └── highlight │ │ │ └── zenburn.css │ ├── favicon.ico │ ├── dist │ │ ├── theme │ │ │ └── fonts │ │ │ │ ├── league-gothic │ │ │ │ ├── LICENSE │ │ │ │ ├── league-gothic.eot │ │ │ │ ├── league-gothic.ttf │ │ │ │ ├── league-gothic.woff │ │ │ │ └── league-gothic.css │ │ │ │ └── source-sans-pro │ │ │ │ ├── source-sans-pro-italic.eot │ │ │ │ ├── source-sans-pro-italic.ttf │ │ │ │ ├── source-sans-pro-italic.woff │ │ │ │ ├── source-sans-pro-regular.eot │ │ │ │ ├── source-sans-pro-regular.ttf │ │ │ │ ├── source-sans-pro-regular.woff │ │ │ │ ├── source-sans-pro-semibold.eot │ │ │ │ ├── source-sans-pro-semibold.ttf │ │ │ │ ├── source-sans-pro-semibold.woff │ │ │ │ ├── source-sans-pro-semibolditalic.eot │ │ │ │ ├── source-sans-pro-semibolditalic.ttf │ │ │ │ ├── source-sans-pro-semibolditalic.woff │ │ │ │ ├── source-sans-pro.css │ │ │ │ └── LICENSE │ │ └── reset.css │ └── plugin │ │ ├── math │ │ ├── plugin.js │ │ ├── mathjax2.js │ │ ├── mathjax3.js │ │ └── katex.js │ │ ├── highlight │ │ ├── monokai.css │ │ └── zenburn.css │ │ └── zoom │ │ ├── zoom.esm.js │ │ └── zoom.js ├── .gitignore ├── favicon.ico ├── img │ ├── docker1.png │ ├── docker2.png │ ├── docker3.png │ ├── mailbagit-gui.png │ ├── smart_screen.png │ ├── mailbagit-demo1.png │ ├── mailbagit-demo2.png │ ├── release_assets.png │ ├── windows_install.png │ ├── allow_executable.png │ ├── mailbagit-guided.png │ ├── mailbag-colab-notebook.png │ └── mailbag-colab-notebook2.png ├── _sass │ └── custom │ │ └── custom.scss ├── diagrams │ └── mailbagOverview.png ├── persona_images │ ├── andreaLee.jpg │ ├── IsaacHoffman.jpg │ ├── TeresaBurns.jpg │ ├── aaronSantos.jpg │ ├── emilyCooper.jpg │ ├── issue_number.png │ ├── garyRichardson.jpg │ └── nicholasGarza.jpg ├── spec │ └── Mailbag_Specification_prerelease.pdf ├── _data │ ├── wont_have.csv │ ├── could_have.csv │ ├── must_have.csv │ ├── should_have.csv │ └── user_stories.csv ├── 404.html ├── design_documents │ ├── user_stories.md │ └── requirements.md ├── install.md ├── trying.md ├── use.md ├── about.md ├── events.md ├── using_mailbagit │ ├── logging.md │ ├── plugins.md │ ├── lossiness.md │ ├── concerns.md │ ├── mailbagit-gui.md │ ├── warcs.md │ ├── errors.md │ └── mailbagit.md ├── Gemfile ├── design_documents.md ├── installing_mailbagit │ ├── exe.md │ ├── pdf.md │ ├── docker.md │ └── python.md ├── _config.yml ├── contributors.md ├── news.md ├── Gemfile.lock ├── cfp.md └── index.md ├── mailbagit ├── helper │ ├── __init__.py │ ├── common.py │ └── controller.py ├── globals.py ├── formats │ └── example.py ├── derivatives │ ├── example.py │ ├── txt.py │ ├── html.py │ ├── pdf_chrome.py │ └── pdf.py ├── email_account.py ├── derivative.py └── loggerx.py ├── data ├── mbox-sample1 │ ├── 1 │ │ ├── Message_Path.txt │ │ ├── HTML_Encoding.txt │ │ ├── Mailbag_Message_ID.txt │ │ ├── Text_Encoding.txt │ │ ├── Derivatives_Path.txt │ │ ├── Original_File.txt │ │ ├── Content_Type.txt │ │ ├── Subject.txt │ │ ├── Date.txt │ │ ├── From.txt │ │ ├── To.txt │ │ ├── Message_ID.txt │ │ ├── Headers.pickle │ │ ├── Message.pickle │ │ └── Text_Body.txt │ └── 2 │ │ ├── Message_Path.txt │ │ ├── HTML_Encoding.txt │ │ ├── Mailbag_Message_ID.txt │ │ ├── Text_Encoding.txt │ │ ├── Derivatives_Path.txt │ │ ├── Original_File.txt │ │ ├── Content_Type.txt │ │ ├── Date.txt │ │ ├── To.txt │ │ ├── From.txt │ │ ├── Subject.txt │ │ ├── Message_ID.txt │ │ ├── Headers.pickle │ │ ├── Message.pickle │ │ └── Text_Body.txt ├── eml-2016-06-23_144430_6e449c77fe │ └── 1 │ │ ├── Message_Path.txt │ │ ├── Derivatives_Path.txt │ │ ├── HTML_Encoding.txt │ │ ├── Mailbag_Message_ID.txt │ │ ├── Text_Encoding.txt │ │ ├── Content_Type.txt │ │ ├── Date.txt │ │ ├── Subject.txt │ │ ├── To.txt │ │ ├── From.txt │ │ ├── Original_File.txt │ │ ├── HTML_Body.txt │ │ ├── Message_ID.txt │ │ ├── Headers.pickle │ │ ├── Message.pickle │ │ └── Text_Body.txt ├── msg-Digitization Archiving Solutions │ └── 1 │ │ ├── Message_Path.txt │ │ ├── Derivatives_Path.txt │ │ ├── HTML_Encoding.txt │ │ ├── Mailbag_Message_ID.txt │ │ ├── Text_Encoding.txt │ │ ├── Attachments │ │ ├── 1 │ │ │ ├── Name.txt │ │ │ ├── MimeType.txt │ │ │ ├── WrittenName.txt │ │ │ ├── Content_ID.txt │ │ │ └── File.bin │ │ ├── 2 │ │ │ ├── MimeType.txt │ │ │ ├── Name.txt │ │ │ ├── WrittenName.txt │ │ │ ├── Content_ID.txt │ │ │ └── File.bin │ │ └── 3 │ │ │ ├── MimeType.txt │ │ │ ├── Content_ID.txt │ │ │ ├── Name.txt │ │ │ ├── WrittenName.txt │ │ │ └── File.bin │ │ ├── Content_Type.txt │ │ ├── Date.txt │ │ ├── From.txt │ │ ├── To.txt │ │ ├── Original_File.txt │ │ ├── Subject.txt │ │ ├── Message_ID.txt │ │ ├── Headers.pickle │ │ └── Text_Body.txt ├── pst-outlook2019_MSO_16.0.10377.20023_64-bit │ ├── 1 │ │ ├── Mailbag_Message_ID.txt │ │ ├── To.txt │ │ ├── HTML_Encoding.txt │ │ ├── Text_Encoding.txt │ │ ├── Content_Type.txt │ │ ├── From.txt │ │ ├── Date.txt │ │ ├── Subject.txt │ │ ├── Message_ID.txt │ │ ├── Original_File.txt │ │ ├── Message_Path.txt │ │ ├── Derivatives_Path.txt │ │ └── Headers.pickle │ └── 2 │ │ ├── HTML_Encoding.txt │ │ ├── Text_Encoding.txt │ │ ├── Mailbag_Message_ID.txt │ │ ├── To.txt │ │ ├── Content_Type.txt │ │ ├── From.txt │ │ ├── Subject.txt │ │ ├── Date.txt │ │ ├── Message_ID.txt │ │ ├── Original_File.txt │ │ ├── Message_Path.txt │ │ ├── Derivatives_Path.txt │ │ └── Headers.pickle ├── Digitization Archiving Solutions.msg ├── outlook2019_MSO_16.0.10377.20023_64-bit.pst └── 2016-06-23_144430_6e449c77fe.eml ├── pyproject.toml ├── .dockerignore ├── mailbagit.py ├── mailbagit-gui.py ├── mailbagit-guided.py ├── .gitmodules ├── .gitattributes ├── docker-compose.yml ├── docker-compose-dev.yml ├── .github ├── ISSUE_TEMPLATE │ ├── mailbag-component.md │ ├── feature_request.md │ └── bug_report.md ├── pull_request_template.md └── workflows │ └── build.yml ├── Dockerfile.production ├── Dockerfile ├── dumpData.py ├── LICENSE ├── release.md ├── setup.py ├── branching_workflow.md ├── .gitignore └── README.md /tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /docs/.ruby-version: -------------------------------------------------------------------------------- 1 | 2.6.5 2 | -------------------------------------------------------------------------------- /mailbagit/helper/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /docs/.ruby-gemset: -------------------------------------------------------------------------------- 1 | mailbag 2 | -------------------------------------------------------------------------------- /data/mbox-sample1/1/Message_Path.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/mbox-sample1/2/Message_Path.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/mbox-sample1/1/HTML_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/mbox-sample1/1/Mailbag_Message_ID.txt: -------------------------------------------------------------------------------- 1 | 1 -------------------------------------------------------------------------------- /data/mbox-sample1/1/Text_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/mbox-sample1/2/HTML_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/mbox-sample1/2/Mailbag_Message_ID.txt: -------------------------------------------------------------------------------- 1 | 2 -------------------------------------------------------------------------------- /data/mbox-sample1/2/Text_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/mbox-sample1/1/Derivatives_Path.txt: -------------------------------------------------------------------------------- 1 | sample1 -------------------------------------------------------------------------------- /data/mbox-sample1/2/Derivatives_Path.txt: -------------------------------------------------------------------------------- 1 | sample1 -------------------------------------------------------------------------------- /docs/testingdayslides/css/highlight/zenburn.css: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [tool.black] 2 | line-length = 140 -------------------------------------------------------------------------------- /data/mbox-sample1/1/Original_File.txt: -------------------------------------------------------------------------------- 1 | sample1.mbox -------------------------------------------------------------------------------- /data/mbox-sample1/2/Original_File.txt: -------------------------------------------------------------------------------- 1 | sample1.mbox -------------------------------------------------------------------------------- /.dockerignore: -------------------------------------------------------------------------------- 1 | dist/ 2 | build/ 3 | sampleData/ 4 | docs/ -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Message_Path.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/mbox-sample1/1/Content_Type.txt: -------------------------------------------------------------------------------- 1 | multipart/alternative -------------------------------------------------------------------------------- /data/mbox-sample1/1/Subject.txt: -------------------------------------------------------------------------------- 1 | The GOP’s horrendous damage -------------------------------------------------------------------------------- /data/mbox-sample1/2/Content_Type.txt: -------------------------------------------------------------------------------- 1 | multipart/alternative -------------------------------------------------------------------------------- /mailbagit.py: -------------------------------------------------------------------------------- 1 | from mailbagit import cli 2 | 3 | cli() 4 | -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Derivatives_Path.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/HTML_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Mailbag_Message_ID.txt: -------------------------------------------------------------------------------- 1 | 1 -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Text_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/mbox-sample1/2/Date.txt: -------------------------------------------------------------------------------- 1 | Thu, 23 Jun 2016 13:52:42 +0000 -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Message_Path.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /docs/.gitignore: -------------------------------------------------------------------------------- 1 | _site 2 | .sass-cache 3 | .jekyll-metadata 4 | -------------------------------------------------------------------------------- /mailbagit-gui.py: -------------------------------------------------------------------------------- 1 | from mailbagit import gui 2 | 3 | gui() 4 | -------------------------------------------------------------------------------- /data/mbox-sample1/1/Date.txt: -------------------------------------------------------------------------------- 1 | Thu, 30 Jun 2016 12:22:39 +0000 (GMT) -------------------------------------------------------------------------------- /data/mbox-sample1/1/From.txt: -------------------------------------------------------------------------------- 1 | Chuck Schumer -------------------------------------------------------------------------------- /data/mbox-sample1/1/To.txt: -------------------------------------------------------------------------------- 1 | ualbanymodernpoliticalarchives@gmail.com -------------------------------------------------------------------------------- /data/mbox-sample1/2/To.txt: -------------------------------------------------------------------------------- 1 | ualbanymodernpoliticalarchives@gmail.com -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Derivatives_Path.txt: -------------------------------------------------------------------------------- 1 | . -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/HTML_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Mailbag_Message_ID.txt: -------------------------------------------------------------------------------- 1 | 1 -------------------------------------------------------------------------------- /data/mbox-sample1/2/From.txt: -------------------------------------------------------------------------------- 1 | Andy from Google -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Text_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-16-le -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/HTML_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Text_Encoding.txt: -------------------------------------------------------------------------------- 1 | utf-8 -------------------------------------------------------------------------------- /mailbagit-guided.py: -------------------------------------------------------------------------------- 1 | from mailbagit import guided 2 | 3 | guided() 4 | -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Content_Type.txt: -------------------------------------------------------------------------------- 1 | multipart/alternative -------------------------------------------------------------------------------- /data/mbox-sample1/2/Subject.txt: -------------------------------------------------------------------------------- 1 | UAlbany, welcome to your new Google Account -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/1/Name.txt: -------------------------------------------------------------------------------- 1 | image001.png -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Content_Type.txt: -------------------------------------------------------------------------------- 1 | multipart/mixed -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Date.txt: -------------------------------------------------------------------------------- 1 | 2022-02-03 15:01:10-05:00 -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Mailbag_Message_ID.txt: -------------------------------------------------------------------------------- 1 | 1 -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/To.txt: -------------------------------------------------------------------------------- 1 | gwiedeman@albany.edu -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Mailbag_Message_ID.txt: -------------------------------------------------------------------------------- 1 | 2 -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/To.txt: -------------------------------------------------------------------------------- 1 | gwiedeman@albany.edu -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Date.txt: -------------------------------------------------------------------------------- 1 | Thu, 23 Jun 2016 10:44:26 -0400 -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Subject.txt: -------------------------------------------------------------------------------- 1 | Thank you for your submission -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/1/MimeType.txt: -------------------------------------------------------------------------------- 1 | image/png -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/HTML_Encoding.txt: -------------------------------------------------------------------------------- 1 | windows-1252 -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Text_Encoding.txt: -------------------------------------------------------------------------------- 1 | windows-1252 -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/To.txt: -------------------------------------------------------------------------------- 1 | ualbanymodernpoliticalarchives@gmail.com -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/1/WrittenName.txt: -------------------------------------------------------------------------------- 1 | image001.png -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/2/MimeType.txt: -------------------------------------------------------------------------------- 1 | application/pdf -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/3/MimeType.txt: -------------------------------------------------------------------------------- 1 | application/pdf -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Content_Type.txt: -------------------------------------------------------------------------------- 1 | multipart/alternative -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Content_Type.txt: -------------------------------------------------------------------------------- 1 | multipart/alternative -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/From.txt: -------------------------------------------------------------------------------- 1 | Nadler for Congress -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Original_File.txt: -------------------------------------------------------------------------------- 1 | 2016-06-23_144430_6e449c77fe.eml -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/From.txt: -------------------------------------------------------------------------------- 1 | Matt McCabe -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/To.txt: -------------------------------------------------------------------------------- 1 | Matt McCabe -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/From.txt: -------------------------------------------------------------------------------- 1 | Today at UAlbany -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/From.txt: -------------------------------------------------------------------------------- 1 | Today at UAlbany -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Subject.txt: -------------------------------------------------------------------------------- 1 | Good Morning - Today at UAlbany -------------------------------------------------------------------------------- /docs/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/favicon.ico -------------------------------------------------------------------------------- /data/mbox-sample1/1/Message_ID.txt: -------------------------------------------------------------------------------- 1 | <1314145029.920292481467289359777.JavaMail.app@rbg21.atlis1> -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/2/Name.txt: -------------------------------------------------------------------------------- 1 | CrowleyAdvantage-Hardware.pdf -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Original_File.txt: -------------------------------------------------------------------------------- 1 | Digitization Archiving Solutions.msg -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Date.txt: -------------------------------------------------------------------------------- 1 | Thu, 30 Sep 2021 17:30:56 +0000 (UTC) -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Subject.txt: -------------------------------------------------------------------------------- 1 | Today at UAlbany - Focus on Research -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Date.txt: -------------------------------------------------------------------------------- 1 | Tue, 05 Oct 2021 13:59:44 +0000 (UTC) -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/1/Content_ID.txt: -------------------------------------------------------------------------------- 1 | image001.png@01D81905.3EA3C240 -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/2/WrittenName.txt: -------------------------------------------------------------------------------- 1 | CrowleyAdvantage-Hardware.pdf -------------------------------------------------------------------------------- /docs/img/docker1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/docker1.png -------------------------------------------------------------------------------- /docs/img/docker2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/docker2.png -------------------------------------------------------------------------------- /docs/img/docker3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/docker3.png -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/HTML_Body.txt: -------------------------------------------------------------------------------- 1 |

2 | Thank you for joining our campaign.

3 | -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Message_ID.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Message_ID.txt: -------------------------------------------------------------------------------- 1 | <2Y2JQkdFSGmQAIdjAOswdQ@geopod-ismtpd-3-0> -------------------------------------------------------------------------------- /data/mbox-sample1/2/Message_ID.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Subject.txt: -------------------------------------------------------------------------------- 1 | The Crowley Company - Digitization & Archiving Solutions -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Original_File.txt: -------------------------------------------------------------------------------- 1 | outlook2019_MSO_16.0.10377.20023_64-bit.pst -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Message_ID.txt: -------------------------------------------------------------------------------- 1 | <9QwEIJPSQMyxlfUN2p369A@geopod-ismtpd-canary-0> -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Original_File.txt: -------------------------------------------------------------------------------- 1 | outlook2019_MSO_16.0.10377.20023_64-bit.pst -------------------------------------------------------------------------------- /docs/img/mailbagit-gui.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/mailbagit-gui.png -------------------------------------------------------------------------------- /docs/img/smart_screen.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/smart_screen.png -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Message_Path.txt: -------------------------------------------------------------------------------- 1 | Top of Outlook data file/Inbox/Today at UAlbany -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Message_Path.txt: -------------------------------------------------------------------------------- 1 | Top of Outlook data file/Inbox/Today at UAlbany -------------------------------------------------------------------------------- /docs/img/mailbagit-demo1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/mailbagit-demo1.png -------------------------------------------------------------------------------- /docs/img/mailbagit-demo2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/mailbagit-demo2.png -------------------------------------------------------------------------------- /docs/img/release_assets.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/release_assets.png -------------------------------------------------------------------------------- /docs/img/windows_install.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/windows_install.png -------------------------------------------------------------------------------- /docs/img/allow_executable.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/allow_executable.png -------------------------------------------------------------------------------- /docs/img/mailbagit-guided.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/mailbagit-guided.png -------------------------------------------------------------------------------- /docs/_sass/custom/custom.scss: -------------------------------------------------------------------------------- 1 | .main-content .highlight pre.highlight code { 2 | line-height: 1.5; 3 | display: block; 4 | } -------------------------------------------------------------------------------- /docs/diagrams/mailbagOverview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/diagrams/mailbagOverview.png -------------------------------------------------------------------------------- /docs/persona_images/andreaLee.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/andreaLee.jpg -------------------------------------------------------------------------------- /docs/testingdayslides/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/favicon.ico -------------------------------------------------------------------------------- /data/mbox-sample1/1/Headers.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/mbox-sample1/1/Headers.pickle -------------------------------------------------------------------------------- /data/mbox-sample1/1/Message.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/mbox-sample1/1/Message.pickle -------------------------------------------------------------------------------- /data/mbox-sample1/2/Headers.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/mbox-sample1/2/Headers.pickle -------------------------------------------------------------------------------- /data/mbox-sample1/2/Message.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/mbox-sample1/2/Message.pickle -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/2/Content_ID.txt: -------------------------------------------------------------------------------- 1 | 604E1A43B0B2504E99BDF75DFDC8DD69@namprd04.prod.outlook.com -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/3/Content_ID.txt: -------------------------------------------------------------------------------- 1 | 538183EAA0AF4B46A9F5AFC2A54B557D@namprd04.prod.outlook.com -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/3/Name.txt: -------------------------------------------------------------------------------- 1 | Crowley Modern Library Award Winning Scanners and Services_2022.pdf -------------------------------------------------------------------------------- /docs/img/mailbag-colab-notebook.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/mailbag-colab-notebook.png -------------------------------------------------------------------------------- /docs/img/mailbag-colab-notebook2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/img/mailbag-colab-notebook2.png -------------------------------------------------------------------------------- /docs/persona_images/IsaacHoffman.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/IsaacHoffman.jpg -------------------------------------------------------------------------------- /docs/persona_images/TeresaBurns.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/TeresaBurns.jpg -------------------------------------------------------------------------------- /docs/persona_images/aaronSantos.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/aaronSantos.jpg -------------------------------------------------------------------------------- /docs/persona_images/emilyCooper.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/emilyCooper.jpg -------------------------------------------------------------------------------- /docs/persona_images/issue_number.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/issue_number.png -------------------------------------------------------------------------------- /docs/persona_images/garyRichardson.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/garyRichardson.jpg -------------------------------------------------------------------------------- /docs/persona_images/nicholasGarza.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/persona_images/nicholasGarza.jpg -------------------------------------------------------------------------------- /data/Digitization Archiving Solutions.msg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/Digitization Archiving Solutions.msg -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/3/WrittenName.txt: -------------------------------------------------------------------------------- 1 | Crowley Modern Library Award Winning Scanners and Services_2022.pdf -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Message_ID.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /docs/spec/Mailbag_Specification_prerelease.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/spec/Mailbag_Specification_prerelease.pdf -------------------------------------------------------------------------------- /data/outlook2019_MSO_16.0.10377.20023_64-bit.pst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/outlook2019_MSO_16.0.10377.20023_64-bit.pst -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Derivatives_Path.txt: -------------------------------------------------------------------------------- 1 | outlook2019_MSO_16.0.10377.20023_64-bit/Top of Outlook data file/Inbox/Today at UAlbany -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Derivatives_Path.txt: -------------------------------------------------------------------------------- 1 | outlook2019_MSO_16.0.10377.20023_64-bit/Top of Outlook data file/Inbox/Today at UAlbany -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "docs/mailbag-specification"] 2 | path = docs/mailbag-specification 3 | url = git@github.com:UAlbanyArchives/mailbag-specification.git 4 | -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/league-gothic/LICENSE: -------------------------------------------------------------------------------- 1 | SIL Open Font License (OFL) 2 | http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL 3 | -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Headers.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/eml-2016-06-23_144430_6e449c77fe/1/Headers.pickle -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Message.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/eml-2016-06-23_144430_6e449c77fe/1/Message.pickle -------------------------------------------------------------------------------- /docs/_data/wont_have.csv: -------------------------------------------------------------------------------- 1 | Number,Category,Won't have,Requirement Type 2 | 43,Input,Input from Office 365 APIs,Functional 3 | 44,Input,Input from Gmail APIs,Functional 4 | -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Headers.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/msg-Digitization Archiving Solutions/1/Headers.pickle -------------------------------------------------------------------------------- /data/eml-2016-06-23_144430_6e449c77fe/1/Text_Body.txt: -------------------------------------------------------------------------------- 1 | Dear Friend, 2 | 3 | Thank you for joining our campaign. 4 | 5 | Contact Information: 6 | 7 | 8 | 9 |   12222 -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/1/File.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/msg-Digitization Archiving Solutions/1/Attachments/1/File.bin -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/2/File.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/msg-Digitization Archiving Solutions/1/Attachments/2/File.bin -------------------------------------------------------------------------------- /data/msg-Digitization Archiving Solutions/1/Attachments/3/File.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/msg-Digitization Archiving Solutions/1/Attachments/3/File.bin -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Headers.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/1/Headers.pickle -------------------------------------------------------------------------------- /data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Headers.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/data/pst-outlook2019_MSO_16.0.10377.20023_64-bit/2/Headers.pickle -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/league-gothic/league-gothic.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/league-gothic/league-gothic.eot -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/league-gothic/league-gothic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/league-gothic/league-gothic.ttf -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/league-gothic/league-gothic.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/league-gothic/league-gothic.woff -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-italic.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-italic.eot -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-italic.ttf -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-italic.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-italic.woff -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-regular.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-regular.eot -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-regular.ttf -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-regular.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-regular.woff -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibold.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibold.eot -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibold.ttf -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibold.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibold.woff -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibolditalic.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibolditalic.eot -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibolditalic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibolditalic.ttf -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibolditalic.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/HEAD/docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro-semibolditalic.woff -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | * text=auto eol=lf 2 | 3 | # Binary files. 4 | *.png binary 5 | *.jpg binary 6 | *.msg binary 7 | *.pst binary 8 | *.eml binary 9 | *.mbox binary 10 | *.pickle binary 11 | HTML_Body.txt binary 12 | Text_Body.txt binary 13 | data/* binary -------------------------------------------------------------------------------- /docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '3.1' 2 | 3 | services: 4 | mailbagit: 5 | image: ualbanyarchives/mailbagit:latest 6 | volumes: 7 | - '.:/data' 8 | tty: true 9 | stdin_open: true 10 | working_dir: /data 11 | entrypoint: /bin/bash -------------------------------------------------------------------------------- /mailbagit/globals.py: -------------------------------------------------------------------------------- 1 | def init(): 2 | global log_level, style 3 | log_level = "WARNING" 4 | style = { 5 | "g": ["\033[92m", "\033[0m"], # green start and end 6 | "cy": ["\033[96m", "\033[0m"], 7 | "b": ["\033[1m", "\033[0m"], # bold start and end 8 | } 9 | -------------------------------------------------------------------------------- /docker-compose-dev.yml: -------------------------------------------------------------------------------- 1 | version: '3.1' 2 | 3 | services: 4 | mailbagit: 5 | image: ualbanyarchives/mailbagit:dev 6 | volumes: 7 | - '../sampleData:/data' 8 | - '.:/mailbagit' 9 | tty: true 10 | stdin_open: true 11 | working_dir: /mailbagit 12 | command: bash -c "pip install --upgrade pip && pip install -e . && cd /data && bash" -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/league-gothic/league-gothic.css: -------------------------------------------------------------------------------- 1 | @font-face { 2 | font-family: 'League Gothic'; 3 | src: url('./league-gothic.eot'); 4 | src: url('./league-gothic.eot?#iefix') format('embedded-opentype'), 5 | url('./league-gothic.woff') format('woff'), 6 | url('./league-gothic.ttf') format('truetype'); 7 | 8 | font-weight: normal; 9 | font-style: normal; 10 | } 11 | -------------------------------------------------------------------------------- /docs/testingdayslides/plugin/math/plugin.js: -------------------------------------------------------------------------------- 1 | import {KaTeX} from "./katex"; 2 | import {MathJax2} from "./mathjax2"; 3 | import {MathJax3} from "./mathjax3"; 4 | 5 | const defaultTypesetter = MathJax2; 6 | 7 | /*! 8 | * This plugin is a wrapper for the MathJax2, 9 | * MathJax3 and KaTeX typesetter plugins. 10 | */ 11 | export default Plugin = Object.assign( defaultTypesetter(), { 12 | KaTeX, 13 | MathJax2, 14 | MathJax3 15 | } ); -------------------------------------------------------------------------------- /docs/404.html: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | --- 4 | 5 | 18 | 19 |
20 |

404

21 | 22 |

Page not found :(

23 |

The requested page could not be found.

24 |
25 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/mailbag-component.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Mailbagit Component 3 | about: Describes components of the Mailbagit tool for assignment 4 | title: "[Component Name]" 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | ## The problem the component solves 11 | 12 | ... 13 | 14 | 15 | ## Relevant part of mailbag spec? 16 | 17 | ... 18 | 19 | ## Type of component 20 | 21 | - [ ] Core 22 | - [ ] Input 23 | - [ ] Attachments 24 | - [ ] Derivatives conversion 25 | - [ ] Reporting/Exporting 26 | - [ ] GUI 27 | - [ ] Distribution 28 | 29 | ## Expected contribution 30 | 31 | - [ ] Pull Request 32 | - [ ] Comment with proposed solution 33 | 34 | ## Major challenges or things to keep in mind 35 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: enhancement 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /docs/design_documents/user_stories.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: User Stories 4 | permalink: /user_stories/ 5 | parent: Design Documents 6 | nav_order: 2 7 | --- 8 | 9 | # User Stories 10 | 11 | If you have feedback, please [Suggest a User Story](https://docs.google.com/forms/d/e/1FAIpQLSf68TbV5mbZ48pm_lrGB_SK4oxZO0FtGeUqOfoK6gEQ5iq2WA/viewform?usp=sf_link) by April 25th! 12 | 13 | 14 | {% for row in site.data.user_stories %} 15 | {% if forloop.first %} 16 | 17 | {% for pair in row %} 18 | 19 | {% endfor %} 20 | 21 | {% endif %} 22 | {% tablerow pair in row %} 23 | {{ pair[1] }} 24 | {% endtablerow %} 25 | {% endfor %} 26 |
{{ pair[0] }}
-------------------------------------------------------------------------------- /docs/install.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Installing mailbagit 4 | permalink: /install/ 5 | nav_order: 3 6 | has_children: true 7 | --- 8 | 9 | # Installing mailbagit 10 | 11 | Mailbag contains multiple tools for packaging mailbags according to the [Mailbag Specification]({{ site.baseurl }}/spec/). 12 | 13 | There are three different ways to install and run `mailbagit`. 14 | 15 | 1. [Python install with pip]({{ site.baseurl}}/python) 16 | 2. [Using Windows executables]({{ site.baseurl}}/exe) 17 | 3. [Using a Docker image]({{ site.baseurl}}/docker) 18 | 19 | If you install `mailbagit` with Python or use the Windows executables, you will need to install either wkhtmltopdf or Google Chrome to create [PDF derivatives]({{ site.baseurl}}/pdf). 20 | 21 | The Docker image comes with all dependencies preinstalled. 22 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: bug 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Screenshots** 24 | If applicable, add screenshots to help explain your problem. 25 | 26 | **Environment (please complete the following information):** 27 | _ GUI or command line?: 28 | - OS: [e.g. iOS] 29 | - Python version: 30 | 31 | **Additional context** 32 | Add any other context about the problem here. 33 | -------------------------------------------------------------------------------- /docs/trying.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Trying mailbagit 4 | permalink: /trying/ 5 | nav_order: 2 6 | --- 7 | 8 | # Trying mailbagit 9 | 10 | You can try out mailbagit step-by-step using a Jupyter Notebook in Google Colab. 11 | 12 | 1. Go to [colab.research.google.com](https://colab.research.google.com) 13 | 2. Load from Github tab 14 | 3. Enter the Github URL: https://github.com/UAlbanyArchives/mailbagit-workshop 15 | 4. Select the "mailbagit_workshop.ipynb" notebook 16 | 17 | ![Screenshot of selecting mailbagit_workshop.ipynb in Google Colab]({{ site.baseurl }}/img/mailbag-colab-notebook.png) 18 | 19 | This runs on a folder in your Google Drive, so you can also try adding your own email export files next to the sample data. 20 | 21 | ![Screenshot notebook running in Google Colab]({{ site.baseurl }}/img/mailbag-colab-notebook2.png) -------------------------------------------------------------------------------- /mailbagit/formats/example.py: -------------------------------------------------------------------------------- 1 | # This is an example parser, meant to show how 2 | # to hook up a real parser 3 | 4 | # Does nothing currently 5 | from mailbagit.email_account import EmailAccount 6 | from mailbagit.models import Email 7 | 8 | 9 | class ExampleAccount(EmailAccount): 10 | format_name = "example" 11 | format_details = "" 12 | format_agent = "" 13 | format_agent_version = "" 14 | 15 | def __init__(self, args, source_parent_dir, mailbag_name, **kwargs): 16 | print("Parsity parse") 17 | # code goes here to set up mailbox and pull out any relevant account_data 18 | self._account_data = {} 19 | self._messages 20 | 21 | @property 22 | def account_data(self): 23 | return self._account_data 24 | 25 | def messages(self): 26 | for message in self._messages: 27 | yield Email(**message) 28 | -------------------------------------------------------------------------------- /docs/use.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Using mailbagit 4 | permalink: /use/ 5 | nav_order: 4 6 | has_children: true 7 | --- 8 | 9 | # Using Mailbagit 10 | 11 | Mailbag contains multiple tools for packaging mailbags according to the [Mailbag Specification]({{ site.baseurl }}/spec/). 12 | 13 | * [mailbagit]({{ site.baseurl }}/mailbagit/), a command line tool for creating mailbags 14 | * [mailbagit-gui]({{ site.baseurl }}/mailbagit-gui/), a basic graphical user interface which performs the same functions as `mailbagit` 15 | * [mailbagit-guided]({{ site.baseurl }}/mailbagit-guided/), a simplified CLI interface designed for accessible use by non-technical users 16 | 17 | ⚠️ Creating PDF or WARC derivatives may cause [privacy or security concerns]({{ site.baseurl }}/concerns) in some contexts. 18 | 19 | You can [customize logging]({{ site.baseurl }}/logging/) and [create plugins]({{ site.baseurl }}/plugins/) for all `mailbagit` options. 20 | -------------------------------------------------------------------------------- /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | ## Type of Contribution 2 | 3 | - [ ] Bugfix (non-breaking change which fixes an issue) 4 | - [ ] New component 5 | - [ ] Refactoring (no functional changes) 6 | - [ ] Documentation-only 7 | 8 | ## What does this implement/fix? Explain your changes. 9 | 10 | ... 11 | 12 | ## Link to issue? 13 | 14 | ... 15 | 16 | - [ ] Issue closed 17 | - [ ] Remain open 18 | 19 | ## Pull Request Checklist 20 | 21 | Please check if your PR fulfills the following requirements: 22 | - [ ] Make sure you are requesting to the develop branch. Don't PR to main! 23 | - [ ] This contribution has sufficient documentation 24 | - [ ] Tests for the changes have been added 25 | - [ ] All tests pass 26 | 27 | #### How has this been tested? 28 | **Operating System:** … 29 | **Python Version:** … 30 | 31 | ## Licensing 32 | - [ ] I agree that the Mailbag Project and the University at Albany, SUNY can release this code under the [MIT license](https://github.com/UAlbanyArchives/mailbagit/blob/main/LICENSE). 33 | -------------------------------------------------------------------------------- /docs/testingdayslides/dist/reset.css: -------------------------------------------------------------------------------- 1 | /* http://meyerweb.com/eric/tools/css/reset/ 2 | v4.0 | 20180602 3 | License: none (public domain) 4 | */ 5 | 6 | html, body, div, span, applet, object, iframe, 7 | h1, h2, h3, h4, h5, h6, p, blockquote, pre, 8 | a, abbr, acronym, address, big, cite, code, 9 | del, dfn, em, img, ins, kbd, q, s, samp, 10 | small, strike, strong, sub, sup, tt, var, 11 | b, u, i, center, 12 | dl, dt, dd, ol, ul, li, 13 | fieldset, form, label, legend, 14 | table, caption, tbody, tfoot, thead, tr, th, td, 15 | article, aside, canvas, details, embed, 16 | figure, figcaption, footer, header, hgroup, 17 | main, menu, nav, output, ruby, section, summary, 18 | time, mark, audio, video { 19 | margin: 0; 20 | padding: 0; 21 | border: 0; 22 | font-size: 100%; 23 | font: inherit; 24 | vertical-align: baseline; 25 | } 26 | /* HTML5 display-role reset for older browsers */ 27 | article, aside, details, figcaption, figure, 28 | footer, header, hgroup, main, menu, nav, section { 29 | display: block; 30 | } -------------------------------------------------------------------------------- /docs/about.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Project Team 4 | permalink: /about/ 5 | nav_order: 5 6 | --- 7 | 8 | ### Project Team 9 | 10 | * Gregory Wiedeman, University at Albany, SUNY 11 | * Mark Wolfe, University at Albany, SUNY 12 | * Karen Kiorpes, University at Albany, SUNY 13 | * Harit Garg, University at Albany, SUNY 14 | * Baibhav Rajbhandari, University at Albany, SUNY 15 | 16 | ### Advisory Board 17 | 18 | * Rachel Appel, University of Pennsylvania 19 | * Hillel Arnold, Rockefeller Archive Center 20 | * Mat Kelly, Drexel University 21 | * Albert Rozo, Penn State University 22 | * Nathan Tallman, Penn State University 23 | * Bonnie Weddle, New York State Archives 24 | 25 | ### Consultant Developer 26 | 27 | * Dave Mayo 28 | 29 | You can email gwiedeman [at] albany [dot] edu if you're interested in contacting the project team. 30 | 31 | The [Email Archiving Interest Group listserv](https://mailman.stanford.edu/mailman/listinfo/email_archiving_interest_group) is also great a community of users focused on email preservation. 32 | -------------------------------------------------------------------------------- /.github/workflows/build.yml: -------------------------------------------------------------------------------- 1 | name: build 2 | 3 | on: [push, pull_request] 4 | 5 | jobs: 6 | lint: 7 | 8 | runs-on: ubuntu-latest 9 | strategy: 10 | matrix: 11 | python-version: ['3.9', '3.10', '3.11', '3.12', '3.13', '3.13'] 12 | 13 | steps: 14 | - uses: actions/checkout@v2 15 | - name: Set up Python ${{ matrix.python-version }} 16 | uses: actions/setup-python@v2 17 | with: 18 | python-version: ${{ matrix.python-version }} 19 | - name: Install OS dependencies 20 | run: | 21 | sudo timedatectl set-timezone America/New_York 22 | sudo apt-get update 23 | sudo apt-get install libgtk-3-dev 24 | - name: Install Python dependencies 25 | run: | 26 | pip install attrdict3 27 | pip install -U -f https://extras.wxpython.org/wxPython4/extras/linux/gtk3/ubuntu-22.04 wxPython 28 | pip install . 29 | pip install libpff-python==20211114 30 | - name: Test with pytest 31 | run: | 32 | pytest -vv 33 | -------------------------------------------------------------------------------- /Dockerfile.production: -------------------------------------------------------------------------------- 1 | FROM python:3.11-slim-bookworm 2 | MAINTAINER Gregory Wiedeman gwiedeman@albany.edu 3 | 4 | ENV TZ=America/New_York \ 5 | DEBIAN_FRONTEND=noninteractive \ 6 | MAILBAGIT_LOG_LEVEL=info \ 7 | IN_CONTAINER=true 8 | 9 | RUN apt-get -y update 10 | RUN apt-get -y upgrade 11 | RUN apt-get install -y curl 12 | 13 | # needed to build some python libraries 14 | RUN apt-get install -y gcc dpkg-dev 15 | 16 | # wkhtmltopdf deps 17 | RUN apt-get install -y xfonts-75dpi xfonts-base 18 | 19 | RUN pip install libpff-python==20231205 20 | 21 | RUN curl -L -o /tmp/google-chrome-stable_current_amd64.deb \ 22 | https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb 23 | RUN apt-get install -y /tmp/google-chrome-stable_current_amd64.deb 24 | 25 | RUN curl -L -o /tmp/wkhtmltox_0.12.6.1-3.bookworm_amd64.deb \ 26 | https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6.1-3/wkhtmltox_0.12.6.1-3.bookworm_amd64.deb 27 | RUN dpkg -i /tmp/wkhtmltox_0.12.6.1-3.bookworm_amd64.deb 28 | 29 | RUN pip install mailbagit -U 30 | -------------------------------------------------------------------------------- /docs/_data/could_have.csv: -------------------------------------------------------------------------------- 1 | Number,Category,Could have,Requirement Type,Possible System Dependancies 2 | 33,Core,"Lists important email metadata separate from source data and filesystem (like a CSV or SQLite), as a tag file",Technical, 3 | 34,Core,Provide an estimated time to completion for users during long-running processes,Technical, 4 | 35,Core,Provide a method of keeping or excluding specific email folders while creating Mailbags,Functional, 5 | 36,Core-PDF,Generate PDFs using customizable CSS,Functional,"wkhtmltopdf, htmldoc, if not xhtml2pdf" 6 | 37,Core-WARC,Customizable scoping rules for WARC creation,Functional,wget 7 | 38,Reporting,Export reports of email headers or text of individual emails,Functional, 8 | 39,Dependency management,Common dependencies across major platforms,Technical, 9 | 40,Dependency management,Individual mini-specs for future system dependencies,Technical, 10 | 41,Installation,"All core dependencies install via a single comand, such as using pip",Technical, 11 | 42,Installation,Able to install on Windows machine without admin privileges,Technical, 12 | -------------------------------------------------------------------------------- /mailbagit/derivatives/example.py: -------------------------------------------------------------------------------- 1 | # This is an example derivative, meant to show how 2 | # to hook up a real parser 3 | from mailbagit.loggerx import get_logger 4 | 5 | log = get_logger() 6 | 7 | # Does nothing currently 8 | from mailbagit.derivative import Derivative 9 | 10 | 11 | class ExampleDerivative(Derivative): 12 | derivative_name = "example" 13 | derivative_format = "example" 14 | derivative_agent = "" 15 | derivative_agent_version = "" 16 | 17 | def __init__(self, email_account, args, mailbag_dir): 18 | log.debug(f"Setup {self.derivative_name} derivatives") 19 | 20 | # Sets up self.format_subdirectory 21 | super().__init__(args, mailbag_dir) 22 | 23 | def do_task_per_account(self): 24 | print(self.account.account_data()) 25 | 26 | def do_task_per_message(self, message): 27 | print(self.format_subdirectory) 28 | if message.Message_ID: 29 | log.debug(message.Message_ID) 30 | elif message.Subject: 31 | log.debug(message.Subject) 32 | 33 | return message 34 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3.11-slim-bookworm 2 | 3 | ENV TZ=America/New_York \ 4 | DEBIAN_FRONTEND=noninteractive \ 5 | MAILBAGIT_LOG_LEVEL=debug\ 6 | IN_CONTAINER=true 7 | 8 | #RUN mkdir /mailbagit 9 | #WORKDIR /mailbagit 10 | #ADD . /mailbagit/ 11 | 12 | # manually add to $PYTHONPATH because https://github.com/python/importlib_metadata/issues/364 13 | ENV PYTHONPATH=/mailbagit 14 | 15 | RUN apt-get -y update 16 | RUN apt-get -y upgrade 17 | RUN apt-get install -y curl 18 | 19 | # needed to build some python libraries 20 | RUN apt-get install -y gcc dpkg-dev 21 | 22 | # wkhtmltopdf deps 23 | RUN apt-get install -y xfonts-75dpi xfonts-base 24 | 25 | RUN pip install libpff-python==20231205 26 | 27 | RUN curl -L -o /tmp/google-chrome-stable_current_amd64.deb \ 28 | https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb 29 | RUN apt-get install -y /tmp/google-chrome-stable_current_amd64.deb 30 | 31 | RUN curl -L -o /tmp/wkhtmltox_0.12.6.1-3.bookworm_amd64.deb \ 32 | https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6.1-3/wkhtmltox_0.12.6.1-3.bookworm_amd64.deb 33 | RUN dpkg -i /tmp/wkhtmltox_0.12.6.1-3.bookworm_amd64.deb 34 | -------------------------------------------------------------------------------- /dumpData.py: -------------------------------------------------------------------------------- 1 | import os 2 | from bagit import _make_parser, Bag 3 | from mailbagit.email_account import EmailAccount 4 | from mailbagit.controller import Controller 5 | 6 | bagit_parser = _make_parser() 7 | bagit_parser.description = f"Mailbagit ({bagit_parser.description})" 8 | args = bagit_parser.parse_args() 9 | parsers = ["pst", "mbox", "msg", "eml"] 10 | 11 | args.path = args.directory[0] 12 | args.keep = True 13 | if os.path.isfile: 14 | source_parent_dir = os.path.dirname(args.path) 15 | else: 16 | source_parent_dir = args.path 17 | mailbag_dir = "data" 18 | mailbag_name = "New_Mailbag" 19 | 20 | for parser in parsers: 21 | args.input = parser 22 | args.derivatives = ["html"] 23 | args.dry_run = True 24 | args.mailbag_name = "test_data" 25 | args.companion_files = False 26 | 27 | controller = Controller(args) 28 | 29 | # mail_account: EmailAccount = controller.format(args.directory[0], args) 30 | mail_account: EmailAccount = controller.format(args, source_parent_dir, mailbag_dir, mailbag_name) 31 | count = 0 32 | for message in mail_account.messages(): 33 | count += 1 34 | message.Mailbag_Message_ID = count 35 | message.dump() 36 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 M. E. Grenander Department of Special Collections & Archives, University at Albany, SUNY 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /docs/testingdayslides/plugin/highlight/monokai.css: -------------------------------------------------------------------------------- 1 | /* 2 | Monokai style - ported by Luigi Maselli - http://grigio.org 3 | */ 4 | 5 | .hljs { 6 | display: block; 7 | overflow-x: auto; 8 | padding: 0.5em; 9 | background: #272822; 10 | color: #ddd; 11 | } 12 | 13 | .hljs-tag, 14 | .hljs-keyword, 15 | .hljs-selector-tag, 16 | .hljs-literal, 17 | .hljs-strong, 18 | .hljs-name { 19 | color: #f92672; 20 | } 21 | 22 | .hljs-code { 23 | color: #66d9ef; 24 | } 25 | 26 | .hljs-class .hljs-title { 27 | color: white; 28 | } 29 | 30 | .hljs-attribute, 31 | .hljs-symbol, 32 | .hljs-regexp, 33 | .hljs-link { 34 | color: #bf79db; 35 | } 36 | 37 | .hljs-string, 38 | .hljs-bullet, 39 | .hljs-subst, 40 | .hljs-title, 41 | .hljs-section, 42 | .hljs-emphasis, 43 | .hljs-type, 44 | .hljs-built_in, 45 | .hljs-builtin-name, 46 | .hljs-selector-attr, 47 | .hljs-selector-pseudo, 48 | .hljs-addition, 49 | .hljs-variable, 50 | .hljs-template-tag, 51 | .hljs-template-variable { 52 | color: #a6e22e; 53 | } 54 | 55 | .hljs-comment, 56 | .hljs-quote, 57 | .hljs-deletion, 58 | .hljs-meta { 59 | color: #75715e; 60 | } 61 | 62 | .hljs-keyword, 63 | .hljs-selector-tag, 64 | .hljs-literal, 65 | .hljs-doctag, 66 | .hljs-title, 67 | .hljs-section, 68 | .hljs-type, 69 | .hljs-selector-id { 70 | font-weight: bold; 71 | } 72 | -------------------------------------------------------------------------------- /release.md: -------------------------------------------------------------------------------- 1 | # Building a release 2 | 3 | 1. Test develop 4 | 2. Bump version in `setup.py` and `mailbagit/__init__.py` 5 | 3. Update [Spec compliance version](https://github.com/UAlbanyArchives/mailbagit/blob/develop/mailbagit/controller.py#L114) if neccessary. 6 | 4. Build and push dev Docker image 7 | 5. PR and merge to main 8 | 6. Build and test Windows executables 9 | 7. [Tag a release](https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository) 10 | 8. Write release notes 11 | 9. Upload Windows executables 12 | 10. Build and publish website 13 | 11. Build and push to pypi 14 | 12. Build and push prod Docker image 15 | 16 | ## Building and pushing dev Docker image 17 | 18 | ``` 19 | docker build --no-cache -t ualbanyarchives/mailbagit:dev . 20 | docker push ualbanyarchives/mailbagit:dev 21 | ``` 22 | 23 | ## Build Windows executables 24 | 25 | ``` 26 | pyinstaller mailbagit.spec 27 | pyinstaller mailbagit-gui.spec 28 | pyinstaller mailbagit-guided.spec 29 | ``` 30 | 31 | ## Build and push to pypi 32 | ``` 33 | python -m build 34 | twine upload dist/*.gz dist/*.whl 35 | ``` 36 | 37 | ## Building and pushing prod Docker image 38 | 39 | ``` 40 | docker build --no-cache -t ualbanyarchives/mailbagit:latest -f Dockerfile.production . 41 | docker push ualbanyarchives/mailbagit:latest 42 | ``` -------------------------------------------------------------------------------- /docs/testingdayslides/plugin/highlight/zenburn.css: -------------------------------------------------------------------------------- 1 | /* 2 | 3 | Zenburn style from voldmar.ru (c) Vladimir Epifanov 4 | based on dark.css by Ivan Sagalaev 5 | 6 | */ 7 | 8 | .hljs { 9 | display: block; 10 | overflow-x: auto; 11 | padding: 0.5em; 12 | background: #3f3f3f; 13 | color: #dcdcdc; 14 | } 15 | 16 | .hljs-keyword, 17 | .hljs-selector-tag, 18 | .hljs-tag { 19 | color: #e3ceab; 20 | } 21 | 22 | .hljs-template-tag { 23 | color: #dcdcdc; 24 | } 25 | 26 | .hljs-number { 27 | color: #8cd0d3; 28 | } 29 | 30 | .hljs-variable, 31 | .hljs-template-variable, 32 | .hljs-attribute { 33 | color: #efdcbc; 34 | } 35 | 36 | .hljs-literal { 37 | color: #efefaf; 38 | } 39 | 40 | .hljs-subst { 41 | color: #8f8f8f; 42 | } 43 | 44 | .hljs-title, 45 | .hljs-name, 46 | .hljs-selector-id, 47 | .hljs-selector-class, 48 | .hljs-section, 49 | .hljs-type { 50 | color: #efef8f; 51 | } 52 | 53 | .hljs-symbol, 54 | .hljs-bullet, 55 | .hljs-link { 56 | color: #dca3a3; 57 | } 58 | 59 | .hljs-deletion, 60 | .hljs-string, 61 | .hljs-built_in, 62 | .hljs-builtin-name { 63 | color: #cc9393; 64 | } 65 | 66 | .hljs-addition, 67 | .hljs-comment, 68 | .hljs-quote, 69 | .hljs-meta { 70 | color: #7f9f7f; 71 | } 72 | 73 | 74 | .hljs-emphasis { 75 | font-style: italic; 76 | } 77 | 78 | .hljs-strong { 79 | font-weight: bold; 80 | } 81 | -------------------------------------------------------------------------------- /docs/events.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Virtual Testing Days 4 | permalink: /events/ 5 | nav_order: 8 6 | --- 7 | 8 | # Virtual Testing Days 9 | 10 | The Mailbag Project team plans to hold multiple events where interested users can get together remotely and experiment using `mailbagit` with guidance from the project team. 11 | 12 | We will walk though installing and using `mailbagit` and participants will then have an opportunity for hands-on use with processing email. Project Team members will also be available for questions and feedback. 13 | 14 | Participants are encouraged to bring their own email data to use or experiment with provided datasets. 15 | 16 | ## Register for one of the following dates: 17 | 18 | [**May 27th, 2022 at 2pm-4pm Eastern Time**](https://albany.zoom.us/meeting/register/tJUod-qspj4sHtKlPU4veaZZmVaZgOacWLwM) 19 | 20 | [**June 8th, 2022 at 10am-12pm Eastern Time**](https://albany.zoom.us/meeting/register/tJAodOCsrzwjE9wrzE_8BDVz9fltYfEaj_EG) 21 | 22 | ## Slides 23 | 24 | [Virtual Testing Day slides]({{ site.baseurl}}/testingdayslides) 25 | 26 | ## Virtual Testing Days Docker image and Jupyter notebook 27 | 28 | If you have Docker installed, you can use the Virtual Testing Day docker image which includes sample data and a Jupyter Notebook. The Github repo contains [instuctions on how to pull and run this locally](https://github.com/UAlbanyArchives/mailbagit-notebook). 29 | -------------------------------------------------------------------------------- /docs/using_mailbagit/logging.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Configuring Logging 4 | permalink: /logging/ 5 | parent: Using mailbagit 6 | nav_order: 5 7 | --- 8 | 9 | # Configuring Logging 10 | 11 | * The level of logs displayed by `mailbagit` is based on an environment variable `MAILBAGIT_LOG_LEVEL`. 12 | 13 | * Log levels are available in the following order : `NOTSET`, `DEBUG`, `INFO`, `WARN`, `ERROR`, and `CRITICAL`. 14 | For example, when the `MAILBAGIT_LOG_LEVEL` is `DEBUG`, `mailbagit` displays logs of all levels. 15 | And when `MAILBAGIT_LOG_LEVEL` is `WARN`, it displays logs of level `WARN` and above. i.e. `WARN`, `ERROR`, or `CRITICAL`. 16 | 17 | * If no `MAILBAGIT_LOG_LEVEL` environment variable is set, `mailbagit` will default to `WARN`. 18 | 19 | Unix example: 20 | 21 | ``` 22 | export MAILBAGIT_LOG_LEVEL=info 23 | echo $MAILBAGIT_LOG_LEVEL 24 | > info 25 | ``` 26 | 27 | Windows Powershell example: 28 | 29 | ``` 30 | $env:MAILBAGIT_LOG_LEVEL='debug' 31 | $env:MAILBAGIT_LOG_LEVEL 32 | > debug 33 | ``` 34 | 35 | On Windows, you can also [set environment variables by searching "edit environment variable"](https://www.onmsft.com/how-to/how-to-set-an-environment-variable-in-windows-10). 36 | 37 | 38 | 39 | ### Example of the logger initiation and usage in `Python`: 40 | 41 | ``` 42 | from structlog import get_logger 43 | import mailbagit.loggerx 44 | loggerx.configure() 45 | log = get_logger() 46 | log.error("Error message here") 47 | log.info("Information message here") 48 | ``` -------------------------------------------------------------------------------- /docs/_data/must_have.csv: -------------------------------------------------------------------------------- 1 | Number,Category,Must have,Requirement Type,Possible System Dependancies 2 | 1,Core,Creates bags that meet the Bagit specification,Technical, 3 | 2,Core,Establish fixity for email by generating commonly used checksums,Functional, 4 | 3,Core,Check fixity by validating checksums,Functional, 5 | 4,Core,Create metadata in bag-info.txt to insure bags are fully actionable,Technical, 6 | 5,Core,Update metadata in bag-info.txt,Technical, 7 | 6,Core,Package attachments and establish fixity by generating checksums,Functional, 8 | 7,Core,Preserve email folder structure and ensure it is actionable,Functional, 9 | 8,Core-PDF,Optionally generate PDFs for each email body that displays basic header information,Functional,"wkhtmltopdf, htmldoc, if not xhtml2pdf" 10 | 9,Core-PDF,Easy process for exporting PDFs for each email in bulk,Functional, 11 | 10,Core-PDF,Easy process for exporting PDFs for individual emails,Functional, 12 | 11,Input,Input from MBOX/EML,Functional, 13 | 12,Input,Input from PST,Functional,libpff has Windows dependencies 14 | 13,Input,Input from IMAP,Functional, 15 | 14,Installation,All core functionality works on Windows/macOS/Common Linux distros,Technical,Python on Windows 16 | 15,Installation,Documentation of any manually-installed dependencies,Functional, 17 | 45,Core,"Preserve the original input data, unless directed to exclude folders or messages",Technical, 18 | 46,Core,Preserve links as attachments,Technical, 19 | 48,Core,Creates mailbags that conform to the Mailbag specification,Technical, 20 | -------------------------------------------------------------------------------- /docs/Gemfile: -------------------------------------------------------------------------------- 1 | source "https://rubygems.org" 2 | 3 | # Hello! This is where you manage which Jekyll version is used to run. 4 | # When you want to use a different version, change it below, save the 5 | # file and run `bundle install`. Run Jekyll with `bundle exec`, like so: 6 | # 7 | # bundle exec jekyll serve 8 | # 9 | # This will help ensure the proper Jekyll version is running. 10 | # Happy Jekylling! 11 | #gem "jekyll", "~> 3.8.5" 12 | gem "jekyll", "~> 4.2.2" 13 | gem "rake" 14 | 15 | # This is the default theme for new Jekyll sites. You may change this to anything you like. 16 | #gem "minima", "~> 2.0" 17 | #gem "just-the-docs" 18 | 19 | # If you want to use GitHub Pages, remove the "gem "jekyll"" above and 20 | # uncomment the line below. To upgrade, run `bundle update github-pages`. 21 | # gem "github-pages", group: :jekyll_plugins 22 | 23 | # If you have any plugins, put them here! 24 | group :jekyll_plugins do 25 | gem "jekyll-feed", "~> 0.15.1" 26 | gem "jekyll-remote-theme" 27 | gem "jekyll-seo-tag" 28 | gem "jekyll-sitemap" 29 | gem "jekyll-watch" 30 | gem "jekyll-sass-converter" 31 | gem "jekyll-include-cache" 32 | #gem "github-pages", "226" 33 | end 34 | 35 | # Windows does not include zoneinfo files, so bundle the tzinfo-data gem 36 | gem "tzinfo-data", platforms: [:mingw, :mswin, :x64_mingw, :jruby] 37 | 38 | # Performance-booster for watching directories on Windows 39 | gem "wdm", "~> 0.1.0" if Gem.win_platform? 40 | 41 | # Jekyll dependency updates 42 | gem "addressable", ">= 2.8.0" 43 | gem "kramdown", ">= 2.3.1" 44 | -------------------------------------------------------------------------------- /docs/using_mailbagit/plugins.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Plugins 4 | permalink: /plugins/ 5 | parent: Using mailbagit 6 | nav_order: 6 7 | --- 8 | 9 | # Plugins 10 | 11 | You can make plugins for both the mailbagit format parsers and derivatives classes. This allows you to easily override the behavior for how mailbagit reads email exports or creates derivative formats. New input and derivative formats may also be provided to mailbagit to extend its functionality. 12 | 13 | The files that can be overridden can be found in the [Github repo](https://github.com/UAlbanyArchives/mailbagit) in the [mailbagit/formats](https://github.com/UAlbanyArchives/mailbagit/tree/main/mailbagit/formats) and [mailbagit/derivatives](https://github.com/UAlbanyArchives/mailbagit/tree/main/mailbagit/derivatives) subdirectories. 14 | 15 | By default, mailbagit will look for formats in the following places: 16 | 17 | 1. a `formats` or `derivatives` subdirectory within a directory specified in the `MAILBAG_PLUGIN_DIR` environment variable. 18 | 19 | Unix Example: 20 | ``` 21 | mkdir ~/myplugindir 22 | mkdir ~/myplugindir/formats 23 | touch ~/myplugindir/formats/pst.py 24 | export MAILBAGIT_PLUGIN_DIR=$HOME/myplugindir 25 | ``` 26 | 2. `.mailbagit/formats` and `.mailbagit/derivatives` subdirectories in the user's home directory. 27 | 28 | Unix Example: 29 | ``` 30 | mkdir ~/.mailbagit/derivatives 31 | touch ~/.mailbagit/derivatives/pdf.py 32 | ``` 33 | 34 | Example Windows path: 35 | ``` 36 | C:\Users\[my_username]\.mailbagit\formats\imap.py 37 | ``` 38 | 3. The formats and derivatives built into mailbagit. 39 | -------------------------------------------------------------------------------- /docs/testingdayslides/dist/theme/fonts/source-sans-pro/source-sans-pro.css: -------------------------------------------------------------------------------- 1 | @font-face { 2 | font-family: 'Source Sans Pro'; 3 | src: url('./source-sans-pro-regular.eot'); 4 | src: url('./source-sans-pro-regular.eot?#iefix') format('embedded-opentype'), 5 | url('./source-sans-pro-regular.woff') format('woff'), 6 | url('./source-sans-pro-regular.ttf') format('truetype'); 7 | font-weight: normal; 8 | font-style: normal; 9 | } 10 | 11 | @font-face { 12 | font-family: 'Source Sans Pro'; 13 | src: url('./source-sans-pro-italic.eot'); 14 | src: url('./source-sans-pro-italic.eot?#iefix') format('embedded-opentype'), 15 | url('./source-sans-pro-italic.woff') format('woff'), 16 | url('./source-sans-pro-italic.ttf') format('truetype'); 17 | font-weight: normal; 18 | font-style: italic; 19 | } 20 | 21 | @font-face { 22 | font-family: 'Source Sans Pro'; 23 | src: url('./source-sans-pro-semibold.eot'); 24 | src: url('./source-sans-pro-semibold.eot?#iefix') format('embedded-opentype'), 25 | url('./source-sans-pro-semibold.woff') format('woff'), 26 | url('./source-sans-pro-semibold.ttf') format('truetype'); 27 | font-weight: 600; 28 | font-style: normal; 29 | } 30 | 31 | @font-face { 32 | font-family: 'Source Sans Pro'; 33 | src: url('./source-sans-pro-semibolditalic.eot'); 34 | src: url('./source-sans-pro-semibolditalic.eot?#iefix') format('embedded-opentype'), 35 | url('./source-sans-pro-semibolditalic.woff') format('woff'), 36 | url('./source-sans-pro-semibolditalic.ttf') format('truetype'); 37 | font-weight: 600; 38 | font-style: italic; 39 | } 40 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # /usr/bin/env python3 2 | import setuptools 3 | import sys 4 | 5 | with open("README.md", "r", encoding="utf-8") as fh: 6 | long_description = fh.read() 7 | 8 | setuptools.setup( 9 | name="mailbagit", 10 | version="0.7.5", 11 | author="Gregory Wiedeman", 12 | author_email="gwiedeman@albany.edu", 13 | description="A tool for preserving email in multiple preservation formats.", 14 | long_description=long_description, 15 | long_description_content_type="text/markdown", 16 | url="https://github.com/UAlbanyArchives/mailbag", 17 | packages=setuptools.find_namespace_packages(exclude=("tests")), 18 | classifiers=[ 19 | "Programming Language :: Python :: 3", 20 | "License :: OSI Approved :: MIT License", 21 | "Operating System :: OS Independent", 22 | ], 23 | entry_points={"console_scripts": ["mailbagit=mailbagit:cli", "mailbagit-gui=mailbagit:gui", "mailbagit-guided=mailbagit:guided"]}, 24 | install_requires=[ 25 | "bagit>=1.8.1,<2", 26 | "beautifulsoup4>=4.11.1,<5", 27 | "black>=23.7,<26", 28 | "jsonmodels>=2.2,<=2.5.0", 29 | "extract_msg>=0.42.0", 30 | "structlog>=21.1.0,<22", 31 | "packaging>=21.0,<25", 32 | "python-json-logger>=2.0.2,<3", 33 | "pyparsing>=2.1.0,<3", 34 | "pytest>=7.0.1,<8", 35 | "python-dotenv>0.19.0,<=0.19.1", 36 | "chardet>=4.0.0,<5", 37 | "warcio>=1.7.4,<2", 38 | "requests >=2.27.1,<3", 39 | "cssutils >=2.4.2,<3", 40 | "python-json-logger >=2.0.2,<3", 41 | ], 42 | extras_require={ 43 | "gui": ["gooey>=1.0.8.1,<2"], 44 | "pst": ["libpff-python==20231205"], 45 | "dev": ["pyinstaller==5.0.1,<6"], 46 | }, 47 | python_requires=">=3.9", 48 | ) 49 | -------------------------------------------------------------------------------- /docs/_data/should_have.csv: -------------------------------------------------------------------------------- 1 | Number,Category,Should have,Requirement Type,Possible System Dependancies 2 | 16,Core,A process for compressing Mailbags and managing compressed Mailbags,Functional, 3 | 17,Core,"A process for external tools to use email source data, like RATOM or ePADD to use MBOX data within a Mailbag",Functional, 4 | 18,Core,Provide Beyond the Repository (BTR) compliant BagIt Profiles,Technical, 5 | 19,Core,Provide feedback to users about what the tool is doing and its pace of progress,Functional, 6 | 20,Core-WARC,Optionally generate WARC files that include embedded CSS and Images,Functional,wget if not warcio 7 | 21,Core-WARC,Optionally capture external links within WARC files,Functional,wget if not warcio 8 | 22,Core-WARC,Easy process for exporting or connecting with a tool to replay WARC files,Functional, 9 | 23,Reporting,Report and export attachments,Functional, 10 | 24,Reporting,Report the email folders included in a Mailbag and the number of included email in each folder and the account overall,Technical, 11 | 25,Dependency management,"Modular/extendable dependencies that work out-of-the box, but can be overridden",Functional, 12 | 26,Dependency management,Dependancies configurable via YAML or simmilar,Technical, 13 | 27,Dependency management,Multiple conversion options/plugins for PDFs,Technical,"wkhtmltopdf, htmldoc, if not xhtml2pdf" 14 | 28,Dependency management,Multiple conversion options/plugins for WARCs,Technical, 15 | 29,Dependency management,Process for upgrading and managing dependencies,Functional, 16 | 30,Installation,Provide Docker image,Technical, 17 | 31,Installation,Provide an executable for multiple platforms,Technical, 18 | 32,Installation,Clear documentation for managing system privileges,Functional, 19 | 47,Core-WARC,Optionally capture attached links within WARC files,Functional, 20 | 49,Input,Input from MSG files,Functional, 21 | -------------------------------------------------------------------------------- /docs/design_documents.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Design Documents 4 | permalink: /design_docs/ 5 | nav_order: 6 6 | has_children: true 7 | --- 8 | 9 | 10 | ## Design Documents 11 | 12 | The Project Team and Advisory Board developed these design documents to help define the Mailbag tool and communicate its features and limits. We welcome your feedback, particularly during the community comment period that ends after April 25th, 2021. 13 | 14 | * [Comment on Personas](https://docs.google.com/forms/d/e/1FAIpQLSeZ3WLjaPdJWjPMYSc4BjTVIIv_fnHm7D1vKk1pqsx9tnNQsQ/viewform?usp=sf_link) 15 | * [Suggest a User Story](https://docs.google.com/forms/d/e/1FAIpQLSf68TbV5mbZ48pm_lrGB_SK4oxZO0FtGeUqOfoK6gEQ5iq2WA/viewform?usp=sf_link) 16 | * [Suggest or Prioritize a Requirement](https://docs.google.com/forms/d/e/1FAIpQLScg34b0NJhuDWaUUvyWJxyK5bBGf9Hh9N0n76XElsoBJd7S1Q/viewform?usp=sf_link) 17 | 18 | ## Project Outcomes 19 | 20 | ### [Mailbag Specification](https://archives.albany.edu/mailbag/spec/) 21 | 22 | * A functional specification to package email exports using multiple formats in a [Bagit "bag"](https://tools.ietf.org/html/rfc8493). 23 | * Ensures a structure where email messages will remain actionable among multiple formats. 24 | 25 | ### Command line Mailbag tool and Python library 26 | 27 | * Will accept common email export formats and create valid "mailbags" according to the Mailbag Specification. 28 | * Will optionally convert email exports to derivative formats and include them in a mailbag. 29 | * Will include basic reporting functionality to list included messages. 30 | * Will allow easy pathways to other email appraisal and processing tools. 31 | 32 | ### Basic graphical user interface (GUI) wrapper for command line Mailbag tool. 33 | 34 | * Will include at least the basic functionality of the command line Mailbag tool. 35 | * Will be accessible to users using assistive technology. -------------------------------------------------------------------------------- /docs/design_documents/requirements.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Requirements 4 | permalink: /requirements/ 5 | parent: Design Documents 6 | nav_order: 3 7 | --- 8 | 9 | # Requirements 10 | 11 | If you have feedback, please [Suggest or prioritize a Requirement](https://docs.google.com/forms/d/e/1FAIpQLScg34b0NJhuDWaUUvyWJxyK5bBGf9Hh9N0n76XElsoBJd7S1Q/viewform?usp=sf_link) by April 25th! 12 | 13 | ## Must Have 14 | 15 | 16 | {% for row in site.data.must_have %} 17 | {% if forloop.first %} 18 | 19 | {% for pair in row %} 20 | 21 | {% endfor %} 22 | 23 | {% endif %} 24 | {% tablerow pair in row %} 25 | {{ pair[1] }} 26 | {% endtablerow %} 27 | {% endfor %} 28 |
{{ pair[0] }}
29 | 30 | ## Should Have 31 | 32 | 33 | {% for row in site.data.should_have %} 34 | {% if forloop.first %} 35 | 36 | {% for pair in row %} 37 | 38 | {% endfor %} 39 | 40 | {% endif %} 41 | {% tablerow pair in row %} 42 | {{ pair[1] }} 43 | {% endtablerow %} 44 | {% endfor %} 45 |
{{ pair[0] }}
46 | 47 | ## Could Have 48 | 49 | 50 | {% for row in site.data.could_have %} 51 | {% if forloop.first %} 52 | 53 | {% for pair in row %} 54 | 55 | {% endfor %} 56 | 57 | {% endif %} 58 | {% tablerow pair in row %} 59 | {{ pair[1] }} 60 | {% endtablerow %} 61 | {% endfor %} 62 |
{{ pair[0] }}
63 | 64 | ## Won't Have 65 | 66 | 67 | {% for row in site.data.wont_have %} 68 | {% if forloop.first %} 69 | 70 | {% for pair in row %} 71 | 72 | {% endfor %} 73 | 74 | {% endif %} 75 | {% tablerow pair in row %} 76 | {{ pair[1] }} 77 | {% endtablerow %} 78 | {% endfor %} 79 |
{{ pair[0] }}
-------------------------------------------------------------------------------- /docs/installing_mailbagit/exe.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Windows Executables 4 | permalink: /exe/ 5 | parent: Installing mailbagit 6 | nav_order: 3 7 | --- 8 | 9 | # Windows Executables 10 | 11 | Executables are available for Windows that contain all dependencies except the [PDF dependencies]({{ site.baseurl }}/pdf). Download the latest `mailbagit.exe` and `mailbagit-gui.exe` files from the [Github releases](https://github.com/UAlbanyArchives/mailbag/releases). 12 | 13 | ![Screenshot showing where to download the executable files under "assets" in the latest Github release.]({{ site.baseurl }}/img/release_assets.png) 14 | 15 | You can run `mailbagit.exe` in the command line [just as you would use the `mailbagit` command]({{ site.baseurl }}/mailbagit): 16 | 17 | ``` 18 | mailbagit.exe path/to/export.pst -i pst -d eml warc -m my_mailbag -r 19 | ``` 20 | 21 | `mailbagit-gui.exe` will run just by double-clicking it. 22 | 23 | These executables are unsigned, so Windows will likely give you as "Windows protected your PC" warning from Microsoft Defender SmartScreen. You will need sufficient permissions to allow unsigned executables on you machine. 24 | 25 | ![Screenshot Microsoft Defender SmartScreen preventing mailbagit-gui.exe from running due to an unsigned executable.]({{ site.baseurl }}/img/smart_screen.png) 26 | 27 | ### Unblocking `mailbagit.exe` and `mailbagit-gui.exe` 28 | 29 | 1. Right-click on `mailbagit.exe` or `mailbagit-gui.exe` and select "Properties". 30 | 2. On the bottom of the panel, click the checkbox near the bottom right corner labeled "Unblock". 31 | 3. Select "Apply" and then "OK". 32 | 33 | ![Screenshot showing how to unblock mailbagit-gui.ext.]({{ site.baseurl }}/img/allow_executable.png) 34 | 35 | You should now be able to click on and run the executable. If the Unblock option is not visible, you probably don't have sufficent permissions to run the executables on your computer. 36 | -------------------------------------------------------------------------------- /docs/using_mailbagit/lossiness.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Lossiness in Derivatives 4 | permalink: /lossiness/ 5 | parent: Using mailbagit 6 | nav_order: 7 7 | --- 8 | 9 | # Lossiness in Derivatives 10 | 11 | `mailbagit` is committed to maintaining as many of the meaningful characteristics of each email message across derivatives as feasible. However, practical limitations can cause some derivative formats to be lossy: 12 | 13 | * Some derivative formats, such as PDFs, do not have the structure to appropriately store all parts of a message. 14 | * In other cases, such as PSTs and MSGs, dependency libraries do not provide complete access to all the technical characteristics of a message. 15 | * `mailbagit` maintains what we feel are the meaningful characteristics of each message as defined by the `mailbagit` [email model](https://github.com/UAlbanyArchives/mailbagit/blob/develop/mailbagit/models.py). 16 | * Derivative EML or MBOX files created from PST or MSG source messages will contain all message headers, correctly encoded HTML and plain text bodies if present, and attachments. 17 | * The part structure of a message is not maintained. 18 | * Generating EMLs from an MBOX or the reverse writes the full Python email message object if possible, so this should be lossless and contain the full part structure of a message. 19 | * For some MBOXs or EMLs with missing or inconsistently documented encoding, `mailbagit` tries to write the full message object. If it cannot, it writes a warning and falls back to writing derivatives from the [model](https://github.com/UAlbanyArchives/mailbagit/blob/develop/mailbagit/models.py). 20 | 21 | Mailbag's approach to keep messages in multiple formats is designed to mitigate these issues by maintaining the original source format alongside all derivatives. 22 | 23 | However, if this does not fit your use case, please let us know by [submitting an issue](https://github.com/UAlbanyArchives/mailbagit/issues)! 24 | -------------------------------------------------------------------------------- /docs/_config.yml: -------------------------------------------------------------------------------- 1 | # Welcome to Jekyll! 2 | # 3 | # This config file is meant for settings that affect your whole blog, values 4 | # which you are expected to set up once and rarely edit after that. If you find 5 | # yourself editing this file very often, consider using Jekyll's data files 6 | # feature for the data you need to update frequently. 7 | # 8 | # For technical reasons, this file is *NOT* reloaded automatically when you use 9 | # 'bundle exec jekyll serve'. If you change this file, please restart the server process. 10 | 11 | # Site settings 12 | # These are used to personalize your new site. If you look in the HTML files, 13 | # you will see them accessed via {{ site.title }}, {{ site.email }}, and so on. 14 | # You can create any custom variable you would like, and they will be accessible 15 | # in the templates via {{ site.myvariable }}. 16 | title: Mailbag 17 | email: your-email@example.com 18 | description: >- # this means to ignore newlines until "baseurl:" 19 | A Stable Package for Email in Multiple Formats 20 | baseurl: "/mailbag" # the subpath of your site, e.g. /blog 21 | url: "" # the base hostname & protocol for your site, e.g. http://example.com 22 | twitter_username: jekyllrb 23 | github_username: jekyll 24 | 25 | # Build settings 26 | markdown: kramdown 27 | #theme: just-the-docs 28 | #remote_theme: pmarsceill/just-the-docs 29 | remote_theme: just-the-docs/just-the-docs 30 | plugins: 31 | - jekyll-feed 32 | 33 | # Aux links for the upper right navigation 34 | aux_links: 35 | "Mailbag on Github": 36 | - "https://github.com/UAlbanyArchives/mailbagit" 37 | 38 | #include: docker-compose.yml 39 | 40 | # Exclude from processing. 41 | # The following items will not be processed, by default. Create a custom list 42 | # to override the default setting. 43 | # exclude: 44 | # - Gemfile 45 | # - Gemfile.lock 46 | # - node_modules 47 | # - vendor/bundle/ 48 | # - vendor/cache/ 49 | # - vendor/gems/ 50 | # - vendor/ruby/ 51 | -------------------------------------------------------------------------------- /docs/contributors.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Call for Contributors (closed) 4 | permalink: /contributors/ 5 | nav_order: 9 6 | --- 7 | 8 | # Call for Contributors (closed) 9 | 10 | The Mailbag project team is seeking community members interested in contributing to the `mailbagit` tool. This is open to those with Python experience looking to address [known issues](https://github.com/UAlbanyArchives/mailbag/issues) or review existing code, as well as non-coders interested in improving documentation. 11 | 12 | We have a limited amount of funding to provide honorariums in support of this work. We expect to support about 60-70 hours in total. 13 | 14 | 15 | ## Requirements 16 | 17 | Interested contributors are expected to possess skills and experience in: 18 | 19 | - digital preservation 20 | - born-digital records 21 | - an understanding of and experience working with the Bagit specification 22 | - experience manipulating data in an archives or library setting (spreadsheets definitely count) 23 | 24 | Preferred contributors would have experience in: 25 | 26 | - Creating and managing email exports 27 | - Processing email archives 28 | 29 | We strongly encourage self-nominations from individuals who identify with underrepresented and/or marginalized populations, including those not well-represented within libraries, archives, or technology (e.g. women, people of color, LGBTQ+, ability/disability, non-binary gender identities, etc.) We also encourage applications from members of underrepresented and/or marginalized groups that don't fit into the categories listed above. 30 | 31 | ## Application Process 32 | 33 | Applications will consist of: 34 | 35 | - An existing CV or resume 36 | - A paragraph stating the area(s) or issue(s) your interested in contributing to and your overall interest in the project. We expect this to be about 200 words. 37 | 38 | Interested applicants should complete the [application form](https://docs.google.com/forms/d/e/1FAIpQLSfUHr5L27DUWxo7zXKUuoNjJgVcItmU76TfMObbCIpd_afvbA/viewform?usp=sf_link) and email a CV or resume to gwiedeman [at] albany [dot] edu. -------------------------------------------------------------------------------- /docs/installing_mailbagit/pdf.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: PDF Derivatives 4 | permalink: /pdf/ 5 | parent: Installing mailbagit 6 | nav_order: 5 7 | --- 8 | 9 | # PDF derivatives 10 | 11 | Unless you're using a Docker image, mailbagit is unable to make PDF derivatives out-of-the-box. For this option to be available, you need to have either [wkhtmltopdf](https://wkhtmltopdf.org/) or [Google Chome](https://www.google.com/chrome/) installed and added to your `PATH`. 12 | 13 | [Learn more about `PATH` on Windows](https://www.maketecheasier.com/what-is-the-windows-path/). 14 | 15 | ⚠️ Creating PDF or WARC derivatives may cause [privacy or security concerns]({{ site.baseurl }}/concerns) in some contexts. 16 | 17 | ## wkhtmltopdf 18 | 19 | Installing [wkhtmltopdf](https://wkhtmltopdf.org/) and adding `wkhtmltopdf` or `wkhtmltopdf.exe` to your `PATH` will make the `pdf` derivative option available. You can test this by entering `wkhtmltopdf -V` or `wkhtmltopdf.exe -V` into a command line terminal, which should show options for wkhtmltopdf. 20 | 21 | ## Chrome Headless 22 | 23 | Installing [Google Chrome](https://www.google.com/chrome/) and adding `chrome`, `chrome.exe,` `google-chrome,` or `Google Chrome` to your `PATH` will make the `pdf-chrome` derivative option available. If you have Google Chrome installed already, you may just need to add it to your `PATH`. 24 | 25 | For Windows, Chrome usually installs in one of these locations by default: 26 | ``` 27 | C:\Program Files\Google\Chrome\Application 28 | C:\Program Files (x86)\Google\Chrome\Application 29 | ``` 30 | 31 | For Macs, the usual install path is `/Applications/Google\ Chrome.app/Contents/MacOS/`. 32 | 33 | You can test if Chrome is correctly added to your `PATH` by entering the correlating command into a command line terminal: 34 | 35 | ``` 36 | chrome https://archives.albany.edu/mailbag 37 | chrome.exe https://archives.albany.edu/mailbag 38 | google-chrome https://archives.albany.edu/mailbag 39 | Google\ Chrome https://archives.albany.edu/mailbag 40 | ``` 41 | 42 | If any of these commands open a Chrome browser window, you're all set! 43 | -------------------------------------------------------------------------------- /docs/news.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: News 4 | permalink: /news/ 5 | nav_order: 7 6 | --- 7 | 8 | ## News 9 | 10 | * [Project Announcement](https://library.albany.edu/news/archives-email-grant-announcement), March 15, 2021. 11 | * [Email Archives Building Capacity and Community First Round Awards](https://www.library.illinois.edu/news/email-archives-building-capacity-and-community/), April 17, 2021. 12 | 13 | ## Conference Presentations 14 | 15 | * Wiedeman, Gregory and Mark Wolfe, "The Mailbag Project and Building Digital Preservation Tools Around Filesystems," [ Coalition for Networked Information (CNI) Fall Meeting](https://www.youtube.com/watch?v=9ubzlVMvudk) (remote) December 6, 2021. 16 | 17 | * Wiedeman, Gregory, "Mailbag: A Stable Package for Email in Multiple Formats," 17th International Conference on Digital Preservation (iPres), Beijing, China (remote) October 22, 2021. 18 | 19 | * Wiedeman, Gregory, "Functional Specifications for Mailbag and DigiPres," [BitCurator Users Forum](https://bitcuratorconsortium.org/session-6-tools-and-demos-showcase/) (remote) October 14, 2021. 20 | 21 | * Wiedeman, Gregory, "Preserving Email in Multiple Formats: An Overview of the Mailbag Project," [CoSA SERI webinar](https://www.statearchivists.org/events/event-description?CalendarEventKey=c86fc2b4-befd-40d5-b7cc-7a97d5cb9422&Home=%2Fevents%2Fcalendar) (remote) September 21, 2021. 22 | 23 | * Wiedeman, Gregory, "Mailbag (Building Capacity and Community for Email Archives)," [Society of American Archivists (SAA) Annual Meeting 2021](https://www2.archivists.org/am2021), prerecorded, Live Q&A August 6, 2021. 24 | 25 | * Wiedeman, Gregory, "Mailbag: A Stable Package for Email with Multiple Masters," [Best Practices Exchange 2021](https://bpexchange.wordpress.com/2021-schedule/), May 18, 2021. 26 | 27 | * Wiedeman, Gregory, "Mailbag: A Stable Package for Email with Multiple Masters," [Coalition for Networked Information (CNI)](https://cnispring21mtg.sched.com/event/iBXp), March 15, 2021. 28 | 29 | -------------------------------------------------------------------------------- /branching_workflow.md: -------------------------------------------------------------------------------- 1 | # Mailbagit Branching Workflow 2 | 3 | ## Find an Issue 4 | 5 | * Review the [project](https://github.com/UAlbanyArchives/mailbagit/projects/1) 6 | * Select an [issue](https://github.com/UAlbanyArchives/mailbagit/issues) 7 | * Identify the Issue number 8 | 9 | Screenshot showing where the issue number is on Github 10 | 11 | ## Viewing Branches 12 | 13 | * View local branches 14 | 15 | ```bash 16 | git branch 17 | ``` 18 | 19 | * View local and remote branches 20 | 21 | ```bash 22 | git branch -a 23 | ``` 24 | 25 | ## Creating a new feature to address the issue 26 | 27 | 1. Move the issue to "in progress" in the [project](https://github.com/UAlbanyArchives/mailbagit/projects/1) 28 | 2. Create a new branch named `feature-[issue number]` 29 | 30 | ```bash 31 | git checkout -b feature-[issue number] 32 | ``` 33 | 34 | 3. Edit the branch locally 35 | 4. Run local tests 36 | 37 | ```bash 38 | black . 39 | pytest 40 | ``` 41 | 42 | 5. Push branch to remote 43 | 44 | ```bash 45 | git push origin feature-[issue number] 46 | ``` 47 | 48 | 6. [Create a Pull Request](https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) from your branch to develop 49 | 7. Move the Issue in the [project](https://github.com/UAlbanyArchives/mailbagit/projects/1) to "Ready for Review" 50 | 51 | ## Reviewing 52 | 53 | 1. View the "Ready for Review" column in the [project](https://github.com/UAlbanyArchives/mailbagit/projects/1) and identify the issue number 54 | 2. View the original [issue](https://github.com/UAlbanyArchives/mailbagit/issues) and the [pull request](https://github.com/UAlbanyArchives/mailbagit/pulls) 55 | 3. Checkout the 56 | 57 | ```bash 58 | git fetch 59 | git checkout feature-[issue number] 60 | ``` 61 | 62 | 4. Review the code and ensure it effectively addresses the issue 63 | 5. Run local tests 64 | 65 | ```bash 66 | black . 67 | pytest 68 | ``` 69 | 70 | 6. Merge the branch to develop 71 | 72 | ```bash 73 | git checkout develop 74 | git merge feature-[issue number] 75 | ``` 76 | 77 | 7. Move Move the Issue in the [project](https://github.com/UAlbanyArchives/mailbagit/projects/1) to "Reviewer Approved" -------------------------------------------------------------------------------- /docs/using_mailbagit/concerns.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Privacy and security concerns 4 | permalink: /concerns/ 5 | parent: Using mailbagit 6 | nav_order: 8 7 | --- 8 | 9 | # Privacy and security concerns 10 | 11 | ## File Inclusions in PDFs and WARCs 12 | 13 | Automatically converting the HTML within emails to PDFs and WARCs can cause security vulnerabilities in some contexts. Since it is [no longer actively maintained](https://wkhtmltopdf.org/status.html), wkhtmltopdf does not recommend running it on user-supplied and unsanitized HTML. 14 | 15 | * [File Inclusion Vulnerability](https://www.virtuesecurity.com/kb/wkhtmltopdf-file-inclusion-vulnerability-2/) 16 | 17 | These issues primarily apply when running wkhtmltopdf in a web application where a potentially malicious user has access to the PDFs that are generated. Using Google Chrome may mitigate a few of these vulnerabilities, but overall shares most of these risks. This also applies to WARC derivatives. 18 | 19 | We don't suggest using `mailbagit` in a web application, as a malicious email may leak information about the computer used to used to process it into a PDF. Since `mailbagit` is designed for email preservation, in many contexts it may be unlikely for you to encounter malicious emails designed specifically to cause issues with `mailbagit`, but it is still helpful to be aware of this issue. Running `mailbagit` in a [Docker container](({{ site.baseurl}}/docker)) mitigates many of these concerns. 20 | 21 | ## Email Trackers 22 | 23 | Additionally, both PDF and WARC derivatives require rendering the HTML bodies for every email message packaged by `mailbagit`. This will send HTTP GET requests from your IP address for any ``, `