├── .graphics
├── taupe-icon.png
├── noun-bird-233023.svg
├── README.md
└── taupe-icon.svg
├── bin
├── README.md
└── taupe
├── SUPPORT.md
├── requirements.txt
├── CITATION.cff
├── requirements-dev.txt
├── codemeta.json
├── CHANGES.md
├── .gitattributes
├── CONTRIBUTING.md
├── LICENSE
├── taupe
├── __init__.py
├── exit_codes.py
└── __main__.py
├── setup.cfg
├── .flake8
├── setup.py
├── .gitignore
├── CODE_OF_CONDUCT.md
├── Makefile
└── README.md
/.graphics/taupe-icon.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mhucka/taupe/HEAD/.graphics/taupe-icon.png
--------------------------------------------------------------------------------
/bin/README.md:
--------------------------------------------------------------------------------
1 | # About the shell script in this directory
2 |
3 | The shell script in this directory is mainly for testing and development. During development, I run Taupe from a terminal emulator by starting it simply like this:
4 |
5 | ```sh
6 | ./taupe
7 | ```
8 |
9 | When Taupe is installed on a computer using `pip` or `pipx`, a different wrapper script is installed, not the one that is in this directory. The one here is merely a convenience.
10 |
--------------------------------------------------------------------------------
/.graphics/noun-bird-233023.svg:
--------------------------------------------------------------------------------
1 |
2 |
5 |
--------------------------------------------------------------------------------
/.graphics/README.md:
--------------------------------------------------------------------------------
1 | # Icon for Taupe
2 |
3 | The [vector artwork](https://thenounproject.com/icon/bird-233023/) of a bird, used as the icon for this repository, was created by [Noe Araujo](https://thenounproject.com/noearaujo/) from the Noun Project. It is licensed under the Creative Commons [CC-BY 3.0](https://creativecommons.org/licenses/by/3.0/) license.
4 |
5 | I edited the logo in [Boxy SVG](https://boxy-svg.com), a native SVG editor for macOS, to change the icon color to [taupe](https://en.wikipedia.org/wiki/Taupe).
6 |
--------------------------------------------------------------------------------
/SUPPORT.md:
--------------------------------------------------------------------------------
1 | Support
2 | =======
3 |
4 | Thank you for your interest in this project. If you are experiencing problems or have questions, the following are the preferred methods of reaching someone:
5 |
6 | 1. Report a new issue using the [issue tracker](https://github.com/mhucka/template/issues).
7 | 2. Send email to the primary maintainer: [mhucka@caltech.edu](mhucka@caltech.edu).
8 | 3. Send email to an individual involved in the project. People's names appear in the top-level `README.md` file in the source code repository.
9 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | # =============================================================================
2 | # @file requirements.txt
3 | # @brief Python dependencies for Taupe
4 | # @created 2022-11-18
5 | # @license Please see the file named LICENSE in the project directory
6 | # @website https://github.com/mhucka/taupe
7 | # =============================================================================
8 |
9 | aenum >= 3.1.0
10 | commonpy == 1.9.5
11 | plac == 1.3.5
12 | rich >= 12.6.0
13 | setuptools == 58.3.0
14 | sidetrack >= 2.0.1
15 |
--------------------------------------------------------------------------------
/CITATION.cff:
--------------------------------------------------------------------------------
1 | cff-version: 1.2.0
2 | message: "If you use this software, please cite it as below."
3 | title: Taupe
4 | authors:
5 | - family-names: Hucka
6 | given-names: Michael
7 | orcid: https://orcid.org/0000-0001-9105-5960
8 | abstract: Twitter archive URL parser
9 | repository-code: "https://github.com/mhucka/taupe"
10 | type: software
11 | version: 1.2.0
12 | license-url: "https://github.com/mhucka/taupe/blob/main/LICENSE"
13 | keywords:
14 | - Twitter
15 | - archiving
16 | - data processing
17 | - CSV
18 | - comma separated values
19 | - JSON
20 | - software
21 | date-released: 2022-11-23
22 |
23 |
--------------------------------------------------------------------------------
/.graphics/taupe-icon.svg:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/requirements-dev.txt:
--------------------------------------------------------------------------------
1 | # =============================================================================
2 | # @file requirements-dev.txt
3 | # @brief Python dependencies for Waystation for development
4 | # @created 2022-11-18
5 | # @license Please see the file named LICENSE in the project directory
6 | # @website https://github.com/mhucka/taupe
7 | # =============================================================================
8 |
9 | -r requirements.txt
10 |
11 | pytest >= 6.2.5
12 | pytest-cov >= 3.0.0
13 | pytest-mock >= 3.7.0
14 |
15 | flake8 >= 4.0.1
16 | flake8-bugbear >= 22.4.25
17 | flake8-builtins >= 1.5.3
18 | flake8-comprehensions >= 3.8.0
19 | flake8-executable >= 2.1.1
20 | flake8_implicit_str_concat >= 0.3.0
21 | flake8-pie >= 0.15.0
22 | flake8-simplify >= 0.19.2
23 |
24 | twine
25 |
--------------------------------------------------------------------------------
/codemeta.json:
--------------------------------------------------------------------------------
1 | {
2 | "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
3 | "@type": "SoftwareSourceCode",
4 | "description": "Twitter archive URL parser",
5 | "name": "taupe",
6 | "codeRepository": "https://github.com/mhucka/taupe",
7 | "issueTracker": "https://github.com/mhucka/taupe/issues",
8 | "license": "https://github.com/mhucka/taupe/blob/master/LICENSE",
9 | "version": "1.2.0",
10 | "author": [
11 | {
12 | "@type": "Person",
13 | "givenName": "Michael",
14 | "familyName": "Hucka",
15 | "affiliation": "California Institute of Technology Library",
16 | "email": "mhucka@caltech.edu",
17 | "@id": "https://orcid.org/0000-0001-9105-5960"
18 | }],
19 | "developmentStatus": "active",
20 | "downloadUrl": "https://github.com/mhucka/taupe/archive/master.zip",
21 | "keywords": [
22 | "software",
23 | ],
24 | "maintainer": "https://orcid.org/0000-0001-9105-5960",
25 | }
26 |
--------------------------------------------------------------------------------
/CHANGES.md:
--------------------------------------------------------------------------------
1 | # Change log for Taupe
2 |
3 | ## Version 1.2.0 (2022-11-23)
4 |
5 | This release only corrects inconsistent statements about the license terms of the software. There are no functional or other changes in this release.
6 |
7 |
8 | ## Version 1.1.0 (2022-11-22)
9 |
10 | This update brings more output format options. The option `--extract` now accepts many more values to control the output. For example, It is now possible to produce a plain list of URLs of your tweets. Please see the help text or the [README](https://github.com/mhucka/taupe#the-structure-of-the-output) for the details.
11 |
12 |
13 | ## Version 1.0.0 (2022-11-18)
14 |
15 | Changes since the last release:
16 | * The [README](https://github.com/mhucka/taupe/blob/main/README.md) has been edited and enhanced.
17 | * The help text printed for `taupe --help` has been edited to (hopefully) improve clarity.
18 |
19 |
20 | ## Version 0.0.1
21 |
22 | First release of complete working version.
23 |
--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | # -*- mode: sh; -*-
2 |
3 | # Set the default behavior, in case people don't have core.autocrlf set.
4 | # .............................................................................
5 |
6 | * text=auto
7 |
8 | # Specify what's text and should be normalized.
9 | # .............................................................................
10 |
11 | *.py text
12 | *.in text
13 | *.rst text
14 | *.cfg text
15 | *.ini text
16 | *.yml text
17 | *.json text
18 | *.bat text
19 | *.sh text
20 | LICENSE text
21 | CONTRIBUTING text
22 |
23 | # Denote all files that are truly binary and should not be modified.
24 | # .............................................................................
25 |
26 | *.png binary
27 | *.jpg binary
28 | *.xls binary
29 | *.doc binary
30 |
31 | # This next one is because in other projects, we've had problems with git
32 | # getting confused about line endings when people using Windows and Mac edit
33 | # the same files.
34 | # .............................................................................
35 |
36 | *.csv binary diff=csv
37 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Guidelines for contributing to this project
2 |
3 | Any constructive contributions – bug reports, pull requests (code or documentation), suggestions for improvements, and more – are welcome.
4 |
5 | ## Conduct
6 |
7 | Everyone is asked to read and respect the [code of conduct](CODE_OF_CONDUCT.md) before participating in this project.
8 |
9 | ## Coordinating work
10 |
11 | A quick way to find out what is currently in the near-term plans for this project is to look at the [GitHub issue tracker](https://github.com/mhucka/taupe/issues), but the possibilities are not limited to what you see there – if you have ideas for new features and enhancements, please feel free to write them up as a new issue or contact the developers directly!
12 |
13 | ## Submitting contributions
14 |
15 | Please feel free to contact the primary author (Mike Hucka) directly, or even better, jump right in and use the standard GitHub approach of forking the repo and creating a pull request. When committing code changes and submitting pull requests, please write a clear log message for your commits.
16 |
--------------------------------------------------------------------------------
/bin/taupe:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # =============================================================================
3 | # @file taupe
4 | # @brief Simple interface to run taupe, for testing and exploration
5 | # @author Michael Hucka
6 | # @license Please see the file named LICENSE in the project directory
7 | # @website https://github.com/mhucka/taupe
8 | # =============================================================================
9 |
10 | # Allow this program to be executed directly from the 'bin' directory.
11 | import os
12 | import sys
13 | import plac
14 |
15 | # Allow this program to be executed directly from the 'bin' directory.
16 | try:
17 | thisdir = os.path.dirname(os.path.abspath(__file__))
18 | sys.path.append(os.path.join(thisdir, '..'))
19 | except:
20 | sys.path.append('..')
21 |
22 | # Hand over to the command line interface.
23 | import taupe
24 | from taupe.__main__ import main as main
25 |
26 | if __name__ == '__main__':
27 | if len(sys.argv) > 1 and sys.argv[1] == 'help':
28 | plac.call(main, ['-h'])
29 | else:
30 | plac.call(main)
31 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Copyright (c) 2022 by Michael Hucka.
2 |
3 | Permission is hereby granted, free of charge, to any person obtaining a copy
4 | of this software and associated documentation files (the "Software"), to deal
5 | in the Software without restriction, including without limitation the rights
6 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7 | copies of the Software, and to permit persons to whom the Software is
8 | furnished to do so, subject to the following conditions:
9 |
10 | The above copyright notice and this permission notice shall be included in
11 | all copies or substantial portions of the Software.
12 |
13 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
19 | SOFTWARE.
20 |
--------------------------------------------------------------------------------
/taupe/__init__.py:
--------------------------------------------------------------------------------
1 | '''
2 | Taupe: Extract the URLs from your personal Twitter archive
3 |
4 | This file is part of https://github.com/mhucka/taupe/.
5 |
6 | Copyright (c) 2022 by Michael Hucka.
7 | This code is open-source software released under the MIT license.
8 | Please see the file "LICENSE" for more information.
9 | '''
10 |
11 | # Package metadata ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12 | #
13 | # ╭────────────────────── Notice ── Notice ── Notice ─────────────────────╮
14 | # | The following values are automatically updated at every release |
15 | # | by the Makefile. Manual changes to these values will be lost. |
16 | # ╰────────────────────── Notice ── Notice ── Notice ─────────────────────╯
17 |
18 | __version__ = '1.2.0'
19 | __description__ = 'Taupe: a tool to extract URLs from your personal Twitter archive'
20 | __url__ = 'https://github.com/mhucka/taupe'
21 | __author__ = 'Mike Hucka'
22 | __email__ = 'mhucka@caltech.edu'
23 | __license__ = 'MIT'
24 |
25 |
26 | # Miscellaneous utilities.
27 | # .............................................................................
28 |
29 | def print_version():
30 | print(f'{__name__} version {__version__}')
31 | print(f'Authors: {__author__}')
32 | print(f'URL: {__url__}')
33 | print(f'License: {__license__}')
34 |
--------------------------------------------------------------------------------
/taupe/exit_codes.py:
--------------------------------------------------------------------------------
1 | '''
2 | exit_codes.py: define exit codes for program return values
3 |
4 | This file is part of https://github.com/mhucka/taupe/.
5 |
6 | Copyright (c) 2022 by Michael hucka.
7 | This code is open-source software released under the MIT license.
8 | Please see the file "LICENSE" for more information.
9 | '''
10 |
11 | from aenum import Enum, MultiValue
12 |
13 |
14 | # I adapted the clever approach posted by the author of the Python aenum
15 | # package, Ethan Furman, to Stack Overflow on 2016-03-13 at
16 | # https://stackoverflow.com/a/35964875/743730
17 | # The most important bit is realizing you can define __int__().
18 |
19 | class ExitCode(Enum):
20 | '''Class of exit codes that this program may return.
21 |
22 | The numeric value of a given code can be obtained by using int(). For
23 | example, int(ExitCode.success) will produce 0.
24 | '''
25 |
26 | _init_ = 'value meaning'
27 | _settings_ = MultiValue
28 |
29 | success = 0, "success -- program completed normally"
30 | user_interrupt = 1, "the user interrupted the program's execution"
31 | bad_arg = 2, "encountered a bad or missing value for an option"
32 | file_error = 3, "encountered a problem with a file or directory"
33 | exception = 4, "a miscellaneous exception or fatal error occurred"
34 |
35 | def __int__(self):
36 | return self.value
37 |
--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | # =============================================================================
2 | # @file setup.cfg
3 | # @brief Package metadata and PyPI configuration
4 | # @created 2021-10-16
5 | # @license Please see the file named LICENSE in the project directory
6 | # @website https://github.com/mhucka/taupe
7 | # =============================================================================
8 |
9 | [metadata]
10 | name = taupe
11 | version = 1.2.0
12 | description = Taupe: a tool to extract URLs from your personal Twitter archive
13 | author = Mike Hucka
14 | author_email = mhucka@caltech.edu
15 | license = MIT
16 | license_files = LICENSE
17 | url = https://github.com/mhucka/taupe
18 | # The remaining items below are used by PyPI.
19 | project_urls =
20 | Source Code = https://github.com/mhucka/taupe
21 | Bug Tracker = https://github.com/mhucka/taupe/issues
22 | keywords = Python, applications
23 | classifiers =
24 | Development Status :: 3 - Alpha
25 | Environment :: Console
26 | License :: OSI Approved :: MIT License
27 | Intended Audience :: Science/Research
28 | Operating System :: MacOS :: MacOS X
29 | Operating System :: POSIX
30 | Operating System :: POSIX :: Linux
31 | Operating System :: Unix
32 | Programming Language :: Python
33 | Programming Language :: Python :: 3.8
34 | long_description = file:README.md
35 | long_description_content_type = text/markdown
36 |
37 | [options]
38 | packages = find:
39 | zip_safe = False
40 | python_requires = >= 3.8
41 |
42 | [options.entry_points]
43 | console_scripts =
44 | taupe = taupe.__main__:console_scripts_main
45 |
46 | [tool:pytest]
47 | pythonpath = .
48 |
49 |
--------------------------------------------------------------------------------
/.flake8:
--------------------------------------------------------------------------------
1 | # =========================================================== -*- conf-toml -*-
2 | # @file .flake8
3 | # @brief Project-wide Flake8 configuration
4 | # @created 2022-05-10
5 | # @license Please see the file named LICENSE in the project directory
6 | # @website https://github.com/caltechlibrary/foliage
7 | #
8 | # Note: as of version 4.0, flake8 does NOT read global configuration files
9 | # from ~/.flake8 or ~/.config/flake8. If you had such a config file of your
10 | # own, and you're looking at this config file and wondering how the two will
11 | # interaction, the answer is simple: they won't. Only this file matters.
12 | #
13 | # The following flake8 plugins are assumed to be installed:
14 | # flake8-bugbear
15 | # flake8-builtins
16 | # flake8-comprehensions
17 | # flake8-executable
18 | # flake8-implicit-str-concat
19 | # flake8-pie
20 | # flake8_simplify
21 | # =============================================================================
22 |
23 | [flake8]
24 | # I try to stick to 80 chars, but sometimes it's more readable to go longer.
25 | max-line-length = 90
26 |
27 | ignore =
28 | # We prefer to put spaces around the = in keyword arg lists.
29 | E251,
30 | # We prefer two lines between methods of a class.
31 | E303,
32 | # Sometimes we want to align keywords, and these rules run counter to it.
33 | E271,
34 | E221,
35 | # In some situations, it's more readable to omit spaces around operators
36 | # and colons.
37 | E203,
38 | E226,
39 | # According to Flake8 docs at https://www.flake8rules.com/rules/W503.html
40 | # line breaks *should* come before a binary operator, but as of version 4,
41 | # Flake8 still flags the breaks as bad. So:
42 | W503
43 | # I disagree wit this one.
44 | B005
45 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # =============================================================================
3 | # @file setup.py
4 | # @brief Installation setup file
5 | # @created 2022-11-18
6 | # @license Please see the file named LICENSE in the project directory
7 | # @website https://github.com/mhucka/taupe
8 | #
9 | # Note: configuration metadata is maintained in setup.cfg. This file exists
10 | # primarily to hook in setup.cfg and requirements.txt.
11 | # =============================================================================
12 |
13 | from setuptools import setup
14 |
15 |
16 | def requirements(file):
17 | from os import path
18 | required = []
19 | requirements_file = path.join(path.abspath(path.dirname(__file__)), file)
20 | if path.exists(requirements_file):
21 | with open(requirements_file, encoding='utf-8') as f:
22 | required = [ln for ln in filter(str.strip, f.read().splitlines())
23 | if not ln.startswith('#')]
24 | if any(item.startswith(('-', '.', '/')) for item in required):
25 | # The requirements.txt uses pip features. Try to use pip's parser.
26 | try:
27 | from pip._internal.req import parse_requirements
28 | from pip._internal.network.session import PipSession
29 | parsed = parse_requirements(requirements_file, PipSession())
30 | required = [item.requirement for item in parsed]
31 | except ImportError:
32 | # No pip, or not the expected version. Give up & return as-is.
33 | pass
34 | return required
35 |
36 |
37 | setup(
38 | setup_requires = ['wheel'],
39 | install_requires = requirements('requirements.txt'),
40 | extras_require={'dev': requirements('requirements-dev.txt')},
41 | )
42 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # =========================================================== -*- gitignore -*-
2 | # @file .gitignore
3 | # @brief Files and patterns for files and subdirs that git should ignore
4 | # @date 2022-10-16
5 | # @license Please see the file named LICENSE in the project directory
6 | # @website https://github.com/mhucka/taupe
7 | #
8 | # The approach I suggest is to add ONLY project-specific rules here. Put
9 | # rules that apply to your way of doing things (and the particular tools you
10 | # happen to use) into a global git ignore file as described in the section
11 | # "Configuring ignored files for all repositories on your computer" here:
12 | # https://docs.github.com/en/get-started/getting-started-with-git/ignoring-files
13 | # (accessed on 2022-07-14). For example, Emacs checkpoint and backup files are
14 | # things that are not specific to a given project; rather, Emacs users will
15 | # see them created everywhere, in all projects, because they're a byproduct
16 | # of using Emacs, not a consequence of working on a particular project. Thus,
17 | # they belong in a user's global ignores list, not in this project .gitignore.
18 | #
19 | # A useful starting point for global .gitignore file contents can be found at
20 | # https://github.com/github/gitignore/tree/main/Global (as of 2022-07-14).
21 | # =============================================================================
22 |
23 | # Python-specific things to ignore (relevant because this is a Python project).
24 | # .............................................................................
25 |
26 | __pycache__/
27 | *.py[cod]
28 | *$py.class
29 | *.egg-info/
30 | .eggs/
31 | .pytest_cache
32 | .coverage
33 |
34 | # Project-specific things to ignore:
35 | # .............................................................................
36 |
37 | build
38 | dist
39 | *.tmp
40 | *.bak
41 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | Contributor Covenant Code of Conduct
2 | ====================================
3 |
4 | ## Our Pledge
5 |
6 | In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
7 |
8 | ## Our Standards
9 |
10 | Examples of behavior that contributes to creating a positive environment include:
11 |
12 | * Using welcoming and inclusive language
13 | * Being respectful of differing viewpoints and experiences
14 | * Gracefully accepting constructive criticism
15 | * Focusing on what is best for the community
16 | * Showing empathy towards other community members
17 |
18 | Examples of unacceptable behavior by participants include:
19 |
20 | * The use of sexualized language or imagery and unwelcome sexual attention or advances
21 | * Trolling, insulting/derogatory comments, and personal or political attacks
22 | * Public or private harassment
23 | * Publishing others' private information, such as a physical or electronic address, without explicit permission
24 | * Other conduct which could reasonably be considered inappropriate in a professional setting
25 |
26 | ## Our Responsibilities
27 |
28 | Project contributors are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
29 |
30 | Project contributors have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
31 |
32 | ## Scope
33 |
34 | This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project contributors.
35 |
36 | ## Enforcement
37 |
38 | If a contributor engages in harassing behaviour, the project organizer(s) may take any action they deem appropriate, including warning the offender or expelling them from online forums, online project resources, face-to-face meetings, or any other project-related activity or resource.
39 |
40 | If you are being harassed, notice that someone else is being harassed, or have any other concerns, please contact a member of the project team immediately. Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
41 |
42 | ## Attribution
43 |
44 | Portions of this Code of Conduct were adapted from Electron's [Contributor Covenant Code of Conduct](https://github.com/electron/electron/blob/master/CODE_OF_CONDUCT.md), which itself was adapted from the [Contributor Covenant](http://contributor-covenant.org/version/1/4), version 1.4.
45 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | # =============================================================================
2 | # @file Makefile
3 | # @brief Makefile for some steps in creating new releases on GitHub
4 | # @date 2021-10-16
5 | # @license Please see the file named LICENSE in the project directory
6 | # @website https://github.com/mhucka/taupe
7 | # =============================================================================
8 |
9 | .ONESHELL: # Run all commands in the same shell.
10 | .SHELLFLAGS += -e # Exit at the first error.
11 |
12 | # This Makefile uses syntax that needs at least GNU Make version 3.82.
13 | # The following test is based on the approach posted by Eldar Abusalimov to
14 | # Stack Overflow in 2012 at https://stackoverflow.com/a/12231321/743730
15 |
16 | ifeq ($(filter undefine,$(value .FEATURES)),)
17 | $(error Unsupported version of Make. \
18 | This Makefile does not work properly with GNU Make $(MAKE_VERSION); \
19 | it needs GNU Make version 3.82 or later)
20 | endif
21 |
22 | # Before we go any further, test if certain programs are available.
23 | # The following is based on the approach posted by Jonathan Ben-Avraham to
24 | # Stack Overflow in 2014 at https://stackoverflow.com/a/25668869
25 |
26 | programs_needed = awk curl gh git jq sed python3 pyinstaller pandoc inliner create-dmg
27 | TEST := $(foreach p,$(programs_needed),\
28 | $(if $(shell which $(p)),_,$(error Cannot find program "$(p)")))
29 |
30 | # Set some basic variables. These are quick to set; we set additional
31 | # variables using "set-vars" but only when the others are needed.
32 |
33 | name := $(strip $(shell awk -F "=" '/^name/ {print $$2}' setup.cfg))
34 | version := $(strip $(shell awk -F "=" '/^version/ {print $$2}' setup.cfg))
35 | url := $(strip $(shell awk -F "=" '/^url/ {print $$2}' setup.cfg))
36 | desc := $(strip $(shell awk -F "=" '/^description / {print $$2}' setup.cfg))
37 | author := $(strip $(shell awk -F "=" '/^author / {print $$2}' setup.cfg))
38 | email := $(strip $(shell awk -F "=" '/^author_email/ {print $$2}' setup.cfg))
39 | license := $(strip $(shell awk -F "=" '/^license / {print $$2}' setup.cfg))
40 | platform := $(strip $(shell python3 -c 'import sys; print(sys.platform)'))
41 | os := $(subst $(platform),darwin,macos)
42 | branch := $(shell git rev-parse --abbrev-ref HEAD)
43 | initfile := $(name)/__init__.py
44 | distdir := dist/$(os)
45 | builddir := build/$(os)
46 |
47 |
48 | # Print help if no command is given ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
49 |
50 | help:
51 | @echo 'Available commands:'
52 | @echo ''
53 | @echo 'make'
54 | @echo 'make help'
55 | @echo ' Print this summary of available commands.'
56 | @echo ''
57 | @echo 'make report'
58 | @echo ' Print variables set in this Makefile from various sources.'
59 | @echo ' This is useful to verify the values that have been parsed.'
60 | @echo ''
61 | @echo 'make lint'
62 | @echo ' Run Python linters like flake8.'
63 | @echo ''
64 | @echo 'make test'
65 | @echo ' Run pytest.'
66 | @echo ''
67 | @echo 'make install'
68 | @echo ' Install the project in dev mode.'
69 | @echo ''
70 | @echo 'make release'
71 | @echo ' Do a release on GitHub. This will push changes to GitHub,'
72 | @echo ' open an editor to let you edit release notes, and run'
73 | @echo ' "gh release create" followed by "gh release upload".'
74 | @echo ' Note: this will NOT upload to PyPI, nor create binaries.'
75 | @echo ''
76 | @echo 'make packages'
77 | @echo ' Create the distribution files for PyPI.'
78 | @echo ' Do this manually to check that everything looks okay before.'
79 | @echo ' After doing this, do a "make test-pypi".'
80 | @echo ''
81 | @echo 'make test-pypi'
82 | @echo ' Upload distribution to test.pypi.org.'
83 | @echo ' Do this before doing "make pypi" for real.'
84 | @echo ''
85 | @echo 'make pypi'
86 | @echo ' Upload distribution to pypi.org.'
87 | @echo ''
88 | @echo 'make clean'
89 | @echo ' Clean up various files generated by this Makefile.'
90 | @echo ''
91 | @echo 'make really-clean'
92 | @echo ' Like "make clean", but more so.'
93 | @echo ''
94 | @echo 'make completely-clean'
95 | @echo ' The ultimate in cleaning.'
96 |
97 |
98 | # Gather additional values we sometimes need ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
99 | #
100 | # These variables take longer to compute, and for some actions like "make help"
101 | # they are unnecessary and annoying to wait for.
102 |
103 | .SILENT: vars
104 | vars:
105 | $(info Gathering data -- this takes a few moments ...)
106 | $(eval repo := $(strip $(shell gh repo view | head -1 | cut -f2 -d':')))
107 | $(eval api_url := https://api.github.com)
108 | $(eval id := $(shell curl -s $(api_url)/repos/$(repo) | jq '.id'))
109 | $(info Gathering data -- this takes a few moments ... Done.)
110 |
111 | report: vars
112 | @echo name = $(name)
113 | @echo version = $(version)
114 | @echo url = $(url)
115 | @echo desc = $(desc)
116 | @echo author = $(author)
117 | @echo email = $(email)
118 | @echo license = $(license)
119 | @echo branch = $(branch)
120 | @echo repo = $(repo)
121 | @echo id = $(id)
122 | @echo initfile = $(initfile)
123 | @echo distdir = $(distdir)
124 | @echo builddir = $(builddir)
125 |
126 |
127 | # make lint & make test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
128 |
129 | lint:
130 | flake8 taupe
131 |
132 | test tests:;
133 | pytest -v --cov=taupe -l tests/
134 |
135 |
136 | # make install ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137 |
138 | install:
139 | python3 install -e .[dev]
140 |
141 |
142 | # make release ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
143 |
144 | release: | test-branch release-on-github print-instructions
145 |
146 | test-branch:;
147 | ifneq ($(branch),main)
148 | $(error Current git branch != main. Merge changes into main first!)
149 | endif
150 |
151 | update-init:;
152 | @sed -i .bak -e "s|^\(__version__ *=\).*|\1 '$(version)'|" $(initfile)
153 | @sed -i .bak -e "s|^\(__description__ *=\).*|\1 '$(desc)'|" $(initfile)
154 | @sed -i .bak -e "s|^\(__url__ *=\).*|\1 '$(url)'|" $(initfile)
155 | @sed -i .bak -e "s|^\(__author__ *=\).*|\1 '$(author)'|" $(initfile)
156 | @sed -i .bak -e "s|^\(__email__ *=\).*|\1 '$(email)'|" $(initfile)
157 | @sed -i .bak -e "s|^\(__license__ *=\).*|\1 '$(license)'|" $(initfile)
158 |
159 | update-meta:;
160 | @sed -i .bak -e "/version/ s/[0-9].[0-9][0-9]*.[0-9][0-9]*/$(version)/" codemeta.json
161 |
162 | update-citation:;
163 | $(eval date := $(shell date "+%F"))
164 | @sed -i .bak -e "/^date-released/ s/[0-9][0-9-]*/$(date)/" CITATION.cff
165 | @sed -i .bak -e "/^version/ s/[0-9].[0-9][0-9]*.[0-9][0-9]*/$(version)/" CITATION.cff
166 |
167 | edited := codemeta.json $(initfile) CITATION.cff
168 |
169 | commit-updates:;
170 | git add $(edited)
171 | git diff-index --quiet HEAD $(edited) || \
172 | git commit -m"Update stored version number" $(edited)
173 |
174 | release-on-github: | vars update-init update-meta update-citation commit-updates
175 | $(eval tmp_file := $(shell mktemp /tmp/release-notes-$(name).XXXX))
176 | git push -v --all
177 | git push -v --tags
178 | @$(info ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓)
179 | @$(info ┃ Write release notes in the file that gets opened in your ┃)
180 | @$(info ┃ editor. Close the editor to complete the release process. ┃)
181 | @$(info ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛)
182 | sleep 2
183 | $(EDITOR) $(tmp_file)
184 | gh release create v$(version) -t "Release $(version)" -F $(tmp_file)
185 |
186 | print-instructions: vars
187 | @$(info ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓)
188 | @$(info ┃ Next steps: ┃)
189 | @$(info ┃ 1. Check https://github.com/$(repo)/releases )
190 | @$(info ┃ 2. Wait a few seconds to let web services do their work ┃)
191 | @$(info ┃ 3. Run "make packages" & check the results ┃)
192 | @$(info ┃ 4. Run "make test-pypi" to push to test.pypi.org ┃)
193 | @$(info ┃ 5. Check https://test.pypi.org/project/$(name) )
194 | @$(info ┃ 6. Run "make pypi" to push to pypi for real ┃)
195 | @$(info ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛)
196 |
197 | packages: vars
198 | -mkdir -p $(builddir) $(distdir)
199 | python3 setup.py sdist --dist-dir $(distdir)
200 | python3 setup.py bdist_wheel --dist-dir $(distdir)
201 | python3 -m twine check $(distdir)/$(name)-$(version).tar.gz
202 |
203 | # Note: for the next action to work, the repository "testpypi" needs to be
204 | # defined in your ~/.pypirc file. Here is an example file:
205 | #
206 | # [distutils]
207 | # index-servers =
208 | # pypi
209 | # testpypi
210 | #
211 | # [testpypi]
212 | # repository = https://test.pypi.org/legacy/
213 | # username = YourPyPIlogin
214 | # password = YourPyPIpassword
215 | #
216 | # You could copy-paste the above to ~/.pypirc, substitute your user name and
217 | # password, and things should work after that. See the following for more info:
218 | # https://packaging.python.org/en/latest/specifications/pypirc/
219 |
220 | test-pypi: packages
221 | python3 -m twine upload --verbose --repository testpypi \
222 | $(distdir)/$(name)-$(version)*.{whl,gz}
223 |
224 | pypi: packages
225 | python3 -m twine upload $(distdir)/$(name)-$(version)*.{gz,whl}
226 |
227 |
228 | # Cleanup and miscellaneous directives ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
229 |
230 | clean: clean-dist clean-build clean-release clean-other
231 | @echo ✨ Cleaned! ✨
232 |
233 | really-clean: clean really-clean-dist really-clean-build
234 |
235 | completely-clean: really-clean clean-other
236 | rm -rf build dist
237 |
238 | clean-build:;
239 | rm -rf $(builddir)/lib $(builddir)/bdist.*
240 |
241 | clean-dist: vars
242 | rm -fr $(distdir)/$(name) $(distdir)/$(name)-$(version)-py3-none-any.whl
243 |
244 | really-clean-build: clean-build
245 | rm -rf $(builddir)/*.*
246 |
247 | really-clean-dist: clean-dist
248 | rm -fr $(distdir)/*.*
249 |
250 | clean-release:;
251 | rm -rf $(name).egg-info codemeta.json.bak $(initfile).bak README.md.bak
252 |
253 | clean-other:;
254 | rm -fr __pycache__ $(name)/__pycache__ .eggs
255 | rm -rf .cache
256 | rm -rf .pytest_cache
257 |
258 | .PHONY: release release-on-github update-init update-meta update-citation \
259 | print-instructions packages clean test-pypi pypi extra-files dmg \
260 | pyinstaller clean clean-dist clean-build clean-release clean-other \
261 | really-clean really-clean-dist really-clean-build completely-clean
262 |
263 | .PHONY: help vars report release test-branch \
264 | update-init update-meta update-citation commit-updates \
265 | release-on-github print-instructions update-doi \
266 | packages test-pypi pypi clean really-clean completely-clean \
267 | clean-dist really-clean-dist clean-build really-clean-build \
268 | clean-release clean-other dmg pyinstaller extra-files
269 |
270 | .SILENT: clean clean-dist clean-build clean-release clean-other really-clean \
271 | really-clean-dist really-clean-build completely-clean
272 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Taupe
2 |
3 | A simple program to extract the URLs of your tweets, retweets, replies, quote tweets, and "likes" from a personal Twitter archive.
4 |
5 | [](https://choosealicense.com/licenses/mit)
6 | [](https://github.com/mhucka/taupe/releases)
7 |
8 |
9 | ## Table of contents
10 |
11 | * [Introduction](#introduction)
12 | * [Installation](#installation)
13 | * [Usage](#usage)
14 | * [Known issues and limitations](#known-issues-and-limitations)
15 | * [Relationships to other similar tools](#relationships-to-other-similar-tools)
16 | * [Getting help](#getting-help)
17 | * [Contributing](#contributing)
18 | * [License](#license)
19 | * [Acknowledgments](#authors-and-acknowledgments)
20 |
21 |
22 | ## Introduction
23 |
24 | When you [download your personal Twitter archive](https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive), you receive a [ZIP](https://en.wikipedia.org/wiki/ZIP_(file_format)) file. The contents are not necessarily in a format convenient for doing something with them. For example, you may want to send the URLs to the [Wayback Machine at the Internet Archive](https://archive.org/web/) or do something else with the URLs. For tasks like that, you need to extract URLs from your Twitter archive. That's the purpose of Taupe.
25 |
26 | _Taupe_ (a loose acronym of Twitter archive URL parser) takes a Twitter archive ZIP file, extracts the URLs corresponding to your tweets, retweets, replies, quote tweets, and liked tweets, and outputs the results in a [comma-separated values (CSV)](https://en.wikipedia.org/wiki/Comma-separated_values) format that you can easily use with other software tools. Once you have [installed it](#installation), using `taupe` is easy:
27 | ```shell
28 | # Extract tweets, retweets, replies, and quote tweets:
29 | taupe /path/to/your/twitter-archive.zip
30 |
31 | # Extract likes:
32 | taupe --extract likes /path/to/your/twitter-archive.zip
33 |
34 | # Learn more:
35 | taupe --help
36 | ```
37 |
38 | ## Installation
39 |
40 | There are multiple ways of installing Taupe. Please choose the alternative that suits you.
41 |
42 | ### _Alternative 1: installing Taupe using `pipx`_
43 |
44 | [Pipx](https://pypa.github.io/pipx/) lets you install Python programs in a way that isolates Python dependencies, and yet the resulting `taupe` command can be run from any shell and directory – like any normal program on your computer. If you use `pipx` on your system, you can install Taupe with the following command:
45 | ```sh
46 | pipx install taupe
47 | ```
48 |
49 | Pipx can also let you run Taupe directly using `pipx run taupe`, although in that case, you must always prefix every Taupe command with `pipx run`. Consult the [documentation for `pipx run`](https://github.com/pypa/pipx#walkthrough-running-an-application-in-a-temporary-virtual-environment) for more information.
50 |
51 |
52 | ### _Alternative 2: installing Taupe using `pip`_
53 |
54 | You should be able to install `taupe` with [`pip`](https://pip.pypa.io/en/stable/installing/) for Python 3. To install `taupe` from the [Python package repository (PyPI)](https://pypi.org), run the following command:
55 | ```sh
56 | python3 -m pip install taupe
57 | ```
58 |
59 | As an alternative to getting it from [PyPI](https://pypi.org), you can use `pip` to install `taupe` directly from GitHub:
60 | ```sh
61 | python3 -m pip install git+https://github.com/mhucka/taupe.git
62 | ```
63 |
64 | _If you already installed Taupe once before_, and want to update to the latest version, add `--upgrade` to the end of either command line above.
65 |
66 |
67 | ### _Alternative 3: installing Taupe from sources_
68 |
69 | If you prefer to install Taupe directly from the source code, you can do that too. To get a copy of the files, you can clone the GitHub repository:
70 | ```sh
71 | git clone https://github.com/mhucka/taupe
72 | ```
73 |
74 | Alternatively, you can download the software source files as a ZIP archive directly from your browser using this link:
75 |
76 | Next, after getting a copy of the files, run `setup.py` inside the code directory:
77 | ```sh
78 | cd taupe
79 | python3 setup.py install
80 | ```
81 |
82 |
83 | ## Usage
84 |
85 | If the installation process described above is successful, you should end up with a program named `taupe` in a location where software is normally installed on your computer. Running `taupe` should be as simple as running any other command-line program. For example, the following command should print a helpful message to your terminal:
86 | ```shell
87 | taupe --help
88 | ```
89 |
90 | If not given the option `--help` or `--version`, this program expects to be given a [personal Twitter archive file](https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive), either on the command line (as an argument) or on standard input (from a pipe or file redirection). Here's an example (and note this path is fake – substitute a real path on your computer when you do this!):
91 | ```shell
92 | taupe /path/to/twitter-archive.zip
93 | ```
94 |
95 | The URLs produced by `taupe` will be, by default, as they appear in the archive. If you want to [normalize the URLs](https://developer.twitter.com/en/blog/community/2020/getting-to-the-canonical-url-for-a-tweet) into the canonical form `https://twitter.com/twitter/status/TWEETID`, use the option `--canonical-urls` (`-c` for short):
96 | ```shell
97 | taupe -c /path/to/twitter-archive.zip
98 | ```
99 |
100 |
101 | ### The structure of the output
102 |
103 | The option `--extract` controls both the content and the format of the output. The following options are recognized:
104 |
105 | | Value | Synonym | Output |
106 | |------------------|----------------|--------|
107 | | `all-tweets` | `tweets` | CSV table with all tweets and details (default) |
108 | | `my-tweets` | | list of URLs of only your original tweets |
109 | | `retweets` | | list of URLs of tweets that are retweets |
110 | | `quoted-tweets` | `quote-tweets` | list of URLs of other tweets you quoted |
111 | | `replied-tweets` | `reply-tweets` | list of URLs of other tweets you replied to |
112 | | `liked` | `likes` | list of URLs of tweets you "liked" |
113 |
114 |
115 | #### `all-tweets`
116 |
117 | When using `--extract all-tweets` (the default), `taupe` produces a table with four columns. Each row of the table corresponds to a type of event in the Twitter timeline: a tweet, a retweet, a reply to another tweet, or a quote tweet. The values in the columns provide details about the event. The following is a summary of the structure:
118 |
119 | | Column 1 | Column 2 | Column 3 | Column 4 |
120 | |:-------------:|----------|----------|----------|
121 | | tweet timestamp in ISO format | The URL of the tweet | The type; one of `tweet`, `reply`, `retweet`, or `quote` | (For type `reply` or `quote`.) The URL of the original or source tweet |
122 |
123 | The last column only has a value for replies and quote-tweets; in those cases, the URL in the column refers to the tweet being replied to or the tweet being quoted. The fourth column does not have a value for retweets even though it would be desirable, because the Twitter archive – strangely – does not provide the URLs of retweeted tweets.
124 |
125 | Here is an example of the output:
126 | ```text
127 | 2022-09-21T22:36:29+00:00,https://twitter.com/mhucka/status/1572716422857658368,quote,https://twitter.com/poppy_northcutt/status/1572714310077673472
128 | 2022-10-10T22:04:20+00:00,https://twitter.com/mhucka/status/1579593701965582336,reply,https://twitter.com/arfon/status/1579572453726355456
129 | 2022-10-14T04:17:01+00:00,https://twitter.com/mhucka/status/1580774654217625600,tweet
130 | 2022-10-25T14:49:06+00:00,https://twitter.com/mhucka/status/1584919989307715586,retweet
131 | ...
132 | ```
133 |
134 | #### `my-tweets`
135 |
136 | When using `--extract my-tweets`, the output is just a single column (a list) of URLs, one per line, of just your original tweets. This list corresponds exactly to column 2 in the `--extract all-tweets` case above.
137 |
138 |
139 | #### `retweets`
140 |
141 | When using `--extract retweets`, the output is a single column (a list) of URLs, one per line, of tweets that are retweets of other tweets. This list corresponds to the values of column 2 above when the type is `retweet`. **Important**: the Twitter archive does not contain the original tweet's URL, only the URL of your retweet. Consequently, the output for `--extract retweets` is _your_ retweet's URL, not the URL of the source tweet.
142 |
143 |
144 | #### `quoted-tweets`
145 |
146 | When using `--extract quoted-tweets`, the output is a list of the URLs of other tweets that you have quoted. It corresponds to the subset of column 4 values above when the type is "quote". Note that these are the source tweet URLs, not the URLs of your tweets.
147 |
148 |
149 | #### `replied-tweets`
150 |
151 | When using `--extract replied-tweets`, the output is a list of the URLs of other tweets that you have replied to. It corresponds to the subset of column 4 values above when the type is "reply". Note that these are the source tweet URLs, not the URLs of your tweets.
152 |
153 |
154 | #### `likes`
155 |
156 | When using the option `--extract likes`, the output will only contain one column: the URLs of the "liked" tweets. `taupe` cannot provide more detail because the Twitter archive format does not contain date/time information for "likes". (This is also why "likes" are _not_ part of the output when `--extract all-tweets` is used – there is no possible value for column 1.)
157 |
158 | Here is an example of the output when using `--extract likes` in combination with `--canonical-urls`:
159 | ```
160 | https://twitter.com/twitter/status/1588146224376463365
161 | https://twitter.com/twitter/status/1588349144803905536
162 | https://twitter.com/twitter/status/1590475356976578560
163 | ...
164 | ```
165 |
166 |
167 | ### Other options recognized by `taupe`
168 |
169 | Running `taupe` with the option `--help` will make it print help text and exit without doing anything else.
170 |
171 | The option `--output` controls where `taupe` writes the output. If the value given to `--output` is `-` (a single dash), the output is written to the terminal (stdout). Otherwise, the value must be a file.
172 |
173 | If given the `--version` option, this program will print its version and other information, and exit without doing anything else.
174 |
175 | If given the `--debug` argument, `taupe` will output a detailed trace of what it is doing. The debug trace will be sent to the given destination, which can be `-` to indicate console output, or a file path to send the debug output to a file.
176 |
177 | ### _Summary of command-line options_
178 |
179 | The following table summarizes all the command line options available.
180 |
181 | | Short | Long form opt | Meaning | Default | |
182 | |---------------|------------------------|----------------------|---------|---|
183 | | `-c` | `--canonical-urls` | Normalize Twitter URLs | Leave as-is| |
184 | | `-h` | `--help` | Print help info and exit | | |
185 | | `-e` _E_ | `--extract` _E_ | Extract URL type _E_ | `all-tweets` | ⚑ |
186 | | `-o` _O_ | `--output` _O_ | Write output to file _O_ | Terminal | ✦ |
187 | | `-V` | `--version` | Print program version & exit | | |
188 | | `-@` _OUT_ | `--debug` _OUT_ | Write debug output to _OUT_ | | ⚐ |
189 |
190 | ⚑ Recognized values: `all-tweets`, `tweets`, `my-tweets`, `retweets`, `quoted-tweets`, `replied-tweets`, and `likes`. See [section above](#the-structure-of-the-output) for more information.
191 | ✦ To write to the console, you can also use the character `-` as the value of _O_; otherwise, _O_ must be the name of a file where the output should be written.
192 | ⚐ To write to the console, use the character `-` as the value of _OUT_; otherwise, _OUT_ must be the name of a file where the output should be written.
193 |
194 |
195 | ## Known issues and limitations
196 |
197 | This program assumes that the Twitter archive ZIP file is in the format which Twitter produced in mid-November 2022. Twitter probably used a different format in the past, and may change the format again in the future, so `taupe` may or may not work on Twitter archives obtained in different historical periods.
198 |
199 | The Twitter archive format for "likes" contains only the tweet identifier and the text of the tweet; consequently, `taupe` cannot provide date/time information for this case.
200 |
201 | This program does all its work in memory, which means that `taupe`'s ability to process a given archive depends on its size and how much RAM the computer has. It has only been tested with modest-sized archives. It is unknown how it will behave with exceptionally large archives.
202 |
203 |
204 | ## Relationships to other similar tools
205 |
206 | To the author's knowledge, Taupe is the only tool that will directly and easily extract the URLs of tweets and "likes" from a Twitter archive ZIP file. There do exist other software tools for working with Twitter archives; the following is a (possibly incomplete) list:
207 | * [twitter-archive-parser](https://github.com/timhutton/twitter-archive-parser) – convert the contents of a Twitter archive into and extract other information such as lists of followers.
208 | * [Save Your Threads](https://archive.social) – lets you download signed PDFs of Twitter URLs.
209 | * [tweetback Twitter Archive](https://github.com/tweetback/tweetback) – "Take ownership of your Twitter data".
210 | * [twitter-tools](https://github.com/selfawaresoup/twitter-tools) – perform various operations such as get details about specific tweets using the Twitter API
211 | * [Twitter-Archive](https://github.com/jarulsamy/Twitter-Archive) – a Python CLI tool to download media from bookmarked tweets.
212 | * [get_twitter_bookmarks.py](https://gist.github.com/divyajyotiuk/9fb29c046e1dfcc8d5683684d7068efe#file-get_twitter_bookmarks_v3-py) – extract the URLs from bookmarked tweets; requires first using your web browser's developer interface to grab Twitter's bookmarks JSON data.
213 | * [archive.alt-text.org](https://github.com/alt-text-org/www.alt-text.org) – a tool for saving the alt text you've written on Twitter.
214 | * [twitter-archive-tweets](https://observablehq.com/@enjalot/twitter-archive-tweets) – a notebook to use as a starting point for processing tweets from your Twitter archive.
215 | * [fork of TWINT](https://github.com/woluxwolu/twint) – a fork of the now-defunct [Twitter Intelligence Tool](https://github.com/twintproject/twint).
216 | * [pleroma-bot](https://github.com/robertoszek/pleroma-bot) – bot for mirroring your favorite Twitter accounts in the Fediverse as well as migrating your own to the Fediverse using a Twitter archive.
217 | * [twitter-archive-analysis](https://github.com/dangoldin/twitter-archive-analysis) – a script to analyze your Twitter archive.
218 | * [twitter-archive-reader](https://github.com/alkihis/twitter-archive-reader) – explore tweets, DMs, media and more in a Twitter archive.
219 | * [twitter-archive-parser](https://github.com/leandrojmp/twitter-archive-converter) – extract tweets from a Twitter archive.
220 |
221 |
222 | ## Getting help
223 |
224 | If you find a problem or have a request or suggestion, please submit it in [the GitHub issue tracker](https://github.com/mhucka/taupe/issues) for this repository.
225 |
226 |
227 | ## Contributing
228 |
229 | I would be happy to receive your help and participation if you are interested. Everyone is asked to read and respect the [code of conduct](CONDUCT.md) when participating in this project. Please feel free to [report issues](https://github.com/mhucka/taupe/issues) or do a [pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests) to fix bugs or add new features.
230 |
231 |
232 | ## License
233 |
234 | This software is Copyright (C) 2022, by Michael Hucka. This software is freely distributed under the MIT license. Please see the [LICENSE](LICENSE) file for more information.
235 |
236 |
237 | ## Acknowledgments
238 |
239 | This work is a personal project developed by the author, using computing equipment owned by the [California Institute of Technology Library](https://www.library.caltech.edu).
240 |
241 | The [vector artwork](https://thenounproject.com/icon/bird-233023/) of a bird, used as the icon for this repository, was created by [Noe Araujo](https://thenounproject.com/noearaujo/) from the Noun Project. It is licensed under the Creative Commons [CC-BY 3.0](https://creativecommons.org/licenses/by/3.0/) license. I manually changed the color to be a shade of taupe.
242 |
243 | Taupe uses multiple other open-source packages, without which it would have taken much longer to write the software. I want to acknowledge this debt. In alphabetical order, the packages are:
244 | * [Aenum](https://github.com/ethanfurman/aenum) – Python package for advanced enumerations
245 | * [CommonPy](https://github.com/caltechlibrary/commonpy) – a collection of commonly-useful Python functions
246 | * [Plac](https://github.com/ialbert/plac) – a command line argument parser
247 | * [Rich](https://github.com/Textualize/rich) – library for writing styled text to the terminal
248 | * [Sidetrack](https://github.com/caltechlibrary/sidetrack) – simple debug logging/tracing package
249 | * [Twine](https://github.com/pypa/twine) – utilities for publishing Python packages on [PyPI](https://pypi.org)
250 |
--------------------------------------------------------------------------------
/taupe/__main__.py:
--------------------------------------------------------------------------------
1 | '''
2 | Taupe: Extract the URLs from your personal Twitter archive
3 |
4 | This file is part of https://github.com/mhucka/taupe/.
5 |
6 | Copyright (c) 2022 by Michael Hucka.
7 | This code is open-source software released under the MIT license.
8 | Please see the file "LICENSE" for more information.
9 | '''
10 |
11 | import sys
12 | if sys.version_info <= (3, 8):
13 | print('taupe requires Python version 3.8 or higher,')
14 | print('but the current version is ' + str(sys.version_info.major)
15 | + '.' + str(sys.version_info.minor) + '.')
16 | exit(1)
17 |
18 | # Note: this code uses lazy loading. Additional imports are made later.
19 | from commonpy.data_structures import CaseFoldDict
20 | import errno
21 | import plac
22 | from sidetrack import set_debug, log
23 |
24 | from .exit_codes import ExitCode
25 |
26 |
27 | # Constants.
28 | # .............................................................................
29 |
30 | # Mapping of recognized --extract argument values to canonical names.
31 | EXTRACT_OPTIONS = CaseFoldDict({'all-tweets' : 'all-tweets',
32 | 'tweets' : 'all-tweets',
33 | 'my-tweets' : 'my-tweets',
34 | 'my-tweet' : 'my-tweets',
35 | 'my' : 'my-tweets',
36 | 'mine' : 'my-tweets',
37 | 'retweets' : 'retweets',
38 | 'retweet' : 'retweets',
39 | 'quoted-tweets' : 'quote-tweets',
40 | 'quote-tweets' : 'quote-tweets',
41 | 'quoted' : 'quote-tweets',
42 | 'replied-tweets' : 'reply-tweets',
43 | 'reply-tweets' : 'reply-tweets',
44 | 'replied' : 'reply-tweets',
45 | 'reply' : 'reply-tweets',
46 | 'likes' : 'likes',
47 | 'liked' : 'likes',
48 | 'like' : 'likes'})
49 |
50 | # Main program.
51 | # .............................................................................
52 |
53 | @plac.annotations(
54 | canonical_urls = ('convert URLs to canonical Twitter URL form' , 'flag' , 'c'),
55 | extract = ('extract info "E" (default: tweets)' , 'option', 'e'),
56 | output = ('write output to destination "O" (default: stdout)', 'option', 'o'),
57 | version = ('print program version info and exit' , 'flag' , 'V'),
58 | debug = ('write debug trace to "OUT" ("-" for console)' , 'option', '@'),
59 | archive_file = 'path to Twitter archive ZIP file',
60 | )
61 | def main(canonical_urls = False, extract = 'E', output = 'O', version = False,
62 | debug = 'OUT', *archive_file):
63 | '''Taupe extracts URLs from your downloaded personal Twitter archive.
64 |
65 | At its most basic, taupe ("Twitter Archive Url ParsEr") expects to be given
66 | the path to a Twitter archive ZIP file from which it should extract the URLs
67 | of tweets, replies, retweets, and quote tweets, and print the results:
68 |
69 | taupe /path/to/twitter-archive.zip
70 |
71 | If instead you want taupe to extract the URLs of "liked" tweets (see the next
72 | section for the difference), use the optional argument '--extract likes':
73 |
74 | taupe --extract likes /path/to/twitter-archive.zip
75 |
76 | The URLs produced by taupe will be, by default, as they appear in the archive,
77 | which means they will have account names in them. If you prefer to normalize
78 | the URLs to the canonical form https://twitter.com/twitter/status/TWEETID, use
79 | the optional argument '--canonical-urls':
80 |
81 | taupe --canonical-urls /path/to/twitter-archive.zip
82 |
83 | If you want to send the output to a file instead of the terminal, you can use
84 | the option '--output' and give it a destination file:
85 |
86 | taupe --output /tmp/urls.txt --canonical-urls /path/to/twitter-archive.zip
87 |
88 | The structure of the output
89 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~
90 |
91 | The option '--extract' controls both the content and the format of the output.
92 | The following options are recognized:
93 |
94 | Value Synonym Output
95 | ------------ ------- -----------------------------------------------
96 | all-tweets tweets CSV table with all tweets and details (default)
97 | my-tweets list of URLs of only your original tweets
98 | retweets list of URLs of tweets that are retweets
99 | quoted-tweets quote-tweets list of URLs of (other) tweets you quoted
100 | replied-tweets reply-tweets list of URLs of (other) tweets you replied to
101 |
102 | liked likes list of URLs of tweets you "liked"
103 |
104 | When using '--extract all-tweets' (the default), taupe produces a table with
105 | four columns. Each row of the table corresponds to a tweet of some kind. The
106 | values in the columns provide details:
107 |
108 | Column 1 Column 2 Column 3 Column 4
109 | -------- -------- ------------- ---------------------------------
110 | timestamp tweet URL type of tweet URL of quoted or replied-to tweet
111 |
112 | The last column only has a value for replies and quote-tweets; in those cases,
113 | it provides the URL of the tweet being replied to or the tweet being quoted.
114 | The fourth column does not have a value for retweets even though it would be
115 | desirable, because the Twitter archive (strangely) does not provide the
116 | URLs of retweeted tweets. Note also that this format does NOT include your
117 | "liked" tweets; those are available using a different option described below.
118 |
119 | When using '--extract my-tweets', the output is just a single column (a list)
120 | of URLs, one per line, corresponding to just your original tweets. This list
121 | corresponds exactly to column 2 in the '--extract all-tweets' case above.
122 |
123 | When using '--extract retweets', the output is a single column (a list) of
124 | URLs, one per line, of tweets that are retweets of other tweets. This list
125 | corresponds to the values of column 2 above when the type is 'retweet'.
126 | IMPORTANT: the Twitter archive does not contain the original tweet's URL,
127 | only the URL of your retweet. Consequently, the output of '--extract retweets'
128 | is YOUR retweet's URL, not the URL of the source tweet.
129 |
130 | When using '--extract quoted-tweets', the output is a list of the URLs of
131 | other people's tweets that you have quoted. It corresponds to the subset of
132 | column 4 values above when the type is "quote"; i.e., the source tweet URL,
133 | not the URL of your tweet.
134 |
135 | When using '--extract replied-tweets', the output is a list of the URLs of
136 | other people's tweets that you have replied to. It corresponds to the subset
137 | of column 4 values above when the type is "reply"; i.e., the source tweet URL,
138 | not the URL of your tweet.
139 |
140 | Finally, when using '--extract likes', the output will contain a list of the
141 | URLs of tweets you have "liked" on Twitter. Taupe cannot provide more details
142 | (not even timestamps) because the Twitter archive format does not contain the
143 | information.
144 |
145 | Other options recognized by taupe
146 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147 |
148 | Running taupe with the option '--help' will make it print help text and exit
149 | without doing anything else.
150 |
151 | The option '--output' controls where taupe writes the output. If the value
152 | given to '--output' is "-" (a single dash), the output is written to the
153 | terminal (stdout). Otherwise, the value must be a file.
154 |
155 | If given the '--version' option, this program will print its version and other
156 | information, and exit without doing anything else.
157 |
158 | If given the '--debug' argument, taupe will output details about what it is
159 | doing. The debug trace will be sent to the given destination, which can be "-"
160 | to indicate console output, or a file path to send the debug output to a file.
161 |
162 | Return values
163 | ~~~~~~~~~~~~~
164 |
165 | Taupe exits with a return code of 0 if no problem is encountered. Otherwise,
166 | it returns a nonzero value. The following table lists the possible values:
167 |
168 | 0 = success -- program completed normally
169 | 1 = the user interrupted the program's execution
170 | 2 = encountered a bad or missing value for an option
171 | 3 = file error -- encountered a problem with a file
172 | 4 = an exception or fatal error occurred
173 |
174 | Command-line options summary
175 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
176 | '''
177 |
178 | # Process arguments & handle early exits ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
179 |
180 | debugging = (debug != 'OUT')
181 | if debugging:
182 | set_debug(True, debug)
183 | import faulthandler
184 | faulthandler.enable()
185 |
186 | if version:
187 | from taupe import print_version
188 | print_version()
189 | sys.exit(int(ExitCode.success))
190 |
191 | log('starting.')
192 | log('command line: ' + str(sys.argv))
193 |
194 | extract = 'all-tweets' if extract == 'E' else extract
195 | if extract not in EXTRACT_OPTIONS:
196 | stop('Unrecognized value for --extract option: ' + extract, ExitCode.bad_arg)
197 | else:
198 | requested = EXTRACT_OPTIONS[extract]
199 |
200 | archive_file = '-' if not archive_file else archive_file[0]
201 | if archive_file == '-' and sys.stdin.isatty():
202 | stop('Need archive as argument or via pipe/redirection.', ExitCode.bad_arg)
203 | elif archive_file != '-':
204 | from commonpy.file_utils import readable
205 | from os.path import exists, isfile
206 | if not exists(archive_file):
207 | stop(f'Path does not appear to exist: {archive_file}', ExitCode.bad_arg)
208 | if not isfile(archive_file):
209 | stop(f'Path is not a file: {archive_file}', ExitCode.bad_arg)
210 | if not readable(archive_file):
211 | stop(f'File is not readable: {archive_file}', ExitCode.file_error)
212 |
213 | output = '-' if output == 'O' else output
214 | if output != '-':
215 | from commonpy.file_utils import writable
216 | if not writable(output):
217 | stop(f'Unable to write to destination: {output}', ExitCode.file_error)
218 |
219 | # Do the main work --------------------------------------------------------
220 |
221 | exit_code = ExitCode.success
222 | try:
223 | if archive_file == '-':
224 | log('reading archive from stdin')
225 | import io
226 | archive_file = io.BytesIO(sys.stdin.buffer.read())
227 |
228 | data = parsed_data(archive_file, requested, canonical_urls)
229 | filtered_data = filter(None, map(data_filter(requested), data))
230 | write_data(filtered_data, output)
231 | except KeyboardInterrupt:
232 | # Catch it, but don't treat it as an error; just stop execution.
233 | log('keyboard interrupt received')
234 | exit_code = ExitCode.user_interrupt
235 | except Exception as ex: # noqa: PIE786
236 | exit_code = ExitCode.exception
237 | import traceback
238 | exception = sys.exc_info()
239 | details = ''.join(traceback.format_exception(*exception))
240 | log('exception: ' + str(ex) + '\n\n' + details)
241 | if debugging and debug == '-':
242 | from rich.console import Console
243 | Console().print_exception()
244 | else:
245 | import taupe
246 | line = 'unknown'
247 | tb = ex.__traceback__
248 | while tb.tb_next:
249 | tb = tb.tb_next
250 | line = tb.tb_lineno
251 | stop('Oh no! Taupe encountered an error. Please consider reporting'
252 | f' this to the developer. Your version of {taupe.__name__} is'
253 | f' {taupe.__version__} and the error occurred on line {line}.'
254 | f' For information about how to report this, please see the'
255 | f' project page at ' + taupe.__url__)
256 |
257 | # Exit with status code ---------------------------------------------------
258 |
259 | log(f'exiting with exit code {int(exit_code)}.')
260 | sys.exit(int(exit_code))
261 |
262 |
263 | # Miscellaneous helpers.
264 | # .............................................................................
265 |
266 | # The functions for extracting URLs from the .js files (currently only likes.js
267 | # and tweets.js) return a common intermediate format consisting of a generator
268 | # that produces 4-tuples:
269 | #
270 | # (date, url of my tweet, type, url of referenced tweet)
271 | #
272 | # The "type" can be one of "tweet", "reply", "retweet", "quote", or "like".
273 | # Some of the slots in the tuple are not filled in for all types. Notably, if
274 | # the type is "likes", the date and tweet url are empty (because for a "liked"
275 | # tweet, it only makes sense to talk about the referenced tweet's URL).
276 | # Conversely, if we're not extracting "likes", then the referenced tweet url
277 | # slot only has a value for types "quote" and "retweet".
278 | #
279 | # This kind of funneling of all types into a common intermediate form, even
280 | # though there is heterogeneity in the underlying data, is done to shorten
281 | # and simplify the code and not really for performance reasons. Performance
282 | # is currently not a concern because the expectation is that users won't run
283 | # this program very often anyway.
284 |
285 | def data_filter(requested):
286 | return {
287 | 'all-tweets' : lambda row: ','.join(row),
288 | 'my-tweets' : lambda row: row[1],
289 | 'retweets' : lambda row: row[1] if row[2] == 'retweet' else '',
290 | 'quote-tweets': lambda row: row[3] if row[2] == 'quote' else '',
291 | 'reply-tweets': lambda row: row[3] if row[2] == 'reply' else '',
292 | 'likes' : lambda row: row[3],
293 | }.get(requested)
294 |
295 |
296 | def likes_from(likes_file, username, canonical_urls = False):
297 | '''Return the URLs from the likes.js file in a Twitter archive.'''
298 | import json
299 | # The file starts with "window.YTD.like.part0 = ". Skip that and it's json.
300 | likes_json = json.loads(likes_file[23:])
301 | log(f'extracted {len(likes_json)} likes from the likes file')
302 | likes_urls = (item['like']['expandedUrl'] for item in likes_json)
303 | account = 'twitter' if canonical_urls else username
304 | # Return the same 4-tuple format as tweets_from(...).
305 | return (('', '', 'like', url.replace('i/web', account)) for url in likes_urls)
306 |
307 |
308 | def tweets_from(tweets_file, username, canonical_urls = False):
309 | '''Return tuples of parsed data from tweets.js in a Twitter archive.'''
310 | from dateutil.parser import parse
311 | import json
312 | import re
313 |
314 | ending_in_twitter_url = re.compile(r'.*(https://t.co/\S+)$')
315 |
316 | # Helper functions.
317 |
318 | def user_from_tweet_url(url):
319 | if canonical_urls:
320 | return 'twitter'
321 | else:
322 | # Extract USERNAME from https://twitter.com/USERNAME/status/TWEETID
323 | fragment = url[20:]
324 | return fragment[: fragment.find('/')]
325 |
326 | def tweet_url(tweet):
327 | account = 'twitter' if canonical_urls else username
328 | return 'https://twitter.com/' + account + '/status/' + tweet['id_str']
329 |
330 | def tweet_date(tweet):
331 | date = parse(tweet['created_at'])
332 | return date.isoformat()
333 |
334 | def tweet_data(tweet):
335 | tdate = tweet_date(tweet)
336 | turl = tweet_url(tweet)
337 |
338 | # Figure out the type & extracting reference URLs. Look for specific
339 | # cases; default case is normal tweet, possibly with embedded media.
340 | ttype = 'tweet'
341 | tref = ''
342 | if tweet.get('in_reply_to_status_id_str', None):
343 | # Easiest case: replies.
344 | ttype = 'reply'
345 | if canonical_urls:
346 | author = 'twitter'
347 | elif 'in_reply_to_screen_name' not in tweet:
348 | # This happens if the tweet being replied to has been deleted.
349 | log(f'reply tweet {tweet["id"]} refers to a deleted tweet')
350 | author = 'twitter'
351 | else:
352 | author = tweet['in_reply_to_screen_name']
353 | tweet_id = tweet['in_reply_to_status_id_str']
354 | tref = 'https://twitter.com/' + author + '/status/' + tweet_id
355 | elif tweet['full_text'].startswith('RT @'):
356 | ttype = 'retweet'
357 | # In my archive, the full_text of retweeted tweets is truncated,
358 | # and the tweet object doesn't contain the retweeted tweet's id
359 | # or a URL. (This despite that when I look up my retweet on
360 | # Twitter, it shows info about the original tweet.) The archive is
361 | # thus incomplete and I see no way to get the retweeted tweet's id.
362 | tref = ''
363 | elif (match := ending_in_twitter_url.match(tweet['full_text'])):
364 | # This can be either a quote tweet or just a tweet with media in it.
365 | embedded_url = match.group(1)
366 | for entity in tweet['entities']['urls']:
367 | if entity['url'] != embedded_url:
368 | continue
369 | # Found the entity info for the URL we pulled from the text.
370 | expanded_url = entity['expanded_url']
371 | if not expanded_url.startswith('https://twitter.com'):
372 | # This is not a quote tweet after all.
373 | break
374 | author = user_from_tweet_url(expanded_url)
375 | tweet_id = expanded_url[expanded_url.rfind('/') + 1:]
376 | tref = 'https://twitter.com/' + author + '/status/' + tweet_id
377 | ttype = 'quote'
378 | break
379 |
380 | return (tdate, turl, ttype, tref)
381 |
382 | # The 26 is to skip the "window.YTD.tweets.part0 =" text at the start.
383 | all_tweets = json.loads(tweets_file[26:])
384 | log(f'found a total of {len(all_tweets)} tweets in the tweets file')
385 | return sorted(tweet_data(tweet_json['tweet']) for tweet_json in all_tweets)
386 |
387 |
388 | def username_from(account_file):
389 | '''Return the "username" from the account.js file in a Twitter archive.'''
390 | import json
391 | # The file starts w/ "window.YTD.account.part0 = ". Skip it; rest is json.
392 | account_json = json.loads(account_file[27:])
393 | username = account_json[0]['account']['username']
394 | log(f'found username "{username}"')
395 | return username
396 |
397 |
398 | def parsed_data(source_zip, requested, canonical_urls):
399 | from zipfile import is_zipfile, ZipFile, BadZipFile, LargeZipFile
400 | if not is_zipfile(source_zip):
401 | stop('The input does not appear to be a ZIP file.', ExitCode.bad_arg)
402 | log(f'parsing Twitter data to extract {requested}')
403 | try:
404 | username = None
405 | with ZipFile(source_zip) as zf:
406 | # First find the account name because we need it to construct URLs.
407 | for item in zf.namelist():
408 | if item == 'data/account.js':
409 | with zf.open(item) as file_:
410 | username = username_from(file_.read())
411 | break
412 | if not username:
413 | stop('Cannot find account.js file in ' + source_zip, ExitCode.file_error)
414 |
415 | # Now extract the tweets.
416 | for item in zf.namelist():
417 | if item == 'data/like.js' and requested == 'likes':
418 | with zf.open(item) as file_:
419 | return likes_from(file_.read(), username, canonical_urls)
420 | break
421 | elif item == 'data/tweets.js':
422 | with zf.open(item) as file_:
423 | return tweets_from(file_.read(), username, canonical_urls)
424 | break
425 | log('done parsing Twitter data')
426 | except BadZipFile:
427 | stop('Unable to parse ZIP archive.', ExitCode.file_error)
428 | except LargeZipFile:
429 | stop('Unable to parse very large ZIP archive.', ExitCode.file_error)
430 |
431 |
432 | def write_data(rows, dest):
433 | log(f'writing output to {dest}')
434 | try:
435 | if dest == '-':
436 | print(*rows, flush = True, sep = '\n')
437 | sys.stdout.flush()
438 | else:
439 | with open(dest, 'w') as output:
440 | output.write('\n'.join(rows))
441 | except IOError as ex:
442 | # Check for broken pipe, as happens when the output is sent to "head".
443 | if ex.errno == errno.EPIPE:
444 | log('broken pipe')
445 | import os
446 | # This solution comes from a 2015-05-07 posting by user "mklement0"
447 | # to Stack Overflow at https://stackoverflow.com/a/30091579/743730.
448 | # Python flushes standard streams on exit, so redirect remaining
449 | # output to devnull to avoid another BrokenPipeError at shutdown.
450 | devnull = os.open(os.devnull, os.O_WRONLY)
451 | os.dup2(devnull, sys.stdout.fileno())
452 | else:
453 | # A real error, not merely a broken pipe. Bubble up to caller.
454 | raise
455 |
456 |
457 | def stop(msg, err = ExitCode.exception):
458 | '''Print an error message and exit with an exit code.'''
459 | log('printing to terminal: ' + msg)
460 | from rich import print
461 | print('[red]' + msg + '[/]')
462 | log(f'exiting with exit code {int(err)}.')
463 | sys.exit(int(err))
464 |
465 |
466 | # Main entry point.
467 | # .............................................................................
468 |
469 | # The following entry point definition is for the console_scripts keyword
470 | # option to setuptools. The entry point for console_scripts has to be a
471 | # function that takes zero arguments.
472 | def console_scripts_main():
473 | plac.call(main)
474 |
475 |
476 | # The following allows users to invoke this using "python3-m taupe" and also
477 | # pass it an argument of "help" to get the help text.
478 | if __name__ == '__main__':
479 | if len(sys.argv) > 1 and sys.argv[1] == 'help':
480 | plac.call(main, ['-h'])
481 | else:
482 | plac.call(main)
483 |
484 |
485 | # For Emacs users
486 | # .............................................................................
487 | # Local Variables:
488 | # mode: python
489 | # python-indent-offset: 4
490 | # End:
491 |
--------------------------------------------------------------------------------