├── .gitignore
├── LICENSE
├── README.md
├── horsephrase
    ├── __init__.py
    ├── __main__.py
    ├── _guess_guess.py
    ├── _implementation.py
    ├── _regen_words.py
    ├── py.typed
    └── words.txt
├── mypy.ini
├── requirements.txt
├── setup.cfg
├── setup.py
├── tests
    └── test_horsephrase.py
└── tox.ini


/.gitignore:
--------------------------------------------------------------------------------
1 | *.egg-info
2 | /build
3 | /dist
4 | .tox
5 | *.pyc


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2014 Glyph
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 6 | this software and associated documentation files (the "Software"), to deal in
 7 | the Software without restriction, including without limitation the rights to
 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 9 | the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Horsephrase
 2 | 
 3 | Horsephrase is a human-readable password generator.
 4 | 
 5 | [XKCD makes some good points about password entropy](http://xkcd.com/936/), and
 6 | I thought I'd create a tool to help follow that advice.  It has been updated
 7 | somewhat from the XKCD strip's guidance.  For example, "a thousand guesses per
 8 | second" is an extremely low number; `horsephrase` instead assumes attackers can
 9 | perform a trillion guesses per second.
10 | 
11 | ## When To Use Horsephrase
12 | 
13 | For as many of your passwords as possible, you do *not* want to try to
14 | creatively, or randomly, come up with new ones.  You cannot possibly remember
15 | all the passwords a normal person needs to use. You should be using a password
16 | manager, such as [Dashlane](https://www.dashlane.com),
17 | [LastPass](https://lastpass.com), [KeePass](http://keepass.info) or
18 | [1Password](https://agilebits.com/onepassword).
19 | 
20 | For *most* of your passwords, you should just be using your password manager's
21 | "generate" function to generate passwords which are long, totally random line
22 | noise that you could not possibly remember and could not easily communicate
23 | without copying and pasting.
24 | 
25 | However, ultimately you need a *few* passwords you can remember and possibly
26 | pronounce:
27 | 
28 | 1. an unlock code for your phone, which you have to type in
29 | 2. a login password for your local computer
30 | 3. a master password for that password manager
31 | 4. WiFi passwords which need to be frequently shared via analog means, since
32 |    the device they're being typed into isn't on the network yet
33 | 5. the password to certain online accounts, such as app stores, which may be
34 |    necessary to access new devices or get access to the account that lets you
35 |    install your password manager of choice onto a device.
36 | 
37 | For *these* passwords, `horsephrase` can come in handy.
38 | 
39 | ## How To Use Horsephrase
40 | 
41 | You can generate a new password by simply typing:
42 | 
43 | ```console
44 | $ horsephrase generate
45 | ```
46 | 
47 | at a command prompt.
48 | 
49 | You can customize `horsephrase` a little by supplying your own word list and
50 | choosing how many words to use; see `horsephrase --help` for details.  To
51 | estimate how long it would take an attacker to guess, if they could guess a
52 | trillion times a second, based on your current word list and word count, you
53 | can use the `estimate` command instead, and it will print out a human-readable
54 | time interval where an attacker will have guessed your password.  You should
55 | probably rotate your password significantly more often than this, since your
56 | passwords can be compromised in ways other than simply guessing.  The default
57 | configuration of `horsephrase` should be good enough that you don't need to
58 | tweak it much:
59 | 
60 | ```console
61 | $ horsephrase estimate
62 | 116 years, 20 weeks, 1 day, 21 hours, 13 minutes, and 30 seconds
63 | ```
64 | 
65 | ## Technical Note
66 | 
67 | Just so you know, `horsephrase` uses Python's `SystemRandom` API, which pulls
68 | entropy from `/dev/urandom`, which is
69 | [the correct way to do it](http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/).
70 | 


--------------------------------------------------------------------------------
/horsephrase/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | from ._implementation import output
3 | from .__main__ import main
4 | 
5 | __all__ = [
6 |     'output',
7 |     'main'
8 | ]
9 | 


--------------------------------------------------------------------------------
/horsephrase/__main__.py:
--------------------------------------------------------------------------------
  1 | 
  2 | from __future__ import annotations
  3 | from ._implementation import generate, words
  4 | from ._guess_guess import humantime, how_long
  5 | 
  6 | from typing import Iterator, ContextManager
  7 | 
  8 | import sys
  9 | 
 10 | if sys.version_info < (3,8):
 11 |     from typing_extensions import Protocol
 12 | else:
 13 |     from typing import Protocol
 14 | 
 15 | from os.path import join as pathjoin, normpath
 16 | import argparse
 17 | import string
 18 | from io import BytesIO, StringIO, TextIOWrapper
 19 | 
 20 | def do_generate(namespace: NameSpace) -> None:
 21 |     print(generate(number=namespace.count,
 22 |                    words=namespace.wordlist,
 23 |                    joiner=namespace.joiner))
 24 | 
 25 | def do_estimate(namespace: NameSpace) -> None:
 26 |     seconds = how_long(length=namespace.count,
 27 |                        choices=len(namespace.wordlist),
 28 |                        speed=namespace.guesses_per_second)
 29 |     if namespace.numeric:
 30 |         result = repr(seconds)
 31 |     else:
 32 |         result = humantime(seconds)
 33 |     print(result)
 34 | 
 35 | 
 36 | def make_parser() -> argparse.ArgumentParser:
 37 |     parser = argparse.ArgumentParser(description="Generate secure passwords.")
 38 |     parser.add_argument(
 39 |         "--count", type=int, default=None,
 40 |         help="the number of tokens (words or hex digits) to print"
 41 |     )
 42 | 
 43 |     sourcegroup = parser.add_mutually_exclusive_group()
 44 |     sourcegroup.add_argument(
 45 |         "--words", type=argparse.FileType("r"),
 46 |         default=normpath(pathjoin(__file__, "..", "words.txt")),
 47 |         help="the filename of a words file to use"
 48 |     )
 49 |     sourcegroup.add_argument(
 50 |         "--hex", action="store_true",
 51 |         help="generate a hexidecimal key"
 52 |     )
 53 |     sourcegroup.add_argument(
 54 |         "--letters", action="store_true", help="generate a short combination of letters for easier typing, i.e. access points"
 55 |     )
 56 |     return parser
 57 | 
 58 | 
 59 | without_subparsers = make_parser()
 60 | parser = make_parser()
 61 | subparsers = parser.add_subparsers()
 62 | subparsers.required = True
 63 | 
 64 | parser_generate = subparsers.add_parser("generate")
 65 | parser_generate.set_defaults(do_verb=do_generate)
 66 | 
 67 | parser_estimate = subparsers.add_parser("estimate")
 68 | parser_estimate.add_argument("--guesses-per-second",
 69 |                              default=1000 * 1000 * 1000 * 1000,
 70 |                              type=int)
 71 | parser_estimate.add_argument("--numeric", action="store_true")
 72 | parser_estimate.set_defaults(do_verb=do_estimate)
 73 | 
 74 | import contextlib
 75 | 
 76 | @contextlib.contextmanager
 77 | def captured_output() -> Iterator[StringIO]:
 78 |     buffer = StringIO()
 79 |     out = sys.stdout
 80 |     err = sys.stderr
 81 |     sys.stdout = buffer
 82 |     sys.stderr = buffer
 83 |     try:
 84 |         yield buffer
 85 |     finally:
 86 |         sys.stdout = out
 87 |         sys.stderr = err
 88 | 
 89 | class NameSpace(Protocol):
 90 |     wordlist: list[str]
 91 |     count: int
 92 |     joiner: str
 93 |     guesses_per_second: int
 94 |     numeric: bool
 95 |     hex: bool
 96 |     letters: bool
 97 |     words: ContextManager[TextIOWrapper]
 98 | 
 99 |     def do_verb(self, namespace: NameSpace) -> None:
100 |         pass
101 | 
102 | def parse_command_line(argv: list[str]) -> NameSpace:
103 |     try:
104 |         with captured_output() as captured:
105 |             return parser.parse_args(argv)
106 |     except:
107 |         try:
108 |             with captured_output() as recaptured:
109 |                 without_subparsers.parse_args(argv)
110 |         except:
111 |             print(captured.getvalue(), end="")
112 |             raise
113 |         else:
114 |             print(recaptured.getvalue(), end="")
115 |         return parser.parse_args(argv + ["generate"])
116 | 
117 | 
118 | def main() -> None:
119 |     namespace = parse_command_line(sys.argv[1:])
120 |     if namespace.hex:
121 |         namespace.wordlist = list(sorted(set(string.hexdigits.lower())))
122 |         namespace.joiner = ""
123 |         if namespace.count is None:
124 |             namespace.count = 20
125 |     elif namespace.letters:
126 |         namespace.wordlist = list(string.ascii_letters)
127 |         namespace.joiner = ""
128 |         if namespace.count is None:
129 |             namespace.count = 13
130 |     else:
131 |         if namespace.count is None:
132 |             namespace.count = 5
133 |         namespace.joiner = " "
134 |         with namespace.words as wordfile:
135 |             namespace.wordlist = wordfile.read().split()
136 |     namespace.do_verb(namespace)
137 | 
138 | if __name__ == '__main__':
139 |     main()
140 | 


--------------------------------------------------------------------------------
/horsephrase/_guess_guess.py:
--------------------------------------------------------------------------------
 1 | # Guess how many guesses it will take to guess a password.
 2 | 
 3 | from __future__ import annotations
 4 | from typing import Iterable
 5 | 
 6 | from ._implementation import words
 7 | 
 8 | def how_long(length: int=4, choices: int=len(words), speed: int=1000 * 1000 * 1000 * 1000,
 9 |              optimism: int=2) -> int:
10 |     """
11 |     How long might it take to guess a password?
12 | 
13 |     @param length: the number of words that we're going to choose.
14 | 
15 |     @param choice: the number of words we might choose between.
16 | 
17 |     @param speed: the speed of our hypothetical password guesser, in guesses
18 |         per second.
19 | 
20 |     @param optimism: When we start guessing all the options, we probably won't
21 |         have to guess I{all} of them to get a hit.  This assumes that the
22 |         guesser will have to guess only C{1/optimism} of the total number of
23 |         possible options before it finds a hit.
24 |     """
25 |     # https://github.com/python/mypy/issues/7765
26 |     assert choices > 0
27 |     assert length > 0
28 |     count: int = (choices ** length)
29 |     return int(count / (speed * optimism))
30 | 
31 | 
32 | 
33 | def redivmod(initial_value: float, factors: Iterable[tuple[int,str]]) -> str:
34 |     """
35 |     Chop up C{initial_value} according to the list of C{factors} and return a
36 |     formatted string.
37 |     """
38 |     result: list[str] = []
39 |     value = initial_value
40 |     for divisor, label in factors:
41 |         if not divisor:
42 |             remainder = value
43 |             if not remainder:
44 |                 break
45 |         else:
46 |             value, remainder = divmod(value, divisor)
47 |             if not value and not remainder:
48 |                 break
49 |         if remainder == 1:
50 |             # depluralize
51 |             label = label[:-1]
52 |         addition = str(remainder) + ' ' + str(label)
53 |         result.insert(0, addition)
54 |     if len(result) > 1:
55 |         result[-1] = "and " + result[-1]
56 |     if result:
57 |         return ', '.join(result)
58 |     else:
59 |         return "instantly"
60 | 
61 | 
62 | 
63 | def humantime(seconds: float) -> str:
64 |     """
65 |     A human-readable interpretation of a time interval.
66 | 
67 |     @param seconds: A number of seconds.
68 | 
69 |     @return: A string describing the time interval.
70 |     """
71 |     return redivmod(seconds, [(60, "seconds"),
72 |                               (60, "minutes"),
73 |                               (24, "hours"),
74 |                               (7, "days"),
75 |                               (52, "weeks"),
76 |                               (0, "years")])
77 | 
78 | if __name__ == "__main__":
79 |     import sys
80 |     print(humantime(how_long(*map(int, sys.argv[1:]))))
81 | 


--------------------------------------------------------------------------------
/horsephrase/_implementation.py:
--------------------------------------------------------------------------------
 1 | from __future__ import annotations
 2 | import sys
 3 | 
 4 | # See http://jbauman.com/aboutgsl.html
 5 | parent_module = ".".join(__name__.split(".")[:-1])
 6 | wordfile_name = "words.txt"
 7 | 
 8 | from typing import Callable
 9 | 
10 | _read_text: Callable[[str, str], str]
11 | 
12 | if sys.version_info >= (3, 9):
13 |     from importlib.resources import files
14 | 
15 |     def _read_text(module: str, filename: str) -> str:
16 |         with (files(parent_module) / wordfile_name).open() as f:
17 |             return f.read()
18 | 
19 | else:
20 |     from importlib.resources import read_text as _read_text
21 | 
22 | words = _read_text(parent_module, wordfile_name).strip().split("\n")
23 | 
24 | 
25 | from random import SystemRandom
26 | from typing import TypeVar, Callable, Sequence
27 | 
28 | 
29 | def generate(
30 |     number: int = 4,
31 |     choice: Callable[[Sequence[str]], str] = SystemRandom().choice,
32 |     words: list[str] = words,
33 |     joiner: str = " ",
34 | ) -> str:
35 |     """
36 |     Generate a random passphrase from the GSL.
37 |     """
38 |     return joiner.join(choice(words) for each in range(number))
39 | 
40 | 
41 | def output() -> int:
42 |     print(generate())
43 |     return 0
44 | 


--------------------------------------------------------------------------------
/horsephrase/_regen_words.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Run with 'python -m horsephrase._regen_words > horsephrase/words.txt'
 3 | 
 4 | - Stop allowing words less than 3 characters; if we have the possibility
 5 |   of words that short, it's trivially possible to attack the password as
 6 |   letters rather than as selections from this list.
 7 | 
 8 | - Add words (both sourced from https://norvig.com/ngrams/) from a list of
 9 |   correctly-spelled words (the YAWL) in the order of word
10 |   frequency (count_1w.txt) until we reach a desirable count
11 | 
12 | - The oldest recorded living human -- courtesy of this list
13 |   https://en.wikipedia.org/wiki/List_of_oldest_living_people - is
14 |   presently 116 years and 74 days old. Determine the desirable count
15 |   from the number of guesses that will exceed that time with a 5-word
16 |   passphrase assuming a trillion guesses per second.
17 | 
18 | - Remove words that are offensive, triggering or otherwise in poor taste.
19 |   Horsephrases should be communicable to people over phone lines without
20 |   being embarassing, offensive or unprofessional.
21 |   If generating a horsephrase generates something offensive, add the sha256 of
22 |   the offending word to _removed_words, run the command at the start of this
23 |   module, and open a PR with both changes.
24 | """
25 | from typing import Iterator, Iterable
26 | 
27 | 
28 | if __name__ != "__main__":
29 |     raise ImportError("module is not importable")
30 | 
31 | import hashlib
32 | import itertools
33 | 
34 | import requests
35 | 
36 | # There's a whole bit about the oldest
37 | # living human or something.
38 | NUM_WORDS = 23600
39 | 
40 | # Removed words are specified by their hash,
41 | # as we do not want to offend people who read the source.
42 | _removed_words = set([
43 |    '31506a8448a761a448a08aa69d9116ea8a6cb1c6b3f4244b3043051f69c9cc3c',
44 |     'e9b6438440bf1991a49cfc2032b47e4bde26b7d7a6bf7594ec6f308ca1f5797c',
45 | ])
46 | 
47 | def get_words(session: requests.Session) -> Iterator[str]:
48 |     yawl = session.get("https://norvig.com/ngrams/word.list")
49 |     correct = set(yawl.text.split())
50 |     counts = session.get("https://norvig.com/ngrams/count_1w.txt")
51 |     for line in counts.text.splitlines():
52 |         word, count = line.split()
53 |         if word not in correct:
54 |             continue
55 |         yield word
56 | 
57 | def valid_words(words: Iterable[str]) -> Iterator[str]:
58 |     for word in words:
59 |         if len(word) <= 3:
60 |             continue
61 |         digest = hashlib.sha256(word.encode('ascii')).hexdigest()
62 |         if digest in _removed_words:
63 |             continue
64 |         yield word
65 | 
66 | for word in sorted(itertools.islice(valid_words(get_words(requests.Session())),
67 |                                     NUM_WORDS)):
68 |     print(word)
69 | 


--------------------------------------------------------------------------------
/horsephrase/py.typed:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/glyph/horsephrase/eea5c9845ef8b6287f935d3359c95b6cf87bd0a5/horsephrase/py.typed


--------------------------------------------------------------------------------
/mypy.ini:
--------------------------------------------------------------------------------
1 | [mypy]
2 | strict=True


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | tox==4.0.3
2 | pytest==7.2.0
3 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [bdist_wheel]
2 | # no py2 tag for pure-python dist:
3 | universal=0
4 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from setuptools import setup
 3 | 
 4 | setup(
 5 |     name="horsephrase",
 6 |     version="2022.12.9.1",
 7 |     description="Secure password generator.",
 8 |     long_description=(
 9 |         "Like http://correcthorsebatterystaple.net/ except it's not a web page"
10 |         " which is logging your passwords and sending them all to the NSA."
11 |     ),
12 |     author="Glyph",
13 |     author_email="glyph@twistedmatrix.com",
14 |     maintainer="Glyph",
15 |     maintainer_email="glyph@twistedmatrix.com",
16 |     url="https://github.com/glyph/horsephrase/",
17 |     packages=["horsephrase"],
18 |     package_data=dict(
19 |         horsephrase=["*.txt", "py.typed"],
20 |     ),
21 |     license="MIT",
22 |     python_requires='>=3',
23 |     classifiers=[
24 |         "Programming Language :: Python :: 3.9",
25 |         "Programming Language :: Python :: 3.10",
26 |         "Programming Language :: Python :: 3.11",
27 |     ],
28 |     entry_points={
29 |         "console_scripts": [
30 |             "horsephrase = horsephrase.__main__:main",
31 |         ],
32 |     },
33 |     extras_require={
34 |         ':python_version < "3.8"': ['typing_extensions'],
35 |         'dev': ['requests'],
36 |     }
37 | )
38 | 


--------------------------------------------------------------------------------
/tests/test_horsephrase.py:
--------------------------------------------------------------------------------
 1 | from io import StringIO
 2 | 
 3 | import builtins as builtins_module
 4 | from unittest import mock
 5 | import sys
 6 | 
 7 | 
 8 | import horsephrase
 9 | 
10 | 
11 | def test_generate() -> None:
12 |     pw = horsephrase._implementation.generate()
13 |     assert all(word in horsephrase._implementation.words for word in pw.split(" "))
14 | 
15 | 
16 | def test_estimate() -> None:
17 |     with mock.patch.object(builtins_module, "print") as mock_print:
18 |         namespace = mock.MagicMock(
19 |             count=5,
20 |             wordlist=horsephrase._implementation.words,
21 |             guesses_per_second=1000 * 1000 * 1000 * 1000,
22 |             numeric=True,
23 |         )
24 |         horsephrase.__main__.do_estimate(namespace)
25 |         assert mock_print.call_count
26 | 
27 | 
28 | def test_main() -> None:
29 |     with mock.patch.object(builtins_module, "print") as mock_print, mock.patch.object(
30 |         sys, "argv", ["horsephrase", "generate"]
31 |     ):
32 |         horsephrase.__main__.main()
33 |         assert mock_print.call_count
34 | 


--------------------------------------------------------------------------------
/tox.ini:
--------------------------------------------------------------------------------
 1 | [tox]
 2 | envlist = py37,py38,py39,py310,py311,mypy
 3 | [testenv]
 4 | deps=pytest
 5 | commands=pytest -s
 6 | [testenv:mypy]
 7 | commands=mypy horsephrase tests
 8 | deps=
 9 |     mypy
10 |     types-requests


--------------------------------------------------------------------------------