├── debian ├── compat ├── source │ └── format ├── clean ├── watch ├── upstream │ └── metadata ├── rules ├── changelog ├── control └── copyright ├── .gitignore ├── tldp ├── .gitignore ├── __init__.py ├── doctypes │ ├── __init__.py │ ├── rst.py │ ├── example.py │ ├── markdown.py │ ├── asciidoc.py │ ├── linuxdoc.py │ ├── docbook4xml.py │ ├── docbook5xml.py │ └── docbooksgml.py ├── ldpcollection.py ├── typeguesser.py ├── config.py ├── inventory.py ├── outputs.py ├── sources.py └── utils.py ├── requirements.txt ├── tests ├── sample-documents │ ├── Linuxdoc-Larger │ │ ├── images │ │ │ ├── tiny.png │ │ │ └── another.png │ │ └── Linuxdoc-Larger.sgml │ ├── ISO-8859-1.sgml │ ├── Unknown-Doctype.xqf │ ├── DocBook-4.2-WHYNOT │ │ ├── images │ │ │ ├── warning.jpg │ │ │ ├── warning.png │ │ │ └── warning.svg │ │ ├── disappearing.xml │ │ └── DocBook-4.2-WHYNOT.xml │ ├── DocBookSGML-Larger │ │ ├── images │ │ │ └── bullet.png │ │ ├── DocBookSGML-Larger.sgml │ │ └── index.sgml │ ├── docbook5xml-simple.xml │ ├── linuxdoc-simple.sgml │ ├── Bad-Dir-Multiple-Doctypes │ │ ├── Bad-Dir-Multiple-Doctypes.xml │ │ └── Bad-Dir-Multiple-Doctypes.sgml │ ├── docbook4xml-simple.xml │ ├── docbooksgml-simple.sgml │ ├── docbook4xml-broken.xml │ └── asciidoc-complete.txt ├── test_config.py ├── example.py ├── test_outputs.py ├── test_typeguesser.py ├── long_inventory.py ├── test_inventory.py ├── long_driver.py ├── test_cascadingconfig.py ├── test_sources.py ├── tldptesttools.py └── test_utils.py ├── MANIFEST.in ├── tox.ini ├── contrib ├── debian-release.sh ├── rpm-release.py ├── tldp.spec └── tldp.spec.in ├── NOTES.rst ├── extras ├── xsl │ ├── tldp-one-page.xsl │ ├── tldp-chapters.xsl │ ├── tldp-common.xsl │ ├── tldp-sections.xsl │ └── tldp-print.xsl └── css │ └── style.css ├── LICENSE ├── TODO ├── .travis.yml ├── setup.py ├── etc └── ldptool.ini ├── docs └── conf.py └── ChangeLog /debian/compat: -------------------------------------------------------------------------------- 1 | 9 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | .coverage 3 | -------------------------------------------------------------------------------- /debian/source/format: -------------------------------------------------------------------------------- 1 | 3.0 (quilt) 2 | -------------------------------------------------------------------------------- /tldp/.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *.swp 3 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | networkx 2 | nose 3 | coverage 4 | -------------------------------------------------------------------------------- /debian/clean: -------------------------------------------------------------------------------- 1 | tldp.egg-info/ 2 | docs/_build/ 3 | .coverage/ 4 | .tox/ 5 | -------------------------------------------------------------------------------- /tests/sample-documents/Linuxdoc-Larger/images/tiny.png: -------------------------------------------------------------------------------- 1 | ../../DocBookSGML-Larger/images/bullet.png -------------------------------------------------------------------------------- /tests/sample-documents/Linuxdoc-Larger/images/another.png: -------------------------------------------------------------------------------- 1 | ../../DocBookSGML-Larger/images/bullet.png -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include README.rst 2 | recursive-include etc * 3 | recursive-include docs * 4 | recursive-include contrib * 5 | -------------------------------------------------------------------------------- /tests/sample-documents/ISO-8859-1.sgml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tLDP/python-tldp/HEAD/tests/sample-documents/ISO-8859-1.sgml -------------------------------------------------------------------------------- /debian/watch: -------------------------------------------------------------------------------- 1 | version=3 2 | opts=uversionmangle=s/(rc|a|b|c)/~$1/ \ 3 | https://pypi.debian.net/tldp/tldp-(.+)\.(?:zip|tgz|tbz|txz|(?:tar\.(?:gz|bz2|xz))) 4 | -------------------------------------------------------------------------------- /tests/sample-documents/Unknown-Doctype.xqf: -------------------------------------------------------------------------------- 1 | 2 | This is a deliberately weird file...without document signature and 3 | with a completely unknown extension. 4 | -------------------------------------------------------------------------------- /tests/sample-documents/DocBook-4.2-WHYNOT/images/warning.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tLDP/python-tldp/HEAD/tests/sample-documents/DocBook-4.2-WHYNOT/images/warning.jpg -------------------------------------------------------------------------------- /tests/sample-documents/DocBook-4.2-WHYNOT/images/warning.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tLDP/python-tldp/HEAD/tests/sample-documents/DocBook-4.2-WHYNOT/images/warning.png -------------------------------------------------------------------------------- /tests/sample-documents/DocBookSGML-Larger/images/bullet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tLDP/python-tldp/HEAD/tests/sample-documents/DocBookSGML-Larger/images/bullet.png -------------------------------------------------------------------------------- /debian/upstream/metadata: -------------------------------------------------------------------------------- 1 | Bug-Database: https://github.com/tLDP/LDP/issues 2 | Contact: discuss@en.tldp.org 3 | Name: python-tldp 4 | Repository: https://github.com/tLDP/LDP 5 | -------------------------------------------------------------------------------- /tests/sample-documents/DocBook-4.2-WHYNOT/disappearing.xml: -------------------------------------------------------------------------------- 1 |
2 | 3 | I am just a disappearing file, for use in the long_inventory.py test. 4 | 5 |
6 | -------------------------------------------------------------------------------- /tests/sample-documents/docbook5xml-simple.xml: -------------------------------------------------------------------------------- 1 | 2 |
3 | Simple article 4 | This is a ridiculously terse article. 5 |
6 | -------------------------------------------------------------------------------- /tests/sample-documents/linuxdoc-simple.sgml: -------------------------------------------------------------------------------- 1 | 2 |
3 | B 4 | <author>A 5 | <date>2016-02-11 6 | <abstract> abstract </abstract> 7 | <toc> 8 | <sect>Introduction 9 | <p> 10 | <sect>Stuff. 11 | <p> 12 | <sect>More-stuff. 13 | <p> 14 | </article> 15 | -------------------------------------------------------------------------------- /debian/rules: -------------------------------------------------------------------------------- 1 | #!/usr/bin/make -f 2 | # export PYBUILD_VERBOSE=1 3 | export PYBUILD_NAME=tldp 4 | 5 | %: 6 | dh $@ --with=python3 --buildsystem=pybuild 7 | 8 | override_dh_installman: 9 | (cd docs && \ 10 | sphinx-build -b man -d _build/doctrees . _build/man) 11 | dh_installman docs/_build/man/ldptool.1 12 | -------------------------------------------------------------------------------- /tests/sample-documents/Bad-Dir-Multiple-Doctypes/Bad-Dir-Multiple-Doctypes.xml: -------------------------------------------------------------------------------- 1 | <?xml version="1.0" encoding="utf-8"?> 2 | <article xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="en"> 3 | <title>Bad Dir Multiple Doctypes (DocBook XML 5.0) 4 | This is a ridiculously terse article. 5 |
6 | -------------------------------------------------------------------------------- /tldp/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import 6 | from __future__ import unicode_literals 7 | 8 | import tldp.config 9 | import tldp.outputs 10 | import tldp.sources 11 | import tldp.inventory 12 | 13 | VERSION = "0.7.15" 14 | -------------------------------------------------------------------------------- /tests/sample-documents/Linuxdoc-Larger/Linuxdoc-Larger.sgml: -------------------------------------------------------------------------------- 1 | 2 |
3 | Linuxdoc Larger Document 4 | <author>Another Author 5 | <date>2016-02-11 6 | <abstract> abstract </abstract> 7 | <toc> 8 | <sect>Introduction 9 | <p> 10 | <sect>Stuff. 11 | <p> 12 | <sect>More-stuff. 13 | <p> 14 | </article> 15 | -------------------------------------------------------------------------------- /tldp/doctypes/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import 6 | 7 | from tldp.doctypes.asciidoc import Asciidoc 8 | from tldp.doctypes.linuxdoc import Linuxdoc 9 | from tldp.doctypes.docbooksgml import DocbookSGML 10 | from tldp.doctypes.docbook4xml import Docbook4XML 11 | from tldp.doctypes.docbook5xml import Docbook5XML 12 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | # Tox (http://tox.testrun.org/) is a tool for running tests 2 | # in multiple virtualenvs. This configuration file will run the 3 | # test suite on all supported python versions. To use it, "pip install tox" 4 | # and then run "tox" from this directory. 5 | 6 | [tox] 7 | envlist = py39, py310 8 | skip_missing_interpreters = True 9 | 10 | [testenv] 11 | commands = {envpython} setup.py test 12 | deps = 13 | networkx 14 | -------------------------------------------------------------------------------- /tldp/doctypes/rst.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import logging 10 | 11 | from tldp.doctypes.common import BaseDoctype 12 | 13 | logger = logging.getLogger(__name__) 14 | 15 | 16 | class RestructuredText(BaseDoctype): 17 | formatname = 'reStructuredText' 18 | extensions = ['.rst'] 19 | signatures = [] 20 | 21 | 22 | # 23 | # -- end of file 24 | -------------------------------------------------------------------------------- /tldp/doctypes/example.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import logging 10 | 11 | from tldp.doctypes.common import BaseDoctype 12 | 13 | logger = logging.getLogger(__name__) 14 | 15 | 16 | class Frobnitz(BaseDoctype): 17 | formatname = 'Frobnitz' 18 | extensions = ['.fb'] 19 | signatures = ['{{Frobnitz-Format 2.3}}'] 20 | 21 | # 22 | # -- end of file 23 | -------------------------------------------------------------------------------- /contrib/debian-release.sh: -------------------------------------------------------------------------------- 1 | #! /bin/bash 2 | # 3 | # 4 | 5 | set -e 6 | set -x 7 | set -o pipefail 8 | 9 | PACKAGE=$(dpkg-parsechangelog | awk '/Source:/{print $2}') 10 | VERSION=$(dpkg-parsechangelog | awk -F'[- ]' '/Version:/{print $2}') 11 | 12 | PREFIX="${PACKAGE}-${VERSION}" 13 | TARBALL="../${PACKAGE}_${VERSION}.orig.tar.xz" 14 | 15 | git archive \ 16 | --format tar \ 17 | --prefix "${PREFIX}/" \ 18 | "${PREFIX}" \ 19 | | xz \ 20 | --compress \ 21 | --to-stdout \ 22 | > "${TARBALL}" 23 | 24 | exec debuild "$@" 25 | 26 | # -- end of file 27 | -------------------------------------------------------------------------------- /tldp/doctypes/markdown.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import logging 10 | 11 | from tldp.doctypes.common import BaseDoctype 12 | 13 | logger = logging.getLogger(__name__) 14 | 15 | 16 | class Markdown(BaseDoctype): 17 | formatname = 'Markdown' 18 | extensions = ['.md'] 19 | signatures = [] 20 | tools = ['pandoc'] 21 | 22 | # 23 | # -- end of file 24 | -------------------------------------------------------------------------------- /contrib/rpm-release.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # 3 | # 4 | 5 | from __future__ import print_function 6 | 7 | import os 8 | import sys 9 | 10 | opd = os.path.dirname 11 | opj = os.path.join 12 | 13 | sys.path.insert(0, opd(opd(__file__))) 14 | from tldp import VERSION 15 | 16 | fin = open(opj(opd(__file__), 'tldp.spec.in')) 17 | fout = open(opj(opd(__file__), 'tldp.spec'), 'w') 18 | 19 | def transform(mapping, text): 20 | for tag, replacement in mapping.items(): 21 | text = text.replace(tag, replacement) 22 | return text 23 | 24 | subst = {'@VERSION@': VERSION} 25 | print(subst) 26 | 27 | fout.write(transform(subst, fin.read())) 28 | 29 | # -- end of file 30 | -------------------------------------------------------------------------------- /debian/changelog: -------------------------------------------------------------------------------- 1 | tldp (0.7.15-1) unstable; urgency=low 2 | 3 | * support Python3.8+: fix import for MutableMapping and other minor fixes 4 | 5 | tldp (0.7.14-1) unstable; urgency=low 6 | 7 | * Add --version option. 8 | 9 | -- Martin A. Brown <martin@linux-ip.net> Mon, 16 May 2016 16:54:47 +0000 10 | 11 | tldp (0.7.13-1) unstable; urgency=low 12 | 13 | * Fix testsuite when run as root (Closes: #824201). 14 | 15 | -- Martin A. Brown <martin@linux-ip.net> Fri, 13 May 2016 16:28:22 +0000 16 | 17 | tldp (0.7.12-1) unstable; urgency=low 18 | 19 | * Initial release (Closes: #822181) 20 | 21 | -- Martin A. Brown <martin@linux-ip.net> Wed, 27 Apr 2016 17:09:56 +0000 22 | -------------------------------------------------------------------------------- /NOTES.rst: -------------------------------------------------------------------------------- 1 | Notes to future self 2 | ++++++++++++++++++++ 3 | 4 | To release a new version for different software consumers. 5 | 6 | * commit all of the changes you want 7 | * bump version in tldp/__init__.py 8 | * adjust debian/changelog in accordance with Debian policy 9 | N.B. the version must match what you put in tldp/__init__.py 10 | * run 'python contrib/rpm-release.py' which will regenerate a 11 | contrib/tldp.spec with the correct version 12 | * commit debian/changelog tldp/__init__.py and contrib/tldp.spec 13 | * tag the release 14 | * run 'git push origin master --tags' 15 | * run 'python setup.py sdist upload -r pypi' 16 | * run 'bash contrib/debian-release.py' (on a Debian-ish box) 17 | 18 | 19 | -------------------------------------------------------------------------------- /extras/xsl/tldp-one-page.xsl: -------------------------------------------------------------------------------- 1 | <?xml version="1.0" encoding="ISO-8859-1"?> 2 | 3 | <xsl:stylesheet version="1.0" 4 | xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 5 | 6 | <!-- TLDP One Page HTML XSL; create a single-page HTML output 7 | This is a small customization layer on top of upstream 8 | docbook-xsl-stylesheets. Since the XML_CATALOG_FILES will locate the 9 | installed version of the required import resources, we will use the 10 | system identifier in the xsl:import line. 11 | --> 12 | <xsl:import href="http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl"/> 13 | <xsl:import href="tldp-common.xsl"/> 14 | 15 | <!-- This set of customizations is used to generate the entire XML 16 | document on a single HTML page. --> 17 | 18 | </xsl:stylesheet> 19 | -------------------------------------------------------------------------------- /extras/xsl/tldp-chapters.xsl: -------------------------------------------------------------------------------- 1 | <?xml version="1.0" encoding="ISO-8859-1"?> 2 | 3 | <xsl:stylesheet version="1.0" 4 | xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 5 | 6 | <!-- TLDP Chapters HTML XSL; break source into one output file per chapter 7 | This is a small customization layer on top of upstream 8 | docbook-xsl-stylesheets. Since the XML_CATALOG_FILES will locate the 9 | installed version of the required import resources, we will use the 10 | system identifier in the xsl:import line. 11 | --> 12 | <xsl:import href="http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl"/> 13 | <xsl:import href="tldp-common.xsl"/> 14 | 15 | <!-- Generate a separate HTML page for each preface, chapter or 16 | appendix. Contrast this behavior with the tldp-one-page.xsl 17 | and tldp-section.xsl customizations. --> 18 | <xsl:param name="chunk.section.depth" select="0"></xsl:param> 19 | 20 | </xsl:stylesheet> 21 | -------------------------------------------------------------------------------- /tests/sample-documents/docbook4xml-simple.xml: -------------------------------------------------------------------------------- 1 | <?xml version="1.0" standalone="no"?> 2 | <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> 4 | <article> 5 | <articleinfo> 6 | <title>T 7 | AB 8 | AB 9 | 10 | v0.0 11 | 2016-02-11 12 | AB 13 | Initial release. 14 | 15 | abstract 16 | 17 | 18 | 19 | Intro 20 | Text 21 | 22 | Intro 23 | Text 24 | 25 | 26 | 27 |
28 | -------------------------------------------------------------------------------- /tests/sample-documents/docbooksgml-simple.sgml: -------------------------------------------------------------------------------- 1 | 2 |
3 | 4 | T 5 | 6 | A B 7 | 8 |
devnull@example.org
9 |
10 |
11 | 2016-02-11 12 | abstract 13 | 14 | 15 | 1.0 16 | 2016-02-11 17 | AB 18 | Initial release. 19 | 20 | 21 | 22 |
23 | 24 | Introduction 25 | Text 26 | More stuff 27 | Text 28 | 29 | 30 |
31 | -------------------------------------------------------------------------------- /tests/sample-documents/Bad-Dir-Multiple-Doctypes/Bad-Dir-Multiple-Doctypes.sgml: -------------------------------------------------------------------------------- 1 | 2 |
3 | 4 | Bad Dir Multiple Doctypes (DocBook SGML 4.1) 5 | 6 | A B 7 | 8 |
devnull@example.org
9 |
10 |
11 | 2016-02-11 12 | abstract 13 | 14 | 15 | 1.0 16 | 2016-02-11 17 | AB 18 | Initial release. 19 | 20 | 21 | 22 |
23 | 24 | Introduction 25 | Text 26 | More stuff 27 | Text 28 | 29 | 30 |
31 | -------------------------------------------------------------------------------- /tests/sample-documents/docbook4xml-broken.xml: -------------------------------------------------------------------------------- 1 | 2 | 4 |
5 | 6 | T 7 | AB 8 | AB 9 | 10 | v0.0 11 | 2016-02-11 12 | AB 13 | Initial release. 14 | 15 | abstract 16 | 17 | 18 | 19 | Intro 20 | Text 21 | 22 | Intro 23 | Text 24 | 25 | 26 | 27 | 31 | 32 | -------------------------------------------------------------------------------- /extras/xsl/tldp-common.xsl: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 6 | 8 | 9 | 12 | text/css 13 | 14 | 15 | 19 | 20 | 21 | 22 | 23 | -------------------------------------------------------------------------------- /tests/sample-documents/DocBookSGML-Larger/DocBookSGML-Larger.sgml: -------------------------------------------------------------------------------- 1 | 3 | ]> 4 | 5 |
6 | 7 | T 8 | 9 | A B 10 | 11 |
devnull@example.org
12 |
13 |
14 | 2016-02-11 15 | abstract 16 | 17 | 18 | 1.0 19 | 2016-02-11 20 | AB 21 | Initial release. 22 | 23 | 24 | 25 |
26 | 27 | Introduction 28 | Text 29 | Text 30 | More stuff 31 | Text 32 | 33 | 34 | &index; 35 |
36 | -------------------------------------------------------------------------------- /tests/test_config.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | 9 | import unittest 10 | from argparse import Namespace 11 | 12 | # -- SUT 13 | from tldp.config import collectconfiguration 14 | 15 | 16 | class TestConfigWorks(unittest.TestCase): 17 | 18 | def test_basic(self): 19 | config, args = collectconfiguration('tag', []) 20 | self.assertIsInstance(config, Namespace) 21 | self.assertIsInstance(args, list) 22 | 23 | def test_singleoptarg(self): 24 | config, args = collectconfiguration('tag', ['--pubdir', '.']) 25 | self.assertEqual(config.pubdir, '.') 26 | 27 | def test_nonexistent_directory(self): 28 | argv = ['--pubdir', '/path/to/nonexistent/directory'] 29 | with self.assertRaises(ValueError) as ecm: 30 | config, args = collectconfiguration('tag', argv) 31 | e = ecm.exception 32 | self.assertTrue("/path/to/nonexistent/directory" in e.args[0]) 33 | 34 | # 35 | # -- end of file 36 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2016, Linux Documentation Project 2 | 3 | Permission is hereby granted, free of charge, to any person 4 | obtaining a copy of this software and associated documentation files 5 | (the "Software"), to deal in the Software without restriction, 6 | including without limitation the rights to use, copy, modify, merge, 7 | publish, distribute, sublicense, and/or sell copies of the Software, 8 | and to permit persons to whom the Software is furnished to do so, 9 | subject to the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be 12 | included in all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 17 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 18 | BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 19 | ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /tests/sample-documents/DocBookSGML-Larger/index.sgml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 21 | 22 | Index 23 | 24 | T 25 | 26 | 27 | Text, 28 | Introduction 29 | 30 | 31 | 32 | 33 | 34 | -------------------------------------------------------------------------------- /TODO: -------------------------------------------------------------------------------- 1 | python-tldp TODO 2 | ================ 3 | 4 | Bugs 5 | ---- 6 | 7 | * when running --sourcedir $FILE, the error message is TERRIBLE; 8 | fix it; 9 | 10 | user-visible needs 11 | ------------------ 12 | 13 | * add a manpage 14 | 15 | * add support for .epub3 (or just .epub?) [python-epub ?] 16 | 17 | * consider adding support for metadata extraction from documents 18 | 19 | * create TLDP customizations of DocBook 5.0 XSL (namespaced) files 20 | (if we wish to do so) 21 | 22 | code internals 23 | -------------- 24 | 25 | * generate contrib/tldp.spec at build time (?) 26 | 27 | * SourceDocument and OutputDirectory both have nearly-identical 28 | methods called detail() which define a format string; probably 29 | should be defined once in a parent class or something 30 | 31 | 32 | CascadingConfig 33 | --------------- 34 | * consider replacing CascadingConfig with something (better?) from PyPI 35 | 36 | * factor out CascadingConfig into its own project 37 | 38 | * smart_bool for config handling; /usr/lib64/python2.7/ConfigParser.py 39 | around line 364ff. 40 | _boolean_states = {'1': True, 'yes': True, 'true': True, 'on': True, 41 | '0': False, 'no': False, 'false': False, 'off': False} 42 | 43 | -------------------------------------------------------------------------------- /extras/css/style.css: -------------------------------------------------------------------------------- 1 | /* 2 | style.css - a CSS stylesheet for use with HTML output produced by 3 | tldp-xsl stylesheets. Written by David Horton. 4 | */ 5 | 6 | 7 | body { 8 | 9 | /* 10 | Style the HMTL tag with a sans-serif font and 6% margin. 11 | A sans-serif font makes documents easier to read when displayed on 12 | a computer screen. Whitespace surrounding the document should 13 | make it easier to read both on screen and on printed paper. The 14 | value of 6% was chosen because it closely approximates a one-half 15 | inch margin on a US letter (8.5" by 11") paper. Since the margin 16 | is expressed as a percentage it should scale well in a web browser 17 | window. 18 | */ 19 | 20 | font-family: sans-serif; 21 | margin: 6%; 22 | } 23 | 24 | 25 | .programlisting, .screen { 26 | 27 | /* 28 | Style the programlisting and screen classes with a light gray 29 | background and a small bit of space between the object border and 30 | the text inside. The programlisting and screen classes are HTML 31 | representations of the and DocBook tags. 32 | */ 33 | 34 | background: lightgray; 35 | padding: 5px; 36 | } 37 | 38 | 39 | /* Add any desired customizations below. */ 40 | 41 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | sudo: required 3 | dist: trusty 4 | before_install: 5 | - sudo apt-get -qq update 6 | - sudo apt-get --assume-yes install htmldoc fop jing xsltproc asciidoc docbook docbook5-xml docbook-xsl-ns linuxdoc-tools-latex linuxdoc-tools-text sgml2x ldp-docbook-xsl ldp-docbook-dsssl html2text 7 | python: 8 | - "2.7" 9 | - "3.4" 10 | script: nosetests --cover-erase --with-coverage --cover-package tldp -- tests/long_driver.py tests/long_inventory.py tests/ 11 | 12 | # -- comments on install set on an Ubuntu system: 13 | # Here is the full set of packages that need to be installed in order for 14 | # this software to work/build. The leftmost string should say 'ii' for 15 | # each of the packages listed in this command-line: 16 | # 17 | # dpkg-query --list \ 18 | # asciidoc \ 19 | # docbook \ 20 | # docbook-dsssl \ 21 | # docbook-xsl \ 22 | # docbook-utils \ 23 | # docbook-xsl-ns \ 24 | # docbook5-xml \ 25 | # fop \ 26 | # htmldoc \ 27 | # htmldoc-common \ 28 | # html2text \ 29 | # jing \ 30 | # ldp-docbook-xsl \ 31 | # ldp-docbook-dsssl \ 32 | # libxml2-utils \ 33 | # linuxdoc-tools \ 34 | # linuxdoc-tools-text \ 35 | # linuxdoc-tools-latex \ 36 | # opensp \ 37 | # openjade \ 38 | # sgml2x \ 39 | # xsltproc \ 40 | # 41 | # 42 | -------------------------------------------------------------------------------- /extras/xsl/tldp-sections.xsl: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 6 | 12 | 13 | 14 | 15 | 20 | 21 | 22 | 24 | 25 | 26 | 27 | -------------------------------------------------------------------------------- /tests/sample-documents/DocBook-4.2-WHYNOT/DocBook-4.2-WHYNOT.xml: -------------------------------------------------------------------------------- 1 | 2 | 4 |
5 | 6 | T 7 | AB 8 | AB 9 | 10 | v0.0 11 | 2016-02-11 12 | AB 13 | Initial release. 14 | 15 | abstract 16 | 17 | 18 | 19 | Intro 20 | Text 21 | 22 | Intro 23 | Text 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | FIXME 36 | 37 | 38 | 39 |
40 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import os 3 | 4 | import glob 5 | from setuptools import setup 6 | from tldp import VERSION 7 | 8 | 9 | with open(os.path.join(os.path.dirname(__file__), 'README.rst')) as r_file: 10 | readme = r_file.read() 11 | 12 | 13 | setup( 14 | name='tldp', 15 | version=VERSION, 16 | license='MIT', 17 | author='Martin A. Brown', 18 | author_email='martin@linux-ip.net', 19 | url="http://en.tldp.org/", 20 | description='automatic publishing tool for DocBook, Linuxdoc and Asciidoc', 21 | long_description=readme, 22 | packages=['tldp', 'tldp/doctypes'], 23 | test_suite='nose.collector', 24 | install_requires=['networkx', 'nose'], 25 | include_package_data=True, 26 | package_data={'extras': ['extras/collateindex.pl'], 27 | 'extras/xsl': glob.glob('extras/xsl/*.xsl'), 28 | 'extras/css': glob.glob('extras/css/*.css'), 29 | 'extras/dsssl': glob.glob('extras/dsssl/*.dsl'), 30 | }, 31 | data_files=[('/etc/ldptool', ['etc/ldptool.ini']), ], 32 | entry_points={ 33 | 'console_scripts': ['ldptool = tldp.driver:main', ], 34 | }, 35 | classifiers=[ 36 | 'Development Status :: 4 - Beta', 37 | 'Intended Audience :: Developers', 38 | 'License :: OSI Approved :: MIT License', 39 | 'Operating System :: OS Independent', 40 | 'Programming Language :: Python', 41 | 'Topic :: Software Development :: Documentation', 42 | 'Topic :: Software Development :: Libraries :: Python Modules', 43 | ], 44 | ) 45 | -------------------------------------------------------------------------------- /tests/sample-documents/DocBook-4.2-WHYNOT/images/warning.svg: -------------------------------------------------------------------------------- 1 | 2 | 4 | 13 | 28 | 29 | -------------------------------------------------------------------------------- /extras/xsl/tldp-print.xsl: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 6 | 12 | 13 | 14 | 18 | 19 | 20 | 21 | 24 | start 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 35 | 36 | 37 | 38 | -------------------------------------------------------------------------------- /tldp/ldpcollection.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | 8 | import sys 9 | 10 | if sys.version_info[:2] >= (3, 8): # pragma: no cover 11 | from collections.abc import MutableMapping 12 | else: # pragma: no cover 13 | from collections import MutableMapping 14 | 15 | 16 | class LDPDocumentCollection(MutableMapping): 17 | '''a dict-like container for DocumentCollection objects 18 | 19 | Intended to be subclassed. 20 | 21 | Implements all the usual dictionary stuff, but also provides sorted 22 | lists of documents in the collection. 23 | ''' 24 | def __repr__(self): 25 | return '<%s:(%s docs)>' % (self.__class__.__name__, len(self)) 26 | 27 | def __delitem__(self, key): 28 | del self.__dict__[key] 29 | 30 | def __getitem__(self, key): 31 | return self.__dict__[key] 32 | 33 | def __setitem__(self, key, value): 34 | self.__dict__[key] = value 35 | 36 | def __iter__(self): 37 | return iter(self.__dict__) 38 | 39 | def __len__(self): 40 | return len(self.__dict__) 41 | 42 | def iterkeys(self): 43 | return iter(self.keys) 44 | 45 | def itervalues(self): 46 | for key in sorted(self, key=lambda x: x.lower()): 47 | yield self[key] 48 | 49 | def iteritems(self): 50 | for key in self.keys: 51 | yield (key, self[key]) 52 | 53 | def keys(self): 54 | return sorted(self, key=lambda x: x.lower()) 55 | 56 | def items(self): 57 | return [(key, self[key]) for key in self.keys()] 58 | 59 | def values(self): 60 | return [self[key] for key in self.keys()] 61 | 62 | # 63 | # -- end of file 64 | -------------------------------------------------------------------------------- /debian/control: -------------------------------------------------------------------------------- 1 | Source: tldp 2 | Maintainer: Martin A. Brown 3 | Section: text 4 | X-Python3-Version: >= 3.4 5 | Priority: optional 6 | Homepage: https://github.com/tLDP/python-tldp 7 | Build-Depends: debhelper (>= 9), 8 | dh-python, 9 | python3-all, 10 | python3-networkx, 11 | python3-nose, 12 | python3-coverage, 13 | python3-setuptools, 14 | python3-sphinx, 15 | htmldoc, 16 | fop, 17 | jing, 18 | sgml2x, 19 | xsltproc, 20 | asciidoc, 21 | docbook, 22 | docbook5-xml, 23 | docbook-xsl-ns, 24 | linuxdoc-tools-latex, 25 | linuxdoc-tools-text, 26 | ldp-docbook-xsl, 27 | ldp-docbook-dsssl, 28 | html2text 29 | Standards-Version: 3.9.8 30 | Vcs-Git: https://github.com/tLDP/python-tldp.git 31 | Vcs-Browser: https://github.com/tLDP/python-tldp 32 | 33 | Package: python3-tldp 34 | Architecture: all 35 | Depends: ${misc:Depends}, 36 | ${python3:Depends}, 37 | fop, 38 | jing, 39 | xsltproc, 40 | docbook, 41 | docbook5-xml, 42 | docbook-xsl-ns, 43 | htmldoc, 44 | html2text, 45 | sgml2x, 46 | asciidoc, 47 | linuxdoc-tools-latex, 48 | linuxdoc-tools-text, 49 | ldp-docbook-xsl, 50 | ldp-docbook-dsssl 51 | Description: automatic publishing tool for DocBook, Linuxdoc and Asciidoc 52 | The Linux Documentation Project (TLDP) stores hundreds of documents in 53 | DocBook SGML, DocBook XML, Linuxdoc and Asciidoc formats. This tool 54 | automatically detects the source format and generates a directory containing 55 | chunked and single-page HTML, a PDF and a plain text output. 56 | -------------------------------------------------------------------------------- /tldp/doctypes/asciidoc.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import logging 10 | 11 | from tldp.utils import which 12 | from tldp.utils import arg_isexecutable, isexecutable 13 | from tldp.doctypes.common import depends 14 | from tldp.doctypes.docbook4xml import Docbook4XML 15 | 16 | logger = logging.getLogger(__name__) 17 | 18 | 19 | class Asciidoc(Docbook4XML): 20 | formatname = 'AsciiDoc' 21 | extensions = ['.txt'] 22 | signatures = [] 23 | 24 | required = {'asciidoc_asciidoc': isexecutable, 25 | 'asciidoc_xmllint': isexecutable, 26 | } 27 | required.update(Docbook4XML.required) 28 | 29 | def make_docbook45(self, **kwargs): 30 | s = '''"{config.asciidoc_asciidoc}" \\ 31 | --backend docbook45 \\ 32 | --out-file {output.validsource} \\ 33 | "{source.filename}"''' 34 | return self.shellscript(s, **kwargs) 35 | 36 | @depends(make_docbook45) 37 | def make_validated_source(self, **kwargs): 38 | s = '"{config.asciidoc_xmllint}" --noout --valid "{output.validsource}"' 39 | return self.shellscript(s, **kwargs) 40 | 41 | @classmethod 42 | def argparse(cls, p): 43 | descrip = 'executables and data files for %s' % (cls.formatname,) 44 | g = p.add_argument_group(title=cls.__name__, description=descrip) 45 | g.add_argument('--asciidoc-asciidoc', type=arg_isexecutable, 46 | default=which('asciidoc'), 47 | help='full path to asciidoc [%(default)s]') 48 | g.add_argument('--asciidoc-xmllint', type=arg_isexecutable, 49 | default=which('xmllint'), 50 | help='full path to xmllint [%(default)s]') 51 | 52 | # 53 | # -- end of file 54 | -------------------------------------------------------------------------------- /contrib/tldp.spec: -------------------------------------------------------------------------------- 1 | %define sourcename tldp 2 | %define name python-tldp 3 | %define version 0.7.15 4 | %define unmangled_version 0.7.15 5 | %define unmangled_version 0.7.15 6 | %define release 1 7 | 8 | Summary: automatic publishing tool for DocBook, Linuxdoc and Asciidoc 9 | Name: %{name} 10 | Version: %{version} 11 | Release: %{release} 12 | Source0: %{sourcename}-%{unmangled_version}.tar.gz 13 | License: MIT 14 | Group: Development/Libraries 15 | BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-buildroot 16 | Prefix: %{_prefix} 17 | BuildArch: noarch 18 | Vendor: Martin A. Brown 19 | BuildRequires: python-setuptools 20 | Requires: asciidoc 21 | Requires: jing 22 | Requires: htmldoc 23 | Requires: sgmltool 24 | Requires: openjade 25 | Requires: docbook-utils 26 | Requires: docbook-utils-minimal 27 | Requires: docbook-dsssl-stylesheets 28 | Requires: docbook-xsl-stylesheets 29 | Requires: docbook5-xsl-stylesheets 30 | Requires: libxslt-tools 31 | Requires: python-networkx 32 | 33 | %description 34 | tldp - automatic publishing tool for DocBook, Linuxdoc and Asciidoc 35 | =================================================================== 36 | A toolset for publishing multiple output formats (PDF, text, chunked HTML and 37 | single-page HTML) from each source document in one of the supported formats. 38 | 39 | * Asciidoc 40 | * Linuxdoc 41 | * Docbook SGML 3.x (though deprecated, please no new submissions) 42 | * Docbook SGML 4.x 43 | * Docbook XML 4.x 44 | * Docbook XML 5.x (basic support, as of 2016-03-10) 45 | 46 | TLDP = The Linux Documentation Project. 47 | 48 | 49 | %prep 50 | %setup -n %{sourcename}-%{unmangled_version} 51 | 52 | %build 53 | python setup.py build 54 | 55 | %install 56 | python setup.py install --single-version-externally-managed -O1 --root=$RPM_BUILD_ROOT --record=INSTALLED_FILES 57 | install -D --mode 0644 docs/ldptool.1 %{buildroot}%{_mandir}/man1/ldptool.1 58 | perl -pi -e 's,(/etc/ldptool/ldptool.ini),%config(noreplace) $1,' INSTALLED_FILES 59 | 60 | %clean 61 | rm -rf $RPM_BUILD_ROOT 62 | 63 | %files -f INSTALLED_FILES 64 | %defattr(-,root,root) 65 | %{_mandir}/man1/ldptool.1* 66 | -------------------------------------------------------------------------------- /contrib/tldp.spec.in: -------------------------------------------------------------------------------- 1 | %define sourcename tldp 2 | %define name python-tldp 3 | %define version @VERSION@ 4 | %define unmangled_version @VERSION@ 5 | %define unmangled_version @VERSION@ 6 | %define release 1 7 | 8 | Summary: automatic publishing tool for DocBook, Linuxdoc and Asciidoc 9 | Name: %{name} 10 | Version: %{version} 11 | Release: %{release} 12 | Source0: %{sourcename}-%{unmangled_version}.tar.gz 13 | License: MIT 14 | Group: Development/Libraries 15 | BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-buildroot 16 | Prefix: %{_prefix} 17 | BuildArch: noarch 18 | Vendor: Martin A. Brown 19 | BuildRequires: python-setuptools 20 | Requires: asciidoc 21 | Requires: jing 22 | Requires: htmldoc 23 | Requires: sgmltool 24 | Requires: openjade 25 | Requires: docbook-utils 26 | Requires: docbook-utils-minimal 27 | Requires: docbook-dsssl-stylesheets 28 | Requires: docbook-xsl-stylesheets 29 | Requires: docbook5-xsl-stylesheets 30 | Requires: libxslt-tools 31 | Requires: python-networkx 32 | 33 | %description 34 | tldp - automatic publishing tool for DocBook, Linuxdoc and Asciidoc 35 | =================================================================== 36 | A toolset for publishing multiple output formats (PDF, text, chunked HTML and 37 | single-page HTML) from each source document in one of the supported formats. 38 | 39 | * Asciidoc 40 | * Linuxdoc 41 | * Docbook SGML 3.x (though deprecated, please no new submissions) 42 | * Docbook SGML 4.x 43 | * Docbook XML 4.x 44 | * Docbook XML 5.x (basic support, as of 2016-03-10) 45 | 46 | TLDP = The Linux Documentation Project. 47 | 48 | 49 | %prep 50 | %setup -n %{sourcename}-%{unmangled_version} 51 | 52 | %build 53 | python setup.py build 54 | 55 | %install 56 | python setup.py install --single-version-externally-managed -O1 --root=$RPM_BUILD_ROOT --record=INSTALLED_FILES 57 | install -D --mode 0644 docs/ldptool.1 %{buildroot}%{_mandir}/man1/ldptool.1 58 | perl -pi -e 's,(/etc/ldptool/ldptool.ini),%config(noreplace) $1,' INSTALLED_FILES 59 | 60 | %clean 61 | rm -rf $RPM_BUILD_ROOT 62 | 63 | %files -f INSTALLED_FILES 64 | %defattr(-,root,root) 65 | %{_mandir}/man1/ldptool.1* 66 | -------------------------------------------------------------------------------- /tests/example.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | import os 9 | import codecs 10 | from argparse import Namespace 11 | 12 | import tldp.doctypes 13 | 14 | from tldptesttools import stem_and_ext 15 | 16 | opj = os.path.join 17 | opd = os.path.dirname 18 | opa = os.path.abspath 19 | sampledocs = opa(opj(opd(__file__), 'sample-documents')) 20 | 21 | 22 | def load_content(ex): 23 | with codecs.open(ex.filename, encoding='utf-8') as f: 24 | ex.content = f.read() 25 | ex.stem, ex.ext = stem_and_ext(ex.filename) 26 | 27 | 28 | ex_linuxdoc = Namespace( 29 | doctype=tldp.doctypes.linuxdoc.Linuxdoc, 30 | filename=opj(sampledocs, 'linuxdoc-simple.sgml'),) 31 | 32 | ex_docbooksgml = Namespace( 33 | doctype=tldp.doctypes.docbooksgml.DocbookSGML, 34 | filename=opj(sampledocs, 'docbooksgml-simple.sgml'),) 35 | 36 | ex_docbook4xml = Namespace( 37 | doctype=tldp.doctypes.docbook4xml.Docbook4XML, 38 | filename=opj(sampledocs, 'docbook4xml-simple.xml'),) 39 | 40 | ex_docbook5xml = Namespace( 41 | doctype=tldp.doctypes.docbook5xml.Docbook5XML, 42 | filename=opj(sampledocs, 'docbook5xml-simple.xml'),) 43 | 44 | ex_asciidoc = Namespace( 45 | doctype=tldp.doctypes.asciidoc.Asciidoc, 46 | filename=opj(sampledocs, 'asciidoc-complete.txt'),) 47 | 48 | ex_linuxdoc_dir = Namespace( 49 | doctype=tldp.doctypes.linuxdoc.Linuxdoc, 50 | filename=opj(sampledocs, 'Linuxdoc-Larger', 51 | 'Linuxdoc-Larger.sgml'),) 52 | 53 | ex_docbook4xml_dir = Namespace( 54 | doctype=tldp.doctypes.docbook4xml.Docbook4XML, 55 | filename=opj(sampledocs, 'DocBook-4.2-WHYNOT', 56 | 'DocBook-4.2-WHYNOT.xml'),) 57 | 58 | ex_docbooksgml_dir = Namespace( 59 | doctype=tldp.doctypes.docbooksgml.DocbookSGML, 60 | filename=opj(sampledocs, 'DocBookSGML-Larger', 61 | 'DocBookSGML-Larger.sgml'),) 62 | 63 | # -- a bit ugly, but grab each dict 64 | sources = [y for x, y in locals().items() if x.startswith('ex_')] 65 | 66 | for ex in sources: 67 | load_content(ex) 68 | 69 | unknown_doctype = Namespace( 70 | doctype=None, 71 | filename=opj(sampledocs, 'Unknown-Doctype.xqf'),) 72 | 73 | broken_docbook4xml = Namespace( 74 | doctype=tldp.doctypes.docbook4xml.Docbook4XML, 75 | filename=opj(sampledocs, 'docbook4xml-broken.xml'),) 76 | 77 | load_content(broken_docbook4xml) 78 | 79 | # -- end of file 80 | -------------------------------------------------------------------------------- /tests/test_outputs.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | import os 9 | import errno 10 | import unittest 11 | import random 12 | 13 | from tldptesttools import TestToolsFilesystem 14 | 15 | # -- SUT 16 | from tldp.outputs import OutputCollection 17 | from tldp.outputs import OutputDirectory 18 | from tldp.outputs import OutputNamingConvention 19 | 20 | 21 | class TestOutputNamingConvention(unittest.TestCase): 22 | 23 | def test_namesets(self): 24 | onc = OutputNamingConvention("/path/to/output/", "Stem") 25 | self.assertTrue(onc.name_txt.endswith(".txt")) 26 | self.assertTrue(onc.name_pdf.endswith(".pdf")) 27 | self.assertTrue(onc.name_epub.endswith(".epub")) 28 | self.assertTrue(onc.name_html.endswith(".html")) 29 | self.assertTrue(onc.name_htmls.endswith("-single.html")) 30 | self.assertTrue(onc.name_indexhtml.endswith("index.html")) 31 | 32 | 33 | class TestOutputCollection(TestToolsFilesystem): 34 | 35 | def test_not_a_directory(self): 36 | missing = os.path.join(self.tempdir, 'vanishing') 37 | with self.assertRaises(IOError) as ecm: 38 | OutputCollection(missing) 39 | e = ecm.exception 40 | self.assertEqual(errno.ENOENT, e.errno) 41 | 42 | def test_file_in_output_collection(self): 43 | reldir, absdir = self.adddir('collection') 44 | self.addfile('collection', __file__, stem='non-directory') 45 | oc = OutputCollection(absdir) 46 | self.assertEqual(0, len(oc)) 47 | 48 | def test_manyfiles(self): 49 | reldir, absdir = self.adddir('manyfiles') 50 | count = random.randint(8, 15) 51 | for x in range(count): 52 | self.adddir('manyfiles/Document-Stem-' + str(x)) 53 | oc = OutputCollection(absdir) 54 | self.assertEqual(count, len(oc)) 55 | 56 | 57 | class TestOutputDirectory(TestToolsFilesystem): 58 | 59 | def test_no_parent_dir(self): 60 | odoc = os.path.join(self.tempdir, 'non-existent-parent', 'stem') 61 | with self.assertRaises(IOError) as ecm: 62 | OutputDirectory(odoc) 63 | e = ecm.exception 64 | self.assertEqual(errno.ENOENT, e.errno) 65 | 66 | def test_iscomplete(self): 67 | reldir, absdir = self.adddir('outputs/Frobnitz-HOWTO') 68 | o = OutputDirectory(absdir) 69 | self.assertFalse(o.iscomplete) 70 | for prop in o.expected: 71 | fname = getattr(o, prop, None) 72 | assert fname is not None 73 | with open(fname, 'w'): 74 | pass 75 | self.assertTrue(o.iscomplete) 76 | self.assertTrue('Frobnitz' in str(o)) 77 | 78 | # 79 | # -- end of file 80 | -------------------------------------------------------------------------------- /tests/test_typeguesser.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | import os 9 | import codecs 10 | import unittest 11 | from tempfile import NamedTemporaryFile as ntf 12 | 13 | # -- Test Data 14 | import example 15 | 16 | # -- SUT 17 | from tldp.typeguesser import guess 18 | from tldp.doctypes.common import SignatureChecker 19 | 20 | # -- shorthand 21 | opj = os.path.join 22 | opd = os.path.dirname 23 | opa = os.path.abspath 24 | 25 | sampledocs = opj(opd(__file__), 'sample-documents') 26 | 27 | 28 | def genericGuessTest(content, ext): 29 | tf = ntf(prefix='tldp-guesser-test-', suffix=ext, delete=False) 30 | tf.close() 31 | with codecs.open(tf.name, 'w', encoding='utf-8') as f: 32 | f.write(content) 33 | dt = guess(tf.name) 34 | os.unlink(tf.name) 35 | return dt 36 | 37 | 38 | class TestDoctypes(unittest.TestCase): 39 | 40 | def testISO_8859_1(self): 41 | dt = guess(opj(sampledocs, 'ISO-8859-1.sgml')) 42 | self.assertIsNotNone(dt) 43 | 44 | def testDetectionBySignature(self): 45 | for ex in example.sources: 46 | if isinstance(ex.doctype, SignatureChecker): 47 | dt = genericGuessTest(ex.content, ex['ext']) 48 | self.assertEqual(ex.doctype, dt) 49 | 50 | def testDetectionByExtension(self): 51 | for ex in example.sources: 52 | if not isinstance(ex.doctype, SignatureChecker): 53 | dt = genericGuessTest(ex.content, ex.ext) 54 | self.assertEqual(ex.doctype, dt) 55 | 56 | def testDetectionBogusExtension(self): 57 | dt = genericGuessTest('franks-cheese-shop', '.wmix') 58 | self.assertIsNone(dt) 59 | 60 | def testDetectionMissingExtension(self): 61 | dt = genericGuessTest('franks-cheese-shop', '') 62 | self.assertIsNone(dt) 63 | 64 | def testGuessNumber(self): 65 | self.assertIsNone(guess(7)) 66 | 67 | def testGuessBadXML(self): 68 | dt = genericGuessTest('XML', '.xml') 69 | self.assertIsNone(dt) 70 | 71 | def testGuessSingleMatchAsciidoc(self): 72 | ex = example.ex_asciidoc 73 | dt = genericGuessTest(ex.content, '.txt') 74 | self.assertEqual(ex.doctype, dt) 75 | 76 | def testGuessTooManyMatches(self): 77 | a = example.ex_docbook4xml.content 78 | b = example.ex_docbook5xml.content 79 | four, fourdt = a + b, example.ex_docbook4xml.doctype 80 | dt = genericGuessTest(four, '.xml') 81 | self.assertIs(dt, fourdt) 82 | five, fivedt = b + a, example.ex_docbook5xml.doctype 83 | dt = genericGuessTest(five, '.xml') 84 | self.assertIs(dt, fivedt) 85 | 86 | # 87 | # -- end of file 88 | -------------------------------------------------------------------------------- /debian/copyright: -------------------------------------------------------------------------------- 1 | Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ 2 | Upstream-Name: python-tldp 3 | Upstream-Contact: Martin A. Brown 4 | Source: https://github.com/tLDP/python-tldp 5 | 6 | Files: * 7 | Copyright: 2016 Linux Documentation Project 8 | License: MIT 9 | 10 | Files: extras/dsssl/ldp.dsl 11 | Copyright: 2000-2003 - Greg Ferguson (gferg@metalab.unc.edu) 12 | License: GPL-2.0+ 13 | 14 | Files: extras/xsl/tldp-*.xsl extras/css/style.css 15 | Copyright: 2000-2002 - David Horton (dhorton@speakeasy.net) 16 | License: GFDL-1.2 17 | 18 | Files: tests/sample-documents/DocBook-4.2-WHYNOT/images/* tests/sample-documents/DocBookSGML-Larger/images/bullet.png 19 | Copyright: Copyright (C) 2011-2012 O'Reilly Media 20 | License: MIT 21 | 22 | Files: extras/collateindex.pl 23 | Copyright: 1997-2001 Norman Walsh 24 | License: MIT 25 | 26 | License: MIT 27 | Permission is hereby granted, free of charge, to any person obtaining a 28 | copy of this software and associated documentation files (the "Software"), 29 | to deal in the Software without restriction, including without limitation 30 | the rights to use, copy, modify, merge, publish, distribute, sublicense, 31 | and/or sell copies of the Software, and to permit persons to whom the 32 | Software is furnished to do so, subject to the following conditions: 33 | . 34 | The above copyright notice and this permission notice shall be included 35 | in all copies or substantial portions of the Software. 36 | . 37 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 38 | OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 39 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 40 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 41 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 42 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 43 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 44 | 45 | License: GPL-2.0+ 46 | This package is free software; you can redistribute it and/or modify 47 | it under the terms of the GNU General Public License as published by 48 | the Free Software Foundation; either version 2 of the License, or 49 | (at your option) any later version. 50 | . 51 | This package is distributed in the hope that it will be useful, 52 | but WITHOUT ANY WARRANTY; without even the implied warranty of 53 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 54 | GNU General Public License for more details. 55 | . 56 | You should have received a copy of the GNU General Public License 57 | along with this program. If not, see 58 | . 59 | On Debian systems, the complete text of the GNU General 60 | Public License version 2 can be found in "/usr/share/common-licenses/GPL-2". 61 | 62 | License: GFDL-1.2 63 | Permission is granted to copy, distribute and/or modify this document 64 | under the terms of the GNU Free Documentation License, Version 1.2 65 | or any later version published by the Free Software Foundation; 66 | with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. 67 | A copy of the license is included in the section entitled "GNU 68 | Free Documentation License". 69 | . 70 | On Debian systems, the complete text of the GFDL-1.2 can be found in 71 | /usr/share/common-licenses/GFDL-1.2 72 | -------------------------------------------------------------------------------- /tests/long_inventory.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | 7 | import io 8 | import os 9 | import codecs 10 | import shutil 11 | 12 | from tldptesttools import TestInventoryBase, TestSourceDocSkeleton 13 | 14 | # -- Test Data 15 | import example 16 | 17 | # -- SUT 18 | import tldp.driver 19 | 20 | opb = os.path.basename 21 | opj = os.path.join 22 | 23 | 24 | class TestInventoryHandling(TestInventoryBase): 25 | 26 | def test_lifecycle(self): 27 | self.add_docbook4xml_xsl_to_config() 28 | c = self.config 29 | argv = self.argv 30 | argv.extend(['--publish']) 31 | argv.extend(['--docbook4xml-xslprint', c.docbook4xml_xslprint]) 32 | argv.extend(['--docbook4xml-xslchunk', c.docbook4xml_xslchunk]) 33 | argv.extend(['--docbook4xml-xslsingle', c.docbook4xml_xslsingle]) 34 | mysource = TestSourceDocSkeleton(c.sourcedir) 35 | ex = example.ex_docbook4xml_dir 36 | exdir = os.path.dirname(ex.filename) 37 | mysource.copytree(exdir) 38 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 39 | self.assertEqual(1, len(inv.new.keys())) 40 | 41 | # -- run first build (will generate MD5SUMS file 42 | # 43 | exitcode = tldp.driver.run(argv) 44 | self.assertEqual(exitcode, os.EX_OK) 45 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 46 | self.assertEqual(1, len(inv.published.keys())) 47 | 48 | # -- remove the generated MD5SUMS file, ensure rebuild occurs 49 | # 50 | doc = inv.published.values().pop() 51 | os.unlink(doc.output.MD5SUMS) 52 | self.assertEqual(dict(), doc.output.md5sums) 53 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 54 | self.assertEqual(1, len(inv.stale.keys())) 55 | if not os.path.isdir(c.builddir): 56 | os.mkdir(c.builddir) 57 | exitcode = tldp.driver.run(argv) 58 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 59 | self.assertEqual(1, len(inv.published.keys())) 60 | 61 | # -- remove a source file, add a source file, change a source file 62 | # 63 | main = opj(mysource.dirname, opb(exdir), opb(ex.filename)) 64 | disappearing = opj(mysource.dirname, opb(exdir), 'disappearing.xml') 65 | brandnew = opj(mysource.dirname, opb(exdir), 'brandnew.xml') 66 | shutil.copy(disappearing, brandnew) 67 | os.unlink(opj(mysource.dirname, opb(exdir), 'disappearing.xml')) 68 | with codecs.open(main, 'w', encoding='utf-8') as f: 69 | f.write(ex.content.replace('FIXME', 'TOTALLY-FIXED')) 70 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 71 | self.assertEqual(1, len(inv.stale.keys())) 72 | stdout = io.StringIO() 73 | c.verbose = True 74 | tldp.driver.detail(c, inv.all.values(), file=stdout) 75 | stdout.seek(0) 76 | data = stdout.read() 77 | self.assertTrue('new source' in data) 78 | self.assertTrue('gone source' in data) 79 | self.assertTrue('changed source' in data) 80 | 81 | # -- rebuild (why not?) 82 | # 83 | if not os.path.isdir(c.builddir): 84 | os.mkdir(c.builddir) 85 | exitcode = tldp.driver.run(argv) 86 | self.assertEqual(exitcode, os.EX_OK) 87 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 88 | self.assertEqual(1, len(inv.published.keys())) 89 | 90 | # -- remove a file (known extraneous file, build should succeed) 91 | 92 | # 93 | # -- end of file 94 | -------------------------------------------------------------------------------- /tldp/doctypes/linuxdoc.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import logging 10 | 11 | from tldp.utils import which 12 | from tldp.utils import arg_isexecutable, isexecutable 13 | from tldp.doctypes.common import BaseDoctype, SignatureChecker, depends 14 | 15 | logger = logging.getLogger(__name__) 16 | 17 | 18 | class Linuxdoc(BaseDoctype, SignatureChecker): 19 | formatname = 'Linuxdoc' 20 | extensions = ['.sgml'] 21 | signatures = [' "{output.name_txt}" \\ 49 | -style pretty \\ 50 | -nobs \\ 51 | "{output.name_htmls}"''' 52 | return self.shellscript(s, **kwargs) 53 | 54 | @depends(make_name_htmls) 55 | def make_name_pdf(self, **kwargs): 56 | s = '''"{config.linuxdoc_htmldoc}" \\ 57 | --size universal \\ 58 | --firstpage p1 \\ 59 | --format pdf \\ 60 | --footer c.1 \\ 61 | --outfile "{output.name_pdf}" \\ 62 | "{output.name_htmls}"''' 63 | return self.shellscript(s, **kwargs) 64 | 65 | @depends(make_name_htmls) 66 | def make_name_html(self, **kwargs): 67 | '''create chunked output''' 68 | s = '"{config.linuxdoc_sgml2html}" "{source.filename}"' 69 | return self.shellscript(s, **kwargs) 70 | 71 | @depends(make_name_html) 72 | def make_name_indexhtml(self, **kwargs): 73 | '''create final index.html symlink''' 74 | s = 'ln -svr -- "{output.name_html}" "{output.name_indexhtml}"' 75 | return self.shellscript(s, **kwargs) 76 | 77 | @classmethod 78 | def argparse(cls, p): 79 | descrip = 'executables and data files for %s' % (cls.formatname,) 80 | g = p.add_argument_group(title=cls.__name__, description=descrip) 81 | g.add_argument('--linuxdoc-sgmlcheck', type=arg_isexecutable, 82 | default=which('sgmlcheck'), 83 | help='full path to sgmlcheck [%(default)s]') 84 | g.add_argument('--linuxdoc-sgml2html', type=arg_isexecutable, 85 | default=which('sgml2html'), 86 | help='full path to sgml2html [%(default)s]') 87 | g.add_argument('--linuxdoc-html2text', type=arg_isexecutable, 88 | default=which('html2text'), 89 | help='full path to html2text [%(default)s]') 90 | g.add_argument('--linuxdoc-htmldoc', type=arg_isexecutable, 91 | default=which('htmldoc'), 92 | help='full path to htmldoc [%(default)s]') 93 | 94 | # 95 | # -- end of file 96 | -------------------------------------------------------------------------------- /tests/test_inventory.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | 9 | import random 10 | 11 | from tldptesttools import TestInventoryBase 12 | 13 | # -- Test Data 14 | import example 15 | 16 | # -- SUT 17 | from tldp.inventory import Inventory 18 | 19 | 20 | class TestInventoryUsage(TestInventoryBase): 21 | 22 | def test_inventory_repr(self): 23 | c = self.config 24 | ex = random.choice(example.sources) 25 | self.add_published('Frobnitz-HOWTO', ex) 26 | i = Inventory(c.pubdir, c.sourcedir) 27 | self.assertTrue('1 published' in str(i)) 28 | 29 | def test_status_class_accessors(self): 30 | c = self.config 31 | ex = random.choice(example.sources) 32 | self.add_published('Published-HOWTO', ex) 33 | self.add_new('New-HOWTO', ex) 34 | self.add_stale('Stale-HOWTO', ex) 35 | self.add_orphan('Orphan-HOWTO', ex) 36 | self.add_broken('Broken-HOWTO', ex) 37 | i = Inventory(c.pubdir, c.sourcedir) 38 | self.assertTrue('Orphan-HOWTO' in i.orphans.keys()) 39 | self.assertTrue('Orphan-HOWTO' in i.orphaned.keys()) 40 | self.assertTrue(3, len(i.problems.keys())) 41 | self.assertTrue(4, len(i.work.keys())) 42 | self.assertTrue(5, len(i.all.keys())) 43 | self.assertTrue(5, len(i.sources.keys())) 44 | self.assertTrue(5, len(i.outputs.keys())) 45 | 46 | def test_detect_status_published(self): 47 | c = self.config 48 | ex = random.choice(example.sources) 49 | self.add_published('Frobnitz-Published-HOWTO', ex) 50 | i = Inventory(c.pubdir, c.sourcedir) 51 | self.assertEqual(0, len(i.stale)) 52 | self.assertEqual(1, len(i.published)) 53 | self.assertEqual(0, len(i.new)) 54 | self.assertEqual(0, len(i.orphan)) 55 | self.assertEqual(0, len(i.broken)) 56 | 57 | def test_detect_status_new(self): 58 | c = self.config 59 | ex = random.choice(example.sources) 60 | self.add_new('Frobnitz-New-HOWTO', ex) 61 | i = Inventory(c.pubdir, c.sourcedir) 62 | self.assertEqual(0, len(i.stale)) 63 | self.assertEqual(0, len(i.published)) 64 | self.assertEqual(1, len(i.new)) 65 | self.assertEqual(0, len(i.orphan)) 66 | self.assertEqual(0, len(i.broken)) 67 | 68 | def test_detect_status_orphan(self): 69 | c = self.config 70 | ex = random.choice(example.sources) 71 | self.add_orphan('Frobnitz-Orphan-HOWTO', ex) 72 | i = Inventory(c.pubdir, c.sourcedir) 73 | self.assertEqual(0, len(i.stale)) 74 | self.assertEqual(0, len(i.published)) 75 | self.assertEqual(0, len(i.new)) 76 | self.assertEqual(1, len(i.orphan)) 77 | self.assertEqual(0, len(i.broken)) 78 | 79 | def test_detect_status_stale(self): 80 | c = self.config 81 | ex = random.choice(example.sources) 82 | self.add_stale('Frobnitz-Stale-HOWTO', ex) 83 | i = Inventory(c.pubdir, c.sourcedir) 84 | self.assertEqual(1, len(i.stale)) 85 | self.assertEqual(1, len(i.published)) 86 | self.assertEqual(0, len(i.new)) 87 | self.assertEqual(0, len(i.orphan)) 88 | self.assertEqual(0, len(i.broken)) 89 | 90 | def test_detect_status_broken(self): 91 | c = self.config 92 | ex = random.choice(example.sources) 93 | self.add_broken('Frobnitz-Broken-HOWTO', ex) 94 | i = Inventory(c.pubdir, c.sourcedir) 95 | self.assertEqual(0, len(i.stale)) 96 | self.assertEqual(1, len(i.published)) 97 | self.assertEqual(0, len(i.new)) 98 | self.assertEqual(0, len(i.orphan)) 99 | self.assertEqual(1, len(i.broken)) 100 | 101 | # 102 | # -- end of file 103 | -------------------------------------------------------------------------------- /tests/sample-documents/asciidoc-complete.txt: -------------------------------------------------------------------------------- 1 | The Article Title 2 | ================= 3 | Author's Name 4 | v1.0, 2003-12 5 | 6 | 7 | This is the optional preamble (an untitled section body). Useful for 8 | writing simple sectionless documents consisting only of a preamble. 9 | 10 | NOTE: The abstract, preface, appendix, bibliography, glossary and 11 | index section titles are significant ('specialsections'). 12 | 13 | 14 | [abstract] 15 | Example Abstract 16 | ---------------- 17 | The optional abstract (one or more paragraphs) goes here. 18 | 19 | This document is an AsciiDoc article skeleton containing briefly 20 | annotated element placeholders plus a couple of example index entries 21 | and footnotes. 22 | 23 | :numbered: 24 | 25 | The First Section 26 | ----------------- 27 | Article sections start at level 1 and can be nested up to four levels 28 | deep. 29 | footnote:[An example footnote.] 30 | indexterm:[Example index entry] 31 | 32 | And now for something completely different: ((monkeys)), lions and 33 | tigers (Bengal and Siberian) using the alternative syntax index 34 | entries. 35 | (((Big cats,Lions))) 36 | (((Big cats,Tigers,Bengal Tiger))) 37 | (((Big cats,Tigers,Siberian Tiger))) 38 | Note that multi-entry terms generate separate index entries. 39 | 40 | Here are a couple of image examples: an image:images/smallnew.png[] 41 | example inline image followed by an example block image: 42 | 43 | .Tiger block image 44 | image::images/tiger.png[Tiger image] 45 | 46 | Followed by an example table: 47 | 48 | .An example table 49 | [width="60%",options="header"] 50 | |============================================== 51 | | Option | Description 52 | | -a 'USER| GROUP' | Add 'USER' to 'GROUP'. 53 | | -R 'GROUP' | Disables access to 'GROUP'. 54 | |============================================== 55 | 56 | .An example example 57 | =============================================== 58 | Lorum ipum... 59 | =============================================== 60 | 61 | [[X1]] 62 | Sub-section with Anchor 63 | ~~~~~~~~~~~~~~~~~~~~~~~ 64 | Sub-section at level 2. 65 | 66 | A Nested Sub-section 67 | ^^^^^^^^^^^^^^^^^^^^ 68 | Sub-section at level 3. 69 | 70 | Yet another nested Sub-section 71 | ++++++++++++++++++++++++++++++ 72 | Sub-section at level 4. 73 | 74 | This is the maximum sub-section depth supported by the distributed 75 | AsciiDoc configuration. 76 | footnote:[A second example footnote.] 77 | 78 | 79 | The Second Section 80 | ------------------ 81 | Article sections are at level 1 and can contain sub-sections nested up 82 | to four deep. 83 | 84 | An example link to anchor at start of the <>. 85 | indexterm:[Second example index entry] 86 | 87 | An example link to a bibliography entry <>. 88 | 89 | 90 | :numbered!: 91 | 92 | [appendix] 93 | Example Appendix 94 | ---------------- 95 | AsciiDoc article appendices are just just article sections with 96 | 'specialsection' titles. 97 | 98 | Appendix Sub-section 99 | ~~~~~~~~~~~~~~~~~~~~ 100 | Appendix sub-section at level 2. 101 | 102 | 103 | [bibliography] 104 | Example Bibliography 105 | -------------------- 106 | The bibliography list is a style of AsciiDoc bulleted list. 107 | 108 | [bibliography] 109 | - [[[taoup]]] Eric Steven Raymond. 'The Art of Unix 110 | Programming'. Addison-Wesley. ISBN 0-13-142901-9. 111 | - [[[walsh-muellner]]] Norman Walsh & Leonard Muellner. 112 | 'DocBook - The Definitive Guide'. O'Reilly & Associates. 1999. 113 | ISBN 1-56592-580-7. 114 | 115 | 116 | [glossary] 117 | Example Glossary 118 | ---------------- 119 | Glossaries are optional. Glossaries entries are an example of a style 120 | of AsciiDoc labeled lists. 121 | 122 | [glossary] 123 | A glossary term:: 124 | The corresponding (indented) definition. 125 | 126 | A second glossary term:: 127 | The corresponding (indented) definition. 128 | 129 | 130 | ifdef::backend-docbook[] 131 | [index] 132 | Example Index 133 | ------------- 134 | //////////////////////////////////////////////////////////////// 135 | The index is normally left completely empty, it's contents being 136 | generated automatically by the DocBook toolchain. 137 | //////////////////////////////////////////////////////////////// 138 | endif::backend-docbook[] 139 | -------------------------------------------------------------------------------- /tldp/typeguesser.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | 8 | import os 9 | import codecs 10 | import inspect 11 | import logging 12 | 13 | import tldp.doctypes 14 | 15 | logger = logging.getLogger(__name__) 16 | 17 | 18 | def getDoctypeMembers(membertype): 19 | '''returns a list of tldp.doctypes; convenience function''' 20 | found = list() 21 | for name, member in inspect.getmembers(tldp.doctypes, membertype): 22 | logger.debug("Located %s %s (%r).", membertype.__name__, name, member) 23 | found.append(member) 24 | return found 25 | 26 | 27 | def getDoctypeClasses(): 28 | '''returns a list of the classes known in tldp.doctypes 29 | 30 | This is the canonical list of doctypes which are recognized and capable of 31 | being processed into outputs. See tldp.doctypes for more information. 32 | ''' 33 | return getDoctypeMembers(inspect.isclass) 34 | 35 | 36 | def guess(fname): 37 | '''return a tldp.doctype class which is a best guess for document type 38 | 39 | :parama fname: A filename. 40 | 41 | The guess function will try to guess the document type (doctype) from the 42 | file extension. If extension matching produces multiple possible doctype 43 | matches (e.g. .xml or .sgml), the guess function will then use signature 44 | matching to find the earliest match in the file for a signature. 45 | 46 | If there are multiple signature matches, it will choose the signature 47 | matching at the earliest position in the file. 48 | 49 | Bugs/shortcomings: 50 | 51 | * This is only a guesser. 52 | * When signature matching, it reports first signature it discovers in 53 | any input file. 54 | * It could/should read more than 1024 bytes (cf. SignatureChecker) 55 | especially if it cannot return any result. 56 | * It could/should use heuristics or something richer than signatures. 57 | ''' 58 | try: 59 | stem, ext = os.path.splitext(fname) 60 | except (AttributeError, TypeError): 61 | return None 62 | 63 | if not ext: 64 | logger.debug("%s no file extension, skipping %s.", stem, ext) 65 | return None 66 | 67 | possible = [t for t in knowndoctypes if ext in t.extensions] 68 | logger.debug("Possible: %r", possible) 69 | if not possible: 70 | logger.debug("%s unknown extension %s.", stem, ext) 71 | return None 72 | 73 | if len(possible) == 1: 74 | doctype = possible.pop() 75 | return doctype 76 | 77 | # -- for this extension, multiple document types, probably SGML, XML 78 | # 79 | logger.debug("%s multiple possible doctypes for extension %s on file %s.", 80 | stem, ext, fname) 81 | for doctype in possible: 82 | logger.debug("%s extension %s could be %s.", stem, ext, doctype) 83 | 84 | try: 85 | with codecs.open(fname, encoding='utf-8') as f: 86 | buf = f.read(1024) 87 | except UnicodeDecodeError: 88 | # -- a wee bit ugly, but many SGML docs used iso-8859-1, so fall back 89 | with codecs.open(fname, encoding='iso-8859-1') as f: 90 | buf = f.read(1024) 91 | 92 | guesses = list() 93 | for doctype in possible: 94 | sindex = doctype.signatureLocation(buf, fname) 95 | if sindex is not None: 96 | guesses.append((sindex, doctype)) 97 | 98 | if not guesses: 99 | logger.warning("%s no matching signature found for %s.", 100 | stem, fname) 101 | return None 102 | if len(guesses) == 1: 103 | _, doctype = guesses.pop() 104 | return doctype 105 | 106 | # -- OK, this is unusual; we still found multiple document type 107 | # signatures. Seems rare but unlikely, so we should choose the 108 | # first signature in the file as the more likely document type. 109 | # 110 | guesses.sort() 111 | logger.info("%s multiple doctype guesses for file %s", stem, fname) 112 | for sindex, doctype in guesses: 113 | logger.info("%s could be %s (sig at pos %s)", stem, doctype, sindex) 114 | logger.info("%s going to guess %s for %s", stem, doctype, fname) 115 | _, doctype = guesses.pop(0) 116 | return doctype 117 | 118 | 119 | knowndoctypes = getDoctypeClasses() 120 | knownextensions = set() 121 | for x in knowndoctypes: 122 | knownextensions.update(x.extensions) 123 | 124 | # 125 | # -- end of file 126 | -------------------------------------------------------------------------------- /etc/ldptool.ini: -------------------------------------------------------------------------------- 1 | # -- System Wide configuration file for the ldptool, a command-line utility 2 | # for building DocBook XML, DocBook SGML and Linuxdoc (SGML) documents into 3 | # a variety of output formats. 4 | # 5 | [ldptool] 6 | 7 | # -- source dir is a comma-separated list of directories containing LDP 8 | # documents; a document is either a plain file of a supported document type 9 | # or a directory containing a file of a supported document type. 10 | # 11 | # For example, in a sourcedir, the following will be detected and 12 | # classified as source documents. Note, in particular, that, in the naming 13 | # convention, the file stem must match the directory base name: 14 | # 15 | # Frobnitz-HOWTO.sgml 16 | # Wascally-Wabbit-HOWTO/Wascally-Wabbit-HOWTO.xml 17 | # 18 | # sourcedir = /path/to/checkout/LDP/LDP/faq/linuxdoc/, 19 | # /path/to/checkout/LDP/LDP/guide/linuxdoc/, 20 | # /path/to/checkout/LDP/LDP/howto/linuxdoc/, 21 | # /path/to/checkout/LDP/LDP/howto/docbook/, 22 | # /path/to/checkout/LDP/LDP/guide/docbook/, 23 | # /path/to/checkout/LDP/LDP/ref/docbook/, 24 | # /path/to/checkout/LDP/LDP/faq/docbook/ 25 | 26 | # -- the pubdir is the location where the output directories will be found 27 | # and/or created; this is the publication directory 28 | # 29 | # pubdir = /path/to/publication/directory/ 30 | 31 | # -- if you need to skip a particular (problematic?) document during build 32 | # the skip option is available; this parameter holds comma-separated 33 | # document STEM names (HOWTO-INDEX is broken as of 2016-03-04) 34 | # 35 | skip = HOWTO-INDEX 36 | 37 | # -- the ldptool utility can be very chatty, if you wish; loglevel accepts the 38 | # standard set of Python loglevel identifiers (or numeric values), e.g. 39 | # 40 | # loglevel = DEBUG 41 | # loglevel = INFO 42 | # loglevel = WARNING 43 | # loglevel = ERROR 44 | # loglevel = CRITICAL 45 | # 46 | # -- the default loglevel is ERROR (40); setting the loglevel as low as INFO 47 | # (20) will produce only a moderate amount of output, and is probably 48 | # suitable for automated processing; setting the loglevel to DEBUG will 49 | # generate quite a bit of logging 50 | # 51 | loglevel = ERROR 52 | 53 | # -- Used only by the 'detail' command-line, you can get more verbose 54 | # descriptions of the source and output documents by throwing the verbose 55 | # flag 56 | # 57 | verbose = False 58 | 59 | # -- These are the main actions and they are mutually exclusive. Pick any 60 | # of them that you would like: 61 | # 62 | # publish = False 63 | # build = False 64 | # script = False 65 | # detail = False 66 | # summary = False 67 | # doctypes = False 68 | # statustypes = False 69 | # 70 | 71 | # -- Each of the document types may require different executables and/or data 72 | # files to support processing of the specific document type. The below 73 | # configuration file section fragments allow each document type processor 74 | # to keep its own configurables separate from other document processors. 75 | # 76 | # -- The ldptool code uses $PATH (from the environment) to locate the 77 | # executables, by default. If the utilities are not installed in the 78 | # system path, then it is possible to configure the full path to each 79 | # executable in your own configuration file or in a system-wide 80 | # configuration file (/etc/ldptool/ldptool.ini). 81 | # 82 | # -- If specific data files are not discoverable, e.g. the DocBook DSSSL and 83 | # DocBook XSL stylesheets, the ldptool will skip processing that document 84 | # type. 85 | # 86 | 87 | [ldptool-linuxdoc] 88 | # htmldoc = /usr/bin/htmldoc 89 | # html2text = /usr/bin/html2text 90 | # sgml2html = /usr/bin/sgml2html 91 | # sgmlcheck = /usr/bin/sgmlcheck 92 | 93 | [ldptool-docbooksgml] 94 | # collateindex = /home/mabrown/bin/collateindex.pl 95 | # dblatex = /usr/bin/dblatex 96 | # docbookdsl = /usr/share/sgml/docbook/dsssl-stylesheets/html/docbook.dsl 97 | # html2text = /usr/bin/html2text 98 | # jw = /usr/bin/jw 99 | # ldpdsl = /usr/share/sgml/docbook/stylesheet/dsssl/ldp/ldp.dsl 100 | # openjade = /usr/bin/openjade 101 | 102 | [ldptool-docbook4xml] 103 | # fop = /usr/bin/fop 104 | # dblatex = /usr/bin/dblatex 105 | # html2text = /usr/bin/html2text 106 | # xsltproc = /usr/bin/xsltproc 107 | # xslchunk = /usr/share/xml/docbook/stylesheet/ldp/html/tldp-sections.xsl 108 | # xslprint = /usr/share/xml/docbook/stylesheet/ldp/fo/tldp-print.xsl 109 | # xslsingle = /usr/share/xml/docbook/stylesheet/ldp/html/tldp-one-page.xsl 110 | 111 | [ldptool-docbook5xml] 112 | # dblatex = /usr/bin/dblatex 113 | # fop = /usr/bin/fop 114 | # jing = /usr/bin/jing 115 | # html2text = /usr/bin/html2text 116 | # rngfile = /usr/share/xml/docbook/schema/rng/5.0/docbook.rng 117 | # xmllint = /usr/bin/xmllint 118 | # xslchunk = /usr/share/xml/docbook/stylesheet/docbook-xsl-ns/html/chunk.xsl 119 | # xslprint = /usr/share/xml/docbook/stylesheet/docbook-xsl-ns/fo/docbook.xsl 120 | # xslsingle = /usr/share/xml/docbook/stylesheet/docbook-xsl-ns/html/docbook.xsl 121 | # xsltproc = /usr/bin/xsltproc 122 | 123 | [ldptool-asciidoc] 124 | # asciidoc = /usr/bin/asciidoc 125 | # xmllint = /usr/bin/xmllint 126 | 127 | # -- end of file 128 | -------------------------------------------------------------------------------- /tests/long_driver.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | 7 | import os 8 | 9 | from tldptesttools import TestInventoryBase 10 | 11 | from tldp.sources import SourceDocument 12 | 13 | # -- Test Data 14 | import example 15 | 16 | # -- SUT 17 | import tldp.driver 18 | 19 | 20 | class TestDriverRun(TestInventoryBase): 21 | 22 | def test_run_status_selection(self): 23 | self.add_docbook4xml_xsl_to_config() 24 | c = self.config 25 | self.add_stale('Asciidoc-Stale-HOWTO', example.ex_asciidoc) 26 | self.add_new('DocBook4XML-New-HOWTO', example.ex_docbook4xml) 27 | argv = self.argv 28 | argv.extend(['--publish', 'stale']) 29 | argv.extend(['--docbook4xml-xslprint', c.docbook4xml_xslprint]) 30 | argv.extend(['--docbook4xml-xslchunk', c.docbook4xml_xslchunk]) 31 | argv.extend(['--docbook4xml-xslsingle', c.docbook4xml_xslsingle]) 32 | exitcode = tldp.driver.run(argv) 33 | self.assertEqual(exitcode, os.EX_OK) 34 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 35 | self.assertEqual(1, len(inv.published.keys())) 36 | 37 | 38 | class TestDriverBuild(TestInventoryBase): 39 | 40 | def test_build_one_broken(self): 41 | self.add_docbook4xml_xsl_to_config() 42 | c = self.config 43 | c.build = True 44 | self.add_new('Frobnitz-DocBook-XML-4-HOWTO', example.ex_docbook4xml) 45 | # -- mangle the content of a valid DocBook XML file 46 | borked = example.ex_docbook4xml.content[:-12] 47 | self.add_new('Frobnitz-Borked-XML-4-HOWTO', 48 | example.ex_docbook4xml, content=borked) 49 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 50 | self.assertEqual(2, len(inv.all.keys())) 51 | docs = inv.all.values() 52 | result = tldp.driver.build(c, docs) 53 | self.assertTrue('Build failed' in result) 54 | 55 | def test_build_only_requested_stem(self): 56 | c = self.config 57 | ex = example.ex_linuxdoc 58 | self.add_published('Published-HOWTO', ex) 59 | self.add_new('New-HOWTO', ex) 60 | argv = ['--pubdir', c.pubdir, '--sourcedir', c.sourcedir[0]] 61 | argv.extend(['--build', 'Published-HOWTO']) 62 | tldp.driver.run(argv) 63 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 64 | self.assertEqual(1, len(inv.published.keys())) 65 | self.assertEqual(1, len(inv.work.keys())) 66 | 67 | 68 | class TestDriverPublish(TestInventoryBase): 69 | 70 | def test_publish_fail_because_broken(self): 71 | c = self.config 72 | c.publish = True 73 | self.add_new('Frobnitz-DocBook-XML-4-HOWTO', example.ex_docbook4xml) 74 | self.add_stale('Broken-DocBook-XML-4-HOWTO', example.broken_docbook4xml) 75 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 76 | self.assertEqual(2, len(inv.all.keys())) 77 | docs = inv.all.values() 78 | exitcode = tldp.driver.publish(c, docs) 79 | self.assertNotEqual(exitcode, os.EX_OK) 80 | 81 | def test_publish_docbook5xml(self): 82 | c = self.config 83 | c.publish = True 84 | self.add_new('Frobnitz-DocBook-XML-5-HOWTO', example.ex_docbook5xml) 85 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 86 | self.assertEqual(1, len(inv.all.keys())) 87 | docs = inv.all.values() 88 | exitcode = tldp.driver.publish(c, docs) 89 | self.assertEqual(exitcode, os.EX_OK) 90 | doc = docs.pop(0) 91 | self.assertTrue(doc.output.iscomplete) 92 | 93 | def test_publish_docbook4xml(self): 94 | self.add_docbook4xml_xsl_to_config() 95 | c = self.config 96 | c.publish = True 97 | self.add_new('Frobnitz-DocBook-XML-4-HOWTO', example.ex_docbook4xml) 98 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 99 | self.assertEqual(1, len(inv.all.keys())) 100 | docs = inv.all.values() 101 | exitcode = tldp.driver.publish(c, docs) 102 | self.assertEqual(exitcode, os.EX_OK) 103 | doc = docs.pop(0) 104 | self.assertTrue(doc.output.iscomplete) 105 | 106 | def test_publish_asciidoc(self): 107 | self.add_docbook4xml_xsl_to_config() 108 | c = self.config 109 | c.publish = True 110 | self.add_new('Frobnitz-Asciidoc-HOWTO', example.ex_asciidoc) 111 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 112 | self.assertEqual(1, len(inv.all.keys())) 113 | docs = inv.all.values() 114 | c.skip = [] 115 | exitcode = tldp.driver.publish(c, docs) 116 | self.assertEqual(exitcode, os.EX_OK) 117 | doc = docs.pop(0) 118 | self.assertTrue(doc.output.iscomplete) 119 | 120 | def test_publish_linuxdoc(self): 121 | c = self.config 122 | c.publish = True 123 | self.add_new('Frobnitz-Linuxdoc-HOWTO', example.ex_linuxdoc) 124 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 125 | self.assertEqual(1, len(inv.all.keys())) 126 | docs = inv.all.values() 127 | c.skip = [] 128 | exitcode = tldp.driver.publish(c, docs) 129 | self.assertEqual(exitcode, os.EX_OK) 130 | doc = docs.pop(0) 131 | self.assertTrue(doc.output.iscomplete) 132 | 133 | def test_publish_docbooksgml(self): 134 | self.add_docbooksgml_support_to_config() 135 | c = self.config 136 | c.publish = True 137 | self.add_new('Frobnitz-DocBookSGML-HOWTO', example.ex_docbooksgml) 138 | inv = tldp.inventory.Inventory(c.pubdir, c.sourcedir) 139 | self.assertEqual(1, len(inv.all.keys())) 140 | docs = inv.all.values() 141 | exitcode = tldp.driver.publish(c, docs) 142 | self.assertEqual(exitcode, os.EX_OK) 143 | doc = docs.pop(0) 144 | self.assertTrue(doc.output.iscomplete) 145 | 146 | def test_publish_docbooksgml_larger(self): 147 | self.add_docbooksgml_support_to_config() 148 | c = self.config 149 | c.publish = True 150 | doc = SourceDocument(example.ex_docbooksgml_dir.filename) 151 | exitcode = tldp.driver.publish(c, [doc]) 152 | self.assertEqual(exitcode, os.EX_OK) 153 | self.assertTrue(doc.output.iscomplete) 154 | outputimages = os.path.join(doc.output.dirname, 'images') 155 | self.assertTrue(os.path.exists(outputimages)) 156 | 157 | # 158 | # -- end of file 159 | -------------------------------------------------------------------------------- /tldp/config.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import os 10 | import argparse 11 | 12 | import logging 13 | 14 | from tldp.utils import arg_isloglevel, arg_isreadablefile 15 | from tldp.cascadingconfig import CascadingConfig, DefaultFreeArgumentParser 16 | 17 | import tldp.typeguesser 18 | 19 | logger = logging.getLogger(__name__) 20 | 21 | DEFAULT_CONFIGFILE = '/etc/ldptool/ldptool.ini' 22 | 23 | 24 | class DirectoriesExist(argparse._AppendAction): 25 | 26 | def __call__(self, parser, namespace, values, option_string=None): 27 | if not os.path.isdir(values): 28 | message = "No such directory: %r for option %r, aborting..." 29 | message = message % (values, option_string) 30 | logger.critical(message) 31 | raise ValueError(message) 32 | items = getattr(namespace, self.dest, []) 33 | items.append(values) 34 | setattr(namespace, self.dest, items) 35 | 36 | 37 | class DirectoryExists(argparse._StoreAction): 38 | 39 | def __call__(self, parser, namespace, values, option_string=None): 40 | if not os.path.isdir(values): 41 | message = "No such directory: %r for option %r, aborting..." 42 | message = message % (values, option_string) 43 | logger.critical(message) 44 | raise ValueError(message) 45 | setattr(namespace, self.dest, values) 46 | 47 | 48 | class StoreTrueOrNargBool(argparse._StoreAction): 49 | 50 | _boolean_states = {'1': True, 'yes': True, 'true': True, 'on': True, 51 | '0': False, 'no': False, 'false': False, 'off': False} 52 | 53 | def __init__(self, *args, **kwargs): 54 | super(argparse._StoreAction, self).__init__(*args, **kwargs) 55 | 56 | def __call__(self, parser, namespace, values, option_string=None): 57 | if values is None: 58 | setattr(namespace, self.dest, True) 59 | else: 60 | boolval = self._boolean_states.get(values.lower(), None) 61 | if boolval is None: 62 | message = "Non-boolean value: %r for option %r, aborting..." 63 | message = message % (values, option_string) 64 | logger.critical(message) 65 | raise ValueError(message) 66 | else: 67 | setattr(namespace, self.dest, boolval) 68 | 69 | 70 | def collectconfiguration(tag, argv): 71 | '''main specification of command-line (and config file) shape''' 72 | 73 | ap = DefaultFreeArgumentParser() 74 | ap.add_argument('--sourcedir', '--source-dir', '--source-directory', 75 | '-s', 76 | default=[], action=DirectoriesExist, 77 | help='a directory containing LDP source documents') 78 | 79 | ap.add_argument('--pubdir', '--output', '--outputdir', '--outdir', 80 | '-o', 81 | default=None, action=DirectoryExists, 82 | help='a directory containing LDP output documents') 83 | 84 | ap.add_argument('--builddir', '--build-dir', '--build-directory', 85 | '-d', 86 | default=None, action=DirectoryExists, 87 | help='a scratch directory used for building') 88 | 89 | ap.add_argument('--configfile', '--config-file', '--cfg', 90 | '-c', 91 | default=DEFAULT_CONFIGFILE, 92 | type=arg_isreadablefile, 93 | help='a configuration file') 94 | 95 | ap.add_argument('--loglevel', 96 | default=logging.ERROR, type=arg_isloglevel, 97 | help='set the loglevel') 98 | 99 | ap.add_argument('--verbose', 100 | action=StoreTrueOrNargBool, nargs='?', default=False, 101 | help='more info in --list/--detail [%(default)s]') 102 | 103 | ap.add_argument('--skip', 104 | default=[], action='append', type=str, 105 | help='skip this stem during processing') 106 | 107 | ap.add_argument('--resources', 108 | default=['images', 'resources'], action='append', type=str, 109 | help='subdirs to copy during build [%(default)s]') 110 | 111 | # -- and the distinct, mutually exclusive actions this script can perform 112 | # 113 | g = ap.add_mutually_exclusive_group() 114 | g.add_argument('--publish', 115 | '-p', 116 | action='store_true', default=False, 117 | help='build and publish LDP documentation [%(default)s]') 118 | 119 | g.add_argument('--build', 120 | '-b', 121 | action='store_true', default=False, 122 | help='build LDP documentation [%(default)s]') 123 | 124 | g.add_argument('--script', 125 | '-S', 126 | action='store_true', default=False, 127 | help='dump runnable script [%(default)s]') 128 | 129 | g.add_argument('--detail', '--list', 130 | '-l', 131 | action='store_true', default=False, 132 | help='list elements of LDP system [%(default)s]') 133 | 134 | g.add_argument('--summary', 135 | '-t', 136 | action='store_true', default=False, 137 | help='dump inventory summary report [%(default)s]') 138 | 139 | g.add_argument('--doctypes', '--formats', '--format', 140 | '--list-doctypes', '--list-formats', 141 | '-T', 142 | action='store_true', default=False, 143 | help='show supported doctypes [%(default)s]') 144 | 145 | g.add_argument('--statustypes', '--list-statustypes', 146 | action='store_true', default=False, 147 | help='show status types and classes [%(default)s]') 148 | 149 | g.add_argument('--version', 150 | '-V', 151 | action='store_true', default=False, 152 | help='print out the version number [%(default)s]') 153 | 154 | # -- collect up the distributed configuration fragments 155 | # 156 | for cls in tldp.typeguesser.knowndoctypes: 157 | argparse_method = getattr(cls, 'argparse', None) 158 | if argparse_method: 159 | argparse_method(ap) 160 | 161 | cc = CascadingConfig(tag, ap, argv) 162 | config, args = cc.parse() 163 | return config, args 164 | 165 | # 166 | # -- end of file 167 | -------------------------------------------------------------------------------- /tests/test_cascadingconfig.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | import unittest 9 | import argparse 10 | from argparse import Namespace 11 | 12 | from tldptesttools import TestToolsFilesystem 13 | from tldptesttools import CCTestTools 14 | 15 | # -- SUT 16 | from tldp.cascadingconfig import CascadingConfig 17 | from tldp.cascadingconfig import DefaultFreeArgumentParser 18 | 19 | 20 | class Test_argv_from_env(unittest.TestCase): 21 | 22 | def test_argv_from_env(self): 23 | pass 24 | 25 | 26 | class Test_argv_from_cfg(TestToolsFilesystem): 27 | 28 | def setUp(self): 29 | self.makeTempdir() 30 | 31 | def tearDown(self): 32 | self.removeTempdir() 33 | 34 | def test_argv_from_env(self): 35 | pass 36 | 37 | 38 | class TestDefaultFreeArgumentParser(unittest.TestCase): 39 | 40 | def test_basic(self): 41 | ap = DefaultFreeArgumentParser() 42 | self.assertIsInstance(ap, argparse.ArgumentParser) 43 | self.assertIsInstance(ap, DefaultFreeArgumentParser) 44 | 45 | 46 | class TestCascadingConfig(unittest.TestCase): 47 | 48 | def setUp(self): 49 | self.ap = DefaultFreeArgumentParser() 50 | 51 | def test_constructor(self): 52 | cc = CascadingConfig('tag', self.ap, []) 53 | self.assertIsInstance(cc, CascadingConfig) 54 | 55 | def test_parse(self): 56 | cc = CascadingConfig('tag', self.ap, []) 57 | config, args = cc.parse() 58 | self.assertIsInstance(config, Namespace) 59 | self.assertIsInstance(args, list) 60 | 61 | 62 | class TestCascadingConfigBasic(TestToolsFilesystem): 63 | 64 | def setUp(self): 65 | self.makeTempdir() 66 | self.ap = DefaultFreeArgumentParser() 67 | self.ap.add_argument('--sneeze', action='store_true', default=False) 68 | self.ap.add_argument('--eructate', default=[], type=str) 69 | 70 | def test_reading_env(self): 71 | argv = [] 72 | env = {'EFFLUVIA_SNEEZE': 'True'} 73 | cc = CascadingConfig('effluvia', self.ap, argv=argv, env=env) 74 | config, args = cc.parse() 75 | self.assertTrue(config.sneeze) 76 | 77 | 78 | class CascadingConfigBasicTest(CCTestTools): 79 | 80 | def test_defaults_returned(self): 81 | ap = DefaultFreeArgumentParser() 82 | ap.add_argument('--configfile', default=None, type=str) 83 | ap.add_argument('--size', default=9, type=int) 84 | 85 | c = Namespace( 86 | tag='tag', 87 | argparser=ap, 88 | argv=''.split(), 89 | env=dict(), 90 | cfg='', 91 | exp_config=Namespace(size=9), 92 | exp_args=[],) 93 | 94 | cc = CascadingConfig(c.tag, c.argparser, argv=c.argv, env=c.env) 95 | config, args = cc.parse() 96 | self.assertEqual(c.exp_config.size, config.size) 97 | 98 | def test_cfg_is_read_passed_by_env(self): 99 | ap = DefaultFreeArgumentParser() 100 | ap.add_argument('--configfile', default=None, type=str) 101 | ap.add_argument('--size', default=9, type=int) 102 | 103 | c = Namespace( 104 | tag='tag', 105 | argparser=ap, 106 | argv=''.split(), 107 | env=dict(), 108 | cfg='[tag]\nsize = 8', 109 | exp_config=Namespace(size=8), 110 | exp_args=[],) 111 | 112 | self.writeconfig(c) 113 | c.env.setdefault('TAG_CONFIGFILE', c.configfile) 114 | cc = CascadingConfig(c.tag, c.argparser, argv=c.argv, env=c.env) 115 | config, args = cc.parse() 116 | self.assertEqual(c.exp_config.size, config.size) 117 | 118 | def test_cfg_is_read_passed_by_argv(self): 119 | ap = DefaultFreeArgumentParser() 120 | ap.add_argument('--configfile', default=None, type=str) 121 | ap.add_argument('--size', default=9, type=int) 122 | 123 | import logging 124 | logging.getLogger().setLevel(logging.DEBUG) 125 | c = Namespace( 126 | tag='tag', 127 | argparser=ap, 128 | argv=''.split(), 129 | env=dict(), 130 | cfg='[tag]\nsize = 8', 131 | exp_config=Namespace(size=8), 132 | exp_args=[],) 133 | self.writeconfig(c) 134 | c.argv.extend(['--configfile', c.configfile]) 135 | cc = CascadingConfig(c.tag, c.argparser, argv=c.argv, env=c.env) 136 | config, args = cc.parse() 137 | self.assertEqual(c.exp_config.size, config.size) 138 | 139 | def test_precedence_env_cfg(self): 140 | ap = DefaultFreeArgumentParser() 141 | ap.add_argument('--configfile', default=None, type=str) 142 | ap.add_argument('--size', default=9, type=int) 143 | 144 | import logging 145 | logging.getLogger().setLevel(logging.DEBUG) 146 | c = Namespace( 147 | tag='tag', 148 | argparser=ap, 149 | argv=''.split(), 150 | env=dict(TAG_SIZE=7, ), 151 | cfg='[tag]\nsize = 8', 152 | exp_config=Namespace(size=7), 153 | exp_args=[],) 154 | self.writeconfig(c) 155 | c.argv.extend(['--configfile', c.configfile]) 156 | cc = CascadingConfig(c.tag, c.argparser, argv=c.argv, env=c.env) 157 | config, args = cc.parse() 158 | self.assertEqual(c.exp_config.size, config.size) 159 | 160 | def test_precedence_argv_env_cfg(self): 161 | ap = DefaultFreeArgumentParser() 162 | ap.add_argument('--configfile', default=None, type=str) 163 | ap.add_argument('--size', default=9, type=int) 164 | 165 | import logging 166 | logging.getLogger().setLevel(logging.DEBUG) 167 | c = Namespace( 168 | tag='tag', 169 | argparser=ap, 170 | argv='--size 6'.split(), 171 | env=dict(TAG_SIZE=7, ), 172 | cfg='[tag]\nsize = 8', 173 | exp_config=Namespace(size=6), 174 | exp_args=[],) 175 | self.writeconfig(c) 176 | c.argv.extend(['--configfile', c.configfile]) 177 | cc = CascadingConfig(c.tag, c.argparser, argv=c.argv, env=c.env) 178 | config, args = cc.parse() 179 | self.assertEqual(c.exp_config.size, config.size) 180 | 181 | def test_basic_emptydefault(self): 182 | ap = DefaultFreeArgumentParser() 183 | ap.add_argument('--source', default='', action='append', type=str) 184 | 185 | c = Namespace( 186 | tag='tag', 187 | argparser=ap, 188 | argv=''.split(), 189 | env=dict(), 190 | cfg='', 191 | exp_config=Namespace(source=''), 192 | exp_args=[],) 193 | cc = CascadingConfig(c.tag, c.argparser, argv=c.argv, env=c.env) 194 | config, args = cc.parse() 195 | self.assertEqual(c.exp_config, config) 196 | self.assertEqual(c.exp_args, args) 197 | 198 | def test_basic_argv(self): 199 | ap = DefaultFreeArgumentParser() 200 | ap.add_argument('--source', default='', action='append', type=str) 201 | 202 | c = Namespace( 203 | tag='tag', 204 | argparser=ap, 205 | argv='--source /some/path'.split(), 206 | env=dict(), 207 | cfg='', 208 | exp_config=Namespace(source=['/some/path']), 209 | exp_args=[],) 210 | cc = CascadingConfig(c.tag, c.argparser, argv=c.argv, env=c.env) 211 | config, args = cc.parse() 212 | self.assertEqual(c.exp_config, config) 213 | self.assertEqual(c.exp_args, args) 214 | 215 | # -- end of file 216 | -------------------------------------------------------------------------------- /tests/test_sources.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | import os 9 | import errno 10 | import random 11 | import unittest 12 | from argparse import Namespace 13 | from io import StringIO 14 | 15 | from tldptesttools import TestToolsFilesystem 16 | 17 | # -- Test Data 18 | import example 19 | 20 | # -- SUT 21 | from tldp.sources import SourceCollection, SourceDocument 22 | from tldp.sources import scansourcedirs, sourcedoc_fromdir 23 | from tldp.sources import arg_issourcedoc 24 | 25 | sampledocs = os.path.join(os.path.dirname(__file__), 'sample-documents') 26 | 27 | 28 | class TestFileSourceCollectionMultiDir(TestToolsFilesystem): 29 | 30 | def test_multidir_finding_singlefiles(self): 31 | ex = random.choice(example.sources) 32 | doc0 = Namespace(reldir='LDP/howto', stem="A-Unique-Stem") 33 | doc1 = Namespace(reldir='LDP/guide', stem="A-Different-Stem") 34 | documents = (doc0, doc1) 35 | for d in documents: 36 | d.reldir, d.absdir = self.adddir(d.reldir) 37 | _, _ = self.addfile(d.reldir, ex.filename, stem=d.stem) 38 | s = scansourcedirs([x.absdir for x in documents]) 39 | self.assertEqual(2, len(s)) 40 | expected = set([x.stem for x in documents]) 41 | found = set(s.keys()) 42 | self.assertEqual(expected, found) 43 | 44 | def test_multidir_finding_namecollision(self): 45 | ex = random.choice(example.sources) 46 | doc0 = Namespace(reldir='LDP/howto', stem="A-Non-Unique-Stem") 47 | doc1 = Namespace(reldir='LDP/guide', stem="A-Non-Unique-Stem") 48 | documents = (doc0, doc1) 49 | for d in documents: 50 | d.reldir, d.absdir = self.adddir(d.reldir) 51 | _, _ = self.addfile(d.reldir, ex.filename, stem=d.stem) 52 | s = scansourcedirs([x.absdir for x in documents]) 53 | self.assertEqual(1, len(s)) 54 | expected = set([x.stem for x in documents]) 55 | found = set(s.keys()) 56 | self.assertEqual(expected, found) 57 | 58 | 59 | class TestFileSourceCollectionOneDir(TestToolsFilesystem): 60 | 61 | def test_finding_nonfile(self): 62 | maindir = 'LDP/LDP/howto' 63 | reldir, absdir = self.adddir(maindir) 64 | os.mkfifo(os.path.join(absdir, 'non-dir-non-file.xml')) 65 | s = scansourcedirs([absdir]) 66 | self.assertEqual(0, len(s)) 67 | 68 | def test_finding_singlefile(self): 69 | ex = random.choice(example.sources) 70 | maindir = 'LDP/LDP/howto' 71 | reldir, absdir = self.adddir(maindir) 72 | _, _ = self.addfile(reldir, ex.filename) 73 | s = scansourcedirs([absdir]) 74 | self.assertEqual(1, len(s)) 75 | 76 | def test_skipping_misnamed_singlefile(self): 77 | ex = random.choice(example.sources) 78 | maindir = 'LDP/LDP/howto' 79 | reldir, absdir = self.adddir(maindir) 80 | self.addfile(reldir, ex.filename, ext=".mis") 81 | s = scansourcedirs([absdir]) 82 | self.assertEqual(1, len(s)) 83 | 84 | def test_multiple_stems_of_different_extensions(self): 85 | ex = random.choice(example.sources) 86 | stem = 'A-Non-Unique-Stem' 87 | maindir = os.path.join('LDP/LDP/howto', stem) 88 | reldir, absdir = self.adddir(maindir) 89 | self.addfile(reldir, ex.filename, stem=stem, ext=".xml") 90 | self.addfile(reldir, ex.filename, stem=stem, ext=".md") 91 | s = scansourcedirs([absdir]) 92 | self.assertEqual(1, len(s)) 93 | 94 | 95 | class TestNullSourceCollection(TestToolsFilesystem): 96 | 97 | def test_SourceCollection_no_dirnames(self): 98 | s = SourceCollection() 99 | self.assertIsInstance(s, SourceCollection) 100 | self.assertTrue('docs' in str(s)) 101 | 102 | 103 | class TestInvalidSourceCollection(TestToolsFilesystem): 104 | 105 | def test_validateDirs_onebad(self): 106 | invalid0 = os.path.join(self.tempdir, 'unique', 'rabbit') 107 | with self.assertRaises(IOError) as ecm: 108 | scansourcedirs([invalid0]) 109 | e = ecm.exception 110 | self.assertTrue('unique/rabbit' in e.filename) 111 | 112 | def test_validateDirs_multibad(self): 113 | invalid0 = os.path.join(self.tempdir, 'unique', 'rabbit') 114 | invalid1 = os.path.join(self.tempdir, 'affable', 'elephant') 115 | with self.assertRaises(IOError) as ecm: 116 | scansourcedirs([invalid0, invalid1]) 117 | e = ecm.exception 118 | self.assertTrue('affable/elephant' in e.filename) 119 | 120 | def testEmptyDir(self): 121 | s = scansourcedirs([self.tempdir]) 122 | self.assertEqual(0, len(s)) 123 | 124 | 125 | class Test_sourcedoc_fromdir(unittest.TestCase): 126 | 127 | def test_sourcedoc_fromdir_missingdir(self): 128 | dirname = os.path.dirname('/frobnitz/path/to/extremely/unlikely/file') 129 | self.assertIsNone(sourcedoc_fromdir(dirname)) 130 | 131 | def test_sourcedoc_fromdir_withdots(self): 132 | dirname = os.path.dirname(example.ex_docbook4xml_dir.filename) 133 | doc = sourcedoc_fromdir(dirname) 134 | self.assertIsNotNone(doc) 135 | 136 | 137 | class Test_arg_issourcedoc(unittest.TestCase): 138 | 139 | def test_arg_issourcedoc_fromdir(self): 140 | fname = example.ex_linuxdoc_dir.filename 141 | dirname = os.path.dirname(fname) 142 | self.assertTrue(fname, arg_issourcedoc(dirname)) 143 | 144 | 145 | class TestSourceDocument(TestToolsFilesystem): 146 | 147 | def test_init(self): 148 | for ex in example.sources: 149 | fullpath = ex.filename 150 | fn = os.path.relpath(fullpath, start=example.sampledocs) 151 | doc = SourceDocument(fullpath) 152 | self.assertIsInstance(doc, SourceDocument) 153 | self.assertTrue(fn in str(doc)) 154 | self.assertTrue(fn in doc.md5sums) 155 | 156 | def test_fromfifo_should_fail(self): 157 | fifo = os.path.join(self.tempdir, 'fifofile') 158 | os.mkfifo(fifo) 159 | with self.assertRaises(ValueError) as ecm: 160 | SourceDocument(fifo) 161 | e = ecm.exception 162 | self.assertTrue('not identifiable' in e.args[0]) 163 | 164 | def test_fromdir(self): 165 | dirname = os.path.dirname(example.ex_linuxdoc_dir.filename) 166 | doc = SourceDocument(dirname) 167 | self.assertIsInstance(doc, SourceDocument) 168 | 169 | def test_detail(self): 170 | ex = example.ex_linuxdoc_dir 171 | s = SourceDocument(ex.filename) 172 | fout = StringIO() 173 | widths = Namespace(status=20, doctype=20, stem=50) 174 | s.detail(widths, False, file=fout) 175 | fout.seek(0) 176 | result = fout.read() 177 | fout.close() 178 | self.assertTrue(ex.stem in result) 179 | self.assertTrue('source' in result) 180 | 181 | def test_bad_dir_multiple_doctypes(self): 182 | fullpath = os.path.join(sampledocs, 'Bad-Dir-Multiple-Doctypes') 183 | with self.assertRaises(Exception) as ecm: 184 | SourceDocument(fullpath) 185 | e = ecm.exception 186 | self.assertTrue('multiple document choices' in e.args[0]) 187 | 188 | 189 | class TestMissingSourceDocuments(TestToolsFilesystem): 190 | 191 | def test_init_missing(self): 192 | missing = os.path.join(self.tempdir, 'vanishing') 193 | with self.assertRaises(IOError) as ecm: 194 | SourceDocument(missing) 195 | e = ecm.exception 196 | self.assertEqual(errno.ENOENT, e.errno) 197 | 198 | def test_init_wrongtype(self): 199 | with self.assertRaises(ValueError) as ecm: 200 | SourceDocument(self.tempdir) 201 | e = ecm.exception 202 | self.assertTrue('not identifiable' in e.args[0]) 203 | 204 | # 205 | # -- end of file 206 | -------------------------------------------------------------------------------- /tldp/inventory.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import copy 10 | import logging 11 | from collections import OrderedDict 12 | 13 | from tldp.sources import SourceCollection 14 | from tldp.outputs import OutputCollection 15 | 16 | logger = logging.getLogger(__name__) 17 | 18 | # -- any individual document (source or output) will have a status 19 | # from the following list of status_types 20 | # 21 | stypes = OrderedDict() 22 | stypes['source'] = 'found in source repository' 23 | stypes['output'] = 'found in output repository' 24 | stypes['published'] = 'matching stem in source/output; doc is up to date' 25 | stypes['stale'] = 'matching stem in source/output; but source is newer' 26 | stypes['orphan'] = 'stem located in output, but no source found (i.e. old?)' 27 | stypes['broken'] = 'output is missing an expected output format (e.g. PDF)' 28 | stypes['new'] = 'stem located in source, but missing in output; unpublished' 29 | 30 | status_types = stypes.keys() 31 | 32 | # -- the user probably doesn't usually care (too much) about listing 33 | # every single published document and source document, but is probably 34 | # mostly interested in specific documents grouped by status; so the 35 | # status_classes are just sets of status_types 36 | # 37 | status_classes = OrderedDict(zip(status_types, [[x] for x in status_types])) 38 | status_classes['outputs'] = ['output'] 39 | status_classes['sources'] = ['source'] 40 | status_classes['orphans'] = ['orphan'] 41 | status_classes['orphaned'] = ['orphan'] 42 | status_classes['problems'] = ['orphan', 'broken', 'stale'] 43 | status_classes['work'] = ['new', 'orphan', 'broken', 'stale'] 44 | status_classes['all'] = ['published', 'new', 'orphan', 'broken', 'stale'] 45 | 46 | 47 | class Inventory(object): 48 | '''a container for classifying documents by their status 49 | 50 | Every SourceDocument has no more than one matching OutputDirectory. 51 | 52 | The Inventory class encodes the logic for identifying the following 53 | different status possibilities for an arbitrary set of SourceDocuments and 54 | OutputDirectorys. 55 | 56 | The following are possible values for status: 57 | - 'source': a source document before any status detection 58 | - 'output': an output document before any status detection 59 | - 'new': a source document without any matching output stem 60 | - 'published': a pair of source/output documents with matching stems 61 | - 'orphan': an output document without any matching source stem 62 | - 'broken': a published document with missing output files 63 | - 'stale': a published document with new(er) source files 64 | 65 | The Inventory object is intended to be used to identify work that needs to 66 | be done on individual source documents to produce up-to-date output 67 | documents. 68 | ''' 69 | def __repr__(self): 70 | return '<%s: %d published, %d orphan, %d new, %d stale, %d broken>' % ( 71 | self.__class__.__name__, 72 | len(self.published), 73 | len(self.orphan), 74 | len(self.new), 75 | len(self.stale), 76 | len(self.broken),) 77 | 78 | def __init__(self, pubdir, sourcedirs): 79 | '''construct an Inventory 80 | 81 | pubdir: path to the OutputCollection 82 | 83 | sourcedirs: a list of directories which could be passed to the 84 | SourceCollection object; essentially a directory containing 85 | SourceDocuments; for example LDP/LDP/howto/linuxdoc and 86 | LDP/LDP/guide/docbook 87 | ''' 88 | self.output = OutputCollection(pubdir) 89 | self.source = SourceCollection(sourcedirs) 90 | s = copy.deepcopy(self.source) 91 | o = copy.deepcopy(self.output) 92 | sset = set(s.keys()) 93 | oset = set(o.keys()) 94 | 95 | # -- orphan identification 96 | # 97 | self.orphan = OutputCollection() 98 | for doc in oset.difference(sset): 99 | self.orphan[doc] = o[doc] 100 | del o[doc] 101 | self.orphan[doc].status = 'orphan' 102 | logger.debug("Identified %d orphan documents: %r.", len(self.orphan), 103 | self.orphan.keys()) 104 | 105 | # -- unpublished ('new') identification 106 | # 107 | self.new = SourceCollection() 108 | for doc in sset.difference(oset): 109 | self.new[doc] = s[doc] 110 | del s[doc] 111 | self.new[doc].status = 'new' 112 | logger.debug("Identified %d new documents: %r.", len(self.new), 113 | self.new.keys()) 114 | 115 | # -- published identification; source and output should be same size 116 | assert len(s) == len(o) 117 | for stem, odoc in o.items(): 118 | sdoc = s[stem] 119 | sdoc.output = odoc 120 | odoc.source = sdoc 121 | sdoc.status = sdoc.output.status = 'published' 122 | self.published = s 123 | logger.debug("Identified %d published documents.", len(self.published)) 124 | 125 | # -- broken identification 126 | # 127 | self.broken = SourceCollection() 128 | for stem, sdoc in s.items(): 129 | if not sdoc.output.iscomplete: 130 | self.broken[stem] = sdoc 131 | sdoc.status = sdoc.output.status = 'broken' 132 | logger.debug("Identified %d broken documents: %r.", len(self.broken), 133 | self.broken.keys()) 134 | 135 | # -- stale identification 136 | # 137 | self.stale = SourceCollection() 138 | for stem, sdoc in s.items(): 139 | odoc = sdoc.output 140 | omd5, smd5 = odoc.md5sums, sdoc.md5sums 141 | if omd5 != smd5: 142 | logger.debug("%s differing MD5 sets %r %r", stem, smd5, omd5) 143 | changed = set() 144 | for gone in set(omd5.keys()).difference(smd5.keys()): 145 | logger.debug("%s gone %s", stem, gone) 146 | changed.add(('gone', gone)) 147 | for new in set(smd5.keys()).difference(omd5.keys()): 148 | changed.add(('new', new)) 149 | for sfn in set(smd5.keys()).intersection(omd5.keys()): 150 | if smd5[sfn] != omd5[sfn]: 151 | changed.add(('changed', sfn)) 152 | for why, sfn in changed: 153 | logger.debug("%s differing source %s (%s)", stem, sfn, why) 154 | odoc.status = sdoc.status = 'stale' 155 | sdoc.differing = changed 156 | self.stale[stem] = sdoc 157 | logger.debug("Identified %d stale documents: %r.", len(self.stale), 158 | self.stale.keys()) 159 | 160 | def getByStatusClass(self, status_class): 161 | desired = status_classes.get(status_class, None) 162 | assert isinstance(desired, list) 163 | collection = SourceCollection() 164 | for status_type in desired: 165 | collection.update(getattr(self, status_type)) 166 | return collection 167 | 168 | @property 169 | def outputs(self): 170 | return self.getByStatusClass('outputs') 171 | 172 | @property 173 | def sources(self): 174 | return self.getByStatusClass('sources') 175 | 176 | @property 177 | def problems(self): 178 | return self.getByStatusClass('problems') 179 | 180 | @property 181 | def work(self): 182 | return self.getByStatusClass('work') 183 | 184 | @property 185 | def orphans(self): 186 | return self.getByStatusClass('orphans') 187 | 188 | @property 189 | def orphaned(self): 190 | return self.getByStatusClass('orphaned') 191 | 192 | @property 193 | def all(self): 194 | return self.getByStatusClass('all') 195 | 196 | # 197 | # -- end of file 198 | -------------------------------------------------------------------------------- /tldp/outputs.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import os 10 | import sys 11 | import errno 12 | import codecs 13 | import logging 14 | 15 | from tldp.ldpcollection import LDPDocumentCollection 16 | from tldp.utils import logdir 17 | 18 | logger = logging.getLogger(__name__) 19 | 20 | 21 | class OutputNamingConvention(object): 22 | '''A base class inherited by OutputDirectory to ensure consistent 23 | naming of files across the output collection of documents, 24 | regardless of the source document type and processing toolchain 25 | choice. 26 | 27 | Sets a list of names for documents that are expected to be present 28 | in order to report that the directory iscomplete. 29 | ''' 30 | expected = ['name_txt', 'name_pdf', 'name_htmls', 'name_html', 31 | 'name_indexhtml'] 32 | 33 | def __init__(self, dirname, stem): 34 | self.dirname = dirname 35 | self.stem = stem 36 | 37 | @property 38 | def MD5SUMS(self): 39 | return os.path.join(self.dirname, '.LDP-source-MD5SUMS') 40 | 41 | @property 42 | def name_txt(self): 43 | return os.path.join(self.dirname, self.stem + '.txt') 44 | 45 | @property 46 | def name_fo(self): 47 | return os.path.join(self.dirname, self.stem + '.fo') 48 | 49 | @property 50 | def name_pdf(self): 51 | return os.path.join(self.dirname, self.stem + '.pdf') 52 | 53 | @property 54 | def name_html(self): 55 | return os.path.join(self.dirname, self.stem + '.html') 56 | 57 | @property 58 | def name_htmls(self): 59 | return os.path.join(self.dirname, self.stem + '-single.html') 60 | 61 | @property 62 | def name_epub(self): 63 | return os.path.join(self.dirname, self.stem + '.epub') 64 | 65 | @property 66 | def name_indexhtml(self): 67 | return os.path.join(self.dirname, 'index.html') 68 | 69 | @property 70 | def validsource(self): 71 | return os.path.join(self.dirname, self.stem + '.xml') # -- burp 72 | 73 | @property 74 | def iscomplete(self): 75 | '''True if the output directory contains all expected documents''' 76 | present = list() 77 | for prop in self.expected: 78 | name = getattr(self, prop, None) 79 | assert name is not None 80 | present.append(os.path.exists(name)) 81 | return all(present) 82 | 83 | @property 84 | def missing(self): 85 | '''returns a set of missing files''' 86 | missing = set() 87 | for prop in self.expected: 88 | name = getattr(self, prop, None) 89 | assert name is not None 90 | if not os.path.isfile(name): 91 | missing.add(name) 92 | return missing 93 | 94 | @property 95 | def md5sums(self): 96 | d = dict() 97 | try: 98 | with codecs.open(self.MD5SUMS, encoding='utf-8') as f: 99 | for line in f: 100 | if line.startswith('#'): 101 | continue 102 | hashval, fname = line.strip().split() 103 | d[fname] = hashval 104 | except IOError as e: 105 | if e.errno != errno.ENOENT: 106 | raise 107 | return d 108 | 109 | 110 | class OutputDirectory(OutputNamingConvention): 111 | '''A class providing a container for each set of output documents 112 | for a given source document and general methods for operating on 113 | and preparing the output directory for a document processor. 114 | For example, the process of generating each document type for a single 115 | source (e.g. 'Unicode-HOWTO') would be managed by this object. 116 | 117 | An important element of the OutputDirectory is the stem, determined 118 | from the directory name when __init__() is called. 119 | ''' 120 | def __repr__(self): 121 | return '<%s:%s>' % (self.__class__.__name__, self.dirname) 122 | 123 | @classmethod 124 | def fromsource(cls, dirname, source): 125 | newname = os.path.join(dirname, source.stem) 126 | return cls(newname, source=source) 127 | 128 | def __init__(self, dirname, source=None): 129 | '''constructor 130 | :param dirname: directory name for all output documents 131 | 132 | This directory name is expected to end with the document stem name, 133 | for example '/path/to/the/collection/Unicode-HOWTO'. The parent 134 | directory (e.g. '/path/to/the/collection' must exist already. The 135 | output directory itself will be created, or emptied and cleared if 136 | the document needs to be rebuilt. 137 | ''' 138 | self.dirname = os.path.abspath(dirname) 139 | self.stem = os.path.basename(self.dirname) 140 | super(OutputDirectory, self).__init__(self.dirname, self.stem) 141 | parent = os.path.dirname(self.dirname) 142 | if not os.path.isdir(parent): 143 | logger.critical("Missing output collection directory %s.", parent) 144 | raise IOError(errno.ENOENT, os.strerror(errno.ENOENT), parent) 145 | self.status = 'output' 146 | self.source = source 147 | self.logdir = os.path.join(self.dirname, logdir) 148 | 149 | def detail(self, widths, verbose, file=sys.stdout): 150 | template = ' '.join(('{s.status:{w.status}}', 151 | '{u:{w.doctype}}', 152 | '{s.stem:{w.stem}}')) 153 | outstr = template.format(s=self, w=widths, u="") 154 | print(outstr, file=file) 155 | if verbose: 156 | print(' missing source', file=file) 157 | 158 | 159 | class OutputCollection(LDPDocumentCollection): 160 | '''a dict-like container for OutputDirectory objects 161 | 162 | The key of an OutputCollection is the stem name of the document, which 163 | allows convenient access and guaranteed non-collision. 164 | 165 | The use of the stem as a key works conveniently with the 166 | SourceCollection which uses the same strategy on SourceDocuments. 167 | ''' 168 | def __init__(self, dirname=None): 169 | '''construct an OutputCollection 170 | 171 | If dirname is not supplied, OutputCollection is basically, a dict(). 172 | If dirname is supplied, then OutputCollection scans the filesystem for 173 | subdirectories of dirname and creates an OutputDirectory for each 174 | subdir. Each subdir name is used as the stem (or key) for holding the 175 | OutputDirectory in the OutputCollection. 176 | 177 | For example, consider the following directory tree: 178 | 179 | en 180 | ├── Latvian-HOWTO 181 | ├── Scanner-HOWTO 182 | ├── UUCP-HOWTO 183 | └── Wireless-HOWTO 184 | 185 | If called like OutputCollection("en"), the result in memory would be 186 | a structure resembling this: 187 | 188 | OutputCollection("/path/en") = { 189 | "Latvian-HOWTO": OutputDirectory("/path/en/Latvian-HOWTO") 190 | "Scanner-HOWTO": OutputDirectory("/path/en/Scanner-HOWTO") 191 | "UUCP-HOWTO": OutputDirectory("/path/en/UUCP-HOWTO") 192 | "Wireless-HOWTO": OutputDirectory("/path/en/Wireless-HOWTO") 193 | } 194 | 195 | ''' 196 | if dirname is None: 197 | return 198 | elif not os.path.isdir(dirname): 199 | logger.critical("Output collection dir %s must already exist.", 200 | dirname) 201 | raise IOError(errno.ENOENT, os.strerror(errno.ENOENT), dirname) 202 | for fname in sorted(os.listdir(dirname), key=lambda x: x.lower()): 203 | name = os.path.join(dirname, fname) 204 | if not os.path.isdir(name): 205 | logger.info("Skipping non-directory %s (in %s)", name, dirname) 206 | continue 207 | logger.debug("Found directory %s (in %s)", name, dirname) 208 | o = OutputDirectory(name) 209 | assert o.stem not in self 210 | self[o.stem] = o 211 | 212 | 213 | # 214 | # -- end of file 215 | -------------------------------------------------------------------------------- /tldp/doctypes/docbook4xml.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import logging 10 | 11 | from tldp.utils import which, firstfoundfile 12 | from tldp.utils import arg_isexecutable, isexecutable 13 | from tldp.utils import arg_isreadablefile, isreadablefile 14 | from tldp.utils import arg_isstr, isstr 15 | 16 | from tldp.doctypes.common import BaseDoctype, SignatureChecker, depends 17 | 18 | logger = logging.getLogger(__name__) 19 | 20 | 21 | def xslchunk_finder(): 22 | l = ['/usr/share/xml/docbook/stylesheet/ldp/html/tldp-sections.xsl', 23 | 'http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl', 24 | ] 25 | return firstfoundfile(l) 26 | 27 | 28 | def xslsingle_finder(): 29 | l = ['/usr/share/xml/docbook/stylesheet/ldp/html/tldp-one-page.xsl', 30 | 'http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl', 31 | ] 32 | return firstfoundfile(l) 33 | 34 | 35 | def xslprint_finder(): 36 | l = ['http://docbook.sourceforge.net/release/xsl/current/fo/docbook.xsl', 37 | # '/usr/share/xml/docbook/stylesheet/ldp/fo/tldp-print.xsl', 38 | ] 39 | return l[0] 40 | # return firstfoundfile(l) 41 | 42 | 43 | class Docbook4XML(BaseDoctype, SignatureChecker): 44 | formatname = 'DocBook XML 4.x' 45 | extensions = ['.xml'] 46 | signatures = ['-//OASIS//DTD DocBook XML V4.1.2//EN', 47 | '-//OASIS//DTD DocBook XML V4.2//EN', 48 | '-//OASIS//DTD DocBook XML V4.2//EN', 49 | '-//OASIS//DTD DocBook XML V4.4//EN', 50 | '-//OASIS//DTD DocBook XML V4.5//EN', ] 51 | required = {'docbook4xml_xsltproc': isexecutable, 52 | 'docbook4xml_html2text': isexecutable, 53 | 'docbook4xml_dblatex': isexecutable, 54 | 'docbook4xml_fop': isexecutable, 55 | 'docbook4xml_xmllint': isexecutable, 56 | 'docbook4xml_xslchunk': isreadablefile, 57 | 'docbook4xml_xslsingle': isreadablefile, 58 | 'docbook4xml_xslprint': isstr, 59 | } 60 | 61 | def make_validated_source(self, **kwargs): 62 | s = '''"{config.docbook4xml_xmllint}" > "{output.validsource}" \\ 63 | --nonet \\ 64 | --noent \\ 65 | --xinclude \\ 66 | --postvalid \\ 67 | "{source.filename}"''' 68 | return self.shellscript(s, **kwargs) 69 | 70 | @depends(make_validated_source) 71 | def make_name_htmls(self, **kwargs): 72 | '''create a single page HTML output''' 73 | s = '''"{config.docbook4xml_xsltproc}" > "{output.name_htmls}" \\ 74 | --nonet \\ 75 | --stringparam admon.graphics.path images/ \\ 76 | --stringparam base.dir . \\ 77 | "{config.docbook4xml_xslsingle}" \\ 78 | "{output.validsource}"''' 79 | return self.shellscript(s, **kwargs) 80 | 81 | @depends(make_name_htmls) 82 | def make_name_txt(self, **kwargs): 83 | '''create text output''' 84 | s = '''"{config.docbook4xml_html2text}" > "{output.name_txt}" \\ 85 | -style pretty \\ 86 | -nobs \\ 87 | "{output.name_htmls}"''' 88 | return self.shellscript(s, **kwargs) 89 | 90 | @depends(make_validated_source) 91 | def make_fo(self, **kwargs): 92 | '''generate the Formatting Objects intermediate output''' 93 | s = '''"{config.docbook4xml_xsltproc}" > "{output.name_fo}" \\ 94 | --stringparam fop.extensions 0 \\ 95 | --stringparam fop1.extensions 1 \\ 96 | "{config.docbook4xml_xslprint}" \\ 97 | "{output.validsource}"''' 98 | if not self.config.script: 99 | self.removals.add(self.output.name_fo) 100 | return self.shellscript(s, **kwargs) 101 | 102 | # -- this is conditionally built--see logic in make_name_pdf() below 103 | # @depends(make_fo) 104 | def make_pdf_with_fop(self, **kwargs): 105 | '''use FOP to create a PDF''' 106 | s = '''"{config.docbook4xml_fop}" \\ 107 | -fo "{output.name_fo}" \\ 108 | -pdf "{output.name_pdf}"''' 109 | return self.shellscript(s, **kwargs) 110 | 111 | # -- this is conditionally built--see logic in make_name_pdf() below 112 | # @depends(make_validated_source) 113 | def make_pdf_with_dblatex(self, **kwargs): 114 | '''use dblatex (fallback) to create a PDF''' 115 | s = '''"{config.docbook4xml_dblatex}" \\ 116 | -F xml \\ 117 | -t pdf \\ 118 | -o "{output.name_pdf}" \\ 119 | "{output.validsource}"''' 120 | return self.shellscript(s, **kwargs) 121 | 122 | @depends(make_validated_source, make_fo) 123 | def make_name_pdf(self, **kwargs): 124 | stem = self.source.stem 125 | classname = self.__class__.__name__ 126 | logger.info("%s calling method %s.%s", 127 | stem, classname, 'make_pdf_with_fop') 128 | if self.make_pdf_with_fop(**kwargs): 129 | return True 130 | logger.error("%s %s failed creating PDF, falling back to dblatex...", 131 | stem, self.config.docbook4xml_fop) 132 | logger.info("%s calling method %s.%s", 133 | stem, classname, 'make_pdf_with_dblatex') 134 | return self.make_pdf_with_dblatex(**kwargs) 135 | 136 | @depends(make_validated_source) 137 | def make_chunked_html(self, **kwargs): 138 | '''create chunked HTML output''' 139 | s = '''"{config.docbook4xml_xsltproc}" \\ 140 | --nonet \\ 141 | --stringparam admon.graphics.path images/ \\ 142 | --stringparam base.dir . \\ 143 | "{config.docbook4xml_xslchunk}" \\ 144 | "{output.validsource}"''' 145 | return self.shellscript(s, **kwargs) 146 | 147 | @depends(make_chunked_html) 148 | def make_name_html(self, **kwargs): 149 | '''rename DocBook XSL's index.html to LDP standard STEM.html''' 150 | s = 'mv -v --no-clobber -- "{output.name_indexhtml}" "{output.name_html}"' 151 | return self.shellscript(s, **kwargs) 152 | 153 | @depends(make_name_html) 154 | def make_name_indexhtml(self, **kwargs): 155 | '''create final index.html symlink''' 156 | s = 'ln -svr -- "{output.name_html}" "{output.name_indexhtml}"' 157 | return self.shellscript(s, **kwargs) 158 | 159 | @depends(make_name_html, make_name_pdf, make_name_htmls, make_name_txt) 160 | def remove_validated_source(self, **kwargs): 161 | '''create final index.html symlink''' 162 | s = 'rm --verbose -- "{output.validsource}"' 163 | return self.shellscript(s, **kwargs) 164 | 165 | @classmethod 166 | def argparse(cls, p): 167 | descrip = 'executables and data files for %s' % (cls.formatname,) 168 | g = p.add_argument_group(title=cls.__name__, description=descrip) 169 | gadd = g.add_argument 170 | gadd('--docbook4xml-xslchunk', type=arg_isreadablefile, 171 | default=xslchunk_finder(), 172 | help='full path to LDP HTML chunker XSL [%(default)s]') 173 | gadd('--docbook4xml-xslsingle', type=arg_isreadablefile, 174 | default=xslsingle_finder(), 175 | help='full path to LDP HTML single-page XSL [%(default)s]') 176 | gadd('--docbook4xml-xslprint', type=arg_isstr, 177 | default=xslprint_finder(), 178 | help='full path to LDP FO print XSL [%(default)s]') 179 | gadd('--docbook4xml-xmllint', type=arg_isexecutable, 180 | default=which('xmllint'), 181 | help='full path to xmllint [%(default)s]') 182 | gadd('--docbook4xml-xsltproc', type=arg_isexecutable, 183 | default=which('xsltproc'), 184 | help='full path to xsltproc [%(default)s]') 185 | gadd('--docbook4xml-html2text', type=arg_isexecutable, 186 | default=which('html2text'), 187 | help='full path to html2text [%(default)s]') 188 | gadd('--docbook4xml-fop', type=arg_isexecutable, 189 | default=which('fop'), 190 | help='full path to fop [%(default)s]') 191 | gadd('--docbook4xml-dblatex', type=arg_isexecutable, 192 | default=which('dblatex'), 193 | help='full path to dblatex [%(default)s]') 194 | 195 | # 196 | # -- end of file 197 | -------------------------------------------------------------------------------- /docs/conf.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # tox documentation build configuration file, created by 4 | # sphinx-quickstart on Fri Nov 9 19:00:14 2012. 5 | # 6 | # This file is execfile()d with the current directory set to its containing dir. 7 | # 8 | # Note that not all possible configuration values are present in this 9 | # autogenerated file. 10 | # 11 | # All configuration values have a default; values that are commented out 12 | # serve to show the default. 13 | 14 | import sys, os 15 | 16 | # If extensions (or modules to document with autodoc) are in another directory, 17 | # add these directories to sys.path here. If the directory is relative to the 18 | # documentation root, use os.path.abspath to make it absolute, like shown here. 19 | #sys.path.insert(0, os.path.abspath('.')) 20 | 21 | # -- General configuration ----------------------------------------------------- 22 | 23 | # If your documentation needs a minimal Sphinx version, state it here. 24 | #needs_sphinx = '1.0' 25 | 26 | # Add any Sphinx extension module names here, as strings. They can be extensions 27 | # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. 28 | extensions = ['sphinx.ext.autodoc'] 29 | 30 | # Add any paths that contain templates here, relative to this directory. 31 | templates_path = ['_templates'] 32 | 33 | # The suffix of source filenames. 34 | source_suffix = '.rst' 35 | 36 | # The encoding of source files. 37 | #source_encoding = 'utf-8-sig' 38 | 39 | # The master toctree document. 40 | master_doc = 'ldptool-man' 41 | 42 | # General information about the project. 43 | project = u'ldptool' 44 | copyright = u'Manual page (C) 2016, Linux Documentation Project' 45 | 46 | # The version info for the project you're documenting, acts as replacement for 47 | # |version| and |release|, also used in various other places throughout the 48 | # built documents. 49 | # 50 | # The short X.Y version. 51 | version = '1.9.2' 52 | # The full version, including alpha/beta/rc tags. 53 | release = '1.9.2' 54 | 55 | # The language for content autogenerated by Sphinx. Refer to documentation 56 | # for a list of supported languages. 57 | #language = None 58 | 59 | # There are two options for replacing |today|: either, you set today to some 60 | # non-false value, then it is used: 61 | #today = '' 62 | # Else, today_fmt is used as the format for a strftime call. 63 | #today_fmt = '%B %d, %Y' 64 | 65 | # List of patterns, relative to source directory, that match files and 66 | # directories to ignore when looking for source files. 67 | exclude_patterns = ['_build'] 68 | 69 | # The reST default role (used for this markup: `text`) to use for all documents. 70 | #default_role = None 71 | 72 | # If true, '()' will be appended to :func: etc. cross-reference text. 73 | #add_function_parentheses = True 74 | 75 | # If true, the current module name will be prepended to all description 76 | # unit titles (such as .. function::). 77 | #add_module_names = True 78 | 79 | # If true, sectionauthor and moduleauthor directives will be shown in the 80 | # output. They are ignored by default. 81 | #show_authors = False 82 | 83 | # The name of the Pygments (syntax highlighting) style to use. 84 | pygments_style = 'sphinx' 85 | 86 | # A list of ignored prefixes for module index sorting. 87 | #modindex_common_prefix = [] 88 | 89 | 90 | # -- Options for HTML output --------------------------------------------------- 91 | 92 | # The theme to use for HTML and HTML Help pages. See the documentation for 93 | # a list of builtin themes. 94 | html_theme = 'default' 95 | 96 | # Theme options are theme-specific and customize the look and feel of a theme 97 | # further. For a list of options available for each theme, see the 98 | # documentation. 99 | #html_theme_options = {} 100 | 101 | # Add any paths that contain custom themes here, relative to this directory. 102 | #html_theme_path = [] 103 | 104 | # The name for this set of Sphinx documents. If None, it defaults to 105 | # " v documentation". 106 | #html_title = None 107 | 108 | # A shorter title for the navigation bar. Default is the same as html_title. 109 | #html_short_title = None 110 | 111 | # The name of an image file (relative to this directory) to place at the top 112 | # of the sidebar. 113 | #html_logo = None 114 | 115 | # The name of an image file (within the static path) to use as favicon of the 116 | # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 117 | # pixels large. 118 | #html_favicon = None 119 | 120 | # Add any paths that contain custom static files (such as style sheets) here, 121 | # relative to this directory. They are copied after the builtin static files, 122 | # so a file named "default.css" will overwrite the builtin "default.css". 123 | html_static_path = ['_static'] 124 | 125 | # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, 126 | # using the given strftime format. 127 | #html_last_updated_fmt = '%b %d, %Y' 128 | 129 | # If true, SmartyPants will be used to convert quotes and dashes to 130 | # typographically correct entities. 131 | #html_use_smartypants = True 132 | 133 | # Custom sidebar templates, maps document names to template names. 134 | #html_sidebars = {} 135 | 136 | # Additional templates that should be rendered to pages, maps page names to 137 | # template names. 138 | #html_additional_pages = {} 139 | 140 | # If false, no module index is generated. 141 | #html_domain_indices = True 142 | 143 | # If false, no index is generated. 144 | #html_use_index = True 145 | 146 | # If true, the index is split into individual pages for each letter. 147 | #html_split_index = False 148 | 149 | # If true, links to the reST sources are added to the pages. 150 | #html_show_sourcelink = True 151 | 152 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. 153 | #html_show_sphinx = True 154 | 155 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. 156 | #html_show_copyright = True 157 | 158 | # If true, an OpenSearch description file will be output, and all pages will 159 | # contain a tag referring to it. The value of this option must be the 160 | # base URL from which the finished HTML is served. 161 | #html_use_opensearch = '' 162 | 163 | # This is the file name suffix for HTML files (e.g. ".xhtml"). 164 | #html_file_suffix = None 165 | 166 | # Output file base name for HTML help builder. 167 | htmlhelp_basename = 'ldptooldoc' 168 | 169 | 170 | # -- Options for LaTeX output -------------------------------------------------- 171 | 172 | latex_elements = { 173 | # The paper size ('letterpaper' or 'a4paper'). 174 | #'papersize': 'letterpaper', 175 | 176 | # The font size ('10pt', '11pt' or '12pt'). 177 | #'pointsize': '10pt', 178 | 179 | # Additional stuff for the LaTeX preamble. 180 | #'preamble': '', 181 | } 182 | 183 | # Grouping the document tree into LaTeX files. List of tuples 184 | # (source start file, target name, title, author, documentclass [howto/manual]). 185 | latex_documents = [] 186 | 187 | # The name of an image file (relative to this directory) to place at the top of 188 | # the title page. 189 | #latex_logo = None 190 | 191 | # For "manual" documents, if this is true, then toplevel headings are parts, 192 | # not chapters. 193 | #latex_use_parts = False 194 | 195 | # If true, show page references after internal links. 196 | #latex_show_pagerefs = False 197 | 198 | # If true, show URL addresses after external links. 199 | #latex_show_urls = False 200 | 201 | # Documents to append as an appendix to all manuals. 202 | #latex_appendices = [] 203 | 204 | # If false, no module index is generated. 205 | #latex_domain_indices = True 206 | 207 | 208 | # -- Options for manual page output -------------------------------------------- 209 | 210 | # One entry per manual page. List of tuples 211 | # (source start file, name, description, authors, manual section). 212 | man_pages = [ 213 | ('ldptool-man', 'ldptool', u'DocBook, Linuxdoc and Asciidoc build/publishing tool.', 214 | [u'Martin A. Brown ',], 1) 215 | ] 216 | 217 | # If true, show URL addresses after external links. 218 | #man_show_urls = False 219 | 220 | 221 | # -- Options for Texinfo output ------------------------------------------------ 222 | 223 | # Grouping the document tree into Texinfo files. List of tuples 224 | # (source start file, target name, title, author, 225 | # dir menu entry, description, category) 226 | texinfo_documents = [ 227 | ('ldptool-man', 'ldptool', u'ldptool(1)', 228 | u'Martin A. Brown', 'ldptool', 'DocBook, Linuxdoc and Asciidoc build/publishing tool.', 229 | 'Miscellaneous'), 230 | ] 231 | 232 | # Documents to append as an appendix to all manuals. 233 | #texinfo_appendices = [] 234 | 235 | # If false, no module index is generated. 236 | #texinfo_domain_indices = True 237 | 238 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 239 | #texinfo_show_urls = 'footnote' 240 | -------------------------------------------------------------------------------- /tests/tldptesttools.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | import os 9 | import codecs 10 | import random 11 | import shutil 12 | import unittest 13 | from tempfile import mkdtemp 14 | from tempfile import NamedTemporaryFile as ntf 15 | 16 | import tldp.config 17 | from tldp.outputs import OutputNamingConvention 18 | from tldp.utils import writemd5sums, md5file 19 | 20 | # -- short names 21 | # 22 | opa = os.path.abspath 23 | opb = os.path.basename 24 | opd = os.path.dirname 25 | opj = os.path.join 26 | 27 | extras = opa(opj(opd(opd(__file__)), 'extras')) 28 | 29 | 30 | def stem_and_ext(name): 31 | stem, ext = os.path.splitext(os.path.basename(name)) 32 | assert ext != '' 33 | return stem, ext 34 | 35 | 36 | def dir_to_components(reldir): 37 | reldir, basename = os.path.split(os.path.normpath(reldir)) 38 | components = [basename] 39 | while reldir != '': 40 | reldir, basename = os.path.split(reldir) 41 | components.append(basename) 42 | assert len(components) >= 1 43 | components.reverse() 44 | return components 45 | 46 | 47 | class TestToolsFilesystem(unittest.TestCase): 48 | 49 | def setUp(self): 50 | self.makeTempdir() 51 | 52 | def tearDown(self): 53 | self.removeTempdir() 54 | 55 | def makeTempdir(self): 56 | self.tempdir = mkdtemp(prefix='tldp-test-') 57 | 58 | def removeTempdir(self): 59 | shutil.rmtree(self.tempdir) 60 | 61 | def adddir(self, reldir): 62 | components = dir_to_components(reldir) 63 | absdir = self.tempdir 64 | while components: 65 | absdir = os.path.join(absdir, components.pop(0)) 66 | if not os.path.isdir(absdir): 67 | os.mkdir(absdir) 68 | self.assertTrue(os.path.isdir(absdir)) 69 | relpath = os.path.relpath(absdir, self.tempdir) 70 | return relpath, absdir 71 | 72 | def addfile(self, reldir, filename, stem=None, ext=None): 73 | dirname = os.path.join(self.tempdir, reldir) 74 | assert os.path.isdir(dirname) 75 | if stem is None: 76 | stem, _ = stem_and_ext(filename) 77 | if ext is None: 78 | _, ext = stem_and_ext(filename) 79 | newname = os.path.join(dirname, stem + ext) 80 | if os.path.isfile(filename): 81 | shutil.copy(filename, newname) 82 | else: 83 | with open(newname, 'w'): 84 | pass 85 | relname = os.path.relpath(newname, self.tempdir) 86 | return relname, newname 87 | 88 | 89 | class CCTestTools(unittest.TestCase): 90 | 91 | def setUp(self): 92 | self.makeTempdir() 93 | 94 | def tearDown(self): 95 | self.removeTempdir() 96 | 97 | def makeTempdir(self): 98 | self.tempdir = mkdtemp(prefix='tldp-test-') 99 | 100 | def removeTempdir(self): 101 | shutil.rmtree(self.tempdir) 102 | 103 | def writeconfig(self, case): 104 | tf = ntf(prefix=case.tag, suffix='.cfg', dir=self.tempdir, delete=False) 105 | tf.close() 106 | with codecs.open(tf.name, 'w', encoding='utf-8') as f: 107 | f.write(case.cfg) 108 | case.configfile = tf.name 109 | 110 | 111 | class TestOutputDirSkeleton(OutputNamingConvention): 112 | 113 | def mkdir(self): 114 | if not os.path.isdir(self.dirname): 115 | os.mkdir(self.dirname) 116 | 117 | def create_md5sum_file(self, md5s): 118 | writemd5sums(self.MD5SUMS, md5s) 119 | 120 | def create_expected_docs(self): 121 | for name in self.expected: 122 | fname = getattr(self, name) 123 | with open(fname, 'w'): 124 | pass 125 | 126 | 127 | class TestSourceDocSkeleton(object): 128 | 129 | def __init__(self, dirname): 130 | if isinstance(dirname, list): 131 | dirname = dirname[0] 132 | if not os.path.abspath(dirname): 133 | raise Exception("Please use absolute path in unit tests....") 134 | self.dirname = dirname 135 | if not os.path.isdir(self.dirname): 136 | os.mkdir(self.dirname) 137 | self.md5s = dict() 138 | 139 | def copytree(self, source): 140 | dst = opj(self.dirname, opb(source)) 141 | shutil.copytree(source, dst) 142 | 143 | def create_stale(self, fname): 144 | l = list(self.md5s[fname]) 145 | random.shuffle(l) 146 | if l == self.md5s[fname]: 147 | self.invalidate_checksum(fname) 148 | self.md5s[fname] = ''.join(l) 149 | 150 | @property 151 | def md5sums(self): 152 | return self.md5s 153 | 154 | def addsourcefile(self, filename, content): 155 | fname = os.path.join(self.dirname, filename) 156 | if os.path.isfile(content): 157 | shutil.copy(content, fname) 158 | else: 159 | with codecs.open(fname, 'w', encoding='utf-8') as f: 160 | f.write(content) 161 | relpath = os.path.relpath(fname, start=self.dirname) 162 | self.md5s[relpath] = md5file(fname) 163 | 164 | 165 | class TestInventoryBase(unittest.TestCase): 166 | 167 | def setUp(self): 168 | self.makeTempdir() 169 | tldp.config.DEFAULT_CONFIGFILE = None 170 | self.config, _ = tldp.config.collectconfiguration('ldptool', []) 171 | c = self.config 172 | c.pubdir = os.path.join(self.tempdir, 'outputs') 173 | c.builddir = os.path.join(self.tempdir, 'builddir') 174 | c.sourcedir = os.path.join(self.tempdir, 'sources') 175 | argv = list() 176 | argv.extend(['--builddir', c.builddir]) 177 | argv.extend(['--pubdir', c.pubdir]) 178 | argv.extend(['--sourcedir', c.sourcedir]) 179 | self.argv = argv 180 | # -- and make some directories 181 | for d in (c.sourcedir, c.pubdir, c.builddir): 182 | if not os.path.isdir(d): 183 | os.mkdir(d) 184 | c.sourcedir = [c.sourcedir] 185 | 186 | def tearDown(self): 187 | self.removeTempdir() 188 | 189 | def makeTempdir(self): 190 | self.tempdir = mkdtemp(prefix='tldp-test-') 191 | 192 | def removeTempdir(self): 193 | shutil.rmtree(self.tempdir) 194 | 195 | def add_stale(self, stem, ex): 196 | c = self.config 197 | mysource = TestSourceDocSkeleton(c.sourcedir) 198 | fname = stem + ex.ext 199 | mysource.addsourcefile(fname, ex.filename) 200 | mysource.create_stale(fname) 201 | myoutput = TestOutputDirSkeleton(os.path.join(c.pubdir, stem), stem) 202 | myoutput.mkdir() 203 | myoutput.create_expected_docs() 204 | myoutput.create_md5sum_file(mysource.md5sums) 205 | 206 | def add_broken(self, stem, ex): 207 | c = self.config 208 | mysource = TestSourceDocSkeleton(c.sourcedir) 209 | fname = stem + ex.ext 210 | mysource.addsourcefile(fname, ex.filename) 211 | myoutput = TestOutputDirSkeleton(os.path.join(c.pubdir, stem), stem) 212 | myoutput.mkdir() 213 | myoutput.create_expected_docs() 214 | myoutput.create_md5sum_file(mysource.md5sums) 215 | prop = random.choice(myoutput.expected) 216 | fname = getattr(myoutput, prop, None) 217 | assert fname is not None 218 | os.unlink(fname) 219 | 220 | def add_new(self, stem, ex, content=None): 221 | c = self.config 222 | mysource = TestSourceDocSkeleton(c.sourcedir) 223 | if content: 224 | mysource.addsourcefile(stem + ex.ext, content) 225 | else: 226 | mysource.addsourcefile(stem + ex.ext, ex.filename) 227 | 228 | def add_unknown(self, stem, ext, content=None): 229 | c = self.config 230 | mysource = TestSourceDocSkeleton(c.sourcedir) 231 | if content: 232 | mysource.addsourcefile(stem + ext, content) 233 | else: 234 | mysource.addsourcefile(stem + ext, '') 235 | 236 | def add_orphan(self, stem, ex): 237 | c = self.config 238 | myoutput = TestOutputDirSkeleton(os.path.join(c.pubdir, stem), stem) 239 | myoutput.mkdir() 240 | myoutput.create_expected_docs() 241 | 242 | def add_published(self, stem, ex): 243 | c = self.config 244 | mysource = TestSourceDocSkeleton(c.sourcedir) 245 | mysource.addsourcefile(stem + ex.ext, ex.filename) 246 | myoutput = TestOutputDirSkeleton(os.path.join(c.pubdir, stem), stem) 247 | myoutput.mkdir() 248 | myoutput.create_expected_docs() 249 | myoutput.create_md5sum_file(mysource.md5sums) 250 | 251 | def add_docbooksgml_support_to_config(self): 252 | c = self.config 253 | c.docbooksgml_collateindex = opj(extras, 'collateindex.pl') 254 | c.docbooksgml_ldpdsl = opj(extras, 'dsssl', 'ldp.dsl') 255 | 256 | def add_docbook4xml_xsl_to_config(self): 257 | c = self.config 258 | c.docbook4xml_xslprint = opj(extras, 'xsl', 'tldp-print.xsl') 259 | c.docbook4xml_xslsingle = opj(extras, 'xsl', 'tldp-one-page.xsl') 260 | c.docbook4xml_xslchunk = opj(extras, 'xsl', 'tldp-chapters.xsl') 261 | 262 | # 263 | # -- end of file 264 | -------------------------------------------------------------------------------- /tldp/doctypes/docbook5xml.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import logging 10 | 11 | from tldp.utils import which, firstfoundfile 12 | from tldp.utils import arg_isexecutable, isexecutable 13 | from tldp.utils import arg_isreadablefile, isreadablefile 14 | 15 | from tldp.doctypes.common import BaseDoctype, SignatureChecker, depends 16 | 17 | logger = logging.getLogger(__name__) 18 | 19 | 20 | def rngfile_finder(): 21 | l = ['/usr/share/xml/docbook/schema/rng/5.0/docbook.rng', 22 | ] 23 | return firstfoundfile(l) 24 | 25 | 26 | def xslchunk_finder(): 27 | l = ['/usr/share/xml/docbook/stylesheet/nwalsh5/current/html/chunk.xsl', 28 | '/usr/share/xml/docbook/stylesheet/docbook-xsl-ns/html/chunk.xsl', 29 | 'http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl', 30 | ] 31 | return firstfoundfile(l) 32 | 33 | 34 | def xslsingle_finder(): 35 | l = ['/usr/share/xml/docbook/stylesheet/nwalsh5/current/html/docbook.xsl', 36 | '/usr/share/xml/docbook/stylesheet/docbook-xsl-ns/html/docbook.xsl', 37 | 'http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl', 38 | ] 39 | return firstfoundfile(l) 40 | 41 | 42 | def xslprint_finder(): 43 | l = ['/usr/share/xml/docbook/stylesheet/nwalsh5/current/fo/docbook.xsl', 44 | '/usr/share/xml/docbook/stylesheet/docbook-xsl-ns/fo/docbook.xsl', 45 | 'http://docbook.sourceforge.net/release/xsl/current/fo/docbook.xsl', 46 | ] 47 | return firstfoundfile(l) 48 | 49 | 50 | class Docbook5XML(BaseDoctype, SignatureChecker): 51 | formatname = 'DocBook XML 5.x' 52 | extensions = ['.xml'] 53 | signatures = ['-//OASIS//DTD DocBook V5.0/EN', 54 | 'http://docbook.org/ns/docbook', ] 55 | 56 | required = {'docbook5xml_xsltproc': isexecutable, 57 | 'docbook5xml_xmllint': isexecutable, 58 | 'docbook5xml_html2text': isexecutable, 59 | 'docbook5xml_dblatex': isexecutable, 60 | 'docbook5xml_fop': isexecutable, 61 | 'docbook5xml_jing': isexecutable, 62 | 'docbook5xml_rngfile': isreadablefile, 63 | 'docbook5xml_xslprint': isreadablefile, 64 | 'docbook5xml_xslchunk': isreadablefile, 65 | 'docbook5xml_xslsingle': isreadablefile, 66 | } 67 | 68 | def make_xincluded_source(self, **kwargs): 69 | s = '''"{config.docbook5xml_xmllint}" > "{output.validsource}" \\ 70 | --nonet \\ 71 | --noent \\ 72 | --xinclude \\ 73 | "{source.filename}"''' 74 | return self.shellscript(s, **kwargs) 75 | 76 | @depends(make_xincluded_source) 77 | def validate_source(self, **kwargs): 78 | '''consider lxml.etree and other validators''' 79 | s = '''"{config.docbook5xml_jing}" \\ 80 | "{config.docbook5xml_rngfile}" \\ 81 | "{output.validsource}"''' 82 | return self.shellscript(s, **kwargs) 83 | 84 | @depends(validate_source) 85 | def make_name_htmls(self, **kwargs): 86 | '''create a single page HTML output''' 87 | s = '''"{config.docbook5xml_xsltproc}" > "{output.name_htmls}" \\ 88 | --nonet \\ 89 | --stringparam admon.graphics.path images/ \\ 90 | --stringparam base.dir . \\ 91 | "{config.docbook5xml_xslsingle}" \\ 92 | "{output.validsource}"''' 93 | return self.shellscript(s, **kwargs) 94 | 95 | @depends(make_name_htmls) 96 | def make_name_txt(self, **kwargs): 97 | '''create text output''' 98 | s = '''"{config.docbook5xml_html2text}" > "{output.name_txt}" \\ 99 | -style pretty \\ 100 | -nobs \\ 101 | "{output.name_htmls}"''' 102 | return self.shellscript(s, **kwargs) 103 | 104 | @depends(validate_source) 105 | def make_fo(self, **kwargs): 106 | '''generate the Formatting Objects intermediate output''' 107 | s = '''"{config.docbook5xml_xsltproc}" > "{output.name_fo}" \\ 108 | --stringparam fop.extensions 0 \\ 109 | --stringparam fop1.extensions 1 \\ 110 | "{config.docbook5xml_xslprint}" \\ 111 | "{output.validsource}"''' 112 | if not self.config.script: 113 | self.removals.add(self.output.name_fo) 114 | return self.shellscript(s, **kwargs) 115 | 116 | # -- this is conditionally built--see logic in make_name_pdf() below 117 | # @depends(make_fo) 118 | def make_pdf_with_fop(self, **kwargs): 119 | '''use FOP to create a PDF''' 120 | s = '''"{config.docbook5xml_fop}" \\ 121 | -fo "{output.name_fo}" \\ 122 | -pdf "{output.name_pdf}"''' 123 | return self.shellscript(s, **kwargs) 124 | 125 | # -- this is conditionally built--see logic in make_name_pdf() below 126 | # @depends(validate_source) 127 | def make_pdf_with_dblatex(self, **kwargs): 128 | '''use dblatex (fallback) to create a PDF''' 129 | s = '''"{config.docbook5xml_dblatex}" \\ 130 | -F xml \\ 131 | -t pdf \\ 132 | -o "{output.name_pdf}" \\ 133 | "{output.validsource}"''' 134 | return self.shellscript(s, **kwargs) 135 | 136 | @depends(make_fo, validate_source) 137 | def make_name_pdf(self, **kwargs): 138 | stem = self.source.stem 139 | classname = self.__class__.__name__ 140 | logger.info("%s calling method %s.%s", 141 | stem, classname, 'make_pdf_with_fop') 142 | if self.make_pdf_with_fop(**kwargs): 143 | return True 144 | logger.error("%s %s failed creating PDF, falling back to dblatex...", 145 | stem, self.config.docbook5xml_fop) 146 | logger.info("%s calling method %s.%s", 147 | stem, classname, 'make_pdf_with_dblatex') 148 | return self.make_pdf_with_dblatex(**kwargs) 149 | 150 | @depends(make_name_htmls, validate_source) 151 | def make_chunked_html(self, **kwargs): 152 | '''create chunked HTML output''' 153 | s = '''"{config.docbook5xml_xsltproc}" \\ 154 | --nonet \\ 155 | --stringparam admon.graphics.path images/ \\ 156 | --stringparam base.dir . \\ 157 | "{config.docbook5xml_xslchunk}" \\ 158 | "{output.validsource}"''' 159 | return self.shellscript(s, **kwargs) 160 | 161 | @depends(make_chunked_html) 162 | def make_name_html(self, **kwargs): 163 | '''rename DocBook XSL's index.html to LDP standard STEM.html''' 164 | s = 'mv -v --no-clobber -- "{output.name_indexhtml}" "{output.name_html}"' 165 | return self.shellscript(s, **kwargs) 166 | 167 | @depends(make_name_html) 168 | def make_name_indexhtml(self, **kwargs): 169 | '''create final index.html symlink''' 170 | s = 'ln -svr -- "{output.name_html}" "{output.name_indexhtml}"' 171 | return self.shellscript(s, **kwargs) 172 | 173 | @depends(make_name_htmls, make_name_html, make_name_pdf, make_name_txt) 174 | def remove_xincluded_source(self, **kwargs): 175 | '''remove the xincluded source file''' 176 | s = 'rm --verbose -- "{output.validsource}"' 177 | return self.shellscript(s, **kwargs) 178 | 179 | @classmethod 180 | def argparse(cls, p): 181 | descrip = 'executables for %s' % (cls.formatname,) 182 | g = p.add_argument_group(title=cls.__name__, description=descrip) 183 | gadd = g.add_argument 184 | gadd('--docbook5xml-xslchunk', type=arg_isreadablefile, 185 | default=xslchunk_finder(), 186 | help='full path to LDP HTML chunker XSL [%(default)s]') 187 | gadd('--docbook5xml-xslsingle', type=arg_isreadablefile, 188 | default=xslsingle_finder(), 189 | help='full path to LDP HTML single-page XSL [%(default)s]') 190 | gadd('--docbook5xml-xslprint', type=arg_isreadablefile, 191 | default=xslprint_finder(), 192 | help='full path to LDP FO print XSL [%(default)s]') 193 | 194 | gadd('--docbook5xml-rngfile', type=arg_isreadablefile, 195 | default=rngfile_finder(), 196 | help='full path to docbook.rng [%(default)s]') 197 | gadd('--docbook5xml-xmllint', type=arg_isexecutable, 198 | default=which('xmllint'), 199 | help='full path to xmllint [%(default)s]') 200 | gadd('--docbook5xml-xsltproc', type=arg_isexecutable, 201 | default=which('xsltproc'), 202 | help='full path to xsltproc [%(default)s]') 203 | gadd('--docbook5xml-html2text', type=arg_isexecutable, 204 | default=which('html2text'), 205 | help='full path to html2text [%(default)s]') 206 | gadd('--docbook5xml-fop', type=arg_isexecutable, 207 | default=which('fop'), 208 | help='full path to fop [%(default)s]') 209 | gadd('--docbook5xml-dblatex', type=arg_isexecutable, 210 | default=which('dblatex'), 211 | help='full path to dblatex [%(default)s]') 212 | gadd('--docbook5xml-jing', type=arg_isexecutable, 213 | default=which('jing'), 214 | help='full path to jing [%(default)s]') 215 | 216 | 217 | # 218 | # -- end of file 219 | -------------------------------------------------------------------------------- /tldp/sources.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import os 10 | import sys 11 | import errno 12 | import logging 13 | 14 | from tldp.ldpcollection import LDPDocumentCollection 15 | 16 | from tldp.utils import md5files, stem_and_ext 17 | from tldp.typeguesser import guess, knownextensions 18 | 19 | logger = logging.getLogger(__name__) 20 | 21 | IGNORABLE_SOURCE = ('index.sgml') 22 | 23 | 24 | def scansourcedirs(dirnames): 25 | '''return a dict() of all SourceDocuments discovered in dirnames 26 | dirnames: a list of directories containing SourceDocuments. 27 | 28 | scansourcedirs ensures it is operating on the absolute filesystem path for 29 | each of the source directories. 30 | 31 | If any of the supplied dirnames does not exist as a directory, the function 32 | will log the missing source directory names and then will raise an IOError 33 | and quit. 34 | 35 | For each document that it finds in a source directory, it creates a 36 | SourceDocument entry using the stem name as a key. 37 | 38 | The rules for identifying possible SourceDocuments go as follows. 39 | 40 | - Within any source directory, a source document can consist of a single 41 | file with an extension or a directory. 42 | 43 | - If the candidate entry is a directory, then, the stem is the full 44 | directory name, e.g. Masquerading-Simple-HOWTO 45 | 46 | - If the candidate entry is a file, the stem is the filename minus 47 | extension, e.g. Encrypted-Root-Filesystem-HOWTO 48 | 49 | Because the function accepts (and will scan) many source directories, it 50 | is possible that there will be stem name collisions. If it discovers a 51 | stem collision, SourceCollection will issue a warning and skip the 52 | duplicated stem(s). [It also tries to process the source directories and 53 | candidates in a stable order between runs.] 54 | ''' 55 | found = dict() 56 | dirs = [os.path.abspath(x) for x in dirnames] 57 | results = [os.path.exists(x) for x in dirs] 58 | 59 | if not all(results): 60 | for result, sdir in zip(results, dirs): 61 | logger.critical("Source collection dir must already exist: %s", 62 | sdir) 63 | raise IOError(errno.ENOENT, os.strerror(errno.ENOENT), sdir) 64 | 65 | for sdir in sorted(dirs): 66 | logger.debug("Scanning for source documents in %s.", sdir) 67 | for fname in sorted(os.listdir(sdir)): 68 | candidates = list() 69 | possible = arg_issourcedoc(os.path.join(sdir, fname)) 70 | if possible: 71 | candidates.append(SourceDocument(possible)) 72 | else: 73 | logger.warning("Skipping non-document %s", fname) 74 | continue 75 | for candy in candidates: 76 | if candy.stem in found: 77 | dup = found[candy.stem].filename 78 | logger.warning("Ignoring duplicate is %s", candy.filename) 79 | logger.warning("Existing dup-entry is %s", dup) 80 | else: 81 | found[candy.stem] = candy 82 | logger.debug("Discovered %s source documents", len(found)) 83 | return found 84 | 85 | 86 | def arg_issourcedoc(filename): 87 | filename = os.path.abspath(filename) 88 | if os.path.isfile(filename): 89 | if os.path.basename(filename) in IGNORABLE_SOURCE: 90 | return None 91 | return filename 92 | elif os.path.isdir(filename): 93 | return sourcedoc_fromdir(filename) 94 | return None 95 | 96 | 97 | def sourcedoc_fromdir(name): 98 | candidates = list() 99 | if not os.path.isdir(name): 100 | return None 101 | stem = os.path.basename(name) 102 | for ext in knownextensions: 103 | possible = os.path.join(name, stem + ext) 104 | if os.path.isfile(possible): 105 | candidates.append(possible) 106 | if len(candidates) > 1: 107 | logger.warning("%s multiple document choices in dir %s, bailing....", 108 | stem, name) 109 | raise Exception("multiple document choices in " + name) 110 | elif len(candidates) == 0: 111 | return None 112 | else: 113 | doc = candidates.pop() 114 | logger.debug("%s identified main document %s.", stem, doc) 115 | return doc 116 | 117 | 118 | class SourceCollection(LDPDocumentCollection): 119 | '''a dict-like container for SourceDocument objects 120 | 121 | The key in the SourceCollection is the stem name of the document, which 122 | allows convenient access and guarantees non-collision. 123 | 124 | The use of the stem as a key works conveniently with the 125 | OutputCollection which uses the same strategy on OutputDirectory. 126 | ''' 127 | def __init__(self, dirnames=None): 128 | '''construct a SourceCollection 129 | 130 | delegates most responsibility to function scansourcedirs 131 | ''' 132 | if dirnames is None: 133 | return 134 | self.update(scansourcedirs(dirnames)) 135 | 136 | 137 | class SourceDocument(object): 138 | '''a class providing a container for each set of source documents 139 | ''' 140 | def __repr__(self): 141 | return '<%s:%s (%s)>' % \ 142 | (self.__class__.__name__, self.filename, self.doctype) 143 | 144 | def __init__(self, filename): 145 | '''construct a SourceDocument 146 | 147 | filename is a required parameter 148 | 149 | The filename is the main (and sometimes sole) document representing 150 | the source of the LDP HOWTO or Guide. It is the document that is 151 | passed by name to be handled by any document processing toolchains 152 | (see also tldp.doctypes). 153 | 154 | Each instantiation will raise an IOERror if the supplied filename does 155 | not exist or if the filename isn't a file (symlink is fine, directory 156 | or fifo is not). 157 | 158 | The remainder of the instantiation will set attributes that are useful 159 | later in the processing phase, for example, stem, status, enclosing 160 | directory name and file extension. 161 | 162 | There are two important attributes. First, the document type guesser 163 | will try to infer the doctype (from file extension and signature). 164 | Note that it is not a fatal error if document type cannot be guessed, 165 | but the document will not be able to be processed. Second, it is 166 | useful during the decision-making process to know if any of the source 167 | files are newer than the output files. Thus, the stat() information 168 | for every file in the source document directory (or just the single 169 | source document file) will be collected. 170 | ''' 171 | self.filename = os.path.abspath(filename) 172 | 173 | if not os.path.exists(self.filename): 174 | fn = self.filename 175 | logger.critical("Missing source document: %s", fn) 176 | raise IOError(errno.ENOENT, os.strerror(errno.ENOENT), fn) 177 | 178 | if os.path.isdir(self.filename): 179 | self.filename = sourcedoc_fromdir(self.filename) 180 | elif os.path.isfile(self.filename): 181 | pass 182 | else: 183 | # -- we did not receive a useable document file or directory name 184 | self.filename = None 185 | 186 | if self.filename is None: 187 | fn = filename 188 | logger.critical("Source document is not a plain file: %s", fn) 189 | raise ValueError(fn + " not identifiable as a document") 190 | 191 | self.doctype = guess(self.filename) 192 | self.status = 'source' 193 | self.output = None 194 | self.working = None 195 | self.differing = set() 196 | self.dirname, self.basename = os.path.split(self.filename) 197 | self.stem, self.ext = stem_and_ext(self.basename) 198 | parentbase = os.path.basename(self.dirname) 199 | logger.debug("%s found source %s", self.stem, self.filename) 200 | if parentbase == self.stem: 201 | parentdir = os.path.dirname(self.dirname) 202 | self.md5sums = md5files(self.dirname, relative=parentdir) 203 | else: 204 | self.md5sums = md5files(self.filename, relative=self.dirname) 205 | 206 | def detail(self, widths, verbose, file=sys.stdout): 207 | '''produce a small tabular output about the document''' 208 | template = ' '.join(('{s.status:{w.status}}', 209 | '{s.doctype.__name__:{w.doctype}}', 210 | '{s.stem:{w.stem}}')) 211 | outstr = template.format(s=self, w=widths) 212 | print(outstr, file=file) 213 | if verbose: 214 | print(' doctype {}'.format(self.doctype), file=file) 215 | if self.output: 216 | print(' output dir {}'.format(self.output.dirname), 217 | file=file) 218 | print(' source file {}'.format(self.filename), file=file) 219 | for why, f in sorted(self.differing): 220 | fname = os.path.join(self.dirname, f) 221 | print(' {:>7} source {}'.format(why, fname), file=file) 222 | if self.output: 223 | for f in sorted(self.output.missing): 224 | print(' missing output {}'.format(f), file=file) 225 | 226 | # 227 | # -- end of file 228 | -------------------------------------------------------------------------------- /tests/test_utils.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | # 3 | # Copyright (c) 2016 Linux Documentation Project 4 | 5 | from __future__ import absolute_import, division, print_function 6 | from __future__ import unicode_literals 7 | 8 | import os 9 | import stat 10 | import uuid 11 | import errno 12 | import posix 13 | import unittest 14 | from tempfile import mkdtemp 15 | from tempfile import NamedTemporaryFile as ntf 16 | 17 | from tldptesttools import TestToolsFilesystem 18 | 19 | # -- SUT 20 | from tldp.utils import which, execute 21 | from tldp.utils import statfile, statfiles, stem_and_ext 22 | from tldp.utils import arg_isexecutable, isexecutable 23 | from tldp.utils import arg_isreadablefile, isreadablefile 24 | from tldp.utils import arg_isdirectory, arg_isloglevel 25 | from tldp.utils import arg_isstr 26 | from tldp.utils import swapdirs 27 | 28 | 29 | class Test_isexecutable_and_friends(unittest.TestCase): 30 | 31 | def test_isexecutable(self): 32 | f = ntf(prefix='executable-file') 33 | self.assertFalse(isexecutable(f.name)) 34 | mode = stat.S_IXUSR | stat.S_IRUSR | stat.S_IWUSR 35 | os.chmod(f.name, mode) 36 | self.assertTrue(isexecutable(f.name)) 37 | 38 | def test_arg_isexecutable(self): 39 | f = ntf(prefix='executable-file') 40 | self.assertIsNone(arg_isexecutable(f.name)) 41 | mode = stat.S_IXUSR | stat.S_IRUSR | stat.S_IWUSR 42 | os.chmod(f.name, mode) 43 | self.assertEqual(f.name, arg_isexecutable(f.name)) 44 | 45 | 46 | class Test_isreadablefile_and_friends(unittest.TestCase): 47 | 48 | def test_isreadablefile(self): 49 | f = ntf(prefix='readable-file') 50 | self.assertTrue(isreadablefile(f.name)) 51 | mode = os.stat(f.name).st_mode 52 | os.chmod(f.name, 0) 53 | if 0 == os.getuid(): 54 | self.assertTrue(isreadablefile(f.name)) 55 | else: 56 | self.assertFalse(isreadablefile(f.name)) 57 | os.chmod(f.name, mode) 58 | 59 | def test_arg_isreadablefile(self): 60 | f = ntf(prefix='readable-file') 61 | self.assertEqual(f.name, arg_isreadablefile(f.name)) 62 | mode = os.stat(f.name).st_mode 63 | os.chmod(f.name, 0) 64 | if 0 == os.getuid(): 65 | self.assertEqual(f.name, arg_isreadablefile(f.name)) 66 | else: 67 | self.assertIsNone(arg_isreadablefile(f.name)) 68 | os.chmod(f.name, mode) 69 | 70 | 71 | class Test_arg_isstr(unittest.TestCase): 72 | 73 | def test_arg_isstr(self): 74 | self.assertEqual('s', arg_isstr('s')) 75 | self.assertEqual(None, arg_isstr(7)) 76 | 77 | 78 | class Test_arg_isloglevel(unittest.TestCase): 79 | 80 | def test_arg_isloglevel_integer(self): 81 | self.assertEqual(7, arg_isloglevel(7)) 82 | self.assertEqual(40, arg_isloglevel('frobnitz')) 83 | self.assertEqual(20, arg_isloglevel('INFO')) 84 | self.assertEqual(10, arg_isloglevel('DEBUG')) 85 | 86 | 87 | class Test_arg_isdirectory(TestToolsFilesystem): 88 | 89 | def test_arg_isdirectory(self): 90 | self.assertTrue(arg_isdirectory(self.tempdir)) 91 | f = ntf(dir=self.tempdir) 92 | self.assertFalse(arg_isdirectory(f.name)) 93 | 94 | 95 | class Test_execute(TestToolsFilesystem): 96 | 97 | def test_execute_returns_zero(self): 98 | exe = which('true') 99 | result = execute([exe], logdir=self.tempdir) 100 | self.assertEqual(0, result) 101 | 102 | def test_execute_stdout_to_devnull(self): 103 | exe = which('cat') 104 | cmd = [exe, '/etc/hosts'] 105 | devnull = open('/dev/null', 'w') 106 | result = execute(cmd, stdout=devnull, logdir=self.tempdir) 107 | devnull.close() 108 | self.assertEqual(0, result) 109 | 110 | def test_execute_stderr_to_devnull(self): 111 | exe = which('cat') 112 | cmd = [exe, '/etc/hosts'] 113 | devnull = open('/dev/null', 'w') 114 | result = execute(cmd, stderr=devnull, logdir=self.tempdir) 115 | devnull.close() 116 | self.assertEqual(0, result) 117 | 118 | def test_execute_returns_nonzero(self): 119 | exe = which('false') 120 | result = execute([exe], logdir=self.tempdir) 121 | self.assertEqual(1, result) 122 | 123 | def test_execute_exception_when_logdir_none(self): 124 | exe = which('true') 125 | with self.assertRaises(ValueError) as ecm: 126 | execute([exe], logdir=None) 127 | e = ecm.exception 128 | self.assertTrue('logdir must be a directory' in e.args[0]) 129 | 130 | def test_execute_exception_when_logdir_enoent(self): 131 | exe = which('true') 132 | logdir = os.path.join(self.tempdir, 'nonexistent-directory') 133 | with self.assertRaises(IOError) as ecm: 134 | execute([exe], logdir=logdir) 135 | e = ecm.exception 136 | self.assertTrue('nonexistent' in e.filename) 137 | 138 | 139 | class Test_which(unittest.TestCase): 140 | 141 | def test_good_which_python(self): 142 | python = which('python') 143 | self.assertIsNotNone(python) 144 | self.assertTrue(os.path.isfile(python)) 145 | qualified_python = which(python) 146 | self.assertEqual(python, qualified_python) 147 | 148 | def test_bad_silly_name(self): 149 | silly = which('silliest-executable-name-which-may-yet-be-possible') 150 | self.assertIsNone(silly) 151 | 152 | def test_fq_executable(self): 153 | f = ntf(prefix='tldp-which-test', delete=False) 154 | f.close() 155 | notfound = which(f.name) 156 | self.assertIsNone(notfound) 157 | mode = stat.S_IRWXU | stat.S_IRGRP | stat.S_IROTH 158 | os.chmod(f.name, mode) 159 | found = which(f.name) 160 | self.assertEqual(f.name, found) 161 | os.unlink(f.name) 162 | 163 | 164 | class Test_statfiles(unittest.TestCase): 165 | 166 | def test_statfiles_dir_in_result(self): 167 | '''Assumes that directory ./sample-documents/ exists here''' 168 | here = os.path.dirname(os.path.abspath(__file__)) 169 | statinfo = statfiles(here, relative=here) 170 | self.assertIsInstance(statinfo, dict) 171 | adoc = 'sample-documents/asciidoc-complete.txt' 172 | self.assertTrue(adoc in statinfo) 173 | 174 | def test_statfiles_dir_rel(self): 175 | here = os.path.dirname(os.path.abspath(__file__)) 176 | statinfo = statfiles(here, relative=here) 177 | self.assertIsInstance(statinfo, dict) 178 | self.assertTrue(os.path.basename(__file__) in statinfo) 179 | 180 | def test_statfiles_dir_abs(self): 181 | here = os.path.dirname(os.path.abspath(__file__)) 182 | statinfo = statfiles(here) 183 | self.assertIsInstance(statinfo, dict) 184 | self.assertTrue(__file__ in statinfo) 185 | 186 | def test_statfiles_file_rel(self): 187 | here = os.path.dirname(os.path.abspath(__file__)) 188 | statinfo = statfiles(__file__, relative=here) 189 | self.assertIsInstance(statinfo, dict) 190 | self.assertTrue(os.path.basename(__file__) in statinfo) 191 | 192 | def test_statfiles_file_abs(self): 193 | statinfo = statfiles(__file__) 194 | self.assertIsInstance(statinfo, dict) 195 | self.assertTrue(__file__ in statinfo) 196 | 197 | def test_statfiles_nonexistent_file(self): 198 | here = os.path.dirname(os.path.abspath(__file__)) 199 | this = os.path.join(here, str(uuid.uuid4())) 200 | statinfo = statfiles(this) 201 | self.assertIsInstance(statinfo, dict) 202 | self.assertEqual(0, len(statinfo)) 203 | 204 | 205 | class Test_statfile(TestToolsFilesystem): 206 | 207 | def test_statfile_bogustype(self): 208 | with self.assertRaises(TypeError): 209 | statfile(0) 210 | 211 | def test_statfile_enoent(self): 212 | f = ntf(dir=self.tempdir) 213 | self.assertIsNone(statfile(f.name + '-ENOENT_TEST')) 214 | 215 | def test_statfile_exception(self): 216 | f = ntf(dir=self.tempdir) 217 | omode = os.stat(self.tempdir).st_mode 218 | os.chmod(self.tempdir, 0) 219 | if 0 != os.getuid(): 220 | with self.assertRaises(Exception) as ecm: 221 | statfile(f.name) 222 | e = ecm.exception 223 | self.assertIn(e.errno, (errno.EPERM, errno.EACCES)) 224 | os.chmod(self.tempdir, omode) 225 | stbuf = statfile(f.name) 226 | self.assertIsInstance(stbuf, posix.stat_result) 227 | 228 | 229 | class Test_stem_and_ext(unittest.TestCase): 230 | 231 | def test_stem_and_ext_final_slash(self): 232 | r0 = stem_and_ext('/h/q/t/z/Frobnitz-HOWTO') 233 | r1 = stem_and_ext('/h/q/t/z/Frobnitz-HOWTO/') 234 | self.assertEqual(r0, r1) 235 | 236 | def test_stem_and_ext_rel_abs(self): 237 | r0 = stem_and_ext('/h/q/t/z/Frobnitz-HOWTO') 238 | r1 = stem_and_ext('Frobnitz-HOWTO/') 239 | self.assertEqual(r0, r1) 240 | 241 | 242 | class Test_swapdirs(TestToolsFilesystem): 243 | 244 | def test_swapdirs_bogusarg(self): 245 | with self.assertRaises(OSError) as ecm: 246 | swapdirs('/path/to/frickin/impossible/dir', None) 247 | e = ecm.exception 248 | self.assertTrue(errno.ENOENT is e.errno) 249 | 250 | def test_swapdirs_b_missing(self): 251 | a = mkdtemp(dir=self.tempdir) 252 | b = a + '-B' 253 | self.assertFalse(os.path.exists(b)) 254 | swapdirs(a, b) 255 | self.assertTrue(os.path.exists(b)) 256 | 257 | def test_swapdirs_with_file(self): 258 | a = mkdtemp(dir=self.tempdir) 259 | afile = os.path.join(a, 'silly') 260 | b = mkdtemp(dir=self.tempdir) 261 | bfile = os.path.join(b, 'silly') 262 | with open(afile, 'w'): 263 | pass 264 | self.assertTrue(os.path.exists(a)) 265 | self.assertTrue(os.path.exists(afile)) 266 | self.assertTrue(os.path.exists(b)) 267 | self.assertFalse(os.path.exists(bfile)) 268 | swapdirs(a, b) 269 | self.assertTrue(os.path.exists(a)) 270 | self.assertFalse(os.path.exists(afile)) 271 | self.assertTrue(os.path.exists(b)) 272 | self.assertTrue(os.path.exists(bfile)) 273 | 274 | # 275 | # -- end of file 276 | -------------------------------------------------------------------------------- /ChangeLog: -------------------------------------------------------------------------------- 1 | 2 | 2016-05-13 Martin A. Brown 3 | * bumping version to tldp-0.7.13 4 | * accommodate root-run tests (used by Deb-O-Matic) 5 | 6 | 2016-04-30 Martin A. Brown 7 | * bumping version to tldp-0.7.12 8 | * adding ChangeLog (this file) 9 | * cosmetic changes; deduplication of test data, copyright in many files 10 | * add contrib/debian-release.sh 11 | * put version number in tldp/__init__.py 12 | * generate specfile after tagging, using contrib/rpm-release.py 13 | * Debian packaging issues larger addressed 14 | 15 | 2016-04-21 Martin A. Brown 16 | * bumping version to tldp-0.7.7 17 | * Debian packaging attempt #1, created build with 'native' source format 18 | which will not be accepted 19 | * add debian/copyright file 20 | * ldptool manpage (sphinx-generated for Debian; statically installed in RPM) 21 | * switch --detail reporting to use predictable DOCTYPE and STATUSTYPE names 22 | 23 | 2016-04-09 Martin A. Brown 24 | * bumping version to 0.7.5 25 | * remove 'random' text from .LDP-source-MD5SUMS 26 | * remove the --builddir if empty after complete run 27 | 28 | 2016-04-02 Martin A. Brown 29 | * bumping version to 0.7.2 30 | * using filesystem age for determining build need will not work; switch 31 | to using content hash (MD5) to determine whether a rebuild is necessary or 32 | not 33 | * create .LDP-source-MD5SUMS in each output directory that lists all of 34 | the hashes of the source files used to create that output directory 35 | * remove testing and references to statfiles() and supporting friends 36 | * add a 'lifecycle' test to the testing suite 37 | * report on running success and failure counts during the run (to allow 38 | interruptability if the user wishes) 39 | 40 | 2016-03-28 Martin A. Brown 41 | * bumping version to 0.7.0 42 | * support better handling of --verbose; --verbose yes, --verbose false 43 | * update and improve documentation in stock configuration file 44 | * provide better feedback on directory existence (or not) rather than 45 | silently doing something unpredicable 46 | 47 | 2016-03-27 Martin A. Brown 48 | * bumping version to 0.6.7 49 | * correct situation where publish() was not propagating errors returned 50 | from the build() function; add test 51 | * add broken example Docbook 4 XML file to test suite 52 | * use unicode_literals in all testing code, too 53 | 54 | 2016-03-24 Martin A. Brown 55 | * bumping version to 0.6.2 56 | * fix all sorts of runtime requirements to build under Ubuntu 57 | and run the full test suite on Travis CI 58 | 59 | 2016-03-15 Martin A. Brown 60 | * bumping version to 0.6.0 61 | * full support for Python3, all unicode-ified and happy 62 | * add test to fall back to iso-8859-1 for SGML docs 63 | * success testing with tox under Python 2.7 and 3.4 64 | 65 | 2016-03-14 Martin A. Brown 66 | * bumping version to 0.5.5 67 | * use sgmlcheck for Linuxdoc sources 68 | * adjust reporting of discovered documents 69 | * use context to prevent more FD leakage 70 | * begin changes to support Python3; e.g. io.StringIO, absolute_import 71 | unicode changes, lots of codecs.open(), unicode_literals, 72 | 73 | 2016-03-11 Martin A. Brown 74 | * handle EPIPE and INT with signal.SIG_DFL 75 | 76 | 2016-03-10 Martin A. Brown 77 | * bumping version to 0.5.3 78 | * create long running tests that exercise more of the code in the likely 79 | way that a user would use the utility 80 | * add testing for Docbook 5 XML 81 | * improve look and consistency for --list (--detail) output 82 | * improve README.rst 83 | 84 | 2016-03-09 Martin A. Brown 85 | * remove unused markdown and rst skeleton processors 86 | * pass **kwargs through all processor tools 87 | 88 | 2016-03-07 Martin A. Brown 89 | * add support for --builddir, ensure that --builddir is on the same 90 | filesystem as --pubdir 91 | * add new option --publish; can't replace a directory atomically, but 92 | get as close as possible by swapping the newly built output (from 93 | --builddir) with the old one (formerly in --pubdir) 94 | * switch to using 'return os.EX_OK' from functions in driver.py that 95 | can be tested and/or wrapped in sys.exit(function(args)) 96 | * testing improvements for Asciidoc and driver.py 97 | 98 | 2016-03-06 Martin A. Brown 99 | * provide user-discoverable support for --doctypes and --statustypes 100 | * correct removal of Docbook4XML generated source document during build 101 | 102 | 2016-03-05 Martin A. Brown 103 | * use a simplified technique (arbitrary attributes on function objects) 104 | to generate the DAG used for topological sorting and build order 105 | generation (thanks to Python mailing lists for the idea) 106 | 107 | 2016-03-04 Martin A. Brown 108 | * bumping version to 0.4.8 109 | * add FO generation XSL 110 | * do not set a system default for --sourcedir / --pubdir (user must 111 | specify, somehow) 112 | * DocBook5/DocBook4: process xincludes before validation with xmllint 113 | * add support for AsciiDoc detection and processing 114 | 115 | 2016-03-03 Martin A. Brown 116 | * validate all documents (where possible) before processing 117 | * provide support for DocBook 5.0 (XML) 118 | * correct --loglevel handling in driver.py (finally works properly!) 119 | * complete support for --script output 120 | 121 | 2016-03-02 Martin A. Brown 122 | * bumping version to 0.4.5 123 | * fix handling of STEMs which contain a '.' in the name 124 | * review signature identification in each DOCTYPE processor and 125 | validate and reconcile errors with PUBLIC / SYSTEM identifiers 126 | for the SGML and XML declarations 127 | * make sure that build() exits non-zero if ANY build fails 128 | 129 | 2016-03-01 Martin A. Brown 130 | * bumping version to 0.4.2 131 | * support a system configuration file /etc/ldptool 132 | * add entry points and make first full installable build 133 | * allow empty OutputDirectory() object 134 | * begin overhauling the porcelain in driver.py 135 | 136 | 2016-02-29 Martin A. Brown 137 | * overhaul generation of inventory object from sources/outputs 138 | * add command-line features and options; actions in particular 139 | * continue improving coverage, at 100% on utils.py 140 | * complete CascadingConfig object creation 141 | 142 | 2016-02-26 Martin A. Brown 143 | * generate a DAG for each processor class, so dependencies can 144 | be localized (controlled, abstracted) to each processor class 145 | * use topological sort of the DAG to drive generation of the shellscript, 146 | which leads to massive simplification of the generate() method 147 | * user can specify explicit file to process 148 | * better PDF generation logic (relying on jw) 149 | * provide support for --script outputs (logical equiv. of --dryrun) 150 | * if a document processor is missing prerequisites, gripe to logging 151 | and skip to the next document 152 | * support a SourceDocument named by its directory 153 | * add timing to each processor (some documents take minutes to process, 154 | others just a few seconds; good for users trying to understand which...) 155 | 156 | 2016-02-25 Martin A. Brown 157 | * overhaul where and how logging module gets called; driver.py is main 158 | * adding --skip feature; can skip STEM, DOCTYPE or STATUSTYPE 159 | * automatically detect configuration fragments in document processors 160 | with object inspection 161 | 162 | 2016-02-23 Martin A. Brown 163 | * add support for --detail (and --verbose) for both source and output docs 164 | * pass args into all driver functions 165 | * get rid of platform.py and references (not necessary any longer) 166 | * fix FD leakage in function execute() and add test case (prevent reversion) 167 | (and start switching to contextlib 'with' usage to avoid in future) 168 | * start generalizing the build process for all doctypes in common.py 169 | * move all generic functionality into BaseDoctype object 170 | * revise fundamental execution approach; generate a shellscript (which can 171 | be executed or simply printed) 172 | * make logging readability improvements: clarity, relevance and alignment 173 | 174 | 2016-02-22 Martin A. Brown 175 | * adding ArgumentParser wrapper so can support config file + envars 176 | * all sorts of work for support cascading configuration 177 | * allow each processor to have its own configuration fragment, e.g. 178 | --docbook4xml-xmllint; owned by the Docbook4XML object 179 | * add support for --dump-cfg, --dump-env, --dump-cli, --debug-options 180 | * adding the license text (MIT) and all of that stuff 181 | * creating and fixing the setup.py 182 | 183 | 2016-02-19 Martin A. Brown 184 | 185 | 2016-02-18 Martin A. Brown 186 | * process and report on documents in case-insensitive stem-sorted order 187 | * add many docstrings for internal usage 188 | * move all source directory scanning logic out of the SourceCollection 189 | object; easier to test and simpler to understand 190 | 191 | 2016-02-17 Martin A. Brown 192 | * add logic for testing file age, assuming a fresh checkout of the 193 | source documents; use filesystem age to determine whether or not 194 | a document rebuild is necessary 195 | * initial support for driver.py (eventually, the main user entry point 196 | and inventory.py (for managing the identification and pairing of 197 | source and output documents) 198 | 199 | 2016-02-16 Martin A. Brown 200 | * adding tons of testing for document types, edge cases, duplicate 201 | stems, sample valid and broken documents 202 | 203 | 2016-02-15 Martin A. Brown 204 | * first processor, Linuxdoc, reaches success 205 | * provide better separation between a SourceCollection and the 206 | individual SourceDocuments; analogously, between OutputDirectory 207 | and OutputCollection 208 | * provide similar dict-like behaviour for SourceCollection and 209 | OutputCollection (which is known to the user as --pubdir) 210 | 211 | 2016-02-12 Martin A. Brown 212 | * first processor, Linuxdoc, fleshed out, created (failed) 213 | * generate skeletons for other supported source document formats 214 | * automate detection of source document format; add initial testing tools 215 | 216 | 2016-02-11 Martin A. Brown 217 | * core source collection and output directory scanning complete 218 | 219 | 2016-02-10 Martin A. Brown 220 | * initial commit and basic beginnings 221 | -------------------------------------------------------------------------------- /tldp/doctypes/docbooksgml.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import os 10 | import logging 11 | 12 | from tldp.utils import which, firstfoundfile 13 | from tldp.utils import arg_isexecutable, isexecutable 14 | from tldp.utils import arg_isreadablefile, isreadablefile 15 | 16 | from tldp.doctypes.common import BaseDoctype, SignatureChecker, depends 17 | 18 | logger = logging.getLogger(__name__) 19 | 20 | 21 | def docbookdsl_finder(): 22 | locations = [ 23 | '/usr/share/sgml/docbook/stylesheet/dsssl/modular/html/docbook.dsl', 24 | '/usr/share/sgml/docbook/dsssl-stylesheets/html/docbook.dsl', 25 | ] 26 | return firstfoundfile(locations) 27 | 28 | 29 | def ldpdsl_finder(): 30 | locations = [ 31 | '/usr/share/sgml/docbook/stylesheet/dsssl/ldp/ldp.dsl', 32 | ] 33 | return firstfoundfile(locations) 34 | 35 | 36 | class DocbookSGML(BaseDoctype, SignatureChecker): 37 | formatname = 'DocBook SGML 3.x/4.x' 38 | extensions = ['.sgml'] 39 | signatures = ['-//Davenport//DTD DocBook V3.0//EN', 40 | '-//OASIS//DTD DocBook V3.1//EN', 41 | '-//OASIS//DTD DocBook V4.1//EN', 42 | '-//OASIS//DTD DocBook V4.2//EN', ] 43 | 44 | required = {'docbooksgml_jw': isexecutable, 45 | 'docbooksgml_openjade': isexecutable, 46 | 'docbooksgml_dblatex': isexecutable, 47 | 'docbooksgml_html2text': isexecutable, 48 | 'docbooksgml_collateindex': isexecutable, 49 | 'docbooksgml_ldpdsl': isreadablefile, 50 | 'docbooksgml_docbookdsl': isreadablefile, 51 | } 52 | 53 | def make_blank_indexsgml(self, **kwargs): 54 | indexsgml = os.path.join(self.source.dirname, 'index.sgml') 55 | self.indexsgml = os.path.isfile(indexsgml) 56 | if self.indexsgml: 57 | return True 58 | '''generate an empty index.sgml file (in output dir)''' 59 | s = '''"{config.docbooksgml_collateindex}" \\ 60 | -N \\ 61 | -o \\ 62 | "index.sgml"''' 63 | return self.shellscript(s, **kwargs) 64 | 65 | @depends(make_blank_indexsgml) 66 | def move_blank_indexsgml_into_source(self, **kwargs): 67 | '''move a blank index.sgml file into the source tree''' 68 | if self.indexsgml: 69 | return True 70 | s = '''mv \\ 71 | --no-clobber \\ 72 | --verbose \\ 73 | -- "index.sgml" "{source.dirname}/index.sgml"''' 74 | indexsgml = os.path.join(self.source.dirname, 'index.sgml') 75 | if not self.config.script: 76 | self.removals.add(indexsgml) 77 | return self.shellscript(s, **kwargs) 78 | 79 | @depends(move_blank_indexsgml_into_source) 80 | def make_data_indexsgml(self, **kwargs): 81 | '''collect document's index entries into a data file (HTML.index)''' 82 | if self.indexsgml: 83 | return True 84 | s = '''"{config.docbooksgml_openjade}" \\ 85 | -t sgml \\ 86 | -V html-index \\ 87 | -d "{config.docbooksgml_docbookdsl}" \\ 88 | "{source.filename}"''' 89 | return self.shellscript(s, **kwargs) 90 | 91 | @depends(make_data_indexsgml) 92 | def make_indexsgml(self, **kwargs): 93 | '''generate the final document index file (index.sgml)''' 94 | if self.indexsgml: 95 | return True 96 | s = '''"{config.docbooksgml_collateindex}" \\ 97 | -g \\ 98 | -t Index \\ 99 | -i doc-index \\ 100 | -o "index.sgml" \\ 101 | "HTML.index" \\ 102 | "{source.filename}"''' 103 | return self.shellscript(s, **kwargs) 104 | 105 | @depends(make_indexsgml) 106 | def move_indexsgml_into_source(self, **kwargs): 107 | '''move the generated index.sgml file into the source tree''' 108 | if self.indexsgml: 109 | return True 110 | indexsgml = os.path.join(self.source.dirname, 'index.sgml') 111 | s = '''mv \\ 112 | --verbose \\ 113 | --force \\ 114 | -- "index.sgml" "{source.dirname}/index.sgml"''' 115 | logger.debug("%s creating %s", self.source.stem, indexsgml) 116 | if not self.config.script: 117 | self.removals.add(indexsgml) 118 | return self.shellscript(s, **kwargs) 119 | 120 | @depends(move_indexsgml_into_source) 121 | def cleaned_indexsgml(self, **kwargs): 122 | '''clean the junk from the output dir after building the index.sgml''' 123 | # -- be super cautious before removing a bunch of files 124 | if not self.config.script: 125 | cwd = os.getcwd() 126 | if not os.path.samefile(cwd, self.output.dirname): 127 | logger.error("%s (cowardly) refusing to clean directory %s", 128 | self.source.stem, cwd) 129 | logger.error("%s expected to find %s", 130 | self.source.stem, self.output.dirname) 131 | return False 132 | preserve = os.path.basename(self.output.MD5SUMS) 133 | s = '''find . -mindepth 1 -maxdepth 1 -not -type d -not -name {} -delete -print''' 134 | s = s.format(preserve) 135 | return self.shellscript(s, **kwargs) 136 | 137 | @depends(cleaned_indexsgml) 138 | def make_htmls(self, **kwargs): 139 | '''create a single page HTML output (with incorrect name)''' 140 | s = '''"{config.docbooksgml_jw}" \\ 141 | -f docbook \\ 142 | -b html \\ 143 | --dsl "{config.docbooksgml_ldpdsl}#html" \\ 144 | -V nochunks \\ 145 | -V '%callout-graphics-path%=images/callouts/' \\ 146 | -V '%stock-graphics-extension%=.png' \\ 147 | --output . \\ 148 | "{source.filename}"''' 149 | return self.shellscript(s, **kwargs) 150 | 151 | @depends(make_htmls) 152 | def make_name_htmls(self, **kwargs): 153 | '''correct the single page HTML output name''' 154 | s = 'mv -v --no-clobber -- "{output.name_html}" "{output.name_htmls}"' 155 | return self.shellscript(s, **kwargs) 156 | 157 | @depends(make_name_htmls) 158 | def make_name_txt(self, **kwargs): 159 | '''create text output (from single-page HTML)''' 160 | s = '''"{config.docbooksgml_html2text}" > "{output.name_txt}" \\ 161 | -style pretty \\ 162 | -nobs \\ 163 | "{output.name_htmls}"''' 164 | return self.shellscript(s, **kwargs) 165 | 166 | def make_pdf_with_jw(self, **kwargs): 167 | '''use jw (openjade) to create a PDF''' 168 | s = '''"{config.docbooksgml_jw}" \\ 169 | -f docbook \\ 170 | -b pdf \\ 171 | --output . \\ 172 | "{source.filename}"''' 173 | return self.shellscript(s, **kwargs) 174 | 175 | def make_pdf_with_dblatex(self, **kwargs): 176 | '''use dblatex (fallback) to create a PDF''' 177 | s = '''"{config.docbooksgml_dblatex}" \\ 178 | -F sgml \\ 179 | -t pdf \\ 180 | -o "{output.name_pdf}" \\ 181 | "{source.filename}"''' 182 | return self.shellscript(s, **kwargs) 183 | 184 | @depends(cleaned_indexsgml) 185 | def make_name_pdf(self, **kwargs): 186 | stem = self.source.stem 187 | classname = self.__class__.__name__ 188 | logger.info("%s calling method %s.%s", 189 | stem, classname, 'make_pdf_with_jw') 190 | if self.make_pdf_with_jw(**kwargs): 191 | return True 192 | logger.error("%s jw failed creating PDF, falling back to dblatex...", 193 | stem) 194 | logger.info("%s calling method %s.%s", 195 | stem, classname, 'make_pdf_with_dblatex') 196 | return self.make_pdf_with_dblatex(**kwargs) 197 | 198 | @depends(make_name_htmls) 199 | def make_html(self, **kwargs): 200 | '''create chunked HTML outputs''' 201 | s = '''"{config.docbooksgml_jw}" \\ 202 | -f docbook \\ 203 | -b html \\ 204 | --dsl "{config.docbooksgml_ldpdsl}#html" \\ 205 | -V '%callout-graphics-path%=images/callouts/' \\ 206 | -V '%stock-graphics-extension%=.png' \\ 207 | --output . \\ 208 | "{source.filename}"''' 209 | return self.shellscript(s, **kwargs) 210 | 211 | @depends(make_html) 212 | def make_name_html(self, **kwargs): 213 | '''rename openjade's index.html to LDP standard name STEM.html''' 214 | s = 'mv -v --no-clobber -- "{output.name_indexhtml}" "{output.name_html}"' 215 | return self.shellscript(s, **kwargs) 216 | 217 | @depends(make_name_html) 218 | def make_name_indexhtml(self, **kwargs): 219 | '''create final index.html symlink''' 220 | s = 'ln -svr -- "{output.name_html}" "{output.name_indexhtml}"' 221 | return self.shellscript(s, **kwargs) 222 | 223 | @classmethod 224 | def argparse(cls, p): 225 | descrip = 'executables and data files for %s' % (cls.formatname,) 226 | g = p.add_argument_group(title=cls.__name__, description=descrip) 227 | g.add_argument('--docbooksgml-docbookdsl', type=arg_isreadablefile, 228 | default=docbookdsl_finder(), 229 | help='full path to html/docbook.dsl [%(default)s]') 230 | g.add_argument('--docbooksgml-ldpdsl', type=arg_isreadablefile, 231 | default=ldpdsl_finder(), 232 | help='full path to ldp/ldp.dsl [%(default)s]') 233 | g.add_argument('--docbooksgml-jw', type=arg_isexecutable, 234 | default=which('jw'), 235 | help='full path to jw [%(default)s]') 236 | g.add_argument('--docbooksgml-html2text', type=arg_isexecutable, 237 | default=which('html2text'), 238 | help='full path to html2text [%(default)s]') 239 | g.add_argument('--docbooksgml-openjade', type=arg_isexecutable, 240 | default=which('openjade'), 241 | help='full path to openjade [%(default)s]') 242 | g.add_argument('--docbooksgml-dblatex', type=arg_isexecutable, 243 | default=which('dblatex'), 244 | help='full path to dblatex [%(default)s]') 245 | g.add_argument('--docbooksgml-collateindex', type=arg_isexecutable, 246 | default=which('collateindex.pl'), 247 | help='full path to collateindex [%(default)s]') 248 | 249 | # 250 | # -- end of file 251 | -------------------------------------------------------------------------------- /tldp/utils.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | # -*- coding: utf8 -*- 3 | # 4 | # Copyright (c) 2016 Linux Documentation Project 5 | 6 | from __future__ import absolute_import, division, print_function 7 | from __future__ import unicode_literals 8 | 9 | import os 10 | import time 11 | import errno 12 | import codecs 13 | import hashlib 14 | import subprocess 15 | import functools 16 | from functools import wraps 17 | from tempfile import mkstemp, mkdtemp 18 | import logging 19 | logger = logging.getLogger(__name__) 20 | 21 | opa = os.path.abspath 22 | opb = os.path.basename 23 | opd = os.path.dirname 24 | opj = os.path.join 25 | 26 | logdir = 'tldp-document-build-logs' 27 | 28 | 29 | def logtimings(logmethod): 30 | def anon(f): 31 | @wraps(f) 32 | def timing(*args, **kwargs): 33 | s = time.time() 34 | result = f(*args, **kwargs) 35 | e = time.time() 36 | logmethod('running %s(%r, %r) took %.3f s', 37 | f.__name__, args, kwargs, e - s) 38 | return result 39 | return timing 40 | return anon 41 | 42 | 43 | def firstfoundfile(locations): 44 | '''return the first existing file from a list of filenames (or None)''' 45 | for option in locations: 46 | if isreadablefile(option): 47 | return option 48 | return None 49 | 50 | 51 | def arg_isloglevel(l, defaultlevel=logging.ERROR): 52 | try: 53 | level = int(l) 54 | return level 55 | except ValueError: 56 | pass 57 | level = getattr(logging, l.upper(), None) 58 | if not level: 59 | level = defaultlevel 60 | return level 61 | 62 | 63 | def arg_isstr(s): 64 | if isstr(s): 65 | return s 66 | return None 67 | 68 | 69 | def arg_isreadablefile(f): 70 | if isreadablefile(f): 71 | return f 72 | return None 73 | 74 | 75 | def arg_isdirectory(d): 76 | if os.path.isdir(d): 77 | return d 78 | return None 79 | 80 | 81 | def arg_isexecutable(f): 82 | if isexecutable(f): 83 | return f 84 | return None 85 | 86 | 87 | def sameFilesystem(d0, d1): 88 | return os.stat(d0).st_dev == os.stat(d1).st_dev 89 | 90 | 91 | def stem_and_ext(name): 92 | '''return (stem, ext) for any relative or absolute filename''' 93 | return os.path.splitext(os.path.basename(os.path.normpath(name))) 94 | 95 | 96 | def swapdirs(a, b): 97 | '''use os.rename() to make "a" become "b"''' 98 | if not os.path.isdir(a): 99 | raise OSError(errno.ENOENT, os.strerror(errno.ENOENT), a) 100 | tname = None 101 | if os.path.exists(b): 102 | tdir = mkdtemp(prefix='swapdirs-', dir=opd(opa(a))) 103 | logger.debug("Created tempdir %s.", tdir) 104 | tname = opj(tdir, opb(b)) 105 | logger.debug("About to rename %s to %s.", b, tname) 106 | os.rename(b, tname) 107 | logger.debug("About to rename %s to %s.", a, b) 108 | os.rename(a, b) 109 | if tname: 110 | logger.debug("About to rename %s to %s.", tname, a) 111 | os.rename(tname, a) 112 | logger.debug("About to remove %s.", tdir) 113 | os.rmdir(tdir) 114 | 115 | 116 | def logfilecontents(logmethod, prefix, fname): 117 | '''log all lines of a file with a prefix ''' 118 | with codecs.open(fname, encoding='utf-8') as f: 119 | for line in f: 120 | logmethod("%s: %s", prefix, line.rstrip()) 121 | 122 | 123 | def conditionallogging(result, prefix, fname): 124 | if logger.isEnabledFor(logging.DEBUG): 125 | logfilecontents(logger.debug, prefix, fname) # -- always 126 | elif logger.isEnabledFor(logging.INFO): 127 | if result != 0: 128 | logfilecontents(logger.info, prefix, fname) # -- error 129 | 130 | 131 | def execute(cmd, stdin=None, stdout=None, stderr=None, 132 | logdir=None, env=os.environ): 133 | '''(yet another) wrapper around subprocess.Popen() 134 | 135 | The processing tools for handling DocBook SGML, DocBook XML and Linuxdoc 136 | all use different conventions for writing outputs. Some write into the 137 | working directory. Others write to STDOUT. Others accept the output file 138 | as a required option. 139 | 140 | To allow for automation and flexibility, this wrapper function does what 141 | most other synchronous subprocess.Popen() wrappers does, but it adds a 142 | feature to record the STDOUT and STDERR of the executable. This is 143 | helpful when trying to diagnose build failures of individual documents. 144 | 145 | Required: 146 | 147 | - cmd: (list form only; the paranoid prefer shell=False) 148 | this must include the whole command-line 149 | - logdir: an existing directory in which temporary log files 150 | will be created 151 | 152 | Optional: 153 | 154 | - stdin: if not supplied, STDIN (FD 0) will be left as is 155 | - stdout: if not supplied, STDOUT (FD 1) will be connected 156 | to a named file in the logdir (and left for later inspection) 157 | - stderr: if not supplied, STDERR (FD 2) will be connected 158 | to a named file in the logdir (and left for later inspection) 159 | - env: if not supplied, just use current environment 160 | 161 | Returns: the numeric exit code of the process 162 | 163 | Side effects: 164 | 165 | * will probably create temporary files in logdir 166 | * function calls wait(); process execution will intentionally block 167 | until the child process terminates 168 | 169 | Possible exceptions: 170 | 171 | * if the first element of list cmd does not contain an executable, 172 | this function will raise an AssertionError 173 | * if logdir is not a directory, this function will raise ValueError or 174 | IOError 175 | * and, of course, any exceptions passed up from calling subprocess.Popen 176 | 177 | ''' 178 | prefix = os.path.basename(cmd[0]) + '.' + str(os.getpid()) + '-' 179 | 180 | assert isexecutable(cmd[0]) 181 | 182 | if logdir is None: 183 | raise ValueError("logdir must be a directory, cannot be None.") 184 | 185 | if not os.path.isdir(logdir): 186 | raise IOError(errno.ENOENT, os.strerror(errno.ENOENT), logdir) 187 | 188 | # -- not remapping STDIN, because that doesn't make sense here 189 | mytfile = functools.partial(mkstemp, prefix=prefix, dir=logdir) 190 | if stdout is None: 191 | stdout, stdoutname = mytfile(suffix='.stdout') 192 | else: 193 | stdoutname = None 194 | 195 | if stderr is None: 196 | stderr, stderrname = mytfile(suffix='.stderr') 197 | else: 198 | stderrname = None 199 | 200 | logger.debug("About to execute: %r", cmd) 201 | proc = subprocess.Popen(cmd, shell=False, close_fds=True, 202 | stdin=stdin, stdout=stdout, stderr=stderr, 203 | env=env, preexec_fn=os.setsid) 204 | result = proc.wait() 205 | if result != 0: 206 | logger.error("Non-zero exit (%s) for process: %r", result, cmd) 207 | logger.error("Find STDOUT/STDERR in %s/%s*", logdir, prefix) 208 | if isinstance(stdout, int) and stdoutname: 209 | os.close(stdout) 210 | conditionallogging(result, 'STDOUT', stdoutname) 211 | if isinstance(stderr, int) and stderrname: 212 | os.close(stderr) 213 | conditionallogging(result, 'STDERR', stderrname) 214 | return result 215 | 216 | 217 | def isexecutable(f): 218 | '''True if argument is executable''' 219 | return os.path.isfile(f) and os.access(f, os.X_OK) 220 | 221 | 222 | def isreadablefile(f): 223 | '''True if argument is readable file''' 224 | return os.path.isfile(f) and os.access(f, os.R_OK) 225 | 226 | 227 | def isstr(s): 228 | '''True if argument is stringy (unicode or string)''' 229 | try: 230 | unicode 231 | stringy = (str, unicode) 232 | except NameError: 233 | stringy = (str,) # -- python3 234 | return isinstance(s, stringy) 235 | 236 | 237 | def which(program): 238 | '''return None or the full path to an executable (respecting $PATH) 239 | http://stackoverflow.com/questions/377017/test-if-executable-exists-in-python/377028#377028 240 | ''' 241 | fpath, fname = os.path.split(program) 242 | if fpath and isexecutable(program): 243 | return program 244 | else: 245 | for path in os.environ["PATH"].split(os.pathsep): 246 | path = path.strip('"') 247 | sut = os.path.join(path, program) 248 | if isexecutable(sut): 249 | return sut 250 | return None 251 | 252 | 253 | def writemd5sums(fname, md5s, header=None): 254 | '''write an MD5SUM file from [(filename, MD5), ...]''' 255 | with codecs.open(fname, 'w', encoding='utf-8') as file: 256 | if header: 257 | print(header, file=file) 258 | for fname, hashval in sorted(md5s.items()): 259 | print(hashval + ' ' + fname, file=file) 260 | 261 | 262 | def md5file(name): 263 | '''return MD5 hash for a single file name''' 264 | with open(name, 'rb') as f: 265 | bs = f.read() 266 | md5 = hashlib.md5(bs).hexdigest() 267 | try: 268 | md5 = unicode(md5) 269 | except NameError: 270 | pass # -- python3 271 | return md5 272 | 273 | 274 | def statfile(name): 275 | '''return posix.stat_result (or None) for a single file name''' 276 | try: 277 | st = os.lstat(name) 278 | except OSError as e: 279 | if e.errno != errno.ENOENT: 280 | raise e 281 | st = None 282 | return st 283 | 284 | 285 | def md5files(name, relative=None): 286 | '''get all of the MD5s for files from here downtree''' 287 | return fileinfo(name, relative=relative, func=md5file) 288 | 289 | 290 | def statfiles(name, relative=None): 291 | ''' 292 | >>> statfiles('./docs/x509').keys() 293 | ['./docs/x509/tutorial.rst', './docs/x509/reference.rst', './docs/x509/index.rst'] 294 | >>> statfiles('./docs/x509', relative='./').keys() 295 | ['docs/x509/reference.rst', 'docs/x509/tutorial.rst', 'docs/x509/index.rst'] 296 | >>> statfiles('./docs/x509', relative='./docs/x509/').keys() 297 | ['index.rst', 'tutorial.rst', 'reference.rst'] 298 | ''' 299 | return fileinfo(name, relative=relative, func=statfile) 300 | 301 | 302 | def fileinfo(name, relative=None, func=statfile): 303 | '''return a dict() with keys being filenames and posix.stat_result values 304 | 305 | Required: 306 | 307 | name: the name should be an existing file, but accessing filesystems 308 | can be a racy proposition, so if the name is ENOENT, returns an 309 | empty dict() 310 | if name is a directory, os.walk() over the entire subtree and 311 | record and return all stat() results 312 | 313 | Optional: 314 | 315 | relative: if the filenames in the keys should be relative some other 316 | directory, then supply that path here (see examples) 317 | 318 | 319 | Bugs: 320 | Dealing with filesystems is always potentially a racy affair. They go 321 | out for lunch sometimes. They don't call. They don't write. But, at 322 | least we can try to rely on them as best we can--mostly, by just 323 | excluding any files (in the output dict()) which did not return a valid 324 | posix.stat_result. 325 | ''' 326 | info = dict() 327 | if not os.path.exists(name): 328 | return info 329 | if not os.path.isdir(name): 330 | if relative: 331 | relpath = os.path.relpath(name, start=relative) 332 | else: 333 | relpath = name 334 | info[relpath] = func(name) 335 | if info[relpath] is None: 336 | del info[relpath] 337 | else: 338 | for root, dirs, files in os.walk(name): 339 | inodes = list() 340 | inodes.extend(dirs) 341 | inodes.extend(files) 342 | for x in inodes: 343 | foundpath = os.path.join(root, x) 344 | if os.path.isdir(foundpath): 345 | continue 346 | if relative: 347 | relpath = os.path.relpath(foundpath, start=relative) 348 | else: 349 | relpath = foundpath 350 | info[relpath] = func(foundpath) 351 | if info[relpath] is None: 352 | del info[relpath] 353 | return info 354 | 355 | # 356 | # -- end of file 357 | --------------------------------------------------------------------------------