├── .gitignore ├── BUILDING.md ├── CONTRIBUTING.md ├── LICENSE.md ├── Makefile ├── README.md ├── TODO ├── images-source ├── dom.svg └── xmllibxml-by-example.xcf ├── images ├── dom-full.png └── dom.png ├── make-pl-out ├── publish ├── source ├── _static │ ├── PerlXMLLibXMLbyExample.epub │ ├── PerlXMLLibXMLbyExample.pdf │ ├── cc-by-sa.png │ ├── cover.jpg │ ├── metacpan-tiny.png │ ├── plxbe.css │ ├── plxbe.js │ └── xpath-sandbox │ │ ├── xpath-sandbox.css │ │ ├── xpath-sandbox.html │ │ └── xpath-sandbox.js ├── _templates │ ├── customsidebar.html │ ├── epub-cover.html │ ├── page.html │ └── search.html ├── _themes │ └── sphinx_rtd_theme │ │ ├── __init__.py │ │ ├── breadcrumbs.html │ │ ├── footer.html │ │ ├── layout.html │ │ ├── layout_old.html │ │ ├── search.html │ │ ├── searchbox.html │ │ ├── static │ │ ├── css │ │ │ ├── badge_only.css │ │ │ ├── badge_only.css.map │ │ │ ├── theme.css │ │ │ └── theme.css.map │ │ ├── fonts │ │ │ ├── FontAwesome.otf │ │ │ ├── Inconsolata-Bold.ttf │ │ │ ├── Inconsolata-Regular.ttf │ │ │ ├── Lato-Bold.ttf │ │ │ ├── Lato-Regular.ttf │ │ │ ├── RobotoSlab-Bold.ttf │ │ │ ├── RobotoSlab-Regular.ttf │ │ │ ├── fontawesome-webfont.eot │ │ │ ├── fontawesome-webfont.svg │ │ │ ├── fontawesome-webfont.ttf │ │ │ └── fontawesome-webfont.woff │ │ └── js │ │ │ ├── modernizr.min.js │ │ │ └── theme.js │ │ ├── theme.conf │ │ └── versions.html ├── basics.rst ├── code │ ├── 010-list-titles.pl │ ├── 030-parse-from-fh.pl │ ├── 040-movie-details.pl │ ├── 050-attributes.pl │ ├── 060-parse-error.pl │ ├── 100-xpath-examples │ ├── 110-case-insensitive-xpath-1 │ ├── 200-dom-document.pl │ ├── 210-dom-elements.pl │ ├── 211-dom-elements-no-blanks.pl │ ├── 220-dom-text-nodes.pl │ ├── 230-dom-attributes.pl │ ├── 240-dom-attr.pl │ ├── 250-dom-nodelist.pl │ ├── 260-dom-modification.pl │ ├── 270-dom-from-scratch.pl │ ├── 271-dom-from-scratch-latin1.pl │ ├── 500-html-tidy.pl │ ├── 501-html-tidy-no-err.pl │ ├── 510-html-no-stderr.pl │ ├── 520-html-xpath-simple.pl │ ├── 530-html-xpath-complex.pl │ ├── 531-html-xpath-no-semantic.pl │ ├── 540-html-xpath-classes.pl │ ├── 580-html-css-selectors.pl │ ├── 590-ignore-words.pws │ ├── 590-spell-check.pl │ ├── 600-ns-no-context.pl │ ├── 610-ns-xpc.pl │ ├── 620-ns-child-nodes.pl │ ├── 700-reader-events.pl │ ├── 710-reader-named-events.pl │ ├── 720-seek-controversy.pl │ ├── 721-seek-controversy-variants.pl │ ├── 730-reader-parse-error.pl │ ├── 750-titles-only.pl │ ├── book-borkened.xml │ ├── book.xml │ ├── carte-latin1.xml │ ├── country.xml │ ├── css-zen-garden.html │ ├── enwiki-latest-abstract1-abridged.xml.gz │ ├── enwiki-latest-abstract1-structure.xml │ ├── fish-and-chips.xml │ ├── people.html │ ├── playlist.xml │ ├── untidy.html │ └── xml-libxml.svg ├── conf.py ├── dom.rst ├── html.rst ├── index.rst ├── installation.rst ├── large-docs.rst ├── namespaces.rst ├── sphinx-ext │ └── plxbe.py └── xpath.rst └── xpath-sandbox /.gitignore: -------------------------------------------------------------------------------- 1 | build/* 2 | source/_output 3 | *.pyc 4 | -------------------------------------------------------------------------------- /BUILDING.md: -------------------------------------------------------------------------------- 1 | Build Instructions 2 | ================== 3 | 4 | If you wish to contribute to this project this document describes (briefly) 5 | how to re-generate the HTML pages from the .rst (reStructuredText) source. 6 | 7 | The HTML is generated using [Sphinx](http://www.sphinx-doc.org/) which in turn 8 | uses [Docutils](http://docutils.sourceforge.net/), the 9 | [Jinja2](http://jinja.pocoo.org/) templating engine and 'make'. On my Ubuntu 10 | system I installed the dependencies with: 11 | 12 | sudo apt-get install python3-sphinx docutils-doc python-pil-doc python3-pil-dbg sphinx-doc build-essential 13 | 14 | Some texlive dependencies will also be required in order to make the 'latexpdf' 15 | build target work: 16 | 17 | sudo apt-get install texlive-latex-recommended texlive-latex-extra 18 | 19 | If you're not using a Debian/Ubuntu system, refer to those projects for manual 20 | installation instructions. 21 | 22 | The Sphinx site includes a [reStructuredText 23 | Primer](http://www.sphinx-doc.org/rest.html) which should help you get up to 24 | speed with the markup. 25 | 26 | To generate the HTML, run the command: 27 | 28 | make html 29 | 30 | You can then view the resulting generated files under `build/html`. 31 | 32 | The PDF and EPUB formats are both very rough. The formatter doesn't handle 33 | the links to static files so different build options are required: 34 | 35 | SPHINXOPTS="-a" make -e latexpdf 36 | SPHINXOPTS="-a" make -e epub 37 | 38 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | By contributing you agree to the [LICENSE](LICENSE.md) of this repository. 4 | 5 | 6 | ## Issue Tracker 7 | 8 | - before submitting a new issue, please check for existing related issues 9 | 10 | - please comment with a "+1" to help vote for issues that are important to you 11 | 12 | - please keep discussions on-topic, and respect the opinions of others 13 | 14 | 15 | ### Bug Reports 16 | 17 | - please report bugs in the issue tracker 18 | 19 | 20 | ### Feature Requests 21 | 22 | - please suggest new features and improvements in the issue tracker 23 | 24 | 25 | ## Pull Requests / Merge Requests 26 | 27 | - **IMPORTANT**: by submitting a patch, you agree to allow the project owners 28 | to license your work under this [LICENSE](LICENSE.md) 29 | 30 | - Please add your name to the "Contributors" section in source/index.rst 31 | 32 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | 2 | "Perl XML::LibXML by Example" by Grant McLean is licensed under a [Creative 3 | Commons Attribution-ShareAlike 4.0 International License][license] (see also 4 | [human-readable summary][human-license]). 5 | 6 | The preferred form of attribution is via a link to 7 | . 8 | 9 | The source code for this work is at: 10 | . 11 | 12 | [license]: https://creativecommons.org/licenses/by-sa/4.0/legalcode 13 | [human-license]: https://creativecommons.org/licenses/by-sa/4.0/ 14 | 15 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # Makefile for Sphinx documentation 2 | # 3 | 4 | # You can set these variables from the command line. 5 | SPHINXOPTS = -a -W 6 | SPHINXBUILD = sphinx-build 7 | PAPER = 8 | BUILDDIR = build 9 | 10 | # User-friendly check for sphinx-build 11 | ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1) 12 | $(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/) 13 | endif 14 | 15 | # Internal variables. 16 | PAPEROPT_a4 = -D latex_paper_size=a4 17 | PAPEROPT_letter = -D latex_paper_size=letter 18 | ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source 19 | # the i18n builder cannot share the environment and doctrees with the others 20 | I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source 21 | 22 | .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext 23 | 24 | help: 25 | @echo "Please use \`make ' where is one of" 26 | @echo " html to make standalone HTML files" 27 | @echo " dirhtml to make HTML files named index.html in directories" 28 | @echo " singlehtml to make a single large HTML file" 29 | @echo " pickle to make pickle files" 30 | @echo " json to make JSON files" 31 | @echo " htmlhelp to make HTML files and a HTML help project" 32 | @echo " qthelp to make HTML files and a qthelp project" 33 | @echo " devhelp to make HTML files and a Devhelp project" 34 | @echo " epub to make an epub" 35 | @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" 36 | @echo " latexpdf to make LaTeX files and run them through pdflatex" 37 | @echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx" 38 | @echo " text to make text files" 39 | @echo " man to make manual pages" 40 | @echo " texinfo to make Texinfo files" 41 | @echo " info to make Texinfo files and run them through makeinfo" 42 | @echo " gettext to make PO message catalogs" 43 | @echo " changes to make an overview of all changed/added/deprecated items" 44 | @echo " xml to make Docutils-native XML files" 45 | @echo " pseudoxml to make pseudoxml-XML files for display purposes" 46 | @echo " linkcheck to check all external links for integrity" 47 | @echo " doctest to run all doctests embedded in the documentation (if enabled)" 48 | 49 | clean: 50 | rm -rf $(BUILDDIR)/* 51 | rm -rf source/_output 52 | 53 | html: pl-out 54 | $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html 55 | @echo 56 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." 57 | @echo "file://$(realpath $(BUILDDIR))/html/index.html" 58 | 59 | dirhtml: 60 | $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml 61 | @echo 62 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." 63 | 64 | singlehtml: 65 | $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml 66 | @echo 67 | @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." 68 | 69 | pickle: 70 | $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle 71 | @echo 72 | @echo "Build finished; now you can process the pickle files." 73 | 74 | json: 75 | $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json 76 | @echo 77 | @echo "Build finished; now you can process the JSON files." 78 | 79 | htmlhelp: 80 | $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp 81 | @echo 82 | @echo "Build finished; now you can run HTML Help Workshop with the" \ 83 | ".hhp project file in $(BUILDDIR)/htmlhelp." 84 | 85 | qthelp: 86 | $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp 87 | @echo 88 | @echo "Build finished; now you can run "qcollectiongenerator" with the" \ 89 | ".qhcp project file in $(BUILDDIR)/qthelp, like this:" 90 | @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/PerlXMLLibXMLbyExample.qhcp" 91 | @echo "To view the help file:" 92 | @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/PerlXMLLibXMLbyExample.qhc" 93 | 94 | devhelp: 95 | $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp 96 | @echo 97 | @echo "Build finished." 98 | @echo "To view the help file:" 99 | @echo "# mkdir -p $$HOME/.local/share/devhelp/PerlXMLLibXMLbyExample" 100 | @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/PerlXMLLibXMLbyExample" 101 | @echo "# devhelp" 102 | 103 | epub: 104 | $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub 105 | @echo 106 | @echo "Build finished. The epub file is in $(BUILDDIR)/epub." 107 | 108 | latex: 109 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 110 | @echo 111 | @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." 112 | @echo "Run \`make' in that directory to run these through (pdf)latex" \ 113 | "(use \`make latexpdf' here to do that automatically)." 114 | 115 | latexpdf: 116 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 117 | @echo "Running LaTeX files through pdflatex..." 118 | $(MAKE) -C $(BUILDDIR)/latex all-pdf 119 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." 120 | 121 | latexpdfja: 122 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 123 | @echo "Running LaTeX files through platex and dvipdfmx..." 124 | $(MAKE) -C $(BUILDDIR)/latex all-pdf-ja 125 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." 126 | 127 | text: 128 | $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text 129 | @echo 130 | @echo "Build finished. The text files are in $(BUILDDIR)/text." 131 | 132 | man: 133 | $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man 134 | @echo 135 | @echo "Build finished. The manual pages are in $(BUILDDIR)/man." 136 | 137 | texinfo: 138 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 139 | @echo 140 | @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." 141 | @echo "Run \`make' in that directory to run these through makeinfo" \ 142 | "(use \`make info' here to do that automatically)." 143 | 144 | info: 145 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 146 | @echo "Running Texinfo files through makeinfo..." 147 | make -C $(BUILDDIR)/texinfo info 148 | @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." 149 | 150 | gettext: 151 | $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale 152 | @echo 153 | @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." 154 | 155 | changes: 156 | $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes 157 | @echo 158 | @echo "The overview file is in $(BUILDDIR)/changes." 159 | 160 | linkcheck: 161 | $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck 162 | @echo 163 | @echo "Link check complete; look for any errors in the above output " \ 164 | "or in $(BUILDDIR)/linkcheck/output.txt." 165 | 166 | doctest: 167 | $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest 168 | @echo "Testing of doctests in the sources finished, look at the " \ 169 | "results in $(BUILDDIR)/doctest/output.txt." 170 | 171 | xml: 172 | $(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml 173 | @echo 174 | @echo "Build finished. The XML files are in $(BUILDDIR)/xml." 175 | 176 | pseudoxml: 177 | $(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml 178 | @echo 179 | @echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml." 180 | 181 | pl-out: 182 | ./make-pl-out 183 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Read the HTML version at: [Perl XML::LibXML by 2 | Example](http://grantm.github.io/perl-libxml-by-example/) 3 | 4 | This is a documentation project to introduce the Perl 5 | [XML::LibXML](https://metacpan.org/release/XML-LibXML) module through example 6 | scripts and line-by-line explanations. 7 | 8 | One of the features of this project is an [interactive XPath 9 | sandbox](http://grantm.github.io/perl-libxml-by-example/_static/xpath-sandbox/xpath-sandbox.html?q=%2F%2Fmovie[%40id%3D%22tt0307479%22]%2F%2Fsynopsis) 10 | which you can use to try out different XPath expressions and see which parts of 11 | the XML document are matched. 12 | 13 | --- 14 | 15 | Creative Commons License
16 | Perl XML::LibXML by Example by Grant McLean is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. 17 | -------------------------------------------------------------------------------- /TODO: -------------------------------------------------------------------------------- 1 | * validating XML ? 2 | - handling parse errors 3 | * URI retrieval (e.g. DTD) and the XML catalog 4 | 5 | * customise the sphinx theme for larger font 6 | * add an index: 7 | - http://www.sphinx-doc.org/en/stable/markup/misc.html#index-generating-markup 8 | 9 | * XPath sandbox 10 | - update the URL on submit (allows going back to previous queries) 11 | 12 | Glossary Terms 13 | 14 | * DOM 15 | * element 16 | * node 17 | * Clarkian notation 18 | * document fragment 19 | -------------------------------------------------------------------------------- /images-source/xmllibxml-by-example.xcf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/images-source/xmllibxml-by-example.xcf -------------------------------------------------------------------------------- /images/dom-full.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/images/dom-full.png -------------------------------------------------------------------------------- /images/dom.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/images/dom.png -------------------------------------------------------------------------------- /make-pl-out: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | # 3 | # Runs each of the .pl scripts in source/code and captures the output into a 4 | # .pl-out file. 5 | # 6 | # The script first chdir's into the source/code directory so that the scripts 7 | # can simply assume any .xml file is in the current directory. 8 | # 9 | 10 | use 5.010; 11 | use strict; 12 | use warnings; 13 | use autodie; 14 | 15 | 16 | chdir('source/code'); 17 | 18 | my $out_dir = '../_output'; 19 | mkdir($out_dir) unless -d $out_dir; 20 | 21 | my $zip_file = "$out_dir/perl-libxml-examples.zip"; 22 | my $rezip = ! -e $zip_file; 23 | 24 | say "Running example scripts"; 25 | 26 | foreach my $script (sort glob '*.pl') { 27 | my $out_file = "$out_dir/$script-out"; 28 | if(-e $out_file) { 29 | my $script_age = -M $script; 30 | my $output_age = -M $out_file; 31 | next if $script_age > $output_age; 32 | } 33 | say " $script > $out_dir/$script-out 2> $out_dir/$script-err"; 34 | system("./$script > $out_dir/$script-out 2> $out_dir/$script-err") == 0 35 | or exit $? >> 8; 36 | } 37 | 38 | $rezip ||= zip_is_stale(); 39 | if($rezip) { 40 | say "Creating perl-libxml-examples.zip"; 41 | unlink($zip_file) if -e $zip_file; 42 | system("zip $zip_file *") == 0 43 | or exit $? >> 8; 44 | } 45 | 46 | exit; 47 | 48 | sub zip_is_stale { 49 | my $zip_age = -M $zip_file; 50 | foreach my $file (glob '*') { 51 | return 1 if $zip_age > -M $file; 52 | } 53 | return 0; 54 | } 55 | -------------------------------------------------------------------------------- /publish: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | ############################################################################## 3 | # 4 | # Publish latest version to GitHub pages site. 5 | # 6 | # Note: To generate the HTML locally and proofread the changes, just use: 7 | # 8 | # make html 9 | # 10 | # Use --help option for more details 11 | # 12 | 13 | use 5.010; 14 | use strict; 15 | use warnings; 16 | use autodie; 17 | 18 | use Pod::Usage; 19 | use Getopt::Long qw(GetOptions); 20 | use POSIX qw(strftime); 21 | 22 | $| = 1; # turn off output buffering for prompt 23 | 24 | 25 | my $generated_dir = 'build/html'; 26 | my $target_dir = '../perl-libxml-by-example-pages'; 27 | 28 | my(%opt); 29 | 30 | if(!GetOptions(\%opt, 'help|?', 'message|m=s', 'dryrun|dry-run')) { 31 | pod2usage(-exitval => 1, -verbose => 0); 32 | } 33 | 34 | pod2usage(-exitstatus => 0, -verbose => 2) if $opt{help}; 35 | 36 | my $message = commit_message(); 37 | check_uncommitted(); 38 | build_all(); 39 | publish_to_gh_pages($message); 40 | 41 | exit; 42 | 43 | 44 | sub commit_message { 45 | if(my $message = $opt{message}) { 46 | my $log = `git log --pretty=oneline -1`; 47 | my($commit_id) = $log =~ m{\A(\w{12})}; 48 | return "$message\n\n Source commit: $commit_id"; 49 | } 50 | elsif($opt{dryrun}) { 51 | return; 52 | } 53 | else { 54 | say "You must specify a commit message with -m"; 55 | say "Recent commits:"; 56 | my $mtime = (stat("$target_dir/.git/index"))[9]; 57 | $mtime = strftime('%F-%T', localtime($mtime)); 58 | my $log = `git log --pretty=format:'%h %s' \@{$mtime}..`; 59 | if($log =~ /\S/) { 60 | say $log; 61 | } 62 | else { 63 | say "No changes in source repo"; 64 | } 65 | exit 1; 66 | } 67 | } 68 | 69 | 70 | sub check_uncommitted { 71 | my $git_status = `git status -s`; 72 | return unless $git_status =~ /\S/; 73 | 74 | say "WARNING: Uncommitted changes ...\n\n$git_status"; 75 | print "Publish anyway? (y/N) "; 76 | 77 | my $response = ; 78 | return if $response =~ /^y(es)?/i; 79 | 80 | exit; 81 | } 82 | 83 | 84 | sub build_all { 85 | system('make', 'html') == 0 86 | or die "Failed to generate HTML from source\n"; 87 | 88 | local($ENV{SPHINXOPTS}) = '-a'; 89 | system('make', '-e', 'latexpdf') == 0 90 | or die "Failed to generate PDF from source\n"; 91 | 92 | system( 93 | 'cp', 'build/latex/PerlXMLLibXMLbyExample.pdf', 94 | 'build/html/_downloads/' 95 | ) == 0 96 | or die "Failed to copy PDF into _downloads\n"; 97 | 98 | local($ENV{SPHINXOPTS}) = '-a'; 99 | system('make', '-e', 'epub') == 0 100 | or die "Failed to generate epub from source\n"; 101 | 102 | system( 103 | 'cp', 'build/epub/PerlXMLLibXMLbyExample.epub', 104 | 'build/html/_downloads/' 105 | ) == 0 106 | or die "Failed to copy PDF into _downloads\n"; 107 | } 108 | 109 | 110 | sub publish_to_gh_pages { 111 | my($commit_message) = @_; 112 | 113 | system( 114 | 'rsync', '-rav', '--checksum', '--delete', 115 | '--exclude=.git', 116 | '--exclude=.nojekyll', 117 | '--exclude=.buildinfo', 118 | '--exclude=.*.swp', 119 | $generated_dir . '/', 120 | $target_dir . '/', 121 | ) == 0 or die "Failed to sync files to gh-pages repo\n"; 122 | 123 | # Don't commit/push to GitHub pages if --dry-run was specified 124 | 125 | if($opt{dryrun}) { 126 | warn "Bailing out before publishing to GitHub pages due to --dry-run option\n"; 127 | return; 128 | } 129 | 130 | chdir($target_dir); 131 | 132 | system('git', 'add', '--all') == 0 133 | or die "Failed to add files to gh-pages repo\n"; 134 | 135 | system('git', 'commit', '-m', $commit_message) == 0 136 | or die "Failed to commit changes to gh-pages repo\n"; 137 | 138 | system('git', 'push', 'origin') == 0 139 | or die "Failed to push update to GitHub\n"; 140 | 141 | say "Site updated successfully"; 142 | } 143 | 144 | __END__ 145 | 146 | =head1 NAME 147 | 148 | publish - Push current version of Perl XML::LibXML by Example site to gh-pages 149 | 150 | =head1 SYNOPSIS 151 | 152 | publish -m 153 | 154 | Options: 155 | 156 | -m set the commit message to be used 157 | --dry-run do everything except commit and push to GitHub pages 158 | -? detailed help message 159 | 160 | =head1 DESCRIPTION 161 | 162 | Generate and publish the current version of the "Perl XML::LibXML by Example" 163 | documentation project to GitHub pages: 164 | 165 | =over 4 166 | 167 | =item * 168 | 169 | check for uncommitted changes 170 | 171 | =item * 172 | 173 | regenerate HTML 174 | 175 | =item * 176 | 177 | rsync generated files to parallel checkout dir on gh-pages branch 178 | 179 | =item * 180 | 181 | add and commit all changes 182 | 183 | =item * 184 | 185 | push to GitHub 186 | 187 | =back 188 | 189 | =head1 OPTIONS 190 | 191 | =over 4 192 | 193 | =item B<< --message >> (alias: -m) 194 | 195 | Set the text of the commit message to use on the gh-pages branch. 196 | 197 | =item B<--dry-run> 198 | 199 | Do everything except commit and push to GitHub pages. 200 | 201 | =item B<--help> (alias: -?) 202 | 203 | Display this documentation. 204 | 205 | =back 206 | 207 | =cut 208 | 209 | 210 | 211 | -------------------------------------------------------------------------------- /source/_static/PerlXMLLibXMLbyExample.epub: -------------------------------------------------------------------------------- 1 | ## Placeholder ## 2 | -------------------------------------------------------------------------------- /source/_static/PerlXMLLibXMLbyExample.pdf: -------------------------------------------------------------------------------- 1 | ## Placeholder ## 2 | -------------------------------------------------------------------------------- /source/_static/cc-by-sa.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_static/cc-by-sa.png -------------------------------------------------------------------------------- /source/_static/cover.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_static/cover.jpg -------------------------------------------------------------------------------- /source/_static/metacpan-tiny.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_static/metacpan-tiny.png -------------------------------------------------------------------------------- /source/_static/plxbe.css: -------------------------------------------------------------------------------- 1 | .document a:hover, 2 | footer a:hover { 3 | color: #2980B9; 4 | border-bottom: 1px dotted #2980B9; 5 | } 6 | 7 | .document a:visited, 8 | footer a:visited { 9 | color: #203d90; 10 | } 11 | 12 | .document p tt.literal, 13 | .document li tt.literal { 14 | border: none; 15 | color: #D11A1A; 16 | background-color: transparent; 17 | padding: 0 2px; 18 | font-size: 80%; 19 | font-weight: bold; 20 | } 21 | 22 | .document p a tt.literal, 23 | .document li a tt.literal { 24 | color: inherit; 25 | } 26 | 27 | a.metacpan { 28 | padding-left: 14px; 29 | background-image: url(); 30 | background-repeat: no-repeat; 31 | background-position: 2px 5px; 32 | } 33 | 34 | .wy-nav-side .custom-sidebar { 35 | margin-top: 50px; 36 | padding: 0.8em 1.6em; 37 | border-top: 1px solid #444444; 38 | } 39 | 40 | .wy-nav-side .custom-sidebar h3 { 41 | display: none; 42 | } 43 | 44 | .wy-nav-side .custom-sidebar a, 45 | .wy-nav-side .custom-sidebar a:visited { 46 | color: #b3b3b3; 47 | } 48 | 49 | .wy-nav-side .custom-sidebar a:hover { 50 | color: #2980B9; 51 | } 52 | 53 | .cc-img { 54 | margin: 10px 0px 40px 0; 55 | } 56 | 57 | .highlight-xml { 58 | max-height: 60vh; 59 | overflow-y: auto; 60 | } 61 | .xpath, 62 | .xpath-try { 63 | display: block; 64 | margin-top: 2.4em; 65 | font-size: 120%; 66 | padding: 0.3em; 67 | } 68 | 69 | a.xpath-try-it { 70 | display: inline-block; 71 | float: right; 72 | color: #ffffff; 73 | background-color: #4c9ed9; 74 | border: none; 75 | border-radius: 8px; 76 | cursor: pointer; 77 | font-family: sans-serif; 78 | font-size: 10px; 79 | font-weight: bold; 80 | line-height: 1; 81 | padding: 4px 8px 3px 8px; 82 | text-align: center; 83 | text-transform: uppercase; 84 | margin: 3px 4px 0 16px; 85 | } 86 | 87 | .xpath-try a.xpath-try-it { 88 | text-decoration: none; 89 | } 90 | 91 | .xpath-try-it:focus { 92 | background-color: #d12828; 93 | } 94 | 95 | .xpath-try:hover .xpath-try-it { 96 | background-color: #4080b0; 97 | } 98 | 99 | #xpath-expressions > blockquote { 100 | margin: 0; 101 | } 102 | 103 | /* Keep 'Try It' buttons on the same line as XPath */ 104 | code.xpath-try { 105 | white-space: normal; 106 | } 107 | 108 | .document p tt.literal.xpath, 109 | .document p tt.literal.xpath-try { 110 | border: 1px solid #e1e1e5; 111 | border-radius: 3px; 112 | background-color: #fafafa; 113 | white-space: normal; 114 | padding: 2px 4px; 115 | } 116 | 117 | .document a.xpath-try-it:visited, 118 | .document a.xpath-try-it:hover { 119 | color: #ffffff; 120 | text-decoration: none; 121 | border: none; 122 | } 123 | 124 | #a-basic-example .sidebar blockquote { 125 | font-family: Consolas,"Andale Mono WT","Andale Mono","Lucida Console","Lucida Sans Typewriter","DejaVu Sans Mono","Bitstream Vera Sans Mono","Liberation Mono","Nimbus Mono L",Monaco,"Courier New",Courier,monospace; 126 | font-size: 14px; 127 | line-height: 16px; 128 | margin: 0 0 1.6em 14px; 129 | } 130 | 131 | p.caption { 132 | text-align: center; 133 | } 134 | 135 | .admonition-title { 136 | border-top-left-radius: 0.3em; 137 | border-top-right-radius: 0.3em; 138 | } 139 | 140 | ol.arabic.simple li { 141 | padding-left: 8px; 142 | } 143 | 144 | #linked-events pre, 145 | #linked-nodes pre { 146 | background-color: #e0e0ff; 147 | font-size: 100%; 148 | line-height: 1.2; 149 | } 150 | 151 | #linked-events pre span, 152 | #linked-nodes pre span { 153 | padding: 2px 3px; 154 | } 155 | 156 | .show-event-1 .event-1 { background-color: #ffffcc; } 157 | .show-event-2 .event-2 { background-color: #ffffcc; } 158 | .show-event-3 .event-3 { background-color: #ffffcc; } 159 | .show-event-4 .event-4 { background-color: #ffffcc; } 160 | .show-event-5 .event-5 { background-color: #ffffcc; } 161 | .show-event-6 .event-6 { background-color: #ffffcc; } 162 | .show-event-7 .event-7 { background-color: #ffffcc; } 163 | .show-event-8 .event-8 { background-color: #ffffcc; } 164 | .show-event-9 .event-9 { background-color: #ffffcc; } 165 | .show-event-10 .event-10 { background-color: #ffffcc; } 166 | .show-event-11 .event-11 { background-color: #ffffcc; } 167 | -------------------------------------------------------------------------------- /source/_static/plxbe.js: -------------------------------------------------------------------------------- 1 | /* 2 | * Custom Javascript code for the 'perl-libxml-by-example' project 3 | */ 4 | 5 | jQuery(function($) { 6 | 'use strict'; 7 | 8 | // Wrap the img tag for each illustration in a link to just the image 9 | 10 | $('.illustration').each(function() { 11 | var $img = $(this); 12 | $img.wrap( $('').attr('href', $img.attr('src')) ); 13 | }); 14 | 15 | // Add a class to every metacpan link 16 | 17 | $('a').each(function() { 18 | var $this = $(this); 19 | var href = $this.attr('href') || ''; 20 | if(href.match(/^https?:\/\/metacpan.org/)) { 21 | $this.addClass('metacpan') 22 | .attr('title', 'View docs for ' + $this.text() + ' on metacpan.org'); 23 | } 24 | }); 25 | 26 | // Add the hover-effect linkages on the on the XML::LibXML::Reader page 27 | 28 | var $linked_section = $('#the-reader-loop'); 29 | if($linked_section.length > 0) { 30 | var add_link_handlers = function($el, i) { 31 | var cls = 'show-event-' + i; 32 | $el.mouseover(function() { $linked_section.addClass(cls); }) 33 | .mouseout (function() { $linked_section.removeClass(cls); }); 34 | }; 35 | 36 | var $events = $('#linked-events pre'); 37 | var lines = $events.text().split('\n'); 38 | $events.empty(); 39 | $(lines).each(function(i, text){ 40 | if(i > 0) { 41 | $events.append('\n'); 42 | } 43 | var match = text.match(/^ *(\d+)/); 44 | var $span = $('').text(text); 45 | if(match) { 46 | var cls = 'event-' + match[1]; 47 | $span.addClass(cls).text(text); 48 | add_link_handlers($span, match[1]); 49 | } 50 | $events.append($span); 51 | }); 52 | 53 | var $nodes = $('#linked-nodes .code pre'); 54 | var xml_source = $nodes.text().replace(/\n$/, '').replace(/\n/g, '\u21b5\n'); 55 | var chunks = xml_source.match(/(?:<[^>]+>|[^<]+)/g); 56 | $nodes.empty(); 57 | $(chunks).each(function(i, text) { 58 | var cls = 'event-' + (1 + i); 59 | var $span = $('').addClass(cls).text(text); 60 | add_link_handlers($span, i + 1); 61 | $nodes.append($span) 62 | }); 63 | 64 | $linked_section.find('li').each(function() { 65 | var $li = $(this); 66 | var match = $li.text().match(/^At step (\d+)/); 67 | if(match && match.length === 2) { 68 | $li.addClass('event-' + match[1]); 69 | add_link_handlers($li, match[1]); 70 | } 71 | }); 72 | 73 | $('span.linked-prompt').text( 74 | '(try mousing over to see the relationship between the events and the parsed XML source)' 75 | ); 76 | } 77 | 78 | }); 79 | -------------------------------------------------------------------------------- /source/_static/xpath-sandbox/xpath-sandbox.css: -------------------------------------------------------------------------------- 1 | body { 2 | font-family: sans-serif; 3 | margin: 0; 4 | padding: 6.5em 3em; 5 | } 6 | 7 | #navbar { 8 | position: absolute; 9 | top: 0; 10 | left: 0; 11 | box-sizing: border-box; 12 | width: 100%; 13 | color: #eeeeee; 14 | background-color: #333333; 15 | padding: 0.5em 3em; 16 | } 17 | 18 | #navbar h1 { 19 | float: left; 20 | color: #2c2c2c; 21 | opacity: 0.8; 22 | margin: 0.2em 0; 23 | padding: 0; 24 | white-space: nowrap; 25 | text-shadow: -1px -1px 1px rgba(255,255,255,0.13), 1px 1px 1px rgba(0,0,0,0.9); 26 | transition: opacity 0.3s; 27 | } 28 | 29 | #navbar:hover h1 { 30 | opacity: 1.0; 31 | } 32 | 33 | #links { 34 | position: absolute; 35 | right: 3em; 36 | list-style: none; 37 | } 38 | 39 | #links a { 40 | display: block; 41 | padding: 3px 6px; 42 | border-radius: 3px; 43 | color: #888888; 44 | text-decoration: none; 45 | outline: none; 46 | } 47 | 48 | #links a:hover { 49 | background-color: #4080b0; 50 | color: #eeeeee; 51 | } 52 | 53 | #query-xpath { 54 | font-size: 150%; 55 | width: calc(100% - 10.5em); 56 | padding: 4px 10px 3px; 57 | border: 1px solid #cccccc; 58 | border-radius: 0.1em; 59 | } 60 | 61 | #query-xpath:focus { 62 | background-color: #ffffdd; 63 | } 64 | 65 | #buttons { 66 | display: inline-block; 67 | vertical-align: bottom; 68 | } 69 | 70 | .btn { 71 | font-size: 130%; 72 | background-color: #062873; 73 | color: #eeeeee; 74 | border: none; 75 | border-radius: 0.2em; 76 | font-weight: bold; 77 | padding: 5px 7px; 78 | margin-top: 0.2em; 79 | cursor: pointer; 80 | } 81 | 82 | #registered-namespaces { 83 | margin: 1em 0; 84 | } 85 | 86 | #message-box { 87 | margin: 2.4em 0; 88 | border-radius: 0.2em; 89 | } 90 | 91 | #message-box.error, 92 | #file-parser-error { 93 | padding: 0.5em 1em; 94 | border: 1px solid #ddaaaa; 95 | background-color: #ffcccc; 96 | color: #aa3333; 97 | } 98 | 99 | #message-box.success { 100 | padding: 0.5em 1em; 101 | border: 1px solid #aaddaa; 102 | background-color: #ccffcc; 103 | color: #227722; 104 | } 105 | 106 | .hidden { 107 | display: none; 108 | } 109 | 110 | #file-dialog { 111 | position: fixed; 112 | top: 0; 113 | left: 0; 114 | width: 100%; 115 | height: 100%; 116 | text-align: center; 117 | z-index: 20; 118 | } 119 | 120 | .no-scroll { 121 | overflow: hidden; 122 | } 123 | 124 | #file-dialog .overlay { 125 | position: fixed; 126 | top: 0; 127 | right: 0; 128 | bottom: 0; 129 | left: 0; 130 | height: 100%; 131 | width: 100%; 132 | background-color: rgba(0, 0, 0, .55); 133 | } 134 | 135 | #file-dialog-content { 136 | position: relative; 137 | display: inline-block; 138 | box-shadow: 0 0 10px rgba(0,0,0,1); 139 | text-align: initial; 140 | width: auto; 141 | margin: 10vh auto; 142 | padding: 0.2em 1.2em 1.0em; 143 | max-height: 80vh; 144 | max-width: 80vw; 145 | background: #ffffff; 146 | overflow: auto; 147 | -webkit-overflow-scrolling: touch; 148 | border-radius: 0.1em; 149 | outline: none; 150 | } 151 | 152 | #file-selector-input { 153 | display: block; 154 | height: 1px; 155 | overflow: hidden; 156 | margin: 0; 157 | padding: 0; 158 | opacity: 0.01; 159 | } 160 | 161 | #file-selector-filename { 162 | margin-left: 1.2em; 163 | color: #888888; 164 | font-style: italic; 165 | } 166 | 167 | #file-dialog label { 168 | display: block; 169 | } 170 | 171 | #file-parser-error { 172 | font-family: monospace; 173 | font-size: 90%; 174 | border-radius: 0.3em; 175 | margin-bottom: 1em; 176 | } 177 | 178 | .dialog-buttons { 179 | text-align: right; 180 | padding: 0.8em 0 0.2em; 181 | } 182 | 183 | .btn:disabled { 184 | background-color: #666666; 185 | color: #cccccc; 186 | } 187 | 188 | #doc-tree-wrapper { 189 | position: relative; 190 | } 191 | 192 | #nav-buttons { 193 | z-index: 10; 194 | position: fixed; 195 | top: -45px; 196 | left: 50%; 197 | padding: 45px 8px 4px; 198 | background-color: #062873; 199 | border-bottom-left-radius: 12px; 200 | border-bottom-right-radius: 12px; 201 | transition: padding-top 1.2s; 202 | } 203 | 204 | #nav-buttons.hidden { 205 | display: block; 206 | padding-top: 0; 207 | } 208 | 209 | #nav-buttons a { 210 | color: #d7d7d7; 211 | display: inline-block; 212 | padding: 4px; 213 | margin: 0 2px; 214 | cursor: pointer; 215 | } 216 | 217 | #nav-buttons a:hover { 218 | color: #ffffff; 219 | } 220 | 221 | #nav-button-up::before { 222 | content: "\25B2"; 223 | } 224 | 225 | #nav-button-down::before { 226 | content: "\25BC"; 227 | } 228 | 229 | .metric-ruler { 230 | border: none; 231 | padding: none; 232 | visibility: hidden; 233 | } 234 | 235 | #change-file-btn { 236 | display: inline-block; 237 | position: absolute; 238 | right: 4px; 239 | top: 4px; 240 | background-color: #062873; 241 | color: #eeeeee; 242 | border: none; 243 | border-radius: 0.15em; 244 | font-size: 90%; 245 | font-weight: bold; 246 | padding: 5px 7px; 247 | cursor: pointer; 248 | } 249 | 250 | #change-file-btn::before { 251 | content: "✚"; 252 | display: inline-block; 253 | padding: 0; 254 | width: 1.0em; 255 | height: 1.2em; 256 | white-space: nowrap; 257 | overflow: hidden; 258 | text-align: center; 259 | line-height: 1.3; 260 | transition: width 0.1s ease-in-out; 261 | } 262 | 263 | #change-file-btn:hover::before { 264 | content: "Change XML File"; 265 | width: 9.5em; 266 | } 267 | 268 | .no-source, 269 | .no-ns { 270 | color: #888888; 271 | font-style: italic; 272 | } 273 | 274 | .ns-table { 275 | border-collapse: collapse; 276 | } 277 | 278 | .ns-table th { 279 | background-color: #666666; 280 | color: #dddddd; 281 | padding: 2px 8px; 282 | border: 1px solid #ffffff; 283 | } 284 | 285 | .ns-table td { 286 | background-color: #eeeeee; 287 | padding: 2px 8px; 288 | border: 1px solid #ffffff; 289 | } 290 | 291 | .ns-table input { 292 | width: 4.0em; 293 | } 294 | 295 | #doc-tree { 296 | border: 1px solid #cccccc; 297 | border-radius: 0.2em; 298 | background-color: #f4f4f4; 299 | margin: 2.4em 0; 300 | padding: 1.1em; 301 | } 302 | 303 | span.element, span.text, span.comment { 304 | padding: 2px; 305 | line-height: 1.5; 306 | } 307 | 308 | .tag-name { 309 | color: #062873; 310 | font-weight: bold; 311 | } 312 | 313 | .attr-name { 314 | color: #4070A0; 315 | font-weight: bold; 316 | } 317 | 318 | .attr-value { 319 | color: #4070A0; 320 | } 321 | 322 | .comment { 323 | color: #7f7f7f; 324 | font-style: italic; 325 | } 326 | 327 | .xp-match { 328 | background-color: #ffffcc; 329 | outline: 1px solid #ddddaa; 330 | margin-top: 2px; 331 | z-index: -1; 332 | } 333 | 334 | .xp-match > span { 335 | background-color: #ffffcc; 336 | position: relative; 337 | } 338 | 339 | .xp-match > span:hover { 340 | background-color: #e0ffc5; 341 | } 342 | 343 | .xp-match:hover > .element > .tag-name { 344 | color: #880000; 345 | } 346 | 347 | .not-implemented { 348 | color: #ff0000; 349 | background-color: #ffcccc; 350 | } 351 | 352 | @media only screen and (max-width: 800px) { 353 | 354 | #query-xpath { 355 | width: calc(100% - 1em); 356 | } 357 | 358 | } 359 | -------------------------------------------------------------------------------- /source/_templates/customsidebar.html: -------------------------------------------------------------------------------- 1 | 9 | -------------------------------------------------------------------------------- /source/_templates/epub-cover.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Perl XML::LibXML by Example - Grant McLean 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /source/_templates/page.html: -------------------------------------------------------------------------------- 1 | {# 2 | basic/page.html 3 | ~~~~~~~~~~~~~~~ 4 | 5 | Master template for simple pages. 6 | 7 | :copyright: Copyright 2007-2014 by the Sphinx team, see AUTHORS. 8 | :license: BSD, see LICENSE for details. 9 | #} 10 | {%- extends "layout.html" %} 11 | {%- set script_files = script_files + ["_static/plxbe.js"] %} 12 | {%- set css_files = css_files + ["_static/plxbe.css"] %} 13 | {%- set customsidebar = "customsidebar.html" %} 14 | {% block body %} 15 | {{ body }} 16 | {% endblock %} 17 | -------------------------------------------------------------------------------- /source/_templates/search.html: -------------------------------------------------------------------------------- 1 | {# 2 | basic/search.html 3 | ~~~~~~~~~~~~~~~~~ 4 | 5 | Template for the search page. 6 | 7 | :copyright: Copyright 2007-2013 by the Sphinx team, see AUTHORS. 8 | :license: BSD, see LICENSE for details. 9 | #} 10 | {%- extends "layout.html" %} 11 | {%- set script_files = script_files + ["_static/plxbe.js"] %} 12 | {%- set css_files = css_files + ["_static/plxbe.css"] %} 13 | {%- set customsidebar = "customsidebar.html" %} 14 | {% set title = _('Search') %} 15 | {% set script_files = script_files + ['_static/searchtools.js'] %} 16 | {% block footer %} 17 | 20 | {# this is used when loading the search index using $.ajax fails, 21 | such as on Chrome for documents on localhost #} 22 | 23 | {{ super() }} 24 | {% endblock %} 25 | {% block body %} 26 | 34 | 35 | {% if search_performed %} 36 |

{{ _('Search Results') }}

37 | {% if not search_results %} 38 |

{{ _('Your search did not match any documents. Please make sure that all words are spelled correctly and that you\'ve selected enough categories.') }}

39 | {% endif %} 40 | {% endif %} 41 |
42 | {% if search_results %} 43 |
    44 | {% for href, caption, context in search_results %} 45 |
  • 46 | {{ caption }} 47 |

    {{ context|e }}

    48 |
  • 49 | {% endfor %} 50 |
51 | {% endif %} 52 |
53 | {% endblock %} 54 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/__init__.py: -------------------------------------------------------------------------------- 1 | """Sphinx ReadTheDocs theme. 2 | 3 | From https://github.com/ryan-roemer/sphinx-bootstrap-theme. 4 | 5 | """ 6 | import os 7 | 8 | VERSION = (0, 1, 9) 9 | 10 | __version__ = ".".join(str(v) for v in VERSION) 11 | __version_full__ = __version__ 12 | 13 | 14 | def get_html_theme_path(): 15 | """Return list of HTML theme paths.""" 16 | cur_dir = os.path.abspath(os.path.dirname(os.path.dirname(__file__))) 17 | return cur_dir 18 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/breadcrumbs.html: -------------------------------------------------------------------------------- 1 | {# Support for Sphinx 1.3+ page_source_suffix, but don't break old builds. #} 2 | 3 | {% if page_source_suffix %} 4 | {% set suffix = page_source_suffix %} 5 | {% else %} 6 | {% set suffix = source_suffix %} 7 | {% endif %} 8 | 9 |
10 |
    11 |
  • Docs »
  • 12 | {% for doc in parents %} 13 |
  • {{ doc.title }} »
  • 14 | {% endfor %} 15 |
  • {{ title }}
  • 16 |
  • 17 | {% if pagename != "search" %} 18 | {% if display_github %} 19 | Edit on GitHub 20 | {% elif display_bitbucket %} 21 | Edit on Bitbucket 22 | {% elif show_source and source_url_prefix %} 23 | View page source 24 | {% elif show_source and has_source and sourcename %} 25 | View page source 26 | {% endif %} 27 | {% endif %} 28 |
  • 29 |
30 |
31 |
32 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/footer.html: -------------------------------------------------------------------------------- 1 |
2 | {% if next or prev %} 3 | 11 | {% endif %} 12 | 13 |
14 | 15 |
16 |

17 | {%- if show_copyright %} 18 | {%- if hasdoc('copyright') %} 19 | {% trans path=pathto('copyright'), copyright=copyright|e %}© Copyright {{ copyright }}.{% endtrans %} 20 | {%- else %} 21 | {% trans copyright=copyright|e %}© Copyright {{ copyright }}.{% endtrans %} 22 | {%- endif %} 23 | {%- endif %} 24 | 25 | {%- if build_id and build_url %} 26 | {% trans build_url=build_url, build_id=build_id %} 27 | 28 | Build 29 | {{ build_id }}. 30 | 31 | {% endtrans %} 32 | {%- elif commit %} 33 | {% trans commit=commit %} 34 | 35 | Revision {{ commit }}. 36 | 37 | {% endtrans %} 38 | {%- elif last_updated %} 39 | {% trans last_updated=last_updated|e %}Last updated on {{ last_updated }}.{% endtrans %} 40 | {%- endif %} 41 | 42 |

43 |
44 | 45 | {%- if show_sphinx %} 46 | {% trans %}Built with Sphinx using a theme provided by Read the Docs{% endtrans %}. 47 | {%- endif %} 48 | 49 | {%- block extrafooter %} {% endblock %} 50 | 51 |
52 | 53 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/layout.html: -------------------------------------------------------------------------------- 1 | {# TEMPLATE VAR SETTINGS #} 2 | {%- set url_root = pathto('', 1) %} 3 | {%- if url_root == '#' %}{% set url_root = '' %}{% endif %} 4 | {%- if not embedded and docstitle %} 5 | {%- set titlesuffix = " — "|safe + docstitle|e %} 6 | {%- else %} 7 | {%- set titlesuffix = "" %} 8 | {%- endif %} 9 | 10 | 11 | 12 | 13 | 14 | 15 | {{ metatags }} 16 | 17 | {% block htmltitle %} 18 | {{ title|striptags|e }}{{ titlesuffix }} 19 | {% endblock %} 20 | 21 | {# FAVICON #} 22 | {% if favicon %} 23 | 24 | {% endif %} 25 | 26 | {# CSS #} 27 | 28 | {# OPENSEARCH #} 29 | {% if not embedded %} 30 | {% if use_opensearch %} 31 | 32 | {% endif %} 33 | 34 | {% endif %} 35 | 36 | {# RTD hosts this file, so just load on non RTD builds #} 37 | {% if not READTHEDOCS %} 38 | 39 | {% endif %} 40 | 41 | {% for cssfile in css_files %} 42 | 43 | {% endfor %} 44 | 45 | {% for cssfile in extra_css_files %} 46 | 47 | {% endfor %} 48 | 49 | {%- block linktags %} 50 | {%- if hasdoc('about') %} 51 | 53 | {%- endif %} 54 | {%- if hasdoc('genindex') %} 55 | 57 | {%- endif %} 58 | {%- if hasdoc('search') %} 59 | 60 | {%- endif %} 61 | {%- if hasdoc('copyright') %} 62 | 63 | {%- endif %} 64 | 65 | {%- if parents %} 66 | 67 | {%- endif %} 68 | {%- if next %} 69 | 70 | {%- endif %} 71 | {%- if prev %} 72 | 73 | {%- endif %} 74 | {%- endblock %} 75 | {%- block extrahead %} {% endblock %} 76 | 77 | {# Keep modernizr in head - http://modernizr.com/docs/#installing #} 78 | 79 | 80 | 81 | 82 | 83 | 84 | {% block extrabody %} {% endblock %} 85 |
86 | 87 | {# SIDE NAV, TOGGLES ON MOBILE #} 88 | 138 | 139 |
140 | 141 | {# MOBILE NAV, TRIGGLES SIDE NAV ON TOGGLE #} 142 | 146 | 147 | 148 | {# PAGE CONTENT #} 149 |
150 |
151 | {% include "breadcrumbs.html" %} 152 |
153 |
154 | {% block body %}{% endblock %} 155 |
156 |
157 | {% include "footer.html" %} 158 |
159 |
160 | 161 |
162 | 163 |
164 | {% include "versions.html" %} 165 | 166 | {% if not embedded %} 167 | 168 | 177 | {%- for scriptfile in script_files %} 178 | 179 | {%- endfor %} 180 | 181 | {% endif %} 182 | 183 | {# RTD hosts this file, so just load on non RTD builds #} 184 | {% if not READTHEDOCS %} 185 | 186 | {% endif %} 187 | 188 | {# STICKY NAVIGATION #} 189 | {% if theme_sticky_navigation %} 190 | 195 | {% endif %} 196 | 197 | {%- block footer %} {% endblock %} 198 | 199 | 200 | 201 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/layout_old.html: -------------------------------------------------------------------------------- 1 | {# 2 | basic/layout.html 3 | ~~~~~~~~~~~~~~~~~ 4 | 5 | Master layout template for Sphinx themes. 6 | 7 | :copyright: Copyright 2007-2013 by the Sphinx team, see AUTHORS. 8 | :license: BSD, see LICENSE for details. 9 | #} 10 | {%- block doctype -%} 11 | 13 | {%- endblock %} 14 | {%- set reldelim1 = reldelim1 is not defined and ' »' or reldelim1 %} 15 | {%- set reldelim2 = reldelim2 is not defined and ' |' or reldelim2 %} 16 | {%- set render_sidebar = (not embedded) and (not theme_nosidebar|tobool) and 17 | (sidebars != []) %} 18 | {%- set url_root = pathto('', 1) %} 19 | {# XXX necessary? #} 20 | {%- if url_root == '#' %}{% set url_root = '' %}{% endif %} 21 | {%- if not embedded and docstitle %} 22 | {%- set titlesuffix = " — "|safe + docstitle|e %} 23 | {%- else %} 24 | {%- set titlesuffix = "" %} 25 | {%- endif %} 26 | 27 | {%- macro relbar() %} 28 | 46 | {%- endmacro %} 47 | 48 | {%- macro sidebar() %} 49 | {%- if render_sidebar %} 50 |
51 |
52 | {%- block sidebarlogo %} 53 | {%- if logo %} 54 | 57 | {%- endif %} 58 | {%- endblock %} 59 | {%- if sidebars != None %} 60 | {#- new style sidebar: explicitly include/exclude templates #} 61 | {%- for sidebartemplate in sidebars %} 62 | {%- include sidebartemplate %} 63 | {%- endfor %} 64 | {%- else %} 65 | {#- old style sidebars: using blocks -- should be deprecated #} 66 | {%- block sidebartoc %} 67 | {%- include "localtoc.html" %} 68 | {%- endblock %} 69 | {%- block sidebarrel %} 70 | {%- include "relations.html" %} 71 | {%- endblock %} 72 | {%- block sidebarsourcelink %} 73 | {%- include "sourcelink.html" %} 74 | {%- endblock %} 75 | {%- if customsidebar %} 76 | {%- include customsidebar %} 77 | {%- endif %} 78 | {%- block sidebarsearch %} 79 | {%- include "searchbox.html" %} 80 | {%- endblock %} 81 | {%- endif %} 82 |
83 |
84 | {%- endif %} 85 | {%- endmacro %} 86 | 87 | {%- macro script() %} 88 | 97 | {%- for scriptfile in script_files %} 98 | 99 | {%- endfor %} 100 | {%- endmacro %} 101 | 102 | {%- macro css() %} 103 | 104 | 105 | {%- for cssfile in css_files %} 106 | 107 | {%- endfor %} 108 | {%- endmacro %} 109 | 110 | 111 | 112 | 113 | {{ metatags }} 114 | {%- block htmltitle %} 115 | {{ title|striptags|e }}{{ titlesuffix }} 116 | {%- endblock %} 117 | {{ css() }} 118 | {%- if not embedded %} 119 | {{ script() }} 120 | {%- if use_opensearch %} 121 | 124 | {%- endif %} 125 | {%- if favicon %} 126 | 127 | {%- endif %} 128 | {%- endif %} 129 | {%- block linktags %} 130 | {%- if hasdoc('about') %} 131 | 132 | {%- endif %} 133 | {%- if hasdoc('genindex') %} 134 | 135 | {%- endif %} 136 | {%- if hasdoc('search') %} 137 | 138 | {%- endif %} 139 | {%- if hasdoc('copyright') %} 140 | 141 | {%- endif %} 142 | 143 | {%- if parents %} 144 | 145 | {%- endif %} 146 | {%- if next %} 147 | 148 | {%- endif %} 149 | {%- if prev %} 150 | 151 | {%- endif %} 152 | {%- endblock %} 153 | {%- block extrahead %} {% endblock %} 154 | 155 | 156 | {%- block header %}{% endblock %} 157 | 158 | {%- block relbar1 %}{{ relbar() }}{% endblock %} 159 | 160 | {%- block content %} 161 | {%- block sidebar1 %} {# possible location for sidebar #} {% endblock %} 162 | 163 |
164 | {%- block document %} 165 |
166 | {%- if render_sidebar %} 167 |
168 | {%- endif %} 169 |
170 | {% block body %} {% endblock %} 171 |
172 | {%- if render_sidebar %} 173 |
174 | {%- endif %} 175 |
176 | {%- endblock %} 177 | 178 | {%- block sidebar2 %}{{ sidebar() }}{% endblock %} 179 |
180 |
181 | {%- endblock %} 182 | 183 | {%- block relbar2 %}{{ relbar() }}{% endblock %} 184 | 185 | {%- block footer %} 186 | 201 |

asdf asdf asdf asdf 22

202 | {%- endblock %} 203 | 204 | 205 | 206 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/search.html: -------------------------------------------------------------------------------- 1 | {# 2 | basic/search.html 3 | ~~~~~~~~~~~~~~~~~ 4 | 5 | Template for the search page. 6 | 7 | :copyright: Copyright 2007-2013 by the Sphinx team, see AUTHORS. 8 | :license: BSD, see LICENSE for details. 9 | #} 10 | {%- extends "layout.html" %} 11 | {% set title = _('Search') %} 12 | {% set script_files = script_files + ['_static/searchtools.js'] %} 13 | {% block footer %} 14 | 17 | {# this is used when loading the search index using $.ajax fails, 18 | such as on Chrome for documents on localhost #} 19 | 20 | {{ super() }} 21 | {% endblock %} 22 | {% block body %} 23 | 31 | 32 | {% if search_performed %} 33 |

{{ _('Search Results') }}

34 | {% if not search_results %} 35 |

{{ _('Your search did not match any documents. Please make sure that all words are spelled correctly and that you\'ve selected enough categories.') }}

36 | {% endif %} 37 | {% endif %} 38 |
39 | {% if search_results %} 40 |
    41 | {% for href, caption, context in search_results %} 42 |
  • 43 | {{ caption }} 44 |

    {{ context|e }}

    45 |
  • 46 | {% endfor %} 47 |
48 | {% endif %} 49 |
50 | {% endblock %} 51 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/searchbox.html: -------------------------------------------------------------------------------- 1 | {%- if builder != 'singlehtml' %} 2 |
3 |
4 | 5 | 6 | 7 |
8 |
9 | {%- endif %} 10 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/css/badge_only.css: -------------------------------------------------------------------------------- 1 | .fa:before{-webkit-font-smoothing:antialiased}.clearfix{*zoom:1}.clearfix:before,.clearfix:after{display:table;content:""}.clearfix:after{clear:both}@font-face{font-family:FontAwesome;font-weight:normal;font-style:normal;src:url("../font/fontawesome_webfont.eot");src:url("../font/fontawesome_webfont.eot?#iefix") format("embedded-opentype"),url("../font/fontawesome_webfont.woff") format("woff"),url("../font/fontawesome_webfont.ttf") format("truetype"),url("../font/fontawesome_webfont.svg#FontAwesome") format("svg")}.fa:before{display:inline-block;font-family:FontAwesome;font-style:normal;font-weight:normal;line-height:1;text-decoration:inherit}a .fa{display:inline-block;text-decoration:inherit}li .fa{display:inline-block}li .fa-large:before,li .fa-large:before{width:1.875em}ul.fas{list-style-type:none;margin-left:2em;text-indent:-0.8em}ul.fas li .fa{width:0.8em}ul.fas li .fa-large:before,ul.fas li .fa-large:before{vertical-align:baseline}.fa-book:before{content:""}.icon-book:before{content:""}.fa-caret-down:before{content:""}.icon-caret-down:before{content:""}.fa-caret-up:before{content:""}.icon-caret-up:before{content:""}.fa-caret-left:before{content:""}.icon-caret-left:before{content:""}.fa-caret-right:before{content:""}.icon-caret-right:before{content:""}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;border-top:solid 10px #343131;font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif;z-index:400}.rst-versions a{color:#2980B9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27AE60;*zoom:1}.rst-versions .rst-current-version:before,.rst-versions .rst-current-version:after{display:table;content:""}.rst-versions .rst-current-version:after{clear:both}.rst-versions .rst-current-version .fa{color:#fcfcfc}.rst-versions .rst-current-version .fa-book{float:left}.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#E74C3C;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#F1C40F;color:#000}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:gray;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:solid 1px #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px}.rst-versions.rst-badge .icon-book{float:none}.rst-versions.rst-badge .fa-book{float:none}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book{float:left}.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge .rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width: 768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}img{width:100%;height:auto}} 2 | /*# sourceMappingURL=badge_only.css.map */ 3 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/css/badge_only.css.map: -------------------------------------------------------------------------------- 1 | { 2 | "version": 3, 3 | "mappings": "CAyDA,SAAY,EACV,qBAAsB,EAAE,UAAW,EAqDrC,QAAS,EARP,IAAK,EAAE,AAAC,EACR,+BAAS,EAEP,MAAO,EAAE,IAAK,EACd,MAAO,EAAE,CAAE,EACb,cAAO,EACL,IAAK,EAAE,GAAI,EC1Gb,SAkBC,EAjBC,UAAW,ECFJ,UAAW,EDGlB,UAAW,EAHqC,KAAM,EAItD,SAAU,EAJsD,KAAM,EAapE,EAAG,EAAE,qCAAwB,EAC7B,EAAG,EAAE,0PAAyE,ECZpF,SAAU,EACR,MAAO,EAAE,WAAY,EACrB,UAAW,EAAE,UAAW,EACxB,SAAU,EAAE,KAAM,EAClB,UAAW,EAAE,KAAM,EACnB,UAAW,EAAE,AAAC,EACd,cAAe,EAAE,MAAO,EAG1B,IAAK,EACH,MAAO,EAAE,WAAY,EACrB,cAAe,EAAE,MAAO,EAIxB,KAAG,EACD,MAAO,EAAE,WAAY,EACvB,sCAAiB,EAGf,IAAK,EAAE,MAAY,EAEvB,KAAM,EACJ,cAAe,EAAE,GAAI,EACrB,UAAW,EAAE,EAAG,EAChB,UAAW,EAAE,KAAM,EAEjB,YAAG,EACD,IAAK,EAAE,IAAI,EACb,oDAAiB,EAGf,aAAc,EAAE,OAAQ,EAG9B,cAAe,EACb,MAAO,EAAE,EAAO,EAElB,gBAAiB,EACf,MAAO,EAAE,EAAO,EAElB,oBAAqB,EACnB,MAAO,EAAE,EAAO,EAElB,sBAAuB,EACrB,MAAO,EAAE,EAAO,EAElB,kBAAmB,EACjB,MAAO,EAAE,EAAO,EAElB,oBAAqB,EACnB,MAAO,EAAE,EAAO,EAElB,oBAAqB,EACnB,MAAO,EAAE,EAAO,EAElB,sBAAuB,EACrB,MAAO,EAAE,EAAO,EAElB,qBAAsB,EACpB,MAAO,EAAE,EAAO,EAElB,uBAAwB,EACtB,MAAO,EAAE,EAAO,ECnElB,YAAa,EACX,OAAQ,EAAE,IAAK,EACf,KAAM,EAAE,AAAC,EACT,GAAI,EAAE,AAAC,EACP,IAAK,EC6E+B,IAAK,ED5EzC,IAAK,EEoC+B,MAAyB,EFnC7D,SAAU,EAAE,MAAkC,EAC9C,SAAU,EAAE,iBAAiC,EAC7C,UAAW,EE+CyB,sDAAM,EF9C1C,MAAO,EC+E6B,EAAG,ED9EvC,cAAC,EACC,IAAK,EE+B6B,MAAK,EF9BvC,cAAe,EAAE,GAAI,EACvB,6BAAgB,EACd,MAAO,EAAE,GAAI,EACf,iCAAoB,EAClB,MAAO,EAAE,GAAqB,EAC9B,eAAgB,EAAE,MAAkC,EACpD,MAAO,EAAE,IAAK,EACd,SAAU,EAAE,IAAK,EACjB,QAAS,EAAE,EAAG,EACd,KAAM,EAAE,MAAO,EACf,IAAK,EEX6B,MAAM,EL4F1C,IAAK,EAAE,AAAC,EACR,iFAAS,EAEP,MAAO,EAAE,IAAK,EACd,MAAO,EAAE,CAAE,EACb,uCAAO,EACL,IAAK,EAAE,GAAI,EGrFX,qCAAG,EACD,IAAK,EEgB2B,MAAyB,EFf3D,0CAAQ,EACN,IAAK,EAAE,GAAI,EACb,4CAAU,EACR,IAAK,EAAE,GAAI,EACb,iDAAiB,EACf,eAAgB,ECQgB,MAAI,EDPpC,IAAK,EEI2B,GAAM,EFHxC,wDAAwB,EACtB,eAAgB,EEmBgB,MAAO,EFlBvC,IAAK,ECzB2B,GAAI,ED0BxC,yCAA8B,EAC5B,MAAO,EAAE,IAAK,EAChB,gCAAmB,EACjB,QAAS,EAAE,EAAG,EACd,MAAO,EAAE,GAAqB,EAC9B,IAAK,EEP6B,GAAY,EFQ9C,MAAO,EAAE,GAAI,EACb,mCAAE,EACA,MAAO,EAAE,IAAK,EACd,KAAM,EAAE,EAAG,EACX,KAAM,EAAE,AAAC,EACT,KAAM,EAAE,KAAM,EACd,MAAO,EAAE,AAAC,EACV,SAAU,EAAE,gBAA6C,EAC3D,mCAAE,EACA,MAAO,EAAE,WAAY,EACrB,KAAM,EAAE,AAAC,EACT,qCAAC,EACC,MAAO,EAAE,WAAY,EACrB,MAAO,EAAE,EAAqB,EAC9B,IAAK,EEfyB,MAAyB,EFgB7D,sBAAW,EACT,IAAK,EAAE,GAAI,EACX,KAAM,EAAE,GAAI,EACZ,IAAK,EAAE,GAAI,EACX,GAAI,EAAE,GAAI,EACV,KAAM,EAAE,GAAI,EACZ,QAAS,ECkByB,IAAK,EDjBvC,iCAAU,EACR,IAAK,EAAE,GAAI,EACb,+BAAQ,EACN,IAAK,EAAE,GAAI,EACb,oDAA+B,EAC7B,SAAU,EAAE,IAAK,EACjB,6DAAQ,EACN,IAAK,EAAE,GAAI,EACb,+DAAU,EACR,IAAK,EAAE,GAAI,EACf,2CAAoB,EAClB,IAAK,EAAE,GAAI,EACX,KAAM,EAAE,GAAI,EACZ,UAAW,EAAE,GAAI,EACjB,MAAO,EAAE,IAAuB,EAChC,MAAO,EAAE,IAAK,EACd,SAAU,EAAE,KAAM,EGhDpB,mCAAsB,EHmDxB,YAAa,EACX,IAAK,EAAE,EAAG,EACV,MAAO,EAAE,GAAI,EACb,kBAAO,EACL,MAAO,EAAE,IAAK,EAClB,EAAG,EACD,IAAK,EAAE,GAAI,EACX,KAAM,EAAE,GAAI", 4 | "sources": ["../../../bower_components/wyrm/sass/wyrm_core/_mixin.sass","../../../bower_components/bourbon/dist/css3/_font-face.scss","../../../sass/_theme_badge_fa.sass","../../../sass/_theme_badge.sass","../../../bower_components/wyrm/sass/wyrm_core/_wy_variables.sass","../../../sass/_theme_variables.sass","../../../bower_components/neat/app/assets/stylesheets/grid/_media.scss"], 5 | "names": [], 6 | "file": "badge_only.css" 7 | } 8 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/FontAwesome.otf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/FontAwesome.otf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/Inconsolata-Bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/Inconsolata-Bold.ttf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/Inconsolata-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/Inconsolata-Regular.ttf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/Lato-Bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/Lato-Bold.ttf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/Lato-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/Lato-Regular.ttf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/RobotoSlab-Bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/RobotoSlab-Bold.ttf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/RobotoSlab-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/RobotoSlab-Regular.ttf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/fontawesome-webfont.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/fontawesome-webfont.eot -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/fontawesome-webfont.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/fontawesome-webfont.ttf -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/fonts/fontawesome-webfont.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/_themes/sphinx_rtd_theme/static/fonts/fontawesome-webfont.woff -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/static/js/modernizr.min.js: -------------------------------------------------------------------------------- 1 | /* Modernizr 2.6.2 (Custom Build) | MIT & BSD 2 | * Build: http://modernizr.com/download/#-fontface-backgroundsize-borderimage-borderradius-boxshadow-flexbox-hsla-multiplebgs-opacity-rgba-textshadow-cssanimations-csscolumns-generatedcontent-cssgradients-cssreflections-csstransforms-csstransforms3d-csstransitions-applicationcache-canvas-canvastext-draganddrop-hashchange-history-audio-video-indexeddb-input-inputtypes-localstorage-postmessage-sessionstorage-websockets-websqldatabase-webworkers-geolocation-inlinesvg-smil-svg-svgclippaths-touch-webgl-shiv-mq-cssclasses-addtest-prefixed-teststyles-testprop-testallprops-hasevent-prefixes-domprefixes-load 3 | */ 4 | ;window.Modernizr=function(a,b,c){function D(a){j.cssText=a}function E(a,b){return D(n.join(a+";")+(b||""))}function F(a,b){return typeof a===b}function G(a,b){return!!~(""+a).indexOf(b)}function H(a,b){for(var d in a){var e=a[d];if(!G(e,"-")&&j[e]!==c)return b=="pfx"?e:!0}return!1}function I(a,b,d){for(var e in a){var f=b[a[e]];if(f!==c)return d===!1?a[e]:F(f,"function")?f.bind(d||b):f}return!1}function J(a,b,c){var d=a.charAt(0).toUpperCase()+a.slice(1),e=(a+" "+p.join(d+" ")+d).split(" ");return F(b,"string")||F(b,"undefined")?H(e,b):(e=(a+" "+q.join(d+" ")+d).split(" "),I(e,b,c))}function K(){e.input=function(c){for(var d=0,e=c.length;d',a,""].join(""),l.id=h,(m?l:n).innerHTML+=f,n.appendChild(l),m||(n.style.background="",n.style.overflow="hidden",k=g.style.overflow,g.style.overflow="hidden",g.appendChild(n)),i=c(l,a),m?l.parentNode.removeChild(l):(n.parentNode.removeChild(n),g.style.overflow=k),!!i},z=function(b){var c=a.matchMedia||a.msMatchMedia;if(c)return c(b).matches;var d;return y("@media "+b+" { #"+h+" { position: absolute; } }",function(b){d=(a.getComputedStyle?getComputedStyle(b,null):b.currentStyle)["position"]=="absolute"}),d},A=function(){function d(d,e){e=e||b.createElement(a[d]||"div"),d="on"+d;var f=d in e;return f||(e.setAttribute||(e=b.createElement("div")),e.setAttribute&&e.removeAttribute&&(e.setAttribute(d,""),f=F(e[d],"function"),F(e[d],"undefined")||(e[d]=c),e.removeAttribute(d))),e=null,f}var a={select:"input",change:"input",submit:"form",reset:"form",error:"img",load:"img",abort:"img"};return d}(),B={}.hasOwnProperty,C;!F(B,"undefined")&&!F(B.call,"undefined")?C=function(a,b){return B.call(a,b)}:C=function(a,b){return b in a&&F(a.constructor.prototype[b],"undefined")},Function.prototype.bind||(Function.prototype.bind=function(b){var c=this;if(typeof c!="function")throw new TypeError;var d=w.call(arguments,1),e=function(){if(this instanceof e){var a=function(){};a.prototype=c.prototype;var f=new a,g=c.apply(f,d.concat(w.call(arguments)));return Object(g)===g?g:f}return c.apply(b,d.concat(w.call(arguments)))};return e}),s.flexbox=function(){return J("flexWrap")},s.canvas=function(){var a=b.createElement("canvas");return!!a.getContext&&!!a.getContext("2d")},s.canvastext=function(){return!!e.canvas&&!!F(b.createElement("canvas").getContext("2d").fillText,"function")},s.webgl=function(){return!!a.WebGLRenderingContext},s.touch=function(){var c;return"ontouchstart"in a||a.DocumentTouch&&b instanceof DocumentTouch?c=!0:y(["@media (",n.join("touch-enabled),("),h,")","{#modernizr{top:9px;position:absolute}}"].join(""),function(a){c=a.offsetTop===9}),c},s.geolocation=function(){return"geolocation"in navigator},s.postmessage=function(){return!!a.postMessage},s.websqldatabase=function(){return!!a.openDatabase},s.indexedDB=function(){return!!J("indexedDB",a)},s.hashchange=function(){return A("hashchange",a)&&(b.documentMode===c||b.documentMode>7)},s.history=function(){return!!a.history&&!!history.pushState},s.draganddrop=function(){var a=b.createElement("div");return"draggable"in a||"ondragstart"in a&&"ondrop"in a},s.websockets=function(){return"WebSocket"in a||"MozWebSocket"in a},s.rgba=function(){return D("background-color:rgba(150,255,150,.5)"),G(j.backgroundColor,"rgba")},s.hsla=function(){return D("background-color:hsla(120,40%,100%,.5)"),G(j.backgroundColor,"rgba")||G(j.backgroundColor,"hsla")},s.multiplebgs=function(){return D("background:url(https://),url(https://),red url(https://)"),/(url\s*\(.*?){3}/.test(j.background)},s.backgroundsize=function(){return J("backgroundSize")},s.borderimage=function(){return J("borderImage")},s.borderradius=function(){return J("borderRadius")},s.boxshadow=function(){return J("boxShadow")},s.textshadow=function(){return b.createElement("div").style.textShadow===""},s.opacity=function(){return E("opacity:.55"),/^0.55$/.test(j.opacity)},s.cssanimations=function(){return J("animationName")},s.csscolumns=function(){return J("columnCount")},s.cssgradients=function(){var a="background-image:",b="gradient(linear,left top,right bottom,from(#9f9),to(white));",c="linear-gradient(left top,#9f9, white);";return D((a+"-webkit- ".split(" ").join(b+a)+n.join(c+a)).slice(0,-a.length)),G(j.backgroundImage,"gradient")},s.cssreflections=function(){return J("boxReflect")},s.csstransforms=function(){return!!J("transform")},s.csstransforms3d=function(){var a=!!J("perspective");return a&&"webkitPerspective"in g.style&&y("@media (transform-3d),(-webkit-transform-3d){#modernizr{left:9px;position:absolute;height:3px;}}",function(b,c){a=b.offsetLeft===9&&b.offsetHeight===3}),a},s.csstransitions=function(){return J("transition")},s.fontface=function(){var a;return y('@font-face {font-family:"font";src:url("https://")}',function(c,d){var e=b.getElementById("smodernizr"),f=e.sheet||e.styleSheet,g=f?f.cssRules&&f.cssRules[0]?f.cssRules[0].cssText:f.cssText||"":"";a=/src/i.test(g)&&g.indexOf(d.split(" ")[0])===0}),a},s.generatedcontent=function(){var a;return y(["#",h,"{font:0/0 a}#",h,':after{content:"',l,'";visibility:hidden;font:3px/1 a}'].join(""),function(b){a=b.offsetHeight>=3}),a},s.video=function(){var a=b.createElement("video"),c=!1;try{if(c=!!a.canPlayType)c=new Boolean(c),c.ogg=a.canPlayType('video/ogg; codecs="theora"').replace(/^no$/,""),c.h264=a.canPlayType('video/mp4; codecs="avc1.42E01E"').replace(/^no$/,""),c.webm=a.canPlayType('video/webm; codecs="vp8, vorbis"').replace(/^no$/,"")}catch(d){}return c},s.audio=function(){var a=b.createElement("audio"),c=!1;try{if(c=!!a.canPlayType)c=new Boolean(c),c.ogg=a.canPlayType('audio/ogg; codecs="vorbis"').replace(/^no$/,""),c.mp3=a.canPlayType("audio/mpeg;").replace(/^no$/,""),c.wav=a.canPlayType('audio/wav; codecs="1"').replace(/^no$/,""),c.m4a=(a.canPlayType("audio/x-m4a;")||a.canPlayType("audio/aac;")).replace(/^no$/,"")}catch(d){}return c},s.localstorage=function(){try{return localStorage.setItem(h,h),localStorage.removeItem(h),!0}catch(a){return!1}},s.sessionstorage=function(){try{return sessionStorage.setItem(h,h),sessionStorage.removeItem(h),!0}catch(a){return!1}},s.webworkers=function(){return!!a.Worker},s.applicationcache=function(){return!!a.applicationCache},s.svg=function(){return!!b.createElementNS&&!!b.createElementNS(r.svg,"svg").createSVGRect},s.inlinesvg=function(){var a=b.createElement("div");return a.innerHTML="",(a.firstChild&&a.firstChild.namespaceURI)==r.svg},s.smil=function(){return!!b.createElementNS&&/SVGAnimate/.test(m.call(b.createElementNS(r.svg,"animate")))},s.svgclippaths=function(){return!!b.createElementNS&&/SVGClipPath/.test(m.call(b.createElementNS(r.svg,"clipPath")))};for(var L in s)C(s,L)&&(x=L.toLowerCase(),e[x]=s[L](),v.push((e[x]?"":"no-")+x));return e.input||K(),e.addTest=function(a,b){if(typeof a=="object")for(var d in a)C(a,d)&&e.addTest(d,a[d]);else{a=a.toLowerCase();if(e[a]!==c)return e;b=typeof b=="function"?b():b,typeof f!="undefined"&&f&&(g.className+=" "+(b?"":"no-")+a),e[a]=b}return e},D(""),i=k=null,function(a,b){function k(a,b){var c=a.createElement("p"),d=a.getElementsByTagName("head")[0]||a.documentElement;return c.innerHTML="x",d.insertBefore(c.lastChild,d.firstChild)}function l(){var a=r.elements;return typeof a=="string"?a.split(" "):a}function m(a){var b=i[a[g]];return b||(b={},h++,a[g]=h,i[h]=b),b}function n(a,c,f){c||(c=b);if(j)return c.createElement(a);f||(f=m(c));var g;return f.cache[a]?g=f.cache[a].cloneNode():e.test(a)?g=(f.cache[a]=f.createElem(a)).cloneNode():g=f.createElem(a),g.canHaveChildren&&!d.test(a)?f.frag.appendChild(g):g}function o(a,c){a||(a=b);if(j)return a.createDocumentFragment();c=c||m(a);var d=c.frag.cloneNode(),e=0,f=l(),g=f.length;for(;e",f="hidden"in a,j=a.childNodes.length==1||function(){b.createElement("a");var a=b.createDocumentFragment();return typeof a.cloneNode=="undefined"||typeof a.createDocumentFragment=="undefined"||typeof a.createElement=="undefined"}()}catch(c){f=!0,j=!0}})();var r={elements:c.elements||"abbr article aside audio bdi canvas data datalist details figcaption figure footer header hgroup mark meter nav output progress section summary time video",shivCSS:c.shivCSS!==!1,supportsUnknownElements:j,shivMethods:c.shivMethods!==!1,type:"default",shivDocument:q,createElement:n,createDocumentFragment:o};a.html5=r,q(b)}(this,b),e._version=d,e._prefixes=n,e._domPrefixes=q,e._cssomPrefixes=p,e.mq=z,e.hasEvent=A,e.testProp=function(a){return H([a])},e.testAllProps=J,e.testStyles=y,e.prefixed=function(a,b,c){return b?J(a,b,c):J(a,"pfx")},g.className=g.className.replace(/(^|\s)no-js(\s|$)/,"$1$2")+(f?" js "+v.join(" "):""),e}(this,this.document),function(a,b,c){function d(a){return"[object Function]"==o.call(a)}function e(a){return"string"==typeof a}function f(){}function g(a){return!a||"loaded"==a||"complete"==a||"uninitialized"==a}function h(){var a=p.shift();q=1,a?a.t?m(function(){("c"==a.t?B.injectCss:B.injectJs)(a.s,0,a.a,a.x,a.e,1)},0):(a(),h()):q=0}function i(a,c,d,e,f,i,j){function k(b){if(!o&&g(l.readyState)&&(u.r=o=1,!q&&h(),l.onload=l.onreadystatechange=null,b)){"img"!=a&&m(function(){t.removeChild(l)},50);for(var d in y[c])y[c].hasOwnProperty(d)&&y[c][d].onload()}}var j=j||B.errorTimeout,l=b.createElement(a),o=0,r=0,u={t:d,s:c,e:f,a:i,x:j};1===y[c]&&(r=1,y[c]=[]),"object"==a?l.data=c:(l.src=c,l.type=a),l.width=l.height="0",l.onerror=l.onload=l.onreadystatechange=function(){k.call(this,r)},p.splice(e,0,u),"img"!=a&&(r||2===y[c]?(t.insertBefore(l,s?null:n),m(k,j)):y[c].push(l))}function j(a,b,c,d,f){return q=0,b=b||"j",e(a)?i("c"==b?v:u,a,b,this.i++,c,d,f):(p.splice(this.i++,0,a),1==p.length&&h()),this}function k(){var a=B;return a.loader={load:j,i:0},a}var l=b.documentElement,m=a.setTimeout,n=b.getElementsByTagName("script")[0],o={}.toString,p=[],q=0,r="MozAppearance"in l.style,s=r&&!!b.createRange().compareNode,t=s?l:n.parentNode,l=a.opera&&"[object Opera]"==o.call(a.opera),l=!!b.attachEvent&&!l,u=r?"object":l?"script":"img",v=l?"script":u,w=Array.isArray||function(a){return"[object Array]"==o.call(a)},x=[],y={},z={timeout:function(a,b){return b.length&&(a.timeout=b[0]),a}},A,B;B=function(a){function b(a){var a=a.split("!"),b=x.length,c=a.pop(),d=a.length,c={url:c,origUrl:c,prefixes:a},e,f,g;for(f=0;f"); 80 | 81 | // Add expand links to all parents of nested ul 82 | $('.wy-menu-vertical ul').not('.simple').siblings('a').each(function () { 83 | var link = $(this); 84 | expand = $(''); 85 | expand.on('click', function (ev) { 86 | self.toggleCurrent(link); 87 | ev.stopPropagation(); 88 | return false; 89 | }); 90 | link.prepend(expand); 91 | }); 92 | }; 93 | 94 | nav.reset = function () { 95 | // Get anchor from URL and open up nested nav 96 | var anchor = encodeURI(window.location.hash); 97 | if (anchor) { 98 | try { 99 | var link = $('.wy-menu-vertical') 100 | .find('[href="' + anchor + '"]'); 101 | $('.wy-menu-vertical li.toctree-l1 li.current') 102 | .removeClass('current'); 103 | link.closest('li.toctree-l2').addClass('current'); 104 | link.closest('li.toctree-l3').addClass('current'); 105 | link.closest('li.toctree-l4').addClass('current'); 106 | } 107 | catch (err) { 108 | console.log("Error expanding nav for anchor", err); 109 | } 110 | } 111 | }; 112 | 113 | nav.onScroll = function () { 114 | this.winScroll = false; 115 | var newWinPosition = this.win.scrollTop(), 116 | winBottom = newWinPosition + this.winHeight, 117 | navPosition = this.navBar.scrollTop(), 118 | newNavPosition = navPosition + (newWinPosition - this.winPosition); 119 | if (newWinPosition < 0 || winBottom > this.docHeight) { 120 | return; 121 | } 122 | this.navBar.scrollTop(newNavPosition); 123 | this.winPosition = newWinPosition; 124 | }; 125 | 126 | nav.onResize = function () { 127 | this.winResize = false; 128 | this.winHeight = this.win.height(); 129 | this.docHeight = $(document).height(); 130 | }; 131 | 132 | nav.hashChange = function () { 133 | this.linkScroll = true; 134 | this.win.one('hashchange', function () { 135 | this.linkScroll = false; 136 | }); 137 | }; 138 | 139 | nav.toggleCurrent = function (elem) { 140 | var parent_li = elem.closest('li'); 141 | parent_li.siblings('li.current').removeClass('current'); 142 | parent_li.siblings().find('li.current').removeClass('current'); 143 | parent_li.find('> ul li.current').removeClass('current'); 144 | parent_li.toggleClass('current'); 145 | } 146 | 147 | return nav; 148 | }; 149 | 150 | module.exports.ThemeNav = ThemeNav(); 151 | 152 | if (typeof(window) != 'undefined') { 153 | window.SphinxRtdTheme = { StickyNav: module.exports.ThemeNav }; 154 | } 155 | 156 | },{"jquery":"jquery"}]},{},["sphinx-rtd-theme"]); 157 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/theme.conf: -------------------------------------------------------------------------------- 1 | [theme] 2 | inherit = basic 3 | stylesheet = css/theme.css 4 | 5 | [options] 6 | typekit_id = hiw1hhg 7 | analytics_id = 8 | sticky_navigation = False 9 | logo_only = 10 | collapse_navigation = False 11 | display_version = True 12 | -------------------------------------------------------------------------------- /source/_themes/sphinx_rtd_theme/versions.html: -------------------------------------------------------------------------------- 1 | {% if READTHEDOCS %} 2 | {# Add rst-badge after rst-versions for small badge style. #} 3 |
4 | 5 | Read the Docs 6 | v: {{ current_version }} 7 | 8 | 9 |
10 |
11 |
Versions
12 | {% for slug, url in versions %} 13 |
{{ slug }}
14 | {% endfor %} 15 |
16 |
17 |
Downloads
18 | {% for type, url in downloads %} 19 |
{{ type }}
20 | {% endfor %} 21 |
22 |
23 |
On Read the Docs
24 |
25 | Project Home 26 |
27 |
28 | Builds 29 |
30 |
31 |
32 | Free document hosting provided by Read the Docs. 33 | 34 |
35 |
36 | {% endif %} 37 | 38 | -------------------------------------------------------------------------------- /source/basics.rst: -------------------------------------------------------------------------------- 1 | .. highlight:: none 2 | :linenothreshold: 1 3 | 4 | A Basic Example 5 | =============== 6 | 7 | The first thing you'll need is an XML document. The example programs in this 8 | section will use the :download:`playlist.xml ` 9 | file shown below. This file contains details of five different movies: 10 | 11 | .. literalinclude:: /code/playlist.xml 12 | :language: xml 13 | :linenos: 14 | 15 | .. note:: 16 | 17 | Although this XML document contains details which came from the fabulous 18 | `IMDb.com `_ web site, the file structure was created 19 | specifically for this example and does not represent an actual API for 20 | querying movie details. 21 | 22 | Once you have the sample XML document, you can use :download:`this script 23 | ` to extract and print the title of each movie, 24 | in the order they appear in the XML: 25 | 26 | .. literalinclude:: /code/010-list-titles.pl 27 | :language: perl 28 | 29 | and will produce the following output: 30 | 31 | .. literalinclude:: /_output/010-list-titles.pl-out 32 | :language: none 33 | 34 | .. sidebar:: Is XML::LibXML installed? 35 | 36 | If you try running this example script but you don't have the 37 | ``XML::LibXML`` module installed on your system, then you'll get an error 38 | like this: 39 | 40 | Can't locate XML/LibXML.pm in @INC ... at ./010-list-titles.pl line 7. 41 | 42 | If you do get this error, then refer to :doc:`installation` for help on 43 | installing ``XML::LibXML``. 44 | 45 | If we break the example down line-by-line we see that after a standard 46 | boilerplate section, the script loads the ``XML::LibXML`` module: 47 | 48 | .. literalinclude:: /code/010-list-titles.pl 49 | :language: perl 50 | :lines: 7 51 | 52 | Next, the ``load_xml()`` class method is called to parse the XML file and 53 | return a document object: 54 | 55 | .. literalinclude:: /code/010-list-titles.pl 56 | :language: perl 57 | :lines: 11 58 | 59 | The ``$dom`` variable now contains an object representing all the elements of 60 | the XML document arranged in a tree structure known as a 61 | :doc:`Document Object Model ` or 'DOM'. 62 | 63 | Finally we get to the guts of the script where the ``findnodes()`` method is 64 | called to search the DOM for the elements we're interested in and a ``foreach`` 65 | loop is used to iterate through the matching elements: 66 | 67 | .. literalinclude:: /code/010-list-titles.pl 68 | :language: perl 69 | :lines: 13-15 70 | 71 | The ``findnodes()`` method takes one argument - an **XPath expression**. This 72 | is a string describing the location and characteristics of the elements we want 73 | to find. XPath is a query language and the way we use it to select elements 74 | from the DOM is similar to the way we use SQL to select records from a 75 | relational database. The next section (:doc:`xpath`) will include examples of 76 | more complex queries. 77 | 78 | The ``findnodes()`` method returns a list of objects from the DOM that match 79 | the XPath expression. Each time through the loop, ``$title`` will contain an 80 | object representing the next matching element. This object provides a number 81 | of properties and methods that you can use to access the element and its 82 | attributes, as well as any text content and 'child' elements. 83 | 84 | Inside the loop, this example simply calls the ``to_literal()`` method to get 85 | the text content of the element. The string returned by ``to_literal()`` will 86 | not include any of the attributes but will include the text content of any 87 | child elements. 88 | 89 | Other XML sources 90 | ----------------- 91 | 92 | The first example script called ``XML::LibXML->load_xml()`` with the 93 | ``location`` argument set to the name of a file. The ``location`` argument 94 | also accepts a URL: 95 | 96 | .. code-block:: perl 97 | 98 | $dom = XML::LibXML->load_xml(location => 'http://techcrunch.com/feed/'); 99 | 100 | .. note:: 101 | 102 | Not all versions of ``libxml2`` can retrieve documents over SSL/TLS. So if 103 | the URL is an 'https' URL (or if it redirects to one), you may need to use 104 | a module like `LWP `_ to retrieve 105 | the document and pass the response body to the XML parser as a string as 106 | shown below. 107 | 108 | If you have the XML in a string, instead of ``location``, use ``string``: 109 | 110 | .. code-block:: perl 111 | 112 | $dom = XML::LibXML->load_xml(string => $xml_string); 113 | 114 | Or, you can provide a Perl file handle to parse from an open file or socket, 115 | using ``IO``: 116 | 117 | .. code-block:: perl 118 | 119 | $dom = XML::LibXML->load_xml(IO => $fh); 120 | 121 | When providing a string or a file handle, it's crucial that you **do not** 122 | decode the bytes of the source data (for example by using ``':utf8'`` when 123 | opening a file). The underlying ``libxml2`` library is written in C to decode 124 | bytes and does not understand Perl's character strings. If you have assembled 125 | your XML document by concatenating Perl character strings, you will need to 126 | encode it to a byte string (for example using ``Encode::encode_utf8()``) and 127 | then pass the byte string to the parser. 128 | 129 | If you have enabled UTF-8 globally with something like this in your script: 130 | 131 | .. code-block:: perl 132 | 133 | use open ':encoding(utf8)'; 134 | 135 | Then you'll need to turn **off** the encoding IO layers for any file handle 136 | that you pass to XML::LibXML: 137 | 138 | .. code-block:: perl 139 | 140 | open my $fh, '<', $filename; 141 | binmode $fh, ':raw'; 142 | $dom = XML::LibXML->load_xml(IO => $fh); 143 | 144 | A more complex example 145 | ---------------------- 146 | 147 | Now let's look at a slightly more complex example. :download:`This script 148 | ` takes the same XML input and extracts more 149 | details from each ```` element: 150 | 151 | .. literalinclude:: /code/040-movie-details.pl 152 | :language: perl 153 | 154 | and will produce the following output: 155 | 156 | .. literalinclude:: /_output/040-movie-details.pl-out 157 | :language: none 158 | 159 | Let's compare the main loop of the first script: 160 | 161 | .. literalinclude:: /code/010-list-titles.pl 162 | :language: perl 163 | :lines: 13-15 164 | 165 | with the main loop of the second script: 166 | 167 | .. literalinclude:: /code/040-movie-details.pl 168 | :language: perl 169 | :lines: 13-23 170 | 171 | The structure of the main loop is very similar but the XPath expression 172 | passed to ``findnodes()`` is different in each case: 173 | 174 | ``'/playlist/movie/title'`` 175 | | Will match every ```` element which is the child of ... 176 | | a ``<movie>`` element which is the child of ... 177 | | a ``<playlist>`` element which is ... 178 | | the top-level element in the document. 179 | 180 | Or, to phrase it a different way, the search will start at the top of the 181 | document and look for a ``<playlist>`` element; if one is found, the search 182 | will continue for child ``<movie>`` elements; and for each one that is 183 | found the search will continue for child ``<title>`` elements. 184 | 185 | ``'//movie'`` 186 | Will match every ``<movie>`` element at any level of nesting. 187 | 188 | In both cases, the XPath expression starts with a '/' which means the search 189 | will start at the the top of the document. 190 | 191 | Inside the second script's loop are a number of calls to ``findvalue()``. This 192 | is a handy shortcut method that is typically used when you expect the XPath 193 | expression to match *exactly one node*. It combines the functionality of 194 | ``findnodes()`` and ``to_literal()`` into a single method. So this code: 195 | 196 | .. code-block:: perl 197 | 198 | $movie->findvalue('./title'); 199 | 200 | is equivalent to: 201 | 202 | .. code-block:: perl 203 | 204 | $movie->findnodes('./title')->to_literal(); 205 | 206 | There are a couple of other interesting differences with the XPath searches in 207 | the loop compared to previous examples. Firstly, the ``findvalue()`` method is 208 | being called on ``$movie`` (which represents one ``<movie>`` element) rather 209 | than on ``$dom`` (which represents the whole document). This means that the 210 | ``$movie`` element is the **context element**. Secondly, the XPath expression 211 | starts with a '.' which means: start the search at the context element rather 212 | than at the top of the document. 213 | 214 | This second script illustrates a common pattern when working with ``XML::LibXML``: 215 | 216 | #. find 'interesting' elements using an XPath query starting with '/' or '//' 217 | 218 | #. iterate through those elements in a ``foreach`` loop 219 | 220 | #. get additional data from child elements using XPath queries starting with '.' 221 | 222 | 223 | Accessing attributes 224 | -------------------- 225 | 226 | When listing cast members in the main loop of the script above, this code ... 227 | 228 | .. literalinclude:: /code/040-movie-details.pl 229 | :language: perl 230 | :lines: 18-21 231 | 232 | is used to transform this XML ... 233 | 234 | .. code-block:: xml 235 | :linenos: 236 | 237 | <cast> 238 | <person name="Matt Damon" role="Mark Watney" /> 239 | <person name="Jessica Chastain" role="Melissa Lewis" /> 240 | <person name="Kristen Wiig" role="Annie Montrose" /> 241 | </cast> 242 | 243 | into this output: 244 | 245 | .. literalinclude:: /_output/040-movie-details.pl-out 246 | :language: none 247 | :lines: 29 248 | 249 | In an XPath expression, a name that starts with ``@`` will match an attribute 250 | rather than an element, so ``'person/@name'`` refers to an attribute called 251 | ``name`` on a ``<person>`` element. In this case, the call to 252 | ``findnodes('./cast/person/@name')`` will return three DOM nodes representing 253 | attribute values which are then transformed into plain strings using 254 | ``to_literal()``, as we've seen for element nodes, inside a `map 255 | <http://perldoc.perl.org/functions/map.html>`_ block. 256 | 257 | Another approach is to select the *element* with XPath and then call a DOM 258 | method on the element node to get the attribute value: 259 | 260 | .. literalinclude:: /code/050-attributes.pl 261 | :language: perl 262 | :lines: 31-34 263 | 264 | .. _tied-attribute-hash: 265 | 266 | Attributes via tied hash 267 | ------------------------ 268 | 269 | There's a shortcut syntax you can use to make this even easier, simply treat 270 | the element node as a hashref: 271 | 272 | .. literalinclude:: /code/050-attributes.pl 273 | :language: perl 274 | :lines: 42-45 275 | 276 | You might be a bit wary of poking around directly inside the element object, 277 | rather than using accessor methods. But don't worry, that's **not** what this 278 | shortcut syntax is doing. Instead, every `XML::LibXML::Element 279 | <https://metacpan.org/pod/XML::LibXML::Element>`_ object returned from the 280 | XPath query has been `'tied' 281 | <https://metacpan.org/pod/distribution/perl/pod/perltie.pod>`_ using 282 | `XML::LibXML::AttributeHash 283 | <https://metacpan.org/pod/XML::LibXML::AttributeHash>`_ so that hash lookups 284 | 'inside' the object actually get proxied to ``getAttribute()`` method calls. 285 | 286 | This technique is less efficient than calling ``getAttribute()`` directly but 287 | it is very convenient when you want to access more than one attribute of an 288 | element or when you want to interpolate an attribute value into a string: 289 | 290 | .. literalinclude:: /code/050-attributes.pl 291 | :language: perl 292 | :lines: 53-56 293 | 294 | Which will produce this output: 295 | 296 | .. literalinclude:: /_output/050-attributes.pl-out 297 | :language: none 298 | :lines: 5-8 299 | 300 | .. note:: 301 | 302 | Overloading 'Element' nodes to support tied hash access to attribute values 303 | was added in version 1.91 of XML::LibXML. If the examples above don't work 304 | for you then it may be because you have a very old version installed. 305 | 306 | Parsing Errors 307 | -------------- 308 | 309 | One of the advantages of XML is that it has a few strict rules that every 310 | document must comply with to be considered "well-formed". If a document is not 311 | well-formed, it should be rejected in its entirety and no part of the XML 312 | document content should be used. Examples of things that would cause a 313 | document to be not well-formed include: 314 | 315 | * missing or mismatched closing tag 316 | * missing or mismatched quotes around attribute values 317 | * whitespace before the initial XML declaration section 318 | * byte sequences that do not match the document's declared character encoding 319 | * any non-whitespace characters after the closing tag for the first top-level 320 | element 321 | 322 | Like pretty much all XML parser modules, ``libxml`` will throw an exception 323 | if it encounters any violations of these rules. Since the whole of the XML 324 | document is processed when ``load_xml`` is called, an error at any point in 325 | the document will cause an exception to be raised. 326 | 327 | If you wish to handle exceptions gracefully use must use an ``eval`` block or 328 | one of the "try/catch" syntax extension modules to catch the error. For 329 | example, this document contains an error: 330 | 331 | .. literalinclude:: /code/book-borkened.xml 332 | :language: xml 333 | :linenos: 334 | 335 | This script will attempt to parse the bad input: 336 | 337 | .. literalinclude:: /code/060-parse-error.pl 338 | :language: perl 339 | :lines: 9-22 340 | 341 | and will instead produce this output: 342 | 343 | .. literalinclude:: /_output/060-parse-error.pl-out 344 | :language: none 345 | 346 | Note that although the script is only looking for ``<author>`` elements and the 347 | error in the ``<isbn>`` element comes *after* all the ``<author>`` elements, an 348 | exception is still raised by the ``load_xml`` call inside the eval block, 349 | before the DOM has been fully constructed. 350 | 351 | That's it for the basic examples. The next topic will look more closely at 352 | :doc:`XPath expressions <xpath>`. 353 | -------------------------------------------------------------------------------- /source/code/010-list-titles.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'playlist.xml'; 10 | 11 | my $dom = XML::LibXML->load_xml(location => $filename); 12 | 13 | foreach my $title ($dom->findnodes('/playlist/movie/title')) { 14 | say $title->to_literal(); 15 | } 16 | 17 | -------------------------------------------------------------------------------- /source/code/030-parse-from-fh.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | use open ':encoding(utf8)'; # would mess up XML file 7 | binmode(STDOUT, ':utf8'); 8 | 9 | use XML::LibXML; 10 | 11 | my $filename = 'carte-latin1.xml'; 12 | open my $fh, '<', $filename; # affected by 'use open' above 13 | binmode $fh, ':raw'; # turn off effects of 'use open' 14 | 15 | my $dom = XML::LibXML->load_xml(IO => $fh); 16 | 17 | foreach my $course ($dom->findnodes('//cours')) { 18 | say $course->{nom}; 19 | foreach my $dish ($course->findnodes('./plat')) { 20 | say "* " . $dish->to_literal(); 21 | } 22 | say ''; 23 | } 24 | 25 | -------------------------------------------------------------------------------- /source/code/040-movie-details.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'playlist.xml'; 10 | 11 | my $dom = XML::LibXML->load_xml(location => $filename); 12 | 13 | foreach my $movie ($dom->findnodes('//movie')) { 14 | say 'Title: ', $movie->findvalue('./title'); 15 | say 'Director: ', $movie->findvalue('./director'); 16 | say 'Rating: ', $movie->findvalue('./mpaa-rating'); 17 | say 'Duration: ', $movie->findvalue('./running-time'), " minutes"; 18 | my $cast = join ', ', map { 19 | $_->to_literal(); 20 | } $movie->findnodes('./cast/person/@name'); 21 | say 'Starring: ', $cast; 22 | say ""; 23 | } 24 | 25 | -------------------------------------------------------------------------------- /source/code/050-attributes.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'playlist.xml'; 10 | 11 | my $dom = XML::LibXML->load_xml(location => $filename); 12 | 13 | my($movie) = $dom->findnodes('//movie[@id="tt3659388"]'); 14 | 15 | # Three alternative ways to access attribute values 16 | 17 | # Select the attribute value with XPath 18 | { 19 | 20 | my $cast = join ', ', map { 21 | $_->to_literal(); 22 | } $movie->findnodes('./cast/person/@name'); 23 | say 'Starring: ', $cast; 24 | 25 | } 26 | 27 | 28 | # Select the element with XPath and extract the element value via a DOM method 29 | { 30 | 31 | my $cast = join ', ', map { 32 | $_->getAttribute('name'); 33 | } $movie->findnodes('./cast/person'); 34 | say 'Starring: ', $cast; 35 | 36 | } 37 | 38 | 39 | # Select the element with XPath and extract the element value via tied hash 40 | { 41 | 42 | my $cast = join ', ', map { 43 | $_->{name}; 44 | } $movie->findnodes('./cast/person'); 45 | say 'Starring: ', $cast; 46 | 47 | } 48 | 49 | 50 | # Same as above but access multiple attributes 51 | { 52 | 53 | my $cast = join "\n", map { 54 | " * $_->{name} (as $_->{role})"; 55 | } $movie->findnodes('./cast/person'); 56 | say "\nStarring:\n", $cast; 57 | 58 | } 59 | -------------------------------------------------------------------------------- /source/code/060-parse-error.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | #my $filename = 'book.xml'; 10 | my $filename = 'book-borkened.xml'; 11 | 12 | my $dom = eval { 13 | XML::LibXML->load_xml(location => $filename); 14 | }; 15 | if($@) { 16 | # Log failure and exit 17 | print "Error parsing '$filename':\n$@"; 18 | exit 0; 19 | } 20 | 21 | foreach my $author ($dom->findnodes('//author')) { 22 | say $author->to_literal(); 23 | } 24 | 25 | -------------------------------------------------------------------------------- /source/code/100-xpath-examples: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'playlist.xml'; 10 | 11 | my $dom = XML::LibXML->load_xml(location => $filename); 12 | 13 | foreach my $movie ($dom->findnodes('//movie')) { 14 | say 'Title: ', $movie->findvalue('./title'); 15 | } 16 | 17 | -------------------------------------------------------------------------------- /source/code/110-case-insensitive-xpath-1: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'playlist.xml'; 10 | 11 | my $dom = XML::LibXML->load_xml(location => $filename); 12 | 13 | my $query = q{ 14 | //person[ 15 | contains( 16 | translate( 17 | @name, 18 | 'ABCDEFGHIJKLMNOPQRSTUVWXZY', 19 | 'abcdefghijklmnopqrstuvwxyz' 20 | ), 21 | 'matt' 22 | ) 23 | ] 24 | }; 25 | 26 | foreach my $person ($dom->findnodes($query)) { 27 | say "Person: $person->{name}"; 28 | } 29 | 30 | -------------------------------------------------------------------------------- /source/code/200-dom-document.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $dom = XML::LibXML->load_xml(location => 'book.xml'); 10 | 11 | say '$dom is a ', ref($dom); 12 | say '$dom->nodeName is: ', $dom->nodeName; 13 | say 'XML Version is: ', $dom->version; 14 | say 'Document encoding is: ', $dom->encoding; 15 | my $is_or_not = $dom->standalone ? 'is' : 'is not'; 16 | say "Document $is_or_not standalone"; 17 | 18 | say "DOM as XML:\n", $dom->toString; 19 | 20 | say "DOM as a string:\n", $dom; 21 | 22 | my $book = $dom->documentElement; 23 | say '$dom->documentElement is a ', ref($book); 24 | say '$dom->documentElement->nodeName = ', $book->nodeName; 25 | 26 | -------------------------------------------------------------------------------- /source/code/210-dom-elements.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML qw(:libxml); 8 | 9 | my $dom = XML::LibXML->load_xml(location => 'book.xml'); 10 | 11 | my $book = $dom->documentElement; 12 | say '$book is a ', ref($book); 13 | say '$book->nodeName is: ', $book->nodeName; 14 | 15 | my($isbn) = $book->getChildrenByTagName('isbn'); 16 | say '$isbn is a ', ref($isbn); 17 | say '$isbn->nodeName is: ', $isbn->nodeName; 18 | say '$isbn->to_literal returns: ', $isbn->to_literal; 19 | say '$isbn stringifies to: ', $isbn; 20 | 21 | my @children = $book->childNodes; 22 | my $count = @children; 23 | say "\$book has $count child nodes:"; 24 | my $i = 0; 25 | foreach my $child (@children) { 26 | say $i++, ": is a ", ref($child), ', name = ', $child->nodeName; 27 | } 28 | 29 | my @elements = grep { $_->nodeType == XML_ELEMENT_NODE } $book->childNodes; 30 | $count = @elements; 31 | say "\$book has $count child elements:"; 32 | $i = 0; 33 | foreach my $child (@elements) { 34 | say $i++, ": is a ", ref($child), ', name = ', $child->nodeName; 35 | } 36 | -------------------------------------------------------------------------------- /source/code/211-dom-elements-no-blanks.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML qw(:libxml); 8 | 9 | my $dom = XML::LibXML->load_xml(location => 'book.xml', no_blanks => 1); 10 | 11 | my $book = $dom->documentElement; 12 | 13 | my @children = $book->childNodes; 14 | my $count = @children; 15 | say "\$book has $count child nodes:"; 16 | my $i = 0; 17 | foreach my $child (@children) { 18 | say $i++, ": is a ", ref($child), ', name = ', $child->nodeName; 19 | } 20 | 21 | -------------------------------------------------------------------------------- /source/code/220-dom-text-nodes.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $dom = XML::LibXML->load_xml(location => 'fish-and-chips.xml'); 10 | 11 | my $item = $dom->documentElement; 12 | my($text) = $item->childNodes(); 13 | 14 | say '$text is a ', ref($text); 15 | say '$text->data = ', $text->data; 16 | say '$text->nodeValue = ', $text->nodeValue; 17 | say '$text->to_literal = ', $text->to_literal; 18 | say '$text->toString = ', $text->toString; 19 | say '$text as a string: ', $text; 20 | -------------------------------------------------------------------------------- /source/code/230-dom-attributes.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML qw(:libxml); 8 | 9 | my $dom = XML::LibXML->load_xml(location => 'book.xml'); 10 | 11 | my $book = $dom->documentElement; 12 | my($dim) = $book->getChildrenByTagName('dimensions'); 13 | 14 | say '$dim->getAttribute("width") = ', $dim->getAttribute("width"); 15 | say "\$dim->{width} = $dim->{width}"; 16 | -------------------------------------------------------------------------------- /source/code/240-dom-attr.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML qw(:libxml); 8 | 9 | my $dom = XML::LibXML->load_xml(location => 'book.xml'); 10 | 11 | # You probably don't need this object interface for attributes at all. 12 | # The previous example showed how to access attributes directly via 13 | # the Element object. 14 | 15 | my $book = $dom->documentElement; 16 | my($dim) = $book->getChildrenByTagName('dimensions'); 17 | my($width_attr) = $dim->getAttributeNode('width'); 18 | 19 | say '$width_attr is a ', ref($width_attr); 20 | say '$width_attr->nodeName: ', $width_attr->nodeName; 21 | say '$width_attr->value: ', $width_attr->value; 22 | say '$width_attr as a string: ', $width_attr; 23 | -------------------------------------------------------------------------------- /source/code/250-dom-nodelist.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML qw(:libxml); 8 | 9 | my $dom = XML::LibXML->load_xml(location => 'book.xml'); 10 | 11 | my $book = $dom->documentElement; 12 | 13 | my $result = $book->childNodes; 14 | say '$result is a ', ref($result); 15 | my $i = 1; 16 | foreach my $i (1..$result->size) { 17 | my $node = $result->get_node($i); 18 | say $node->nodeName if $node->nodeType == XML_ELEMENT_NODE; 19 | } 20 | 21 | say ''; 22 | 23 | foreach my $node ($book->childNodes) { 24 | say $node->nodeName if $node->nodeType == XML_ELEMENT_NODE; 25 | } 26 | 27 | say ''; 28 | 29 | my($dim) = $book->findnodes('./dimensions'); 30 | say '$dim is a ', ref($dim); 31 | say 'Page count: ', $dim->{pages}; 32 | 33 | say ''; 34 | 35 | say 'Authors: ', join ', ', $book->findnodes('.//author')->to_literal_list; 36 | 37 | -------------------------------------------------------------------------------- /source/code/260-dom-modification.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML qw(:libxml); 8 | 9 | my $xml = q{ 10 | <record> 11 | <event>Men's 100m</event> 12 | </record> 13 | }; 14 | my $dom = XML::LibXML->load_xml(string => $xml); 15 | 16 | my $record = $dom->documentElement; 17 | my($event) = $record->getChildrenByTagName('event'); 18 | my $text = $event->firstChild; 19 | $text->setData("Men's 100 metres"); 20 | $event->{type} = 'sprint'; 21 | say $dom->toString; 22 | 23 | say ''; 24 | 25 | my $country = $dom->createElement('country'); 26 | $country->appendText('Jamaica'); 27 | $record->appendChild($country); 28 | 29 | my $athlete = $dom->createElement('athlete'); 30 | $athlete->appendText('Usain Bolt'); 31 | $record->insertBefore($athlete, $country); 32 | 33 | say $dom->toString; 34 | 35 | say $dom->toString(1); 36 | 37 | my $dom2 = $dom->cloneNode(1); 38 | 39 | foreach my $node ($record->childNodes()) { 40 | $record->removeChild($node) if $node->nodeType != XML_ELEMENT_NODE; 41 | } 42 | 43 | say $dom->toString(1); 44 | 45 | $dom = $dom2; 46 | $record = $dom->documentElement; 47 | 48 | foreach ($dom->findnodes('//text()')) { 49 | $_->parentNode->removeChild($_) unless /\S/; 50 | } 51 | 52 | say $dom->toString(1); 53 | 54 | $record->appendWellBalancedChunk( 55 | '<time>9.58s</time><date>2009-08-16</date><location>Berlin, Germany</location>' 56 | ); 57 | 58 | say $dom->toString(1); 59 | 60 | 61 | my $perl_string = "<time>9.58s</time><date>2009-08-16</date><location>B\x{e9}rlin, Germany</location>"; 62 | my $byte_string = Encode::encode_utf8($perl_string); 63 | $record->appendWellBalancedChunk($byte_string, 'UTF-8'); 64 | 65 | say $dom->toString(1); 66 | 67 | -------------------------------------------------------------------------------- /source/code/270-dom-from-scratch.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | use autodie; 7 | 8 | use XML::LibXML; 9 | 10 | my $dom = XML::LibXML::Document->new('1.0', 'UTF-8'); 11 | my $title = $dom->createElement('title'); 12 | $title->appendText("Caf\x{e9} lunch: \x{20ac}12.50"); 13 | $dom->setDocumentElement($title); 14 | 15 | my $filename = 'temp-utf8.xml'; 16 | open my $out, '>:raw', $filename; 17 | print $out $dom->toString(1); 18 | 19 | say $dom->toString(1); 20 | system("xxd $filename"); 21 | unlink($filename); 22 | say ''; 23 | 24 | -------------------------------------------------------------------------------- /source/code/271-dom-from-scratch-latin1.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | use autodie; 7 | 8 | use XML::LibXML; 9 | 10 | my $dom = XML::LibXML::Document->new('1.0', 'ISO8859-1'); 11 | my $title = $dom->createElement('title'); 12 | $title->appendText("Caf\x{e9} lunch: \x{20ac}12.50"); 13 | $dom->setDocumentElement($title); 14 | 15 | my $filename = 'temp-utf8.xml'; 16 | open my $out, '>:raw', $filename; 17 | print $out $dom->toString(1); 18 | 19 | system("xxd $filename"); 20 | unlink($filename); 21 | 22 | -------------------------------------------------------------------------------- /source/code/500-html-tidy.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'untidy.html'; 10 | 11 | my $dom = XML::LibXML->load_html( 12 | location => $filename, 13 | recover => 1, 14 | ); 15 | 16 | say $dom->toStringHTML(); 17 | 18 | -------------------------------------------------------------------------------- /source/code/501-html-tidy-no-err.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'untidy.html'; 10 | 11 | my $dom = XML::LibXML->load_html( 12 | location => $filename, 13 | recover => 1, 14 | suppress_errors => 1, 15 | ); 16 | 17 | say $dom->toStringHTML(); 18 | 19 | -------------------------------------------------------------------------------- /source/code/510-html-no-stderr.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use File::Spec; 8 | use XML::LibXML; 9 | 10 | warn "message to STDERR before parsing\n"; 11 | my $dom = parse_html_file('untidy.html'); 12 | warn "message to STDERR after parsing\n"; 13 | 14 | say $dom->toStringHTML(); 15 | 16 | exit; 17 | 18 | sub parse_html_file { 19 | my($filename) = @_; 20 | 21 | local(*STDERR); 22 | open STDERR, '>>', File::Spec->devnull(); 23 | return XML::LibXML->load_html( 24 | location => $filename, 25 | recover => 1, 26 | suppress_errors => 1, 27 | ); 28 | }; 29 | -------------------------------------------------------------------------------- /source/code/520-html-xpath-simple.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'css-zen-garden.html'; 10 | 11 | my $dom = XML::LibXML->load_html( 12 | location => $filename, 13 | recover => 1, 14 | suppress_errors => 1, 15 | ); 16 | 17 | my $xpath = '//div[@id="zen-supporting"]//h3'; 18 | say "$_" foreach $dom->findnodes($xpath)->to_literal_list; 19 | 20 | -------------------------------------------------------------------------------- /source/code/530-html-xpath-complex.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | use URI::URL; 9 | use JSON qw(to_json); 10 | 11 | my $base_url = 'http://csszengarden.com/'; 12 | my $filename = 'css-zen-garden.html'; 13 | 14 | my $dom = XML::LibXML->load_html( 15 | location => $filename, 16 | recover => 1, 17 | suppress_errors => 1, 18 | ); 19 | 20 | my @designs; 21 | my $xpath = '//div[@id="design-selection"]//li'; 22 | foreach my $design ($dom->findnodes($xpath)) { 23 | my($name, $designer) = $design->findnodes('./a')->to_literal_list; 24 | my($url) = $design->findnodes('./a/@href')->to_literal_list; 25 | $url = URI::URL->new($url, $base_url)->abs; 26 | push @designs, { 27 | name => $name, 28 | designer => $designer, 29 | url => "$url", 30 | }; 31 | } 32 | 33 | say to_json(\@designs, {pretty => 1}); 34 | -------------------------------------------------------------------------------- /source/code/531-html-xpath-no-semantic.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | use URI::URL; 9 | use JSON qw(to_json); 10 | 11 | my $base_url = 'http://csszengarden.com/'; 12 | my $filename = 'css-zen-garden.html'; 13 | 14 | my $dom = XML::LibXML->load_html( 15 | location => $filename, 16 | recover => 1, 17 | suppress_errors => 1, 18 | ); 19 | 20 | my @designs; 21 | my $xpath = '//h3[contains(.,"Select a Design")]/..//li'; 22 | foreach my $design ($dom->findnodes($xpath)) { 23 | my($name, $designer) = $design->findnodes('./a')->to_literal_list; 24 | my($url) = $design->findnodes('./a/@href')->to_literal_list; 25 | $url = URI::URL->new($url, $base_url)->abs; 26 | push @designs, { 27 | name => $name, 28 | designer => $designer, 29 | url => "$url", 30 | }; 31 | } 32 | 33 | say to_json(\@designs, {pretty => 1}); 34 | -------------------------------------------------------------------------------- /source/code/540-html-xpath-classes.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $dom = XML::LibXML->load_html( 10 | location => 'people.html', 11 | recover => 1, 12 | ); 13 | 14 | my($xpath); 15 | 16 | # Match <li> elements whose class attribute contains 'member' 17 | 18 | $xpath = '//li[contains(@class, "member")]'; 19 | say "$_" foreach $dom->findnodes($xpath)->to_literal_list; 20 | 21 | say ''; 22 | 23 | # Match <li> elements with the class 'member' 24 | 25 | $xpath = '//li[contains(concat(" ", @class, " "), " member ")]'; 26 | say "$_" foreach $dom->findnodes($xpath)->to_literal_list; 27 | 28 | -------------------------------------------------------------------------------- /source/code/580-html-css-selectors.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | use HTML::Selector::XPath qw(selector_to_xpath); 9 | 10 | my $show_matches = ($ARGV[0] || '') eq '-v'; 11 | 12 | my $dom = XML::LibXML->load_html( 13 | location => 'css-zen-garden.html', 14 | recover => 1, 15 | suppress_errors => 1, 16 | ); 17 | 18 | my($xpath); 19 | 20 | select_nodes($dom, '#zen-supporting h3'); 21 | select_nodes($dom, '.designer-name'); 22 | select_nodes($dom, '.preamble abbr'); 23 | select_nodes($dom, '.preamble h3, .requirements h3'); 24 | 25 | exit; 26 | 27 | sub select_nodes { 28 | my($dom, $selector) = @_; 29 | 30 | my $xpath = selector_to_xpath($selector); 31 | say "\nSelector: $selector"; 32 | say "XPath: $xpath"; 33 | return unless $show_matches; 34 | say "$_" foreach find_by_css($dom, $selector)->to_literal_list; 35 | } 36 | 37 | sub find_by_css { 38 | my($dom, $selector) = @_; 39 | my $xpath = selector_to_xpath($selector); 40 | return $dom->findnodes($xpath); 41 | } 42 | 43 | -------------------------------------------------------------------------------- /source/code/590-ignore-words.pws: -------------------------------------------------------------------------------- 1 | personal_ws-1.1 en 0 2 | accessor 3 | ActivePerl 4 | ActiveState 5 | API 6 | Attr 7 | AttributeHash 8 | CentOS 9 | classname 10 | CPAN 11 | cpanm 12 | CSS 13 | DCMI 14 | DocumentFragment 15 | DOM 16 | dpkg 17 | GitHub 18 | hashref 19 | IMDb 20 | Javascript 21 | JSON 22 | libxml 23 | LibXML 24 | lookups 25 | metadata 26 | multi 27 | namespace 28 | namespaces 29 | Namespaces 30 | NodeList 31 | Permalink 32 | pre 33 | proxied 34 | README 35 | RedHat 36 | regex 37 | serialise 38 | serialised 39 | serialising 40 | Solaris 41 | SQL 42 | STDERR 43 | stringification 44 | Stringification 45 | SVG 46 | unclosed 47 | URI 48 | URIs 49 | utf 50 | UTF 51 | whitespace 52 | XHTML 53 | XPath 54 | XPathContext 55 | -------------------------------------------------------------------------------- /source/code/590-spell-check.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | # 3 | # Spell check the generated HTML for this project using aspell - after 4 | # stripping out elements that don't need to be checked. 5 | # 6 | 7 | use 5.010; 8 | use strict; 9 | use warnings; 10 | use autodie; 11 | 12 | use FindBin; 13 | use XML::LibXML; 14 | use HTML::Selector::XPath qw(selector_to_xpath); 15 | 16 | chdir($FindBin::Bin); 17 | my $ignore_list = './590-ignore-words.pws'; # an aspell personal dictionary 18 | 19 | open my $pipe, "| aspell --personal=$ignore_list list | sort -u "; 20 | binmode $pipe, ':utf8'; 21 | 22 | my $pattern = "../../build/html/*.html"; 23 | my @files = glob($pattern); 24 | foreach my $filename (@files) { 25 | my $dom = XML::LibXML->load_html( 26 | location => $filename, 27 | recover => 1, 28 | suppress_errors => 1, 29 | ); 30 | 31 | my($body) = $dom->findnodes('/html/body') or next; 32 | 33 | foreach my $selector ( 34 | 'script', 35 | 'nav', 36 | 'ul.wy-breadcrumbs', 37 | 'tt.literal', 38 | 'table.highlighttable', 39 | 'div.highlight', 40 | 'footer', 41 | ) { 42 | my $xpath = selector_to_xpath($selector); 43 | foreach my $node ($body->findnodes($xpath)) { 44 | $node->parentNode->removeChild($node); 45 | } 46 | } 47 | 48 | say $pipe $body->to_literal(); 49 | 50 | # Also include specific snippets of text not icludedn the body text 51 | 52 | foreach my $xpath ('//title', '//@title', '//@alt') { 53 | say $pipe $_ foreach $dom->findnodes($xpath)->to_literal_list; 54 | } 55 | } 56 | 57 | -------------------------------------------------------------------------------- /source/code/600-ns-no-context.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | 9 | my $filename = 'xml-libxml.svg'; 10 | 11 | my $dom = XML::LibXML->load_xml(location => $filename); 12 | 13 | my $match_count = $dom->findnodes('//title')->size; 14 | say "XPath: //title Matching node count: $match_count"; 15 | 16 | -------------------------------------------------------------------------------- /source/code/610-ns-xpc.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | use XML::LibXML::XPathContext; 9 | 10 | my $filename = 'xml-libxml.svg'; 11 | my $dom = XML::LibXML->load_xml(location => $filename); 12 | 13 | my $xpc = XML::LibXML::XPathContext->new($dom); 14 | $xpc->registerNs('vg', 'http://www.w3.org/2000/svg'); 15 | $xpc->registerNs('dub', 'http://purl.org/dc/elements/1.1/'); 16 | 17 | my($match1) = $xpc->findnodes('//vg:title'); 18 | say 'XPath: //vg:title Matched: ', $match1; 19 | 20 | my($match2) = $xpc->findnodes('//dub:title'); 21 | say 'XPath: //dub:title Matched: ', $match2; 22 | 23 | -------------------------------------------------------------------------------- /source/code/620-ns-child-nodes.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML; 8 | use XML::LibXML::XPathContext; 9 | 10 | my $filename = 'xml-libxml.svg'; 11 | my $dom = XML::LibXML->load_xml(location => $filename, no_blanks => 1); 12 | 13 | my $xpc = XML::LibXML::XPathContext->new($dom); 14 | $xpc->registerNs('svg', 'http://www.w3.org/2000/svg'); 15 | $xpc->registerNs('dc', 'http://purl.org/dc/elements/1.1/'); 16 | 17 | my($metadata) = $xpc->findnodes('//svg:metadata') or die "No metadata"; 18 | 19 | foreach my $el ($xpc->findnodes('.//dc:*', $metadata)) { 20 | my $name = $el->localname; 21 | my $value = $el->to_literal or next; 22 | say "$name=$value"; 23 | } 24 | 25 | -------------------------------------------------------------------------------- /source/code/700-reader-events.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML::Reader; 8 | 9 | my $filename = 'country.xml'; 10 | 11 | my $reader = XML::LibXML::Reader->new(location => $filename) 12 | or die "cannot read file '$filename': $!\n"; 13 | 14 | while($reader->read) { 15 | printf( 16 | "Node type: %2u Depth: %2u Name: %s\n", 17 | $reader->nodeType, 18 | $reader->depth, 19 | $reader->name 20 | ); 21 | } 22 | 23 | 24 | -------------------------------------------------------------------------------- /source/code/710-reader-named-events.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML::Reader; 8 | 9 | my $filename = 'country.xml'; 10 | 11 | my $reader = XML::LibXML::Reader->new(location => $filename) 12 | or die "cannot read file '$filename': $!\n"; 13 | 14 | my %type_name = ( 15 | &XML_READER_TYPE_ELEMENT => 'ELEMENT', 16 | &XML_READER_TYPE_ATTRIBUTE => 'ATTRIBUTE', 17 | &XML_READER_TYPE_TEXT => 'TEXT', 18 | &XML_READER_TYPE_CDATA => 'CDATA', 19 | &XML_READER_TYPE_ENTITY_REFERENCE => 'ENTITY_REFERENCE', 20 | &XML_READER_TYPE_ENTITY => 'ENTITY', 21 | &XML_READER_TYPE_PROCESSING_INSTRUCTION => 'PROCESSING_INSTRUCTION', 22 | &XML_READER_TYPE_COMMENT => 'COMMENT', 23 | &XML_READER_TYPE_DOCUMENT => 'DOCUMENT', 24 | &XML_READER_TYPE_DOCUMENT_TYPE => 'DOCUMENT_TYPE', 25 | &XML_READER_TYPE_DOCUMENT_FRAGMENT => 'DOCUMENT_FRAGMENT', 26 | &XML_READER_TYPE_NOTATION => 'NOTATION', 27 | &XML_READER_TYPE_WHITESPACE => 'WHITESPACE', 28 | &XML_READER_TYPE_SIGNIFICANT_WHITESPACE => 'SIGNIFICANT_WHITESPACE', 29 | &XML_READER_TYPE_END_ELEMENT => 'END_ELEMENT', 30 | ); 31 | 32 | say " Step | Node Type | Depth | Name"; 33 | say "------+-------------------------+-------+-------"; 34 | 35 | my $step = 1; 36 | while($reader->read) { 37 | printf( 38 | " %3u | %-22s | %4u | %s\n", 39 | $step++, 40 | $type_name{$reader->nodeType}, 41 | $reader->depth, 42 | $reader->name 43 | ); 44 | } 45 | 46 | 47 | -------------------------------------------------------------------------------- /source/code/720-seek-controversy.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | use autodie; 7 | 8 | use PerlIO::gzip; 9 | use XML::LibXML::Reader; 10 | 11 | binmode(STDOUT, ':utf8'); 12 | 13 | my $filename = 'enwiki-latest-abstract1-abridged.xml.gz'; 14 | open my $fh, '<:gzip', $filename; 15 | 16 | my $reader = XML::LibXML::Reader->new(IO => $fh); 17 | 18 | my $controversy_xpath = q{./links/sublink[contains(./anchor, 'Controvers')]}; 19 | 20 | while($reader->read) { 21 | next unless $reader->nodeType == XML_READER_TYPE_ELEMENT; 22 | next unless $reader->name eq 'doc'; 23 | my $doc = $reader->copyCurrentNode(1); 24 | if(my($target) = $doc->findnodes($controversy_xpath)) { 25 | say 'Title: ', $doc->findvalue('./title'); 26 | say ' ', $target->findvalue('./anchor'); 27 | say ' ', $target->findvalue('./link'); 28 | say ''; 29 | } 30 | $reader->next; 31 | } 32 | 33 | -------------------------------------------------------------------------------- /source/code/721-seek-controversy-variants.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | use autodie; 7 | 8 | use PerlIO::gzip; 9 | use XML::LibXML::Reader; 10 | 11 | binmode(STDOUT, ':utf8'); 12 | 13 | my $filename = 'enwiki-latest-abstract1-abridged.xml.gz'; 14 | open my $fh, '<:gzip', $filename; 15 | 16 | my $reader = XML::LibXML::Reader->new(IO => $fh); 17 | 18 | my $controversy_xpath = q{/doc/links/sublink[contains(./anchor, 'Controvers')]}; 19 | 20 | while($reader->read) { 21 | next unless $reader->nodeType == XML_READER_TYPE_ELEMENT; 22 | next unless $reader->name eq 'doc'; 23 | my $xml = $reader->readOuterXml; 24 | if($xml =~ /Controvers/) { 25 | my $doc = XML::LibXML->load_xml(string => $xml); 26 | if(my($target) = $doc->findnodes($controversy_xpath)) { 27 | say 'Title: ', $doc->findvalue('/doc/title'); 28 | say ' ', $target->findvalue('./anchor'); 29 | say ' ', $target->findvalue('./link'); 30 | say ''; 31 | } 32 | } 33 | $reader->next; 34 | } 35 | 36 | 37 | __END__ 38 | 39 | my $xml = $reader->readOuterXml; 40 | my $doc = XML::LibXML->load_xml(string => $xml); 41 | 42 | my $doc_pattern = XML::LibXML::Pattern->new('/feed/doc'); 43 | while($reader->read) { 44 | next unless $reader->matchesPattern($doc_pattern); 45 | 46 | $reader->nextPatternMatch($pattern); 47 | 48 | -------------------------------------------------------------------------------- /source/code/730-reader-parse-error.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | 7 | use XML::LibXML::Reader; 8 | 9 | my $filename = 'book-borkened.xml'; 10 | open my $fh, '<', $filename; 11 | 12 | eval { 13 | my $reader = XML::LibXML::Reader->new(IO => $fh); 14 | $reader->finish; 15 | }; 16 | if($@) { 17 | say "Error during parse: '$@'"; 18 | } 19 | else { 20 | say "No parse errors were encountered"; 21 | } 22 | 23 | exit 0; 24 | -------------------------------------------------------------------------------- /source/code/750-titles-only.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | use 5.010; 4 | use strict; 5 | use warnings; 6 | use autodie; 7 | 8 | use XML::LibXML::Reader; 9 | 10 | binmode(STDOUT, ':utf8'); 11 | 12 | my $filename = 'enwiki-latest-abstract1-structure.xml'; 13 | 14 | my $reader = XML::LibXML::Reader->new(location => $filename); 15 | $reader->preservePattern('/feed/doc/title'); 16 | $reader->finish; 17 | 18 | say $reader->document->toString(1); 19 | 20 | -------------------------------------------------------------------------------- /source/code/book-borkened.xml: -------------------------------------------------------------------------------- 1 | <?xml version='1.0' encoding='UTF-8' standalone="yes" ?> 2 | <book edition="2"> 3 | <title>Training Your Pet Ferret 4 | 5 | Gerry Bucsis 6 | Barbara Somerville 7 | 8 | 9780764142239 9 | 10 | 11 | -------------------------------------------------------------------------------- /source/code/book.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | Training Your Pet Ferret 4 | 5 | Gerry Bucsis 6 | Barbara Somerville 7 | 8 | 9780764142239 9 | 10 | 11 | -------------------------------------------------------------------------------- /source/code/carte-latin1.xml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/code/carte-latin1.xml -------------------------------------------------------------------------------- /source/code/country.xml: -------------------------------------------------------------------------------- 1 | 2 | Ireland 3 | 4761657 4 | 5 | -------------------------------------------------------------------------------- /source/code/css-zen-garden.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | CSS Zen Garden: The Beauty of CSS Design 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 19 | 20 | 21 | 50 | 51 | 52 |
53 | 54 |
55 |
56 |

CSS Zen Garden

57 |

The Beauty of CSS Design

58 |
59 | 60 |
61 |

A demonstration of what can be accomplished through CSS-based design. Select any style sheet from the list to load it into this page.

62 |

Download the example html file and css file

63 |
64 | 65 |
66 |

The Road to Enlightenment

67 |

Littering a dark and dreary road lay the past relics of browser-specific tags, incompatible DOMs, broken CSS support, and abandoned browsers.

68 |

We must clear the mind of the past. Web enlightenment has been achieved thanks to the tireless efforts of folk like the W3C, WaSP, and the major browser creators.

69 |

The CSS Zen Garden invites you to relax and meditate on the important lessons of the masters. Begin to see with clarity. Learn to use the time-honored techniques in new and invigorating fashion. Become one with the web.

70 |
71 |
72 | 73 |
74 |
75 |

So What is This About?

76 |

There is a continuing need to show the power of CSS. The Zen Garden aims to excite, inspire, and encourage participation. To begin, view some of the existing designs in the list. Clicking on any one will load the style sheet into this very page. The HTML remains the same, the only thing that has changed is the external CSS file. Yes, really.

77 |

CSS allows complete and total control over the style of a hypertext document. The only way this can be illustrated in a way that gets people excited is by demonstrating what it can truly be, once the reins are placed in the hands of those able to create beauty from structure. Designers and coders alike have contributed to the beauty of the web; we can always push it further.

78 |
79 | 80 |
81 |

Participation

82 |

Strong visual design has always been our focus. You are modifying this page, so strong CSS skills are necessary too, but the example files are commented well enough that even CSS novices can use them as starting points. Please see the CSS Resource Guide for advanced tutorials and tips on working with CSS.

83 |

You may modify the style sheet in any way you wish, but not the HTML. This may seem daunting at first if you’ve never worked this way before, but follow the listed links to learn more, and use the sample files as a guide.

84 |

Download the sample HTML and CSS to work on a copy locally. Once you have completed your masterpiece (and please, don’t submit half-finished work) upload your CSS file to a web server under your control. Send us a link to an archive of that file and all associated assets, and if we choose to use it we will download it and place it on our server.

85 |
86 | 87 |
88 |

Benefits

89 |

Why participate? For recognition, inspiration, and a resource we can all refer to showing people how amazing CSS really can be. This site serves as equal parts inspiration for those working on the web today, learning tool for those who will be tomorrow, and gallery of future techniques we can all look forward to.

90 |
91 | 92 |
93 |

Requirements

94 |

Where possible, we would like to see mostly CSS 1 & 2 usage. CSS 3 & 4 should be limited to widely-supported elements only, or strong fallbacks should be provided. The CSS Zen Garden is about functional, practical CSS and not the latest bleeding-edge tricks viewable by 2% of the browsing public. The only real requirement we have is that your CSS validates.

95 |

Luckily, designing this way shows how well various browsers have implemented CSS by now. When sticking to the guidelines you should see fairly consistent results across most modern browsers. Due to the sheer number of user agents on the web these days — especially when you factor in mobile — pixel-perfect layouts may not be possible across every platform. That’s okay, but do test in as many as you can. Your design should work in at least IE9+ and the latest Chrome, Firefox, iOS and Android browsers (run by over 90% of the population).

96 |

We ask that you submit original artwork. Please respect copyright laws. Please keep objectionable material to a minimum, and try to incorporate unique and interesting visual themes to your work. We’re well past the point of needing another garden-related design.

97 |

This is a learning exercise as well as a demonstration. You retain full copyright on your graphics (with limited exceptions, see submission guidelines), but we ask you release your CSS under a Creative Commons license identical to the one on this site so that others may learn from your work.

98 |

By Dave Shea. Bandwidth graciously donated by mediatemple. Now available: Zen Garden, the book.

99 |
100 | 101 | 108 | 109 |
110 | 111 | 112 | 183 | 184 | 185 |
186 | 187 | 194 | 195 | 196 | 197 | 198 | -------------------------------------------------------------------------------- /source/code/enwiki-latest-abstract1-abridged.xml.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/grantm/perl-libxml-by-example/e2e8a4d17cd9afbcd6d57379277d709b7341b73e/source/code/enwiki-latest-abstract1-abridged.xml.gz -------------------------------------------------------------------------------- /source/code/enwiki-latest-abstract1-structure.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | Wikipedia: Anarchism 4 | https://en.wikipedia.org/wiki/Anarchism 5 | Anarchism is a political philosophy that advocates 6 | self-governed societies based on voluntary institutions. 7 | These are often described as stateless societies … 8 | 9 | 10 | History 11 | https://en.wikipedia.org/wiki/Anarchism#History 12 | 13 | 14 | Origins 15 | https://en.wikipedia.org/wiki/Anarchism#Origins 16 | 17 | 18 | 19 | 20 | 21 | Wikipedia: Autism 22 | https://en.wikipedia.org/wiki/Autism 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /source/code/fish-and-chips.xml: -------------------------------------------------------------------------------- 1 | Fish & Chips 2 | -------------------------------------------------------------------------------- /source/code/people.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Meeting minutes 2016-03-14 5 | 6 | 7 |

Present

8 |
    9 |
  • Catherine Trenton
  • 10 |
  • Daniel Ifflehirst
  • 11 |
  • Finlay Doyle
  • 12 |
  • Grechny Polnokov
  • 13 |
  • Heather Dalton
  • 14 |
  • Lester Strang
  • 15 |
16 | 17 | 18 | -------------------------------------------------------------------------------- /source/code/playlist.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | Apollo 13 4 | Ron Howard 5 | 1995-06-30 6 | PG 7 | 140 8 | adventure 9 | drama 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | NASA must devise a strategy to return Apollo 13 to Earth safely 20 | after the spacecraft undergoes massive internal damage putting 21 | the lives of the three astronauts on board in jeopardy. 22 | 23 | 7.6 24 | 25 | 26 | 27 | Solaris 28 | Steven Soderbergh 29 | 2002-11-27 30 | PG-13 31 | 99 32 | drama 33 | mystery 34 | romance 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | A troubled psychologist is sent to investigate the crew of an 43 | isolated research station orbiting a bizarre planet. 44 | 45 | 6.2 46 | 47 | 48 | 49 | Ender's Game 50 | Gavin Hood 51 | 2013-11-01 52 | PG-13 53 | 114 54 | action 55 | scifi 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | Young Ender Wiggin is recruited by the International Military 64 | to lead the fight against the Formics, a genocidal alien race 65 | which nearly annihilated the human race in a previous invasion. 66 | 67 | 6.7 68 | 69 | 70 | 71 | Interstellar 72 | Christopher Nolan 73 | 2014-11-07 74 | PG-13 75 | 169 76 | adventure 77 | drama 78 | scifi 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | A team of explorers travel through a wormhole in space in an 88 | attempt to ensure humanity's survival. 89 | 90 | 8.6 91 | 92 | 93 | 94 | The Martian 95 | Ridley Scott 96 | 2015-10-02 97 | PG-13 98 | 144 99 | adventure 100 | drama 101 | scifi 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | During a manned mission to Mars, Astronaut Mark Watney is 110 | presumed dead after a fierce storm and left behind by his crew. 111 | But Watney has survived and finds himself stranded and alone on 112 | the hostile planet. With only meager supplies, he must draw upon 113 | his ingenuity, wit and spirit to subsist and find a way to 114 | signal to Earth that he is alive. 115 | 116 | 8.1 117 | 118 | 119 | 120 | -------------------------------------------------------------------------------- /source/code/untidy.html: -------------------------------------------------------------------------------- 1 | Example (Untidy) HTML Doc 2 |

Here's a paragraph with poorly nested 3 | tags. Followed by a list of items — with unclosed tags

4 |
  • red
  • orange
  • yellow
5 | -------------------------------------------------------------------------------- /source/conf.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | # 4 | # Perl XML::LibXML by Example documentation build configuration file, created by 5 | # sphinx-quickstart on Sun Jan 3 11:33:36 2016. 6 | # 7 | # This file is execfile()d with the current directory set to its 8 | # containing dir. 9 | # 10 | # Note that not all possible configuration values are present in this 11 | # autogenerated file. 12 | # 13 | # All configuration values have a default; values that are commented out 14 | # serve to show the default. 15 | 16 | import sys 17 | import os 18 | 19 | # If extensions (or modules to document with autodoc) are in another directory, 20 | # add these directories to sys.path here. If the directory is relative to the 21 | # documentation root, use os.path.abspath to make it absolute, like shown here. 22 | #sys.path.insert(0, os.path.abspath('.')) 23 | 24 | # -- General configuration ------------------------------------------------ 25 | 26 | # If your documentation needs a minimal Sphinx version, state it here. 27 | #needs_sphinx = '1.0' 28 | 29 | # Add any Sphinx extension module names here, as strings. They can be 30 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom 31 | # ones. 32 | 33 | sys.path.append(os.path.abspath('sphinx-ext')) 34 | 35 | extensions = [ 36 | 'plxbe', 37 | ] 38 | 39 | # Add any paths that contain templates here, relative to this directory. 40 | templates_path = ['_templates'] 41 | 42 | # The suffix of source filenames. 43 | source_suffix = '.rst' 44 | 45 | # The encoding of source files. 46 | #source_encoding = 'utf-8-sig' 47 | 48 | # The master toctree document. 49 | master_doc = 'index' 50 | 51 | # General information about the project. 52 | project = 'Perl XML::LibXML by Example' 53 | copyright = '2016-2018, Grant McLean' 54 | 55 | # The version info for the project you're documenting, acts as replacement for 56 | # |version| and |release|, also used in various other places throughout the 57 | # built documents. 58 | # 59 | # The short X.Y version. 60 | #version = '1.0' 61 | # The full version, including alpha/beta/rc tags. 62 | #release = '1.0' 63 | 64 | # The language for content autogenerated by Sphinx. Refer to documentation 65 | # for a list of supported languages. 66 | #language = None 67 | 68 | # There are two options for replacing |today|: either, you set today to some 69 | # non-false value, then it is used: 70 | #today = '' 71 | # Else, today_fmt is used as the format for a strftime call. 72 | #today_fmt = '%B %d, %Y' 73 | 74 | # List of patterns, relative to source directory, that match files and 75 | # directories to ignore when looking for source files. 76 | exclude_patterns = [] 77 | 78 | # The reST default role (used for this markup: `text`) to use for all 79 | # documents. 80 | #default_role = None 81 | 82 | # If true, '()' will be appended to :func: etc. cross-reference text. 83 | #add_function_parentheses = True 84 | 85 | # If true, the current module name will be prepended to all description 86 | # unit titles (such as .. function::). 87 | #add_module_names = True 88 | 89 | # If true, sectionauthor and moduleauthor directives will be shown in the 90 | # output. They are ignored by default. 91 | #show_authors = False 92 | 93 | # The name of the Pygments (syntax highlighting) style to use. 94 | pygments_style = 'sphinx' 95 | 96 | # A list of ignored prefixes for module index sorting. 97 | #modindex_common_prefix = [] 98 | 99 | # If true, keep warnings as "system message" paragraphs in the built documents. 100 | #keep_warnings = False 101 | 102 | 103 | # -- Options for HTML output ---------------------------------------------- 104 | 105 | # The theme to use for HTML and HTML Help pages. See the documentation for 106 | # a list of builtin themes. 107 | html_theme = 'sphinx_rtd_theme' 108 | 109 | # Theme options are theme-specific and customize the look and feel of a theme 110 | # further. For a list of options available for each theme, see the 111 | # documentation. 112 | #html_theme_options = {} 113 | 114 | # Add any paths that contain custom themes here, relative to this directory. 115 | html_theme_path = [ '_themes' ] 116 | 117 | # The name for this set of Sphinx documents. If None, it defaults to 118 | # " v documentation". 119 | #html_title = None 120 | 121 | # A shorter title for the navigation bar. Default is the same as html_title. 122 | #html_short_title = None 123 | 124 | # The name of an image file (relative to this directory) to place at the top 125 | # of the sidebar. 126 | #html_logo = None 127 | 128 | # The name of an image file (within the static path) to use as favicon of the 129 | # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 130 | # pixels large. 131 | #html_favicon = None 132 | 133 | # Add any paths that contain custom static files (such as style sheets) here, 134 | # relative to this directory. They are copied after the builtin static files, 135 | # so a file named "default.css" will overwrite the builtin "default.css". 136 | html_static_path = ['_static'] 137 | 138 | # Add any extra paths that contain custom files (such as robots.txt or 139 | # .htaccess) here, relative to this directory. These files are copied 140 | # directly to the root of the documentation. 141 | #html_extra_path = [] 142 | 143 | # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, 144 | # using the given strftime format. 145 | html_last_updated_fmt = '%Y-%m-%d' 146 | 147 | # If true, SmartyPants will be used to convert quotes and dashes to 148 | # typographically correct entities. 149 | #html_use_smartypants = True 150 | 151 | # Custom sidebar templates, maps document names to template names. 152 | #html_sidebars = {} 153 | 154 | # Additional templates that should be rendered to pages, maps page names to 155 | # template names. 156 | #html_additional_pages = {} 157 | 158 | # If false, no module index is generated. 159 | #html_domain_indices = True 160 | 161 | # If false, no index is generated. 162 | #html_use_index = True 163 | 164 | # If true, the index is split into individual pages for each letter. 165 | #html_split_index = False 166 | 167 | # If true, links to the reST sources are added to the pages. 168 | #html_show_sourcelink = True 169 | 170 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. 171 | #html_show_sphinx = True 172 | 173 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. 174 | #html_show_copyright = True 175 | 176 | # If true, an OpenSearch description file will be output, and all pages will 177 | # contain a tag referring to it. The value of this option must be the 178 | # base URL from which the finished HTML is served. 179 | #html_use_opensearch = '' 180 | 181 | # This is the file name suffix for HTML files (e.g. ".xhtml"). 182 | #html_file_suffix = None 183 | 184 | # Output file base name for HTML help builder. 185 | htmlhelp_basename = 'PerlXMLLibXMLbyExampledoc' 186 | 187 | 188 | # -- Options for LaTeX output --------------------------------------------- 189 | 190 | latex_elements = { 191 | # The paper size ('letterpaper' or 'a4paper'). 192 | #'papersize': 'letterpaper', 193 | 194 | # The font size ('10pt', '11pt' or '12pt'). 195 | #'pointsize': '10pt', 196 | 197 | # Additional stuff for the LaTeX preamble. 198 | #'preamble': '', 199 | } 200 | 201 | # Grouping the document tree into LaTeX files. List of tuples 202 | # (source start file, target name, title, 203 | # author, documentclass [howto, manual, or own class]). 204 | latex_documents = [ 205 | ('index', 'PerlXMLLibXMLbyExample.tex', 'Perl XML::LibXML by Example Documentation', 206 | 'Grant McLean', 'manual'), 207 | ] 208 | 209 | # The name of an image file (relative to this directory) to place at the top of 210 | # the title page. 211 | #latex_logo = None 212 | 213 | # For "manual" documents, if this is true, then toplevel headings are parts, 214 | # not chapters. 215 | #latex_use_parts = False 216 | 217 | # If true, show page references after internal links. 218 | #latex_show_pagerefs = False 219 | 220 | # If true, show URL addresses after external links. 221 | #latex_show_urls = False 222 | 223 | # Documents to append as an appendix to all manuals. 224 | #latex_appendices = [] 225 | 226 | # If false, no module index is generated. 227 | #latex_domain_indices = True 228 | 229 | 230 | # -- Options for manual page output --------------------------------------- 231 | 232 | # One entry per manual page. List of tuples 233 | # (source start file, name, description, authors, manual section). 234 | man_pages = [ 235 | ('index', 'perlxmllibxmlbyexample', 'Perl XML::LibXML by Example Documentation', 236 | ['Grant McLean'], 1) 237 | ] 238 | 239 | # If true, show URL addresses after external links. 240 | #man_show_urls = False 241 | 242 | 243 | # -- Options for Texinfo output ------------------------------------------- 244 | 245 | # Grouping the document tree into Texinfo files. List of tuples 246 | # (source start file, target name, title, author, 247 | # dir menu entry, description, category) 248 | texinfo_documents = [ 249 | ('index', 'PerlXMLLibXMLbyExample', 'Perl XML::LibXML by Example Documentation', 250 | 'Grant McLean', 'PerlXMLLibXMLbyExample', 'One line description of project.', 251 | 'Miscellaneous'), 252 | ] 253 | 254 | # Documents to append as an appendix to all manuals. 255 | #texinfo_appendices = [] 256 | 257 | # If false, no module index is generated. 258 | #texinfo_domain_indices = True 259 | 260 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 261 | #texinfo_show_urls = 'footnote' 262 | 263 | # If true, do not generate a @detailmenu in the "Top" node's menu. 264 | #texinfo_no_detailmenu = False 265 | 266 | 267 | # -- Options for Epub output ---------------------------------------------- 268 | 269 | # Bibliographic Dublin Core info. 270 | epub_title = 'Perl XML::LibXML by Example' 271 | epub_author = 'Grant McLean' 272 | epub_publisher = 'Grant McLean' 273 | epub_copyright = '2016, Grant McLean' 274 | 275 | # The basename for the epub file. It defaults to the project name. 276 | #epub_basename = 'Perl XML::LibXML by Example' 277 | 278 | # The HTML theme for the epub output. Since the default themes are not optimized 279 | # for small screen space, using the same theme for HTML and epub output is 280 | # usually not wise. This defaults to 'epub', a theme designed to save visual 281 | # space. 282 | #epub_theme = 'epub' 283 | 284 | # The language of the text. It defaults to the language option 285 | # or en if the language is not set. 286 | #epub_language = '' 287 | 288 | # The scheme of the identifier. Typical schemes are ISBN or URL. 289 | #epub_scheme = '' 290 | 291 | # The unique identifier of the text. This can be a ISBN number 292 | # or the project homepage. 293 | epub_identifier = 'http://grantm.github.io/perl-libxml-by-example/' 294 | 295 | # A unique identification for the text. 296 | #epub_uid = '' 297 | 298 | # A tuple containing the cover image and cover page html template filenames. 299 | epub_cover = ('_static/cover.jpg', 'epub-cover.html') 300 | 301 | # A sequence of (type, uri, title) tuples for the guide element of content.opf. 302 | #epub_guide = () 303 | 304 | # HTML files that should be inserted before the pages created by sphinx. 305 | # The format is a list of tuples containing the path and title. 306 | #epub_pre_files = [] 307 | 308 | # HTML files shat should be inserted after the pages created by sphinx. 309 | # The format is a list of tuples containing the path and title. 310 | #epub_post_files = [] 311 | 312 | # A list of files that should not be packed into the epub file. 313 | epub_exclude_files = ['search.html'] 314 | 315 | # The depth of the table of contents in toc.ncx. 316 | epub_tocdepth = 2 317 | 318 | # Allow duplicate toc entries. 319 | #epub_tocdup = True 320 | 321 | # Choose between 'default' and 'includehidden'. 322 | #epub_tocscope = 'default' 323 | 324 | # Fix unsupported image types using the PIL. 325 | #epub_fix_images = False 326 | 327 | # Scale large images. 328 | #epub_max_image_width = 0 329 | 330 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 331 | #epub_show_urls = 'inline' 332 | 333 | # If false, no index is generated. 334 | #epub_use_index = True 335 | -------------------------------------------------------------------------------- /source/html.rst: -------------------------------------------------------------------------------- 1 | .. highlight:: none 2 | :linenothreshold: 1 3 | 4 | Working with HTML 5 | ================= 6 | 7 | If you ever need to extract text and data from HTML documents, the ``libxml`` 8 | parser and DOM provide very useful tools. You might imagine that ``libxml`` 9 | would only work with XHTML and even then only strictly well-formed documents. 10 | In fact, the parser has an HTML mode that handles unclosed tags like ```` 11 | and ``
`` and is even able to recover from parse errors caused by poorly 12 | formed HTML. 13 | 14 | Let's start with this mess of HTML tag soup: 15 | 16 | .. literalinclude:: /code/untidy.html 17 | :language: none 18 | 19 | To read the file in, you'd use the ``load_html()`` method rather than 20 | ``load_xml()``. You'll almost certainly want to use the ``recover => 1`` 21 | option to tell the parser to try to recover from parse errors and carry on to 22 | produce a DOM. 23 | 24 | .. literalinclude:: /code/500-html-tidy.pl 25 | :language: perl 26 | 27 | When the DOM is serialised with ``toStringHTML()``, some rudimentary formatting 28 | is applied automatically. Unfortunately there is no option to add indenting 29 | to the HTML output: 30 | 31 | .. literalinclude:: /_output/500-html-tidy.pl-out 32 | :language: none 33 | 34 | While the document is being parsed, you'll see messages like this on STDERR: 35 | 36 | .. literalinclude:: /_output/500-html-tidy.pl-err 37 | :language: none 38 | 39 | You can turn off the error output with the ``suppress_errors`` option: 40 | 41 | .. literalinclude:: /code/501-html-tidy-no-err.pl 42 | :language: perl 43 | :lines: 11-15 44 | 45 | That option doesn't seem to work with all versions of ``XML::LibXML`` so you 46 | may want to use a routine like this that sends STDERR to ``/dev/null`` during 47 | parsing, but still allows other output to STDERR when the parse function 48 | returns: 49 | 50 | .. literalinclude:: /code/510-html-no-stderr.pl 51 | :language: perl 52 | :lines: 7,17-28 53 | 54 | Querying HTML with XPath 55 | ------------------------ 56 | 57 | The main tool you'll use for extracting data from HTML is the ``findnodes()`` 58 | method that was introduced in :doc:`basics` and :doc:`xpath`. For these 59 | examples, the source HTML comes from the `CSS Zen Garden Project 60 | `_ and is in the file :download:`css-zen-garden.html 61 | `. 62 | 63 | This script locates every ``

`` element inside the ``
`` with an ``id`` 64 | attribute value of ``"zen-supporting"``: 65 | 66 | .. literalinclude:: /code/520-html-xpath-simple.pl 67 | :language: perl 68 | :lines: 9-18 69 | 70 | Output: 71 | 72 | .. literalinclude:: /_output/520-html-xpath-simple.pl-out 73 | :language: none 74 | 75 | For a more complex example, the next script iterates through each ``
  • `` in 76 | the "Select a Design" section and extracts three items of information for each: 77 | the name of the design, the name of the designer, and a link to view the 78 | design. Once the information has been collected, it is dumped out in JSON 79 | format: 80 | 81 | .. literalinclude:: /code/530-html-xpath-complex.pl 82 | :language: perl 83 | :lines: 7-33 84 | 85 | Output: 86 | 87 | .. literalinclude:: /_output/530-html-xpath-complex.pl-out 88 | :language: json 89 | 90 | In both these examples we were fortunate to be dealing with 'semantic markup' 91 | -- where sections of the document could be readily identified using ``id`` 92 | attributes. If there were no ``id`` attributes, we could change the XPath 93 | expression to select using element text content instead: 94 | 95 | .. literalinclude:: /code/531-html-xpath-no-semantic.pl 96 | :language: perl 97 | :lines: 21 98 | 99 | This XPath expression first looks for an ``

    `` element that contains the 100 | text ``'Select a Design'``. It then uses ``/..`` to find that element's 101 | parent (a ``
    `` in the example document) and then uses ``//li`` to find 102 | all ``
  • `` elements contained within the parent. 103 | 104 | Another common problem is finding that although your XPath expressions do match 105 | the content you want, they also match content you don't want -- for example 106 | from a block of navigation links. In these cases you might identify a block of 107 | uninteresting content using ``findnodes()`` and then use ``removeChild()`` to 108 | remove that whole section from the :doc:`DOM ` before running your main 109 | XPath query. Because you're only removing the nodes from the in-memory copy 110 | of the document, the original source remains unchanged. This technique is 111 | used in the :download:`spell-check script ` used 112 | to find typos in this document. 113 | 114 | Matching class names 115 | -------------------- 116 | 117 | An HTML element can have multiple classes applied to it by using a 118 | space-separated list in the ``class`` attribute. Some care is needed to ensure 119 | your XPath expressions always match one whole class name from the list. For 120 | example, if you were trying to match ``
  • `` elements with the class 121 | ``member``, you might try something like: 122 | 123 | .. literalinclude:: /code/540-html-xpath-classes.pl 124 | :language: perl 125 | :lines: 18 126 | 127 | which will match an element like this: 128 | 129 | .. literalinclude:: /code/people.html 130 | :language: html 131 | :lines: 9 132 | 133 | but it will also match an element like this: 134 | 135 | .. literalinclude:: /code/people.html 136 | :language: html 137 | :lines: 10 138 | 139 | The most common way to solve the problem is to add an extra space to the 140 | beginning and the end of the ``class`` attribute value like this: ``concat(" 141 | ", @class, " ")`` and then add spaces around the classname we're looking for: 142 | ``' member '``. Giving a expression like this: 143 | 144 | .. literalinclude:: /code/540-html-xpath-classes.pl 145 | :language: perl 146 | :lines: 25 147 | 148 | Using CSS-style selectors 149 | ------------------------- 150 | 151 | The XPath expression in the last example is an effective way to select elements 152 | by class name, but the syntax is very unwieldy compared to CSS selectors. For 153 | example, the CSS selector to match elements with the class name ``member`` 154 | would simply be: ``.member`` 155 | 156 | Wouldn't it be great if there was a way to provide a CSS selector and have it 157 | converted into an XPath expression that you could pass to ``findnodes()``? 158 | Well it turns out that's exactly what the `HTML::Selector::XPath 159 | `_ module does: 160 | 161 | .. literalinclude:: /code/580-html-css-selectors.pl 162 | :language: perl 163 | :lines: 8-9,37-41 164 | 165 | Some example inputs ("Selector") and outputs ("XPath"): 166 | 167 | .. literalinclude:: /_output/580-html-css-selectors.pl-out 168 | :language: none 169 | 170 | -------------------------------------------------------------------------------- /source/index.rst: -------------------------------------------------------------------------------- 1 | .. Perl XML::LibXML by Example documentation master file, created by 2 | sphinx-quickstart on Sun Jan 3 11:33:36 2016. 3 | 4 | Perl XML::LibXML by Example 5 | =========================== 6 | 7 | The `XML::LibXML `_ Perl module is a 8 | wrapper around the `libxml2 `_ parser library which is 9 | written in C. This tutorial uses example code to introduce the features of 10 | XML::LibXML and the ways in which you can use the module. The example 11 | scripts and XML documents are available as a :download:`ZIP file download 12 | `. 13 | 14 | Get started with :doc:`a basic example ` or jump directly to a specific 15 | topic using the Table of Contents. 16 | 17 | .. toctree:: 18 | :maxdepth: 2 19 | 20 | basics 21 | xpath 22 | dom 23 | namespaces 24 | large-docs 25 | html 26 | installation 27 | 28 | 29 | Alternate Formats 30 | ----------------- 31 | 32 | The primary target for this project is the set of HTML pages. Alternate 33 | formats are available but may be missing some elements or features which are 34 | present in the HTML: 35 | 36 | * :download:`.pdf version ` 37 | * :download:`.epub version ` 38 | 39 | Corrections and Updates 40 | ----------------------- 41 | 42 | If you spot errors in the text of this document, please `raise an issue 43 | `_ on GitHub. You are 44 | also welcome to `fork the project 45 | `_, commit a fix and raise a 46 | pull request. 47 | 48 | If you find this document useful please link to it from your blogs, tweets, 49 | Stack Overflow answers etc. The canonical URL for linking is 50 | http://grantm.github.io/perl-libxml-by-example/. 51 | 52 | 53 | Contributors 54 | ------------ 55 | 56 | In alphabetical order: 57 | 58 | * Brandon Youngdale 59 | * Grant McLean 60 | 61 | -------------------------------------------------------------------------------- /source/installation.rst: -------------------------------------------------------------------------------- 1 | 2 | Installing XML::LibXML 3 | ====================== 4 | 5 | You *can* install the XML::LibXML module using standard tools like `cpanm 6 | `_, but there 7 | are a couple of factors to consider first. Because the module wraps a C 8 | library, to install this way you must have a C compiler installed and you must 9 | have already installed the ``libxml2`` library along with its development 10 | header files. 11 | 12 | .. note:: 13 | 14 | Since version 2.0200, the XML::LibXML distribution uses a dependency on 15 | Alien::Libxml2 to install the ``libxml2`` library if your system does not 16 | already have it. So if the easier install options listed below are not 17 | suitable for your use case, you may be able to just use the normal CPAN 18 | install process: 19 | 20 | cpan install XML::LibXML 21 | 22 | There may be easier install options for your platform. 23 | 24 | Installing on Windows 25 | --------------------- 26 | 27 | Strawberry Perl 28 | ~~~~~~~~~~~~~~~ 29 | 30 | The most popular Perl distribution for Windows is `Strawberry Perl 31 | `_, which happens to include XML::LibXML in the 32 | base Perl installer. So if you have Strawberry Perl, you already have 33 | XML::LibXML. 34 | 35 | ActivePerl 36 | ~~~~~~~~~~ 37 | 38 | Another popular Perl distribution for Windows is `ActivePerl 39 | `_ from ActiveState (who also 40 | package Perl for Mac OS X, Linux and Solaris). ActivePerl includes a tool 41 | called PPM (Perl Package Manager) for installing pre-built Perl modules. You 42 | can use the PPM graphical user interface to `search for the XML::LibXML package 43 | `_ then click to select 44 | and install it. A command-line interface is also available:: 45 | 46 | ppm install XML-LibXML 47 | 48 | Installing on Linux 49 | ------------------- 50 | 51 | If you are using the system Perl binary, you can install a pre-compiled version 52 | of XML::LibXML and the underlying libxml2 library from your distribution's 53 | package archive. 54 | 55 | On systems using dpkg/apt (Debian, Ubuntu, Mint, etc.):: 56 | 57 | sudo apt-get install libxml-libxml-perl 58 | 59 | On systems using rpm/yum (RedHat, CentOS, Fedora, etc.):: 60 | 61 | sudo yum install "perl(XML::LibXML)" 62 | 63 | Manual installation 64 | ~~~~~~~~~~~~~~~~~~~ 65 | 66 | If for some reason you want to compile and install a version of XML::LibXML 67 | directly from CPAN, you must first install both the ``libxml2`` library and 68 | the header files for linking against the library. The easiest way to do this 69 | is to use your distribution's packages. For example on Debian:: 70 | 71 | sudo apt-get install libxml2 libxml2-dev 72 | 73 | You can test that the library is correctly installed and your PATH is set up 74 | correctly with this command:: 75 | 76 | xml2-config --version 77 | 78 | For more information about manual builds, refer to the README file in the 79 | `XML::LibXML distribution `_. 80 | 81 | Installing on Mac OS X 82 | ---------------------- 83 | 84 | You can install the ``libxml2`` library using homebrew:: 85 | 86 | brew install libxml2 87 | 88 | If you do not have Homebrew, you can install it at the `homebrew website 89 | `_. 90 | 91 | Once you have the ``libxml2`` library installed, you can install the 92 | XML::LibXML Perl module using standard tools such as ``cpan`` or ``cpanm``. 93 | 94 | -------------------------------------------------------------------------------- /source/large-docs.rst: -------------------------------------------------------------------------------- 1 | 2 | .. highlight:: none 3 | :linenothreshold: 1 4 | 5 | Working With Large Documents 6 | ============================ 7 | 8 | The examples so far have all started by creating a data structure called a 9 | :doc:`Document Object Model ` to represent the whole XML document. Using 10 | :doc:`XPath expressions ` to navigate the DOM can be both powerful and 11 | convenient, but the cost in memory consumption can be quite high. For example, 12 | parsing a 50MB XML file into a DOM might need 500MB of memory. 13 | 14 | If you routinely work with very large XML documents, you might find that 15 | ``XML::LibXML``'s DOM parser wants to consume more memory than your system has 16 | installed. In such cases, you can instead use the 'pull parser' API which 17 | is accessed via the ``XML::LibXML::Reader`` interface. 18 | 19 | 20 | The Reader Loop 21 | --------------- 22 | 23 | To gain a better understanding of how the reader API is used, let's start by 24 | seeing what happens when we parse this very simple XML document: 25 | 26 | .. literalinclude:: /code/country.xml 27 | :language: xml 28 | :linenos: 29 | 30 | This script loads the reader API and parses the XML file: 31 | 32 | .. literalinclude:: /code/700-reader-events.pl 33 | :language: perl 34 | 35 | and produces the following output: 36 | 37 | .. literalinclude:: /_output/700-reader-events.pl-out 38 | :language: none 39 | 40 | We can see from the output that the ``while`` loop executes 11 times. As the 41 | XML document is parsed, the ``$reader`` object acts as a cursor advancing 42 | through the document. Each time a 'node' has been parsed, the ``read`` 43 | method returns to allow the state of the parse and the current node to be 44 | interrogated. 45 | 46 | To make sense of it we really need to turn those 'Node Type' numbers into 47 | something a bit more readable. The ``XML::LibXML::Reader`` module exports a 48 | set of constants for this purpose. Here's a modified version of the script: 49 | 50 | .. literalinclude:: /code/710-reader-named-events.pl 51 | :language: perl 52 | 53 | that produces the following tidier output: 54 | 55 | .. literalinclude:: /_output/710-reader-named-events.pl-out 56 | :language: none 57 | :name: linked-events 58 | 59 | .. role:: linked-prompt 60 | 61 | from the same XML :linked-prompt:`\ ` : 62 | 63 | .. literalinclude:: /code/country.xml 64 | :language: none 65 | :linenos: 66 | :name: linked-nodes 67 | 68 | Some things to note: 69 | 70 | * At step 1, when the ``read`` method returns for the first time, the cursor 71 | has advanced to the closing '>' of the ```` start tag. We could 72 | retrieve an attribute value by calling ``$reader->getAttribute('code')`` but 73 | we can't examine child elements or text nodes because the parser has not seen 74 | them yet. 75 | 76 | * At step 2, the parser has processed a chunk of text and found that it 77 | contains only whitespace (side note: all whitespace is considered to be 78 | 'significant' unless a DTD is loaded and defines which whitespace is 79 | insignificant). Although we can get access to the text, the ``$reader`` 80 | object can no longer tell us that it is a child of a ```` element - 81 | the parser has discarded that information already. 82 | 83 | * At step 3, the parser can tell us the current node is a ```` element, 84 | and the ``depth`` method can tell us that there is one ancestor element. 85 | However there is no way to determine the name of the parent element. 86 | 87 | * At step 4 a text node has been identified and we can call ``$reader->value`` 88 | to get the text string ``"Ireland"``, but the parser can no longer tell us 89 | the name of the element it belongs to. 90 | 91 | * At step 5 we have reached the end of the ```` element, but we no longer 92 | have access to the text it contained. 93 | 94 | But now you surely get the idea - the ``XML::LibXML::Reader`` API is able to 95 | keep its memory requirements low by discarding data from one parse step before 96 | proceeding to the next. The vastly lowered memory demands come at the cost of 97 | significantly lowered convenience for the programmer. However, as we'll see in 98 | the next section, there is a middle ground that can provide the convenience of 99 | the DOM API combined with the reduced memory usage of the Reader API. 100 | 101 | Bring Back the DOM 102 | ------------------ 103 | 104 | Huge XML documents usually contain a long list of similar elements. For 105 | example Wikipedia make XML 'dumps' available 106 | `for download `_. 107 | 108 | At the time of writing, the ``enwiki-latest-abstract1.xml.gz`` file was about 109 | 100MB in size - about 800MB uncompressed. However it contained information 110 | summarising over half a million Wikipedia articles. So whilst the file is very 111 | large, the ```` elements describing each article are, on average, less 112 | than 1.5KB. The following extract is reformatted for clarity to illustrate 113 | the file structure: 114 | 115 | .. literalinclude:: /code/enwiki-latest-abstract1-structure.xml 116 | :language: xml 117 | :linenos: 118 | 119 | To process this file, we can use the Reader API to locate each ```` 120 | element and then parse that element *and all its children* into a DOM fragment. 121 | We can then use the familiar and convenient XPath tools and DOM methods to 122 | process each fragment. 123 | 124 | Another useful technique when working with large files is to leave the files in 125 | their compressed form and use a Perl IO layer to decompress them on the fly. 126 | You can achieve this using the `PerlIO::gzip 127 | `_ module from CPAN. 128 | 129 | To illustrate these techniques, the following script uses the Reader API to 130 | pick out each ```` element and slurp it into a DOM fragment. Then XPath 131 | queries are used to examine the child nodes and determine if the ```` is 132 | 'interesting' - does it have a sub-heading that contains variant of the word 133 | "controversy"? Uninteresting elements are skipped, interesting elements are 134 | reported in summary form: article title, interesting subheading, URL. 135 | 136 | .. literalinclude:: /code/720-seek-controversy.pl 137 | :language: perl 138 | 139 | In the script above, ``$doc`` is a DOM fragment that can be queried and 140 | manipulated using the DOM methods described in earlier chapters. 141 | 142 | At the start of the ``while`` loop, a couple of conditional ``next`` statements 143 | allow skipping quickly to the start of the next ```` element. Depending 144 | on the document you're dealing with, you might need to also use the ``depth`` 145 | method to avoid deeply nested elements that also happened to be named "doc". 146 | 147 | The call to ``$reader->copyCurrentNode(1)`` creates a DOM fragment from the 148 | current element. The ``1`` passed as an argument is a boolean flag that causes 149 | all child elements to be included. 150 | 151 | In order to build the DOM fragment, the ``$reader`` has to process all content 152 | up to the matching ``XML_READER_TYPE_END_ELEMENT`` node. You may be surprised 153 | to learn that this does not advance the cursor. So the next call to 154 | ``$reader->read`` will advance to the first child node of the current 155 | ````. In our case, that would be a waste of time - there is no need to 156 | use the Reader API to re-process the child nodes that we already processed with 157 | the DOM API. Therefore after processing a ````, we call ``$reader->next`` 158 | to skip directly to the node following the matching ```` end tag. When 159 | this script was used to process the full-sized file, adding this call to 160 | ``next`` reduced the run time by almost 50%. 161 | 162 | When processing files with millions of elements, a small optimisation in the 163 | main loop can make a noticeable difference to the run time. For example, 164 | building the DOM fragment is a relatively expensive operation. The call to 165 | ``$reader->copyCurrentNode(1)`` is equivalent to: 166 | 167 | .. literalinclude:: /code/721-seek-controversy-variants.pl 168 | :language: perl 169 | :lines: 39-40 170 | 171 | As an optimisation, we can avoid the step of building the DOM fragment if a 172 | quick regex check of the source XML tells us that it doesn't contain the word 173 | we're going to look for with the XPath query. This rewritten main loop shave 174 | about 20% off the run time: 175 | 176 | .. literalinclude:: /code/721-seek-controversy-variants.pl 177 | :language: perl 178 | :lines: 18-34 179 | 180 | Error Handling 181 | -------------- 182 | 183 | Error handling is a little different with the Reader API vs the DOM API. The 184 | DOM API will parse the whole document and throw an exception immediately if it 185 | encounters and error in the XML. So if there's an error you won't get a DOM. 186 | 187 | The Reader API on the other hand will start returning nodes to your script via 188 | ``$reader->read`` as soon as the parsing starts [#f1]_. If there is an error in your 189 | document, you won't know until your parser reaches the error - then you'll get 190 | the exception. 191 | 192 | You need to bear this in mind when parsing with the Reader API. For example if 193 | you were reading elements to populate records in a database, you might want to 194 | wrap all the database INSERT statement in a transaction so that you can roll 195 | them all back if you encounter a parse error. 196 | 197 | Another useful technique is to parse the document twice, once to check the XML 198 | is well-formed and once to actually process it. The ``finish`` method provides 199 | a quick way to parse from the current position to the end of the document: 200 | 201 | .. literalinclude:: /code/730-reader-parse-error.pl 202 | :language: perl 203 | :lines: 13-14 204 | 205 | You'll then need to reopen the file and create a new Reader object for the 206 | second parse. 207 | 208 | In some applications you might scan through the file looking for a specific 209 | section. Once the target has been located and the required information 210 | extracted, you might not need to look at any more elements. However as we've 211 | seen, you should call ``finish`` to ensure there are no errors in the rest of 212 | the XML. 213 | 214 | Working With Patterns 215 | --------------------- 216 | 217 | Our sample script is identifying elements at the top of the main loop by 218 | examining the node type and the node name: 219 | 220 | .. literalinclude:: /code/721-seek-controversy-variants.pl 221 | :language: perl 222 | :lines: 20-22 223 | 224 | Although these are simple checks, they do still involve two method calls and 225 | passing scalar values across the XS boundary between ``libxml`` and the Perl 226 | runtime. An alternative approach is to compile a 'pattern' (essentially a 227 | simplified subset of XPath) using `XML::LibXML::Pattern 228 | `_ and run a complex set of 229 | checks with a single method call: 230 | 231 | .. literalinclude:: /code/721-seek-controversy-variants.pl 232 | :language: perl 233 | :lines: 42-44 234 | 235 | In our example, the ```` elements that we're interested in are all 236 | adjacent, so when we finish processing one, the very next element is another 237 | ````. If your document is not structured this way, you might find it 238 | useful to skip over large sections of document to find the next element that 239 | matches a pattern, like this: 240 | 241 | .. literalinclude:: /code/721-seek-controversy-variants.pl 242 | :language: perl 243 | :lines: 46 244 | 245 | You can also use patterns with the ``preservePattern`` method to create a DOM 246 | subset of a larger document. For example: 247 | 248 | .. literalinclude:: /code/750-titles-only.pl 249 | :language: perl 250 | :lines: 12-18 251 | 252 | Which will produce this output: 253 | 254 | .. literalinclude:: /_output/750-titles-only.pl-out 255 | :language: none 256 | 257 | Note, this technique does construct the DOM in memory and then serialise it at 258 | the end, so if you have a huge document and many nodes match the pattern then 259 | you will consume a large amount of memory. 260 | 261 | .. rubric:: Footnotes 262 | 263 | .. [#f1] 264 | 265 | In practice, the Reader API will read the XML in chunks and check each 266 | chunk is well-formed before it starts delivering node events. This means 267 | that a short document with an error may trigger an exception before any 268 | nodes have been delivered. 269 | -------------------------------------------------------------------------------- /source/namespaces.rst: -------------------------------------------------------------------------------- 1 | .. highlight:: none 2 | :linenothreshold: 1 3 | 4 | Working with XML Namespaces 5 | =========================== 6 | 7 | Using the ``findnodes()`` method as described in the 8 | :doc:`basic examples ` section doesn't work when the XML document 9 | uses 'namespaces'. This section describes the extra steps you need to take 10 | to work with namespaces in XML. 11 | 12 | XML 'namespaces' allow you to build documents using elements from more than one 13 | vocabulary. For example one XML document might include both SVG elements to 14 | describe a drawing, as well as Dublin Core elements to define metadata *about* 15 | the drawing. The two different vocabularies are defined by separate bodies - 16 | the `W3C `_ and the `DCMI 17 | `_ respectively. Associating each 18 | element in your document with a namespace allows a processor to distinguish 19 | elements that use the same element names. 20 | 21 | The scripts in this section will use the SVG document: 22 | :download:`xml-libxml.svg `. Which starts like this: 23 | 24 | .. literalinclude:: /code/xml-libxml.svg 25 | :language: xml 26 | :lines: 1-21 27 | 28 | Because the top-level ```` element uses 29 | ``xmlns="http://www.w3.org/2000/svg"`` to declare a **default namespace** , 30 | every other element will be in that namespace unless the element name includes 31 | a prefix for a different namespace, or unless an element declares a different 32 | default namespace for itself and its children. 33 | 34 | The first child element in the document is a ```` element with no 35 | namespace prefix, so it is associated with the default namespace URI: 36 | ``http://www.w3.org/2000/svg``. 37 | 38 | .. literalinclude:: /code/xml-libxml.svg 39 | :language: xml 40 | :lines: 22 41 | 42 | A later section of the document includes a ```` element with the `dc:` 43 | namespace prefix, so it is associated with the URI: 44 | ``http://purl.org/dc/elements/1.1/``. 45 | 46 | .. literalinclude:: /code/xml-libxml.svg 47 | :language: xml 48 | :lines: 127 49 | 50 | You can confirm using the XPath sandbox that the XPath expression ``//title`` 51 | does not match either of the ``<title>`` elements in the test document: 52 | 53 | .. xpath-try:: //title 54 | :filename: xml-libxml.svg 55 | 56 | You can also use the following Perl code to confirm that ``findnodes()`` does 57 | not return any matches for the XPath expression ``//title``: 58 | 59 | .. literalinclude:: /code/600-ns-no-context.pl 60 | :language: perl 61 | :lines: 13-14 62 | 63 | Output: 64 | 65 | .. literalinclude:: /_output/600-ns-no-context.pl-out 66 | :language: none 67 | :lines: 1 68 | 69 | When an element in a document is associated with a namespace URI it will only 70 | match an XPath expression that includes a prefix that is also associated with 71 | the same namespace URI. However it's important to stress that it's not the 72 | prefix that is being matched, but the URI associated with the prefix. 73 | 74 | Using the XPath sandbox, you can confirm that if we register the 'Dublin Core' 75 | namespace URI with the prefix ``dc``, the XPath expression ``//dc:title`` will 76 | match the ``<title>`` element in the ``<metadata>`` section: 77 | 78 | .. xpath-try:: //dc:title 79 | :filename: xml-libxml.svg 80 | :ns_args: xmlns:dc=http://purl.org/dc/elements/1.1/ 81 | 82 | However if we register the same URI with the prefix ``dublin`` instead then 83 | we can match the same element using the ``dublin`` prefix in our XPath: 84 | 85 | .. xpath-try:: //dublin:title 86 | :filename: xml-libxml.svg 87 | :ns_args: xmlns:dublin=http://purl.org/dc/elements/1.1/ 88 | 89 | In order to associate namespace prefixes in XPath expressions with namespace 90 | URIs, we need to use an `XML::LibXML::XPathContext 91 | <https://metacpan.org/pod/XML::LibXML::XPathContext>`_ object. This is a 92 | multi-step process: 93 | 94 | #. create an XPathContext object associated with the document you want to search 95 | 96 | #. register a prefix and associated URI for each namespace you want to include 97 | in your query 98 | 99 | #. call the ``findnodes()`` method on the XPathContext object rather than 100 | directly on the DOM object 101 | 102 | .. literalinclude:: /code/610-ns-xpc.pl 103 | :language: perl 104 | :lines: 7-21 105 | 106 | Output: 107 | 108 | .. literalinclude:: /_output/610-ns-xpc.pl-out 109 | :language: none 110 | 111 | You'll recall from earlier examples that you can search within a node by 112 | calling ``findnodes()`` on the element node (rather than the document) and 113 | using an XPath expression like ``./child`` where the dot refers to the 114 | *context* node. However when you're dealing with namespaces that won't work, 115 | because you need to call ``findnodes()`` on the XPathContext object. The 116 | solution is to pass ``findnodes()`` a second argument, after the XPath 117 | expression. The additional argument is the element to use as a context node: 118 | 119 | .. literalinclude:: /code/620-ns-child-nodes.pl 120 | :language: perl 121 | :lines: 7-23 122 | 123 | Output: 124 | 125 | .. literalinclude:: /_output/620-ns-child-nodes.pl-out 126 | :language: none 127 | 128 | One small feature of that script which is worth noting is the use of 129 | ``$el->localname`` to get the name of the element *without* the namespace 130 | prefix. The more commonly used ``$el->nodeName`` method does include the 131 | namespace prefix as it appears in the document. 132 | 133 | -------------------------------------------------------------------------------- /source/sphinx-ext/plxbe.py: -------------------------------------------------------------------------------- 1 | from docutils import nodes 2 | from docutils.parsers.rst import directives 3 | from sphinx.util.compat import Directive 4 | from urllib import quote 5 | 6 | def setup(app): 7 | """ 8 | Add an 'xpath-try' custom directive. For HTML output, this will render the 9 | XPath expression as a code block but with the addition of a link try the 10 | expression in the XPath Sandbox. For non-HTML output, a standard code 11 | block is emitted with no link. 12 | """ 13 | app.add_node(xpath_try, 14 | html=(visit_xpath_try_node_html, depart_xpath_try_node), 15 | latex=(visit_xpath_try_node, depart_xpath_try_node), 16 | text=(visit_xpath_try_node, depart_xpath_try_node)) 17 | 18 | app.add_directive('xpath-try', XPathTryDirective) 19 | 20 | return {'version': '1.0'} # version of this extension 21 | 22 | 23 | class xpath_try(nodes.General, nodes.FixedTextElement): 24 | pass 25 | 26 | def visit_xpath_try_node(self, node): 27 | self.visit_literal_block(node) 28 | 29 | def visit_xpath_try_node_html(self, node): 30 | self.body.append( 31 | '<p><code class="code xpath-try docutils literal"><span class="pre">' 32 | + self.encode(node.rawsource) 33 | + '</span><a class="xpath-try-it" href="' + node['url'] 34 | + '">Try it!</a></code></p>' 35 | ) 36 | raise nodes.SkipNode 37 | 38 | def depart_xpath_try_node(self, node): 39 | self.depart_literal_block(node) 40 | 41 | def unescape_arg(s): 42 | """ 43 | This function is used to undo backslash-escaping. While it's not clear that 44 | arguments to a directive ought to be escaped, vim syntax highlighting seems 45 | to get confused by '*' in arguments. For this reason, '\*' has been used in 46 | the rst source and this function turns '\_' into '_' for any value of '_'. 47 | The standard docutils.utils.unescape seems to do something else entirely. 48 | """ 49 | a = [ '\\' if part == '' else part for part in s.split('\\') ] 50 | return ''.join(a) 51 | 52 | class XPathTryDirective(Directive): 53 | has_content = False 54 | required_arguments = 1 55 | optional_arguments = 0 56 | final_argument_whitespace = True 57 | option_spec = { 58 | 'filename': directives.unchanged, 59 | 'ns_args': directives.unchanged, 60 | } 61 | 62 | def run(self): 63 | xpath_expr = unescape_arg(self.arguments[0]) 64 | node = xpath_try(xpath_expr, xpath_expr) 65 | node['language'] = 'none' 66 | node['highlight_args'] = {} 67 | node['linenos'] = False 68 | url = '_static/xpath-sandbox/xpath-sandbox.html?q=' 69 | url += quote(xpath_expr, safe='/') 70 | if 'filename' in self.options: 71 | url += ';filename=' + self.options.get('filename') 72 | if 'ns_args' in self.options: 73 | url += ';' + self.options.get('ns_args') 74 | node['url'] = url 75 | return [node] 76 | 77 | -------------------------------------------------------------------------------- /source/xpath.rst: -------------------------------------------------------------------------------- 1 | .. highlight:: none 2 | :linenothreshold: 1 3 | 4 | XPath Expressions 5 | ================= 6 | 7 | As you saw in the :doc:`basic examples <basics>` section, the ``findnodes()`` 8 | method takes an XPath expression and finds nodes in the :doc:`DOM <dom>` that 9 | match the expression. There are two ways to call calling the ``findnodes()`` 10 | method: 11 | 12 | * on the object representing the whole document, or 13 | 14 | * on an element from the DOM - the element on which you call the method is 15 | called the context element 16 | 17 | If your XPath expression starts with a '/' then the search will start at 18 | top-most element in the document - even if you call ``findnodes()`` on a 19 | different context element. 20 | 21 | Start your XPath expression with '.' to search down through the children of the 22 | context element. 23 | 24 | The remainder of this section simply includes examples of XPath expressions and 25 | descriptions of what they match. 26 | 27 | .. note:: 28 | 29 | You can try out different XPath expressions in the `XPath sandbox 30 | <http://grantm.github.io/perl-libxml-by-example/_static/xpath-sandbox/xpath-sandbox.html>`_. 31 | The sandbox doesn't actually use Perl or libxml, it simply uses Javascript 32 | to access the XPath engine built into your browser. However, the 33 | expression matching should work just as it would in your Perl scripts. 34 | 35 | .. role:: xpath(code) 36 | 37 | .. xpath-try:: /playlist 38 | 39 | Match the top-most element of the document if (and *only if*) it is a 40 | ``<playlist>`` element. 41 | 42 | .. xpath-try:: //title 43 | 44 | Match every ``<title>`` element in the document. 45 | 46 | .. xpath-try:: //movie/title 47 | 48 | Match every ``<title>`` element that is the direct child of a ``<movie>`` 49 | element. 50 | 51 | :xpath:`./title` 52 | 53 | Match every ``<title>`` element that is the direct child of the context 54 | element, e.g.: 55 | 56 | .. literalinclude:: /code/100-xpath-examples 57 | :language: perl 58 | :lines: 13-15 59 | 60 | .. xpath-try:: //title/.. 61 | 62 | Match any element which is the parent of a ``<title>`` element. 63 | 64 | .. xpath-try:: /\* 65 | 66 | Match the top-most element of the document regardless of the element name. 67 | 68 | .. xpath-try:: //person/@role 69 | 70 | Match the attribute named ``role`` on every ``<person>`` element. 71 | 72 | .. xpath-try:: //person/@\* 73 | 74 | Match every attribute on every ``<person>`` element. 75 | 76 | .. xpath-try:: //person[@role] 77 | 78 | Match every ``<person>`` element *that has an attribute* named ``role``. 79 | 80 | .. xpath-try:: //\*[@url] 81 | 82 | Match every element that has an attribute named ``url``. 83 | 84 | .. xpath-try:: //\*[@\*] 85 | 86 | Match every element that has an attribute of any name. 87 | 88 | .. xpath-try:: /playlist//\*[not(@\*)] 89 | 90 | Match every element that is a descendant of the top-level ``<playlist>`` 91 | element and which does not have any attributes. 92 | 93 | .. xpath-try:: //movie[@id="tt0307479"] 94 | 95 | Match every ``<movie>`` element that has an attribute named ``id`` with the 96 | value ``tt0307479``. 97 | 98 | .. xpath-try:: //movie[not(@id="tt0307479")] 99 | 100 | Match every ``<movie>`` element that does not have an attribute named 101 | ``id`` with the value ``tt0307479`` (including elements that do not have 102 | an ``id`` attribute at all). 103 | 104 | .. xpath-try:: //\*[@id="tt0307479"] 105 | 106 | Match every element that has an attribute named ``id`` with the value 107 | ``tt0307479``. 108 | 109 | .. xpath-try:: //movie[@id="tt0307479"]//synopsis 110 | 111 | Match every ``synopsis`` element within every ``<movie>`` element that has 112 | an attribute named ``id`` with the value ``tt0307479``. 113 | 114 | .. xpath-try:: //person[position()=2] 115 | 116 | Match the second ``<person>`` element in each sequence of adjacent 117 | ``<person>`` elements. Note that the first element in a sequence is at 118 | position 1 not 0. 119 | 120 | .. xpath-try:: //person[2] 121 | 122 | This is simply a shorthand form of the ``position()=2`` expression above. 123 | 124 | .. xpath-try:: //person[position()<3] 125 | 126 | Match the first two ``<person>`` elements in each sequence of adjacent 127 | ``<person>`` elements. 128 | 129 | .. xpath-try:: //person[last()] 130 | 131 | Match the last ``<person>`` element in each sequence of adjacent 132 | ``<person>`` elements. 133 | 134 | .. xpath-try:: //cast[count(person)=3] 135 | 136 | Match every ``<cast>`` element which contains exactly 3 ``<person>`` 137 | elements. 138 | 139 | .. xpath-try:: //\*[name()='genre'] 140 | 141 | Match every element with the name ``genre`` - exactly equivalent to 142 | ``//genre``. 143 | 144 | .. xpath-try:: //\*[starts-with(name(), 'running')] 145 | 146 | Match every element with a name starting with the word ``running``. 147 | 148 | .. xpath-try:: //person[contains(@name, 'Matt')] 149 | 150 | Match every ``<person>`` element that has an attribute named ``name`` 151 | which contains the text ``Matt`` anywhere in the attribute value. 152 | 153 | .. xpath-try:: //person[contains(@name, 'matt')] 154 | 155 | Same as above except for the casing of the text to match. Matching is 156 | case-sensitive. 157 | 158 | .. xpath-try:: //person[not(contains(@name, 'e'))] 159 | 160 | Match every ``<person>`` element that has an attribute named ``name`` 161 | which does not contain the letter ``e`` anywhere in the attribute value. 162 | 163 | .. xpath-try:: //person[starts-with(@name, 'K')] 164 | 165 | Match every ``<person>`` element that has an attribute named ``name`` with 166 | a value that starts with the letter ``K``. 167 | 168 | .. xpath-try:: //director/text() 169 | 170 | Match every text node which is a direct child of a ``<director>`` element. 171 | 172 | .. xpath-try:: //cast/text() 173 | 174 | Match every text node which is a direct child of a ``<cast>`` element. 175 | You might imagine that this would not match anything, since in the sample 176 | document the ``<cast>`` elements contain only ``<person>`` elements. But 177 | if you look carefully, you'll see that in between each ``<person>`` element 178 | there is some whitespace - a newline after the preceding element and then 179 | some spaces at the start of the next line. This whitespace is text and is 180 | therefore matched. 181 | 182 | .. xpath-try:: //person[contains(@name,'Matt')]/parent::\* 183 | 184 | Match the parent of every ``<person>`` element which contains ``Matt`` in 185 | the ``name`` attribute. (You could also use ``/..`` for the parent). The 186 | syntax ``parent::*`` means any element on the parent axis. 187 | 188 | .. xpath-try:: //person[contains(@name,'Matt')]/ancestor::movie 189 | 190 | Match every ``<movie>`` element which is an ancestor of a ``<person>`` 191 | element which contains ``Matt`` in the ``name`` attribute. The syntax 192 | ``ancestor::*`` means any element on the ancestor axis. 193 | 194 | .. xpath-try:: //genre[text()='drama']/following-sibling::\* 195 | 196 | Match every element of any name, which is a sibling of a ``<genre>`` 197 | element whose complete text content is ``drama`` and which follows that 198 | element in document order. 199 | 200 | .. xpath-try:: //genre[text()='drama']/following-sibling::genre 201 | 202 | Match every ``<genre>`` element, which is a sibling of a ``<genre>`` 203 | element whose complete text content is ``drama`` and which follows that 204 | element in document order. 205 | 206 | .. xpath-try:: //genre[text()='drama']/preceding-sibling::genre 207 | 208 | Match every ``<genre>`` element, which is a sibling of a ``<genre>`` 209 | element whose complete text content is ``drama`` and which comes before 210 | that element in document order. 211 | 212 | .. xpath-try:: //movie[@id="tt0112384"]/following::title 213 | 214 | Match every ``<title>`` element, which comes after a ``<movie>`` element 215 | with ``tt0112384`` as the value of the ``id`` attribute. Note that 'after' 216 | means after the closing tag so a ``<title>`` element *inside* the matching 217 | ``<movie>`` would not be included. 218 | 219 | .. xpath-try:: //movie[.//score/text() < 7.5] 220 | 221 | Match every ``<movie>`` element which contains a ``<score>`` element with 222 | text content numerically less than 7.5. 223 | 224 | .. xpath-try:: //movie[.//score/text() > 8.0]//synopsis 225 | 226 | Match every ``<synopsis>`` element in every ``<movie>`` element which 227 | contains a ``<score>`` element with text content numerically greater than 228 | 8.0. 229 | 230 | .. xpath-try:: //director or //genre 231 | 232 | Match every element which is a ``<director>`` or a ``<genre>``. 233 | 234 | .. xpath-try:: //person[contains(@name, 'Bill') and contains(@role, 'Fred')] 235 | 236 | Match every ``<person>`` element which contains ``Bill`` in the ``name`` 237 | attribute **and** contains ``Fred`` in the role attribute. 238 | 239 | .. xpath-try:: //person[@name='Kevin Bacon']/../person[@name!='Kevin Bacon'] 240 | 241 | Find every person who has played alongside Kevin Bacon. First find every 242 | ``<person>`` element with a name attribute equal to ``Kevin Bacon``. Then 243 | find the parent of each matching element and look for its child 244 | ``<person>`` elements with a name attribute which is not equal to ``Kevin 245 | Bacon``. 246 | 247 | XPath Functions 248 | --------------- 249 | 250 | Some of the examples above used `XPath functions 251 | <https://developer.mozilla.org/en-US/docs/Web/XPath/Functions>`_. It's worth 252 | noting that the underlying libxml2 library only supports XPath version 1.0 and 253 | there are `no plans to support 2.0 254 | <http://www.mail-archive.com/xml@gnome.org/msg04082.html>`_. 255 | 256 | XPath 1.0 does not include the ``lower-case()`` or ``upper-case()`` functions, 257 | so nasty workarounds like this are required if you need case-insensitive 258 | matching: 259 | 260 | .. literalinclude:: /code/110-case-insensitive-xpath-1 261 | :language: perl 262 | :lines: 13-28 263 | 264 | Alternatively, you can use the Perl API to `register custom XPath functions 265 | <https://metacpan.org/pod/distribution/XML-LibXML/lib/XML/LibXML/XPathContext.pod#Custom-XPath-functions>`_. 266 | 267 | -------------------------------------------------------------------------------- /xpath-sandbox: -------------------------------------------------------------------------------- 1 | source/_static/xpath-sandbox --------------------------------------------------------------------------------