├── .github └── workflows │ └── test.yml ├── .gitignore ├── LICENSE ├── MANIFEST.in ├── README.md ├── djxml ├── __init__.py ├── build │ ├── __init__.py │ └── rst_converter.py ├── tests │ ├── __init__.py │ ├── data │ │ ├── atom2rss.xsl │ │ └── atom_feed.xml │ ├── settings.py │ ├── test_advanced_example.py │ ├── test_examples.py │ └── xmlmodels.py └── xmlmodels │ ├── __init__.py │ ├── base.py │ ├── decorators.py │ ├── descriptors.py │ ├── exceptions.py │ ├── fields.py │ ├── loading.py │ ├── options.py │ ├── related.py │ ├── signals.py │ └── utils.py ├── docs └── advanced_example.md ├── runtests.py ├── setup.cfg ├── setup.py └── tox.ini /.github/workflows/test.yml: -------------------------------------------------------------------------------- 1 | name: Test 2 | 3 | on: push 4 | 5 | jobs: 6 | build: 7 | runs-on: ubuntu-latest 8 | strategy: 9 | matrix: 10 | python-version: [3.7, 3.8] 11 | 12 | steps: 13 | - uses: actions/checkout@v1 14 | - name: Set up Python ${{ matrix.python-version }} 15 | uses: actions/setup-python@v2 16 | with: 17 | python-version: ${{ matrix.python-version }} 18 | - name: Install dependencies 19 | run: | 20 | python -m pip install --upgrade pip 21 | pip install tox tox-gh-actions 22 | - name: Test with tox 23 | run: tox 24 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /build 2 | /dist 3 | /*.egg-info 4 | *.pyc 5 | *.log 6 | *.egg 7 | *.db 8 | *.pid 9 | pip-log.txt 10 | .DS_Store 11 | /docs/_build 12 | /out 13 | .tox 14 | /venv 15 | .python-version 16 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | This software is published under the BSD 2-Clause License as listed below. 2 | http://www.opensource.org/licenses/bsd-license.php 3 | 4 | Copyright (c) 2014-2019, Atlantic Media 5 | All rights reserved. 6 | 7 | Redistribution and use in source and binary forms, with or without 8 | modification, are permitted provided that the following conditions are met: 9 | 10 | * Redistributions of source code must retain the above copyright notice, this 11 | list of conditions and the following disclaimer. 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 17 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include LICENSE 2 | include README.rst 3 | include README.md 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # django-xml 2 | 3 | [![Test Status](https://github.com/theatlantic/django-xml/actions/workflows/test.yml/badge.svg)](https://github.com/theatlantic/django-xml/actions) 4 | 5 | **django-xml** is a python module which provides an abstraction to 6 | [lxml](http://lxml.de/)'s XPath and XSLT functionality in a manner resembling 7 | django database models. 8 | 9 | ## Note 10 | 11 | * Version 2.0 drops support for Django < 1.11 12 | * Version 2.0.1 drops support for Python 3.4 13 | * Version 3.0 adds support for Django>=2.2, drops support for Python < 3.7 14 | 15 | 16 | ## Contents 17 | 18 | * [Installation](#installation) 19 | * [Example](#example) 20 | * [Advanced Example](#advanced-example) 21 | * [XmlModel Meta options](#xmlmodel-meta-options) 22 | * [namespaces](#namespacesoptionsnamespaces--) 23 | * [parser_opts](#parser_optsoptionsparser_opts--) 24 | * [extension_ns_uri](#extension_ns_urioptionsextension_ns_uri) 25 | * [@lxml_extension reference](#lxml_extension-reference) 26 | * [ns_uri](#ns_uri) 27 | * [name](#name) 28 | * [XPathField options](#xpathfield-options) 29 | * [xpath_query](#xpath_queryxpathfieldxpath_query) 30 | * [required](#requiredxpathfieldrequired) 31 | * [extra_namespaces](#extra_namespacesxpathfieldextra_namespaces) 32 | * [extensions](#extensionsxpathfieldextensions) 33 | * [XPathSingleNodeField options](#xpathsinglenodefield-options) 34 | * [ignore_extra_nodes](#ignore_extra_nodesxpathsinglenodefieldignore_extra_nodes--false) 35 | * [XsltField options](#xsltfield-options) 36 | * [xslt_file, xslt_string](#xslt_file-xslt_stringxsltfieldxslt_filexsltfieldxslt_string) 37 | * [parser](#parserxsltfieldparser) 38 | * [extensions](#extensionsxsltfieldextensions--) 39 | * [XmlModel field reference](#xmlmodel-field-reference) 40 | 41 | ## Installation 42 | 43 | To install the latest stable release of django-xml, use pip or easy_install 44 | 45 | ```bash 46 | pip install django-xml 47 | easy_install django-xml 48 | ``` 49 | 50 | For the latest development version, install from source with pip: 51 | 52 | ```bash 53 | pip install -e git+git://github.com/theatlantic/django-xml#egg=django-xml 54 | ``` 55 | 56 | If the source is already checked out, install via setuptools: 57 | 58 | ```bash 59 | python setup.py develop 60 | ``` 61 | 62 | ## Example 63 | 64 | ```python 65 | import math 66 | from djxml import xmlmodels 67 | 68 | class NumbersExample(xmlmodels.XmlModel): 69 | 70 | class Meta: 71 | extension_ns_uri = "urn:local:number-functions" 72 | namespaces = {"fn": extension_ns_uri,} 73 | 74 | all_numbers = xmlmodels.XPathIntegerListField("//num") 75 | even_numbers = xmlmodels.XPathIntegerListField("//num[fn:is_even(.)]") 76 | sqrt_numbers = xmlmodels.XPathFloatListField("fn:sqrt(//num)") 77 | 78 | @xmlmodels.lxml_extension 79 | def is_even(self, context, number_nodes): 80 | numbers = [getattr(n, 'text', n) for n in number_nodes] 81 | return all([bool(int(num) % 2 == 0) for num in numbers]) 82 | 83 | @xmlmodels.lxml_extension 84 | def sqrt(self, context, number_nodes): 85 | sqrts = [] 86 | for number_node in number_nodes: 87 | number = getattr(number_node, 'text', number_node) 88 | sqrts.append(repr(math.sqrt(int(number)))) 89 | return sqrts 90 | 91 | 92 | def main(): 93 | numbers_xml = u""" 94 | 95 | 1 96 | 2 97 | 3 98 | 4 99 | 5 100 | 6 101 | 7 102 | """ 103 | 104 | example = NumbersExample.create_from_string(numbers_xml) 105 | 106 | print "all_numbers = %r" % example.all_numbers 107 | print "even_numbers = %r" % example.even_numbers 108 | print "sqrt_numbers = [%s]" % ', '.join(['%.3f' % n for n in example.sqrt_numbers]) 109 | # all_numbers = [1, 2, 3, 4, 5, 6, 7] 110 | # even_numbers = [2, 4, 6] 111 | # sqrt_numbers = [1.000, 1.414, 1.732, 2.000, 2.236, 2.449, 2.646] 112 | 113 | if __name__ == '__main__': 114 | main() 115 | ``` 116 | 117 | ## Advanced Example 118 | 119 | An example of django-xml usage which includes XsltField and @lxml_extension methods 120 | can be found [here](https://github.com/theatlantic/django-xml/blob/master/docs/advanced_example.md). 121 | 122 | ## XmlModel Meta options 123 | 124 | Metadata for an `XmlModel` is passed as attributes of an 125 | internal class named `Meta`. Listed below are the options 126 | that can be set on the `Meta` class. 127 | 128 | #### namespaces
`Options.namespaces = {}` 129 | 130 | A dict of prefix / namespace URIs key-value pairs that is passed to 131 | [`lxml.etree.XPathEvaluator()`](http://lxml.de/api/lxml.etree-module.html#XPathEvaluator) 132 | for all XPath fields on the model. 133 | 134 | #### parser_opts
`Options.parser_opts = {}` 135 | 136 | A dict of keyword arguments to pass to 137 | [lxml.etree.XMLParser()](http://lxml.de/api/lxml.etree.XMLParser-class.html) 138 | 139 | #### extension_ns_uri
`Options.extension_ns_uri` 140 | 141 | The default namespace URI to use for extension functions created using the 142 | `@lxml_extension` decorator. 143 | 144 | ## @lxml_extension reference 145 | 146 |
def lxml_extension(method=None, ns_uri=None, name=None)
147 | 148 | The `@lxml_extension` decorator is for registering model methods as 149 | lxml extensions which can be used in XPathFields and XsltFields. All keyword 150 | arguments to it are optional. 151 | 152 | #### ns_uri 153 | 154 | The namespace uri for the function. If used in an `XPathField`, this uri will need to 155 | be one of the values in the namespaces attribute of the XmlModel's internal 156 | `Meta` class. If used in an XSLT, the namespace will need to be defined in 157 | the xslt file or string. 158 | 159 | Defaults to the value of the `extension_ns_uri` attribute of the 160 | XmlModel's internal `Meta` class, if defined. If neither the 161 | `extension_ns_uri` attribute of XmlModel.Meta is set, nor is the 162 | `ns_uri` keyword argument passed, an `ExtensionNamespaceException` 163 | will be thrown. 164 | 165 | #### name 166 | 167 | The name of the function to register. Defaults to the method's name. 168 | 169 | ## XPathField options 170 | 171 | The following arguments are available to all XPath field types. All but the 172 | first are optional. 173 | 174 | 175 | #### xpath_query
`XPathField.xpath_query` 176 | 177 | The XPath query string to perform on the document. Required. 178 | 179 | #### required
`XPathField.required = True` 180 | 181 | If `True`, a `DoesNotExist` exception will be thrown if no nodes match the 182 | XPath query for the field. Defaults to `True`. 183 | 184 | #### extra_namespaces
`XPathField.extra_namespaces` 185 | 186 | A dict of extra prefix/uri namespace pairs to pass to 187 | [`lxml.etree.XPathEvaluator()`](http://lxml.de/api/lxml.etree-module.html#XPathEvaluator). 188 | 189 | #### extensions
`XPathField.extensions` 190 | 191 | Extra extensions to pass on to 192 | [`lxml.etree.XSLT`](http://lxml.de/api/lxml.etree-module.html#XPathEvaluator). 193 | See the [lxml documentation](http://lxml.de/extensions.html#evaluator-local-extensions) 194 | for details on how to form the `extensions` keyword argument. 195 | 196 | ## XPathSingleNodeField options 197 | 198 | #### ignore_extra_nodes
`XPathSingleNodeField.ignore_extra_nodes = False` 199 | 200 | If `True` return only the first node of the XPath evaluation result, even if it 201 | evaluates to more than one node. If `False`, accessing an xpath field which 202 | evaluates to more than one node will throw a `MultipleObjectsExist` exception 203 | Defaults to `False`. 204 | 205 | To return the full list of nodes, Use an `XPathListField` 206 | 207 | ## XsltField options 208 | 209 | #### xslt_file, xslt_string
`XsltField.xslt_file`
`XsltField.xslt_string` 210 | 211 | The first positional argument to XsltField is the path to an xslt file. 212 | Alternatively, the xslt can be passed as a string using the 213 | `xslt_string` keyword argument. It is required to specify one of these 214 | fields. 215 | 216 | #### parser
`XsltField.parser` 217 | 218 | An instance of [lxml.etree.XMLParser](http://lxml.de/api/lxml.etree.XMLParser-class.html) 219 | to override the one created by the XmlModel class. To override parsing options 220 | for the entire class, use the [`parser_opts`](#parser_optsoptionsparser_opts--) 221 | attribute of the [XmlModel internal `Meta` class](#xmlmodel-meta-options). 222 | 223 | #### extensions
`XsltField.extensions = {}` 224 | 225 | Extra extensions to pass on to the constructor of 226 | [`lxml.etree.XSLT`](http://lxml.de/api/lxml.etree.XSLT-class.html#__init__). 227 | See the [lxml documentation](http://lxml.de/extensions.html#evaluator-local-extensions) 228 | for details on how to form the `extensions` keyword argument. 229 | 230 | ## XmlModel field reference 231 | 232 | ```python 233 | class XsltField(xslt_file=None, xslt_string=None, parser=None, extensions=None) 234 | ``` 235 | 236 | Field which abstracts the creation of 237 | [lxml.etree.XSLT](http://lxml.de/api/lxml.etree.XSLT-class.html) objects. 238 | This field's return type is a callable which accepts keyword arguments that 239 | are passed as parameters to the stylesheet. 240 | 241 | ```python 242 | class XPathField(xpath_query, required=False, extra_namespaces=None, extensions=None) 243 | ``` 244 | 245 | Base field for abstracting the retrieval of node results from the xpath 246 | evaluation of an xml etree. 247 | 248 | ```python 249 | class XPathField(xpath_query, required=False, extra_namespaces=None, extensions=None) 250 | ``` 251 | 252 | Base field for abstracting the retrieval of node results from the xpath 253 | evaluation of an xml etree. 254 | 255 | ```python 256 | class XPathListField(xpath_query, required=False, extra_namespaces=None, extensions=None) 257 | ``` 258 | 259 | Field which abstracts retrieving a list of nodes from the xpath evaluation 260 | of an xml etree. 261 | 262 | ```python 263 | class XPathSingleItemField(xpath_query, required=False, extra_namespaces=None, 264 | extensions=None, ignore_extra_nodes=False) 265 | ``` 266 | 267 | Field which abstracts retrieving the first node result from the xpath 268 | evaluation of an xml etree. 269 | 270 | ```python 271 | class XPathTextField(XPathSingleNodeField) 272 | ``` 273 | 274 | Returns a unicode value when accessed. 275 | 276 | ```python 277 | class XPathIntegerField(XPathSingleNodeField) 278 | ``` 279 | 280 | Returns an int value when accessed. 281 | 282 | ```python 283 | class XPathFloatField(XPathSingleNodeField) 284 | ``` 285 | 286 | Returns a float value when accessed. 287 | 288 | ```python 289 | class XPathDateTimeField(XPathSingleNodeField) 290 | ``` 291 | 292 | Returns a datetime.datetime value when accessed. 293 | 294 | ```python 295 | class XPathTextListField(XPathListField) 296 | ``` 297 | 298 | Returns a list of unicode values when accessed. 299 | 300 | ```python 301 | class XPathIntegerListField(XPathListField) 302 | ``` 303 | 304 | Returns a list of int values when accessed. 305 | 306 | ```python 307 | class XPathFloatListField(XPathListField) 308 | ``` 309 | 310 | Returns a list of float values when accessed. 311 | 312 | ```python 313 | class XPathDateTimeListField(XPathListField) 314 | ``` 315 | 316 | Returns a list of datetime.datetime values when accessed. 317 | -------------------------------------------------------------------------------- /djxml/__init__.py: -------------------------------------------------------------------------------- 1 | __version__ = "3.0.1" 2 | -------------------------------------------------------------------------------- /djxml/build/__init__.py: -------------------------------------------------------------------------------- 1 | """Provides the entrypoint in setup.py for the create_readme_rst command""" 2 | 3 | from __future__ import absolute_import 4 | import os 5 | import codecs 6 | import setuptools 7 | from distutils import log 8 | 9 | from .rst_converter import PandocRSTConverter 10 | 11 | 12 | class create_readme_rst(setuptools.Command): 13 | 14 | description = "Convert README.md to README.rst" 15 | user_options = [] 16 | 17 | def initialize_options(self): pass 18 | def finalize_options(self): pass 19 | 20 | def run(self): 21 | converter = PandocRSTConverter() 22 | rst = converter.convert('README.md') 23 | 24 | # Replace API documentation with link to github readme 25 | api_link = ("\nAPI Documentation" 26 | "\n=================\n\n" 27 | "`Read API documentation on github " 28 | "`_") 30 | rst = converter.replace_section(rst, u'XmlModel Meta options', api_link, remove_header=True) 31 | rst = converter.replace_section(rst, u'XPathField options', u'', remove_header=True) 32 | rst = converter.replace_section(rst, u'XsltField options', u'', remove_header=True) 33 | rst = converter.replace_section(rst, u'XmlModel field reference', u'', remove_header=True) 34 | rst = converter.replace_section(rst, u'XPathSingleNodeField options', u'', remove_header=True) 35 | rst = converter.replace_section(rst, u'@lxml\_extension reference', u'', remove_header=True) 36 | rst = converter.replace_section(rst, u'Contents', u'', remove_header=True) 37 | 38 | outfile = os.path.join(os.path.dirname(__file__), '..', '..', 'README.rst') 39 | with codecs.open(outfile, encoding='utf8', mode='w') as f: 40 | f.write(rst) 41 | log.info("Successfully converted README.md to README.rst") 42 | -------------------------------------------------------------------------------- /djxml/build/rst_converter.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | from __future__ import absolute_import, unicode_literals 3 | import os 4 | import re 5 | import subprocess 6 | import distutils 7 | 8 | 9 | class PandocRSTConverter(object): 10 | 11 | replacements = ( 12 | # Remove lists with internal links (effectively, the TOC in a README). 13 | # We remove the TOC because the anchors in a github markdown file 14 | # are broken when the rst is displayed on pypi 15 | ( 16 | re.compile(r'(?ms)^\- ((?:.(?!\n[\n\-]))*?\`__\n)'), 17 | '' 18 | ), 19 | # (shorten link terminator from two underscores to one): `__ => `_ 20 | (re.compile(r'\`__'), '`_'), 21 | # Replace, for example: 22 | # 23 | # code sample: 24 | # 25 | # :: 26 | # 27 | # def example(): pass 28 | # 29 | # with: 30 | # 31 | # code sample:: 32 | # 33 | # def example(): pass 34 | (re.compile(r'(?ms)(\:)\n\n\:(\:\n\n)'), r'\1\2'), 35 | # replace 3+ line breaks with 2 36 | (re.compile(r'\n\n\n+'), '\n\n'), 37 | # Remove syntax highlighting hints, which don't work on pypi 38 | (re.compile(r'(?m)\.\. code\:\: .*$'), '::'), 39 | ) 40 | 41 | from_format = 'markdown' 42 | 43 | def convert(self, path_): 44 | path = path_ if path_[0] == '/' else os.path.join( 45 | os.path.dirname(__file__), '..', '..', path_) 46 | 47 | if not os.path.exists(path): 48 | raise distutils.errors.DistutilsSetupError("File '%s' does not exist" % path_) 49 | 50 | pandoc_path = distutils.spawn.find_executable("pandoc") 51 | if pandoc_path is None: 52 | raise distutils.errors.DistutilsSetupError( 53 | "pandoc must be installed and in PATH to convert markdown to rst") 54 | 55 | rst = subprocess.check_output([ 56 | pandoc_path, 57 | "-f", self.from_format, 58 | "-t", "rst", 59 | path]) 60 | rst = self.replace_header_chars(rst) 61 | for regex, replacement in self.replacements: 62 | rst = regex.sub(replacement, rst) 63 | return rst 64 | 65 | header_char_map = ( 66 | ('=', '#'), 67 | ('-', '='), 68 | ('^', '-'), 69 | ("'", '.'), 70 | ) 71 | 72 | def replace_header_chars(self, rst_string): 73 | """Replace the default header chars with more sensible ones""" 74 | for from_char, to_char in self.header_char_map: 75 | def replace(matchobj): 76 | return to_char * len(matchobj.group(0)) 77 | regex = r'(?m)^%(from)s%(from)s+$' % {'from': re.escape(from_char), } 78 | rst_string = re.sub(regex, replace, rst_string) 79 | return rst_string 80 | 81 | def replace_section(self, rst, section_name, replacement, remove_header=False): 82 | if not len(replacement): 83 | replacement = u"\n" 84 | elif replacement[-1] != u"\n": 85 | replacement = u"%s\n" % replacement 86 | if remove_header: 87 | replacement = u"%s\n" % replacement 88 | else: 89 | replacement = u"\\1\n%s\n" % replacement 90 | regex = (r"""(?msx) 91 | (\n 92 | %(section_name)s\n 93 | ([%(header_chars)s])\2[^\n]+\n 94 | ).*?\n 95 | (?=(?: 96 | ^[^\n]+\n 97 | \2\2\2 98 | | 99 | \Z 100 | )) 101 | """) % { 102 | 'section_name': re.escape(section_name), 103 | 'header_chars': re.escape('-#=.'), 104 | } 105 | return re.sub(regex, replacement, rst) 106 | -------------------------------------------------------------------------------- /djxml/tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/theatlantic/django-xml/293bd1e2673bb746d3f8d0e29b4f19798507d943/djxml/tests/__init__.py -------------------------------------------------------------------------------- /djxml/tests/data/atom2rss.xsl: -------------------------------------------------------------------------------- 1 | 2 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /djxml/tests/data/atom_feed.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | Example Feed 4 | 5 | 2012-07-05T18:30:02Z 6 | urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6 7 | 8 | An example entry 9 | 10 | urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 11 | 2012-07-05T18:30:02Z 12 | Some text. 13 | 14 | 15 | -------------------------------------------------------------------------------- /djxml/tests/settings.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | import django 3 | 4 | 5 | DEBUG = True 6 | TEMPLATE_DEBUG = True 7 | DATABASES = { 8 | 'default': { 9 | 'ENGINE': 'django.db.backends.sqlite3', 10 | 'NAME': ':memory:' 11 | } 12 | } 13 | SECRET_KEY = 'z-i*xqqn)r0i7leak^#clq6y5j8&tfslp^a4duaywj2$**s*0_' 14 | MIDDLEWARE_CLASSES = tuple([]) 15 | SITE_ID = 1 16 | 17 | if django.VERSION >= (1, 6): 18 | TEST_RUNNER = 'django.test.runner.DiscoverRunner' 19 | else: 20 | TEST_RUNNER = 'discover_runner.runner.DiscoverRunner' 21 | -------------------------------------------------------------------------------- /djxml/tests/test_advanced_example.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | import os 3 | from doctest import Example 4 | 5 | from lxml import etree 6 | from lxml.doctestcompare import LXMLOutputChecker 7 | 8 | from django import test 9 | 10 | from .xmlmodels import AtomFeed, AtomEntry 11 | 12 | 13 | class TestAdvancedExample(test.TestCase): 14 | 15 | @classmethod 16 | def setUpClass(cls): 17 | super(TestAdvancedExample, cls).setUpClass() 18 | cls.example = AtomFeed.create_from_file( 19 | os.path.join(os.path.dirname(__file__), 'data', 'atom_feed.xml')) 20 | 21 | def assertXmlEqual(self, got, want): 22 | checker = LXMLOutputChecker() 23 | if not checker.check_output(want, got, 0): 24 | message = checker.output_difference(Example("", want), got, 0) 25 | raise AssertionError(message) 26 | 27 | def test_feed_title(self): 28 | self.assertEqual(self.example.title, "Example Feed") 29 | 30 | def test_feed_entry_title(self): 31 | self.assertIsInstance(self.example.entries[0], AtomEntry) 32 | self.assertEqual(self.example.entries[0].title, "An example entry") 33 | 34 | def test_transform_to_rss(self): 35 | expected = "\n".join([ 36 | '', 37 | ' Example Feed', 38 | '', 39 | ' http://example.org/', 40 | ' Thu, 05 Jul 2012 18:30:02Z', 41 | '', 42 | ' ', 43 | '', 44 | ' http://example.org/2003/12/13/atom03', 45 | ' urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a', 46 | ' Thu, 05 Jul 2012 18:30:02Z', 47 | ' <div>Some text.</div>', 48 | ' ', 49 | '', 50 | '\n']) 51 | self.assertXmlEqual(expected, etree.tounicode(self.example.transform_to_rss())) 52 | -------------------------------------------------------------------------------- /djxml/tests/test_examples.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from django import test 3 | 4 | from .xmlmodels import NumbersExample 5 | 6 | 7 | class TestExamples(test.TestCase): 8 | 9 | numbers_xml = u""" 10 | 11 | 1 12 | 2 13 | 3 14 | 4 15 | 5 16 | 6 17 | 7 18 | """ 19 | 20 | def test_all_numbers(self): 21 | example = NumbersExample.create_from_string(self.numbers_xml) 22 | self.assertEqual(example.all_numbers, [1, 2, 3, 4, 5, 6, 7]) 23 | 24 | def test_lxml_boolean_extension(self): 25 | example = NumbersExample.create_from_string(self.numbers_xml) 26 | self.assertEqual(example.even_numbers, [2, 4, 6]) 27 | 28 | def test_lxml_list_extension(self): 29 | example = NumbersExample.create_from_string(self.numbers_xml) 30 | self.assertEqual(example.square_numbers, 31 | [1, 4, 9, 16, 25, 36, 49]) 32 | -------------------------------------------------------------------------------- /djxml/tests/xmlmodels.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | import re 3 | import time 4 | import os 5 | from datetime import datetime 6 | from lxml import etree 7 | 8 | from djxml import xmlmodels 9 | 10 | 11 | class NumbersExample(xmlmodels.XmlModel): 12 | 13 | class Meta: 14 | extension_ns_uri = "urn:local:number-functions" 15 | namespaces = {"fn": extension_ns_uri,} 16 | 17 | all_numbers = xmlmodels.XPathIntegerListField("//num") 18 | even_numbers = xmlmodels.XPathIntegerListField("//num[fn:is_even(.)]") 19 | square_numbers = xmlmodels.XPathIntegerListField("fn:square(//num)") 20 | 21 | @xmlmodels.lxml_extension 22 | def is_even(self, context, number_nodes): 23 | numbers = [getattr(n, 'text', n) for n in number_nodes] 24 | return all([bool(int(num) % 2 == 0) for num in numbers]) 25 | 26 | @xmlmodels.lxml_extension 27 | def square(self, context, number_nodes): 28 | squares = [] 29 | for number_node in number_nodes: 30 | number = getattr(number_node, 'text', number_node) 31 | squares.append(repr(int(number) ** 2)) 32 | return squares 33 | 34 | 35 | strip_namespaces = etree.XSLT(etree.XML(""" 36 | 38 | 39 | 40 | 41 | """)) 42 | 43 | 44 | class AtomEntry(xmlmodels.XmlModel): 45 | 46 | class Meta: 47 | extension_ns_uri = "urn:local:atom-feed-functions" 48 | namespaces = { 49 | "fn": extension_ns_uri, 50 | "atom": "http://www.w3.org/2005/Atom", 51 | } 52 | 53 | @xmlmodels.lxml_extension 54 | def escape_xhtml(self, context, nodes): 55 | return u"".join([etree.tounicode(strip_namespaces(n)) for n in nodes]) 56 | 57 | title = xmlmodels.XPathTextField('atom:title') 58 | entry_id = xmlmodels.XPathTextField('atom:id') 59 | updated = xmlmodels.XPathDateTimeField('atom:updated') 60 | summary = xmlmodels.XPathInnerHtmlField('fn:escape_xhtml(atom:summary)') 61 | 62 | 63 | class AtomFeed(xmlmodels.XmlModel): 64 | 65 | class Meta: 66 | extension_ns_uri = "urn:local:atom-feed-functions" 67 | namespaces = { 68 | "fn": extension_ns_uri, 69 | "atom": "http://www.w3.org/2005/Atom", 70 | } 71 | 72 | title = xmlmodels.XPathTextField("/atom:feed/atom:title") 73 | 74 | updated = xmlmodels.XPathDateTimeField("/atom:feed/atom:*[%s]" \ 75 | % "local-name()='updated' or (local-name()='published' and not(../atom:updated))") 76 | 77 | entries = xmlmodels.EmbeddedXPathListField(AtomEntry, 78 | "/atom:feed/atom:entry", required=False) 79 | 80 | transform_to_rss = xmlmodels.XsltField( 81 | os.path.join(os.path.dirname(__file__), "data", "atom2rss.xsl")) 82 | 83 | @xmlmodels.lxml_extension 84 | def escape_xhtml(self, context, nodes): 85 | return u"".join([etree.tounicode(strip_namespaces(n)) for n in nodes]) 86 | 87 | @xmlmodels.lxml_extension 88 | def convert_atom_date_to_rss(self, context, rfc3339_str): 89 | try: 90 | m = re.match(r"([\d:T-]+)(?:\.\d+)?(Z|[+-][\d:]{5})", rfc3339_str) 91 | except TypeError: 92 | return "" 93 | dt_str, tz_str = m.groups() 94 | dt = datetime(*[t for t in time.strptime(dt_str, "%Y-%m-%dT%H:%M:%S")][0:6]) 95 | tz_str = 'Z' if tz_str == 'Z' else tz_str[:3] + tz_str[4:] 96 | return dt.strftime("%a, %d %b %Y %H:%M:%S") + tz_str 97 | -------------------------------------------------------------------------------- /djxml/xmlmodels/__init__.py: -------------------------------------------------------------------------------- 1 | from .loading import (get_apps, get_app, get_xml_models, get_xml_model, # noqa 2 | register_xml_models,) 3 | from . import signals # noqa 4 | from .base import XmlModel # noqa 5 | from .decorators import lxml_extension # noqa 6 | from .fields import (XmlElementField, XmlPrimaryElementField, # noqa 7 | XPathSingleNodeField, XPathTextField, XPathIntegerField, 8 | XPathFloatField, XPathDateTimeField, XPathListField, 9 | XPathTextListField, XPathIntegerListField, 10 | XPathFloatListField, XPathDateTimeListField, XsltField, 11 | XPathHtmlField, XPathHtmlListField, 12 | XPathInnerHtmlField, XPathInnerHtmlListField, 13 | XPathBooleanField, XPathBooleanListField, SchematronField,) 14 | from .related import (EmbeddedXPathField, EmbeddedXPathListField, # noqa 15 | EmbeddedXsltField, EmbeddedSchematronField,) 16 | -------------------------------------------------------------------------------- /djxml/xmlmodels/base.py: -------------------------------------------------------------------------------- 1 | import re 2 | import sys 3 | import codecs 4 | import functools 5 | import copy 6 | 7 | from lxml import etree 8 | 9 | from django.core.exceptions import (ObjectDoesNotExist, FieldError, 10 | MultipleObjectsReturned,) 11 | from django.db.models.base import subclass_exception 12 | from django.utils.encoding import force_text 13 | from django.utils.encoding import smart_bytes, smart_str 14 | 15 | from .signals import xmlclass_prepared 16 | from .options import Options, DEFAULT_NAMES 17 | from .loading import register_xml_models, get_xml_model 18 | 19 | 20 | class XmlModelBase(type): 21 | """ 22 | Metaclass for xml models. 23 | """ 24 | 25 | def __new__(cls, name, bases, attrs): 26 | super_new = super(XmlModelBase, cls).__new__ 27 | parents = [b for b in bases if isinstance(b, XmlModelBase)] 28 | if not parents: 29 | # If this isn't a subclass of Model, don't do anything special. 30 | return super_new(cls, name, bases, attrs) 31 | 32 | # Create the class. 33 | module = attrs.pop('__module__') 34 | new_attrs = {'__module__': module} 35 | classcell = attrs.pop('__classcell__', None) 36 | if classcell is not None: 37 | new_attrs['__classcell__'] = classcell 38 | new_class = super_new(cls, name, bases, new_attrs) 39 | 40 | attr_meta = attrs.pop('Meta', None) 41 | if not attr_meta: 42 | meta = getattr(new_class, 'Meta', None) 43 | else: 44 | meta = attr_meta 45 | 46 | if getattr(meta, 'app_label', None) is None: 47 | # Figure out the app_label by looking one level up. 48 | # For 'django.contrib.sites.models', this would be 'sites'. 49 | model_module = sys.modules[new_class.__module__] 50 | kwargs = {"app_label": model_module.__name__.split('.')[-2]} 51 | else: 52 | kwargs = {} 53 | 54 | for attr_name in DEFAULT_NAMES: 55 | if attr_name == 'app_label': 56 | continue 57 | if getattr(meta, attr_name, None) is None: 58 | for base in parents: 59 | if not hasattr(base, '_meta'): 60 | continue 61 | attr_val = getattr(base._meta, attr_name) 62 | if attr_val is not None: 63 | kwargs[attr_name] = attr_val 64 | break 65 | 66 | new_class.add_to_class('_meta', Options(meta, **kwargs)) 67 | 68 | new_class.add_to_class( 69 | 'DoesNotExist', 70 | subclass_exception( 71 | 'DoesNotExist', 72 | tuple( 73 | x.DoesNotExist for x in parents if hasattr(x, '_meta') 74 | ) or (ObjectDoesNotExist,), 75 | module, 76 | attached_to=new_class)) 77 | new_class.add_to_class( 78 | 'MultipleObjectsReturned', 79 | subclass_exception( 80 | 'MultipleObjectsReturned', 81 | tuple( 82 | x.MultipleObjectsReturned for x in parents if hasattr(x, '_meta') 83 | ) or (MultipleObjectsReturned,), 84 | module, 85 | attached_to=new_class)) 86 | 87 | # Bail out early if we have already created this class. 88 | m = get_xml_model(new_class._meta.app_label, name, False) 89 | if m is not None: 90 | return m 91 | 92 | # Add all attributes to the class. 93 | for obj_name, obj in attrs.items(): 94 | new_class.add_to_class(obj_name, obj) 95 | 96 | field_names = set([f.name for f in new_class._meta.local_fields]) 97 | 98 | for base in parents: 99 | if not hasattr(base, '_meta'): 100 | # Things without _meta aren't functional models, so they're 101 | # uninteresting parents 102 | continue 103 | 104 | for field in base._meta.local_fields: 105 | if field.name in field_names: 106 | raise FieldError('Local field %r in class %r clashes ' 107 | 'with field of similar name from ' 108 | 'base class %r' % 109 | (field.name, name, base.__name__)) 110 | new_class.add_to_class(field.name, copy.deepcopy(field)) 111 | 112 | new_class._meta.parents.update(base._meta.parents) 113 | 114 | new_class._prepare() 115 | register_xml_models(new_class._meta.app_label, new_class) 116 | 117 | # Because of the way imports happen (recursively), we may or may not be 118 | # the first time this model tries to register with the framework. There 119 | # should only be one class for each model, so we always return the 120 | # registered version. 121 | return get_xml_model(new_class._meta.app_label, name, False) 122 | 123 | def add_to_class(cls, name, value): 124 | if hasattr(value, 'contribute_to_class'): 125 | value.contribute_to_class(cls, name) 126 | else: 127 | setattr(cls, name, value) 128 | if getattr(value, 'is_lxml_extension', False): 129 | cls._meta.add_extension(value, extension_name=name) 130 | 131 | def _prepare(cls): 132 | """ 133 | Creates some methods once self._meta has been populated. 134 | """ 135 | opts = cls._meta 136 | opts._prepare(cls) 137 | 138 | # Give the class a docstring -- its definition. 139 | if cls.__doc__ is None: 140 | cls.__doc__ = "%s(%s)" % (cls.__name__, ", ".join([f.attname for f in opts.fields])) 141 | 142 | xmlclass_prepared.send(sender=cls) 143 | 144 | 145 | class XmlModel(metaclass=XmlModelBase): 146 | 147 | def __init__(self, root_element_tree): 148 | fields_iter = iter(self._meta.fields) 149 | 150 | for field in fields_iter: 151 | if getattr(field, 'is_root_field', False): 152 | val = root_element_tree 153 | else: 154 | val = None 155 | setattr(self, field.attname, val) 156 | 157 | super(XmlModel, self).__init__() 158 | 159 | def _get_etree_val(self, meta=None): 160 | if not meta: 161 | meta = self._meta 162 | return getattr(self, meta.etree.attname) 163 | 164 | _default_xpath_eval = None 165 | 166 | @property 167 | def default_xpath_eval(self): 168 | if self._default_xpath_eval is None: 169 | self._default_xpath_eval = self._get_xpath_eval() 170 | return self._default_xpath_eval 171 | 172 | def _merge_xpath_kwargs(self, ns=None, ext=None): 173 | """ 174 | Merge user-provided namespace and extension keywords with the model 175 | defaults. 176 | """ 177 | opts = self._meta 178 | 179 | xpath_kwargs = { 180 | 'namespaces': getattr(opts, 'namespaces', {}), 181 | 'extensions': {k: functools.partial(method, self) 182 | for k, method in opts.extensions.items()}} 183 | 184 | if ns is not None: 185 | xpath_kwargs['namespaces'].update(ns) 186 | if ext is not None: 187 | xpath_kwargs['extensions'].update(ext) 188 | return xpath_kwargs 189 | 190 | def _get_xpath_eval(self, namespaces=None, extensions=None): 191 | xpath_kwargs = self._merge_xpath_kwargs(ns=namespaces, ext=extensions) 192 | return etree.XPathEvaluator(self._get_etree_val(), **xpath_kwargs) 193 | 194 | def xpath(self, query, namespaces=None, extensions=None): 195 | """ 196 | Evaluate and return the results of an XPath query expression on the 197 | xml model. 198 | 199 | query: The XPath query string 200 | namespaces: (optional) dict of extra prefix/uri namespaces pairs to 201 | pass to lxml.etree.XPathEvaluator() 202 | extensions: (optional) Extra extensions to pass on to 203 | lxml.etree.XPathEvaluator() 204 | """ 205 | if namespaces is None and extensions is None: 206 | xpath_eval = self.default_xpath_eval 207 | else: 208 | xpath_eval = self._get_xpath_eval(ns=namespaces, ext=extensions) 209 | return xpath_eval(query) 210 | 211 | @classmethod 212 | def create_from_string(cls, xml_source, parser=None): 213 | opts = cls._meta 214 | if parser is None: 215 | parser = opts.get_parser() 216 | # lxml doesn't like it when the header has an encoding, 217 | # so we strip out encoding="utf-8" with a regex 218 | xml_source = re.sub(r'(<\?xml[^\?]*?) encoding="(?:utf-8|UTF-8)"([^\?]*?\?>)', 219 | r'\1\2', xml_source) 220 | tree = etree.XML(xml_source, parser) 221 | return cls(tree) 222 | 223 | @classmethod 224 | def create_from_file(cls, xml_file): 225 | with codecs.open(xml_file, encoding='utf-8', mode='r') as f: 226 | xml_source = f.read() 227 | return cls.create_from_string(xml_source) 228 | 229 | def __repr__(self): 230 | try: 231 | u = str(self) 232 | except (UnicodeEncodeError, UnicodeDecodeError): 233 | u = '[Bad Unicode data]' 234 | return smart_str(u'<%s: %s>' % (self.__class__.__name__, u)) 235 | 236 | def __str__(self): 237 | return '%s object' % self.__class__.__name__ 238 | 239 | def __eq__(self, other): 240 | return isinstance(other, self.__class__) \ 241 | and self._get_etree_val() == other._get_etree_val() 242 | 243 | def __ne__(self, other): 244 | return not self.__eq__(other) 245 | 246 | def __hash__(self): 247 | return hash(self._get_etree_val()) 248 | -------------------------------------------------------------------------------- /djxml/xmlmodels/decorators.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | import types 3 | import functools 4 | 5 | def lxml_extension(method=None, ns_uri=None, name=None): 6 | """ 7 | Decorator for registering model methods as lxml extensions to be passed 8 | to XPathFields and XsltFields. 9 | 10 | Params: 11 | 12 | ns_uri (optional): The namespace uri for the function. If used in an 13 | XPathField, this uri will need to be one of the 14 | values in the namespaces attr of the XmlModel's 15 | `Meta` class. If used in an XSLT, the namespace 16 | will need to be defined in the xslt file or string. 17 | 18 | Defaults to the value of the `extension_ns_uri` 19 | attr of the XmlModel's `Meta` class, if defined. 20 | 21 | name (optional): The name of the function to register. Defaults to 22 | the method's name. 23 | 24 | Usage: 25 | 26 | import math 27 | from pprint import pprint 28 | from djlxml import XmlModel, xmlfields, lxml_extension 29 | 30 | class PrimeExample(XmlModel): 31 | 32 | class Meta: 33 | namespaces = { 34 | 'f': 'urn:local:myfuncs', 35 | } 36 | 37 | @lxml_extension(ns_uri='urn:local:myfuncs') 38 | def sqrt(self, context, nodes): 39 | nodelist = [] 40 | for node in nodes: 41 | nodestr = getattr(node, 'text', node) 42 | nodefloat = float(nodestr) 43 | nodelist.append(repr(math.sqrt(nodefloat))) 44 | return nodelist 45 | 46 | square_roots = xmlfields.XPathFloatListField('f:sqrt(/prime_numbers/num)') 47 | 48 | primes = u''' 49 | 50 | 2 51 | 3 52 | 5 53 | 7 54 | '''.strip() 55 | 56 | example = PrimeExample.create_from_string(primes) 57 | pprint(example.square_roots) 58 | ''' 59 | [1.4142135623730951, 60 | 1.7320508075688772, 61 | 2.2360679774997898, 62 | 2.6457513110645907] 63 | ''' 64 | """ 65 | 66 | 67 | # If called without a method, we've been called with optional arguments. 68 | # We return a decorator with the optional arguments filled in. 69 | # Next time round we'll be decorating method. 70 | if method is None: 71 | return functools.partial(lxml_extension, ns_uri=ns_uri, name=name) 72 | 73 | @functools.wraps(method) 74 | def wrapper(self, *args, **kwargs): 75 | return method(self, *args, **kwargs) 76 | 77 | if name is None: 78 | if isinstance(method, types.MethodType): 79 | name = method.__func__.__name__ 80 | else: 81 | name = method.__name__ 82 | 83 | wrapper.is_lxml_extension = True 84 | wrapper.lxml_ns_uri = ns_uri 85 | wrapper.lxml_extension_name = name 86 | 87 | return wrapper 88 | -------------------------------------------------------------------------------- /djxml/xmlmodels/descriptors.py: -------------------------------------------------------------------------------- 1 | import functools 2 | 3 | from lxml import etree 4 | 5 | from .exceptions import XsltException 6 | 7 | 8 | class Creator(object): 9 | """ 10 | A placeholder class that provides a way to set the attribute on the model. 11 | """ 12 | def __init__(self, field): 13 | self.field = field 14 | 15 | def __get__(self, model_instance, type=None): 16 | return model_instance.__dict__[self.field.name] 17 | 18 | def __set__(self, model_instance, value): 19 | cleaned_value = self.field.clean(value, model_instance) 20 | model_instance.__dict__[self.field.name] = cleaned_value 21 | if value is not None: 22 | model_instance.__dict__[self.cache_name] = cleaned_value 23 | 24 | 25 | class ImmutableCreator(Creator): 26 | 27 | def __init__(self, field): 28 | super(ImmutableCreator, self).__init__(field) 29 | self.field.value_initialized = False 30 | self.cache_name = field.get_cache_name() 31 | 32 | def __set__(self, model_instance, value): 33 | if '_field_inits' not in model_instance.__dict__: 34 | model_instance._field_inits = {} 35 | if model_instance._field_inits.get(self.field.name, False): 36 | raise TypeError("%s.%s is immutable" \ 37 | % (model_instance.__class__.__name__, self.field.name)) 38 | 39 | super(ImmutableCreator, self).__set__(model_instance, value) 40 | 41 | if model_instance.__dict__[self.field.name] is not None: 42 | model_instance._field_inits[self.field.name] = True 43 | self.field.value_initialized = True 44 | 45 | 46 | class FieldBase(type): 47 | """ 48 | A metaclass for custom Field subclasses. This ensures the model's attribute 49 | has the descriptor protocol attached to it. 50 | """ 51 | def __new__(cls, name, bases, attrs, **kwargs): 52 | new_class = super(FieldBase, cls).__new__(cls, name, bases, attrs) 53 | 54 | descriptor_cls_attr = getattr(cls, 'descriptor_cls', None) 55 | if descriptor_cls_attr is not None: 56 | kwargs['descriptor_cls'] = descriptor_cls_attr 57 | 58 | new_class.contribute_to_class = make_contrib( 59 | new_class, attrs.get('contribute_to_class'), **kwargs) 60 | return new_class 61 | 62 | 63 | class ImmutableFieldBase(FieldBase): 64 | 65 | descriptor_cls = ImmutableCreator 66 | 67 | 68 | class XPathObjectDescriptor(ImmutableCreator): 69 | 70 | def __init__(self, field): 71 | super(XPathObjectDescriptor, self).__init__(field) 72 | 73 | def __get__(self, instance, instance_type=None): 74 | if instance is None: 75 | raise AttributeError('Can only be accessed via an instance.') 76 | try: 77 | return getattr(instance, self.cache_name) 78 | except AttributeError: 79 | tree = instance._get_etree_val() 80 | query = self.field.xpath_query 81 | 82 | namespaces = {} 83 | namespaces.update(getattr(instance._meta, 'namespaces', {})) 84 | namespaces.update(getattr(self.field, 'extra_namespaces', {})) 85 | 86 | extensions = {k: functools.partial(method, instance) 87 | for k, method in instance._meta.extensions.items()} 88 | extensions.update(self.field.extensions) 89 | 90 | xpath_eval = etree.XPathEvaluator(tree, namespaces=namespaces, 91 | extensions=extensions) 92 | 93 | nodes = xpath_eval(query) 94 | nodes = self.field.clean(nodes, instance) 95 | setattr(instance, self.cache_name, nodes) 96 | return nodes 97 | 98 | 99 | class XPathFieldBase(FieldBase): 100 | 101 | descriptor_cls = XPathObjectDescriptor 102 | 103 | 104 | class XsltObjectDescriptor(ImmutableCreator): 105 | 106 | def __init__(self, field): 107 | self.cache_name = field.get_cache_name() 108 | super(XsltObjectDescriptor, self).__init__(field) 109 | 110 | def __get__(self, instance, instance_type=None): 111 | if instance is None: 112 | raise AttributeError('Can only be accessed via an instance.') 113 | try: 114 | return getattr(instance, self.cache_name) 115 | except AttributeError: 116 | tree = instance._get_etree_val() 117 | xslt_tree = self.field.get_xslt_tree(instance) 118 | 119 | extensions = {k: functools.partial(method, instance) 120 | for k, method in instance._meta.extensions.items()} 121 | extensions.update(self.field.extensions) 122 | 123 | transform = etree.XSLT(xslt_tree, extensions=extensions) 124 | def xslt_wrapper(xslt_func): 125 | def wrapper(*args, **kwargs): 126 | try: 127 | xslt_result = xslt_func(tree, *args, **kwargs) 128 | except etree.XSLTApplyError as e: 129 | # Put this in frame locals for debugging 130 | xslt_source = etree.tostring(xslt_tree, encoding='utf8') 131 | raise XsltException(e, xslt_func) 132 | return self.field.clean(xslt_result, instance) 133 | return wrapper 134 | return xslt_wrapper(transform) 135 | 136 | 137 | class XsltFieldBase(FieldBase): 138 | 139 | descriptor_cls = XsltObjectDescriptor 140 | 141 | 142 | def make_contrib(superclass, func=None, descriptor_cls=None): 143 | """ 144 | Returns a suitable contribute_to_class() method for the Field subclass. 145 | 146 | If 'func' is passed in, it is the existing contribute_to_class() method on 147 | the subclass and it is called before anything else. It is assumed in this 148 | case that the existing contribute_to_class() calls all the necessary 149 | superclass methods. 150 | 151 | If descriptor_cls is passed, an instance of that class will be used; 152 | otherwise uses the descriptor class `Creator`. 153 | """ 154 | if descriptor_cls is None: 155 | descriptor_cls = Creator 156 | 157 | def contribute_to_class(self, cls, name): 158 | if func: 159 | func(self, cls, name) 160 | else: 161 | super(superclass, self).contribute_to_class(cls, name) 162 | setattr(cls, self.name, descriptor_cls(self)) 163 | 164 | return contribute_to_class 165 | -------------------------------------------------------------------------------- /djxml/xmlmodels/exceptions.py: -------------------------------------------------------------------------------- 1 | from django.core.exceptions import ValidationError 2 | 3 | class XmlModelException(Exception): 4 | pass 5 | 6 | class XmlSchemaValidationError(XmlModelException): 7 | pass 8 | 9 | class XPathException(XmlModelException): 10 | pass 11 | 12 | class XsltException(XmlModelException): 13 | 14 | def __init__(self, apply_exception, xslt_func): 15 | self.apply_exception = apply_exception 16 | self.error_log = xslt_func.error_log 17 | 18 | def __str__(self): 19 | return str(self.apply_exception) 20 | 21 | def __unicode__(self): 22 | msg = str(self.apply_exception) 23 | debug_output = self.get_debug_output() 24 | if len(debug_output) > 0: 25 | msg += "\n\n" + debug_output 26 | return msg 27 | 28 | def get_debug_output(self): 29 | debug_lines = [] 30 | for entry in self.error_log: 31 | entry_filename = '%s ' % entry.filename if entry.filename != '' else '' 32 | debug_line = '%(msg)s [%(file)sline %(line)s, col %(col)s]' % { 33 | 'file': entry_filename, 34 | 'line': entry.line, 35 | 'col': entry.column, 36 | 'msg': entry.message, 37 | } 38 | debug_lines.append(debug_line) 39 | return u"\n".join(debug_lines) 40 | 41 | class XPathDateTimeException(XPathException): 42 | pass 43 | 44 | 45 | class ExtensionException(XmlModelException): 46 | pass 47 | 48 | 49 | class ExtensionNamespaceException(XmlModelException): 50 | pass 51 | -------------------------------------------------------------------------------- /djxml/xmlmodels/fields.py: -------------------------------------------------------------------------------- 1 | import re 2 | import copy 3 | 4 | from lxml import etree, isoschematron 5 | 6 | from django.core.exceptions import ValidationError 7 | try: 8 | from django.utils.encoding import force_text as force_unicode 9 | except ImportError: 10 | from django.utils.encoding import force_unicode 11 | 12 | from .descriptors import ImmutableFieldBase, XPathFieldBase, XsltFieldBase 13 | from .exceptions import XmlSchemaValidationError 14 | from .utils import parse_datetime 15 | 16 | 17 | class NOT_PROVIDED: 18 | pass 19 | 20 | 21 | class XmlField(object): 22 | 23 | # These track each time a Field instance is created. Used to retain order. 24 | creation_counter = 0 25 | 26 | #: If true, the field is the primary xml element 27 | is_root_field = False 28 | 29 | #: an instance of lxml.etree.XMLParser, to override the default 30 | parser = None 31 | 32 | #: Used by immutable descriptors to 33 | value_initialized = False 34 | 35 | def __init__(self, name=None, required=False, default=NOT_PROVIDED, parser=None): 36 | self.name = name 37 | self.required = required 38 | self.default = default 39 | self.parser = parser 40 | 41 | # Adjust the appropriate creation counter, and save our local copy. 42 | self.creation_counter = XmlField.creation_counter 43 | XmlField.creation_counter += 1 44 | 45 | def __cmp__(self, other): 46 | # This is needed because bisect does not take a comparison function. 47 | return cmp(self.creation_counter, other.creation_counter) 48 | 49 | def __lt__(self, other): 50 | return self.creation_counter < other.creation_counter 51 | 52 | def __deepcopy__(self, memodict): 53 | # We don't have to deepcopy very much here, since most things are not 54 | # intended to be altered after initial creation. 55 | obj = copy.copy(self) 56 | memodict[id(self)] = obj 57 | return obj 58 | 59 | def to_python(self, value): 60 | """ 61 | Converts the input value into the expected Python data type, raising 62 | django.core.exceptions.ValidationError if the data can't be converted. 63 | Returns the converted value. Subclasses should override this. 64 | """ 65 | return value 66 | 67 | def run_validators(self, value): 68 | pass 69 | 70 | def validate(self, value, model_instance): 71 | pass 72 | 73 | def clean(self, value, model_instance): 74 | """ 75 | Convert the value's type and run validation. Validation errors from 76 | to_python and validate are propagated. The correct value is returned 77 | if no error is raised. 78 | """ 79 | value = self.to_python(value) 80 | self.validate(value, model_instance) 81 | self.run_validators(value) 82 | return value 83 | 84 | def set_attributes_from_name(self, name): 85 | self.name = name 86 | self.attname = self.get_attname() 87 | 88 | def contribute_to_class(self, cls, name): 89 | self.set_attributes_from_name(name) 90 | self.model = cls 91 | cls._meta.add_field(self) 92 | 93 | def get_attname(self): 94 | return self.name 95 | 96 | def get_cache_name(self): 97 | return '_%s_cache' % self.name 98 | 99 | def has_default(self): 100 | """Returns a boolean of whether this field has a default value.""" 101 | return self.default is not NOT_PROVIDED 102 | 103 | def get_default(self): 104 | """Returns the default value for this field.""" 105 | if self.has_default(): 106 | if callable(self.default): 107 | return self.default() 108 | return force_unicode(self.default, strings_only=True) 109 | return None 110 | 111 | 112 | class XmlElementField(XmlField, metaclass=ImmutableFieldBase): 113 | 114 | def validate(self, value, model_instance): 115 | if value is None: 116 | if not self.value_initialized or not self.required: 117 | return 118 | 119 | if not isinstance(value, etree._Element): 120 | 121 | if hasattr(value, 'getroot'): 122 | try: 123 | value = value.getroot() 124 | except: 125 | pass 126 | else: 127 | if isinstance(value, etree._Element): 128 | return 129 | 130 | opts = model_instance._meta 131 | raise ValidationError(("Field %(field_name)r on xml model " 132 | "%(app_label)s.%(object_name)s is not an" 133 | " instance of lxml.etree._Element") % { 134 | "field_name": self.name, 135 | "app_label": opts.app_label, 136 | "object_name": opts.object_name,}) 137 | 138 | 139 | class XmlPrimaryElementField(XmlElementField): 140 | 141 | is_root_field = True 142 | 143 | def validate(self, value, model_instance): 144 | if model_instance._meta.xsd_schema is not None: 145 | try: 146 | model_instance._meta.xsd_schema.assertValid(value) 147 | except Exception as e: 148 | raise XmlSchemaValidationError(str(e)) 149 | 150 | def contribute_to_class(self, cls, name): 151 | assert not cls._meta.has_root_field, \ 152 | "An xml model can't have more than one XmlPrimaryElementField" 153 | super(XmlPrimaryElementField, self).contribute_to_class(cls, name) 154 | cls._meta.has_root_field = True 155 | cls._meta.root_field = self 156 | 157 | 158 | class XPathField(XmlField, metaclass=XPathFieldBase): 159 | """ 160 | Base field for abstracting the retrieval of node results from the xpath 161 | evaluation of an xml etree. 162 | """ 163 | 164 | #: XPath query string 165 | xpath_query = None 166 | 167 | #: Dict of extra prefix/uri namespaces pairs to pass to xpath() 168 | extra_namespaces = {} 169 | 170 | #: Extra extensions to pass on to lxml.etree.XPathEvaluator() 171 | extensions = {} 172 | 173 | required = True 174 | 175 | def __init__(self, xpath_query, extra_namespaces=None, extensions=None, 176 | **kwargs): 177 | if isinstance(self.__class__, XPathField): 178 | raise RuntimeError("%r is an abstract field type.") 179 | 180 | self.xpath_query = xpath_query 181 | if extra_namespaces is not None: 182 | self.extra_namespaces = extra_namespaces 183 | if extensions is not None: 184 | self.extensions = extensions 185 | 186 | super(XPathField, self).__init__(**kwargs) 187 | 188 | def validate(self, nodes, model_instance): 189 | super(XPathField, self).validate(nodes, model_instance) 190 | if nodes is None: 191 | if not self.value_initialized or not self.required: 192 | return nodes 193 | try: 194 | node_count = len(nodes) 195 | except TypeError: 196 | node_count = 1 197 | if self.required and node_count == 0: 198 | msg = u"XPath query %r did not match any nodes" % self.xpath_query 199 | raise model_instance.DoesNotExist(msg) 200 | 201 | def clean(self, value, model_instance): 202 | """ 203 | Run validators on raw value, not the value returned from 204 | self.to_python(value) (as it is in the parent clean() method) 205 | """ 206 | self.validate(value, model_instance) 207 | self.run_validators(value) 208 | value = self.to_python(value) 209 | return value 210 | 211 | def get_default(self): 212 | value = super(XPathField, self).get_default() 213 | if value is None: 214 | return value 215 | else: 216 | return [value] 217 | 218 | def __unicode__(self): 219 | return (u"%(field_name)s[%(xpath_query)r]" % { 220 | "field_name": self.name, 221 | "xpath_query": self.xpath_query,}) 222 | 223 | def __repr__(self): 224 | return ("<%(cls)s: %(field)s>" % { 225 | "cls": self.__class__.__name__, 226 | "field": self.__unicode__().encode('raw_unicode_escape'),}) 227 | 228 | 229 | class XPathListField(XPathField): 230 | """ 231 | Field which abstracts retrieving a list of nodes from the xpath evaluation 232 | of an xml etree. 233 | """ 234 | 235 | def to_python(self, value): 236 | if value is None: 237 | return value 238 | if isinstance(value, list): 239 | return value 240 | else: 241 | return list(value) 242 | 243 | 244 | class XPathSingleNodeField(XPathField): 245 | """ 246 | Field which abstracts retrieving the first node result from the xpath 247 | evaluation of an xml etree. 248 | """ 249 | 250 | #: Whether to ignore extra nodes and return the first node if the xpath 251 | #: evaluates to more than one node. 252 | #: 253 | #: To return the full list of nodes, Use XPathListField 254 | ignore_extra_nodes = False 255 | 256 | def __init__(self, xpath_query, ignore_extra_nodes=False, **kwargs): 257 | self.ignore_extra_nodes = ignore_extra_nodes 258 | super(XPathSingleNodeField, self).__init__(xpath_query, **kwargs) 259 | 260 | def validate(self, nodes, model_instance): 261 | super(XPathSingleNodeField, self).validate(nodes, model_instance) 262 | if nodes is None: 263 | if not self.value_initialized or not self.required: 264 | return nodes 265 | if isinstance(nodes, str): 266 | node_count = 1 267 | else: 268 | try: 269 | node_count = len(nodes) 270 | except TypeError: 271 | node_count = 1 272 | if not self.ignore_extra_nodes and node_count > 1: 273 | msg = u"XPath query %r matched more than one node" \ 274 | % self.xpath_query 275 | raise model_instance.MultipleObjectsReturned(msg) 276 | 277 | def to_python(self, value): 278 | if value is None: 279 | return value 280 | if isinstance(value, list): 281 | if len(value) == 0: 282 | return None 283 | else: 284 | return value[0] 285 | elif isinstance(value, str): 286 | return value 287 | else: 288 | # Possible throw exception here 289 | return value 290 | 291 | 292 | class XPathTextField(XPathSingleNodeField): 293 | 294 | #: A tuple of strings which should be interpreted as None. 295 | none_vals = () 296 | 297 | def __init__(self, *args, **kwargs): 298 | none_vals = kwargs.pop('none_vals', None) 299 | if none_vals is not None: 300 | self.none_vals = [force_unicode(v) for v in none_vals] 301 | super(XPathTextField, self).__init__(*args, **kwargs) 302 | 303 | def validate(self, value, model_instance): 304 | super(XPathTextField, self).validate(value, model_instance) 305 | if len(self.none_vals): 306 | value = self.to_python(value) 307 | if self.required and value in self.none_vals: 308 | error_msg = ("%(field)s is required, but value %(value)r is " 309 | "mapped to None") % { 310 | "field": str(self), 311 | "value": value,} 312 | raise model_instance.DoesNotExist(error_msg) 313 | 314 | def to_python(self, value): 315 | value = super(XPathTextField, self).to_python(value) 316 | if value is None: 317 | return value 318 | if isinstance(value, etree._Element): 319 | return force_unicode(value.text) 320 | else: 321 | return force_unicode(value) 322 | 323 | 324 | class XPathIntegerField(XPathTextField): 325 | 326 | def to_python(self, value): 327 | value = super(XPathIntegerField, self).to_python(value) 328 | if value is None: 329 | return value 330 | else: 331 | try: 332 | return int(value) 333 | except ValueError: 334 | value = float(value) 335 | if not value.is_integer(): 336 | raise 337 | else: 338 | return int(value) 339 | 340 | 341 | class XPathFloatField(XPathTextField): 342 | 343 | def to_python(self, value): 344 | value = super(XPathFloatField, self).to_python(value) 345 | if value is None: 346 | return value 347 | else: 348 | return float(value) 349 | 350 | 351 | class XPathDateTimeField(XPathTextField): 352 | 353 | def to_python(self, value): 354 | value = super(XPathDateTimeField, self).to_python(value) 355 | if value is None: 356 | return value 357 | else: 358 | return parse_datetime(value) 359 | 360 | 361 | class XPathBooleanField(XPathTextField): 362 | 363 | true_vals = ('true',) 364 | false_vals = ('false',) 365 | 366 | def __init__(self, *args, **kwargs): 367 | true_vals = kwargs.pop('true_vals', None) 368 | if true_vals is not None: 369 | self.true_vals = true_vals 370 | false_vals = kwargs.pop('false_vals', None) 371 | if false_vals is not None: 372 | self.false_vals = false_vals 373 | super(XPathBooleanField, self).__init__(*args, **kwargs) 374 | 375 | def validate(self, value, model_instance): 376 | if value is True or value is False: 377 | return 378 | super(XPathBooleanField, self).validate(value, model_instance) 379 | if value is None: 380 | return 381 | value = XPathTextField.to_python(self, value) 382 | if value is None: 383 | return 384 | if value not in self.true_vals and value not in self.false_vals: 385 | opts = model_instance._meta 386 | exc_msg = ("%(field)s on xmlmodel %(app_label)s.%(object_name)s " 387 | "has value %(val)r not in true_vals or false_vals" % { 388 | "field": repr(self).decode('raw_unicode_escape'), 389 | "app_label": opts.app_label, 390 | "object_name": opts.object_name, 391 | "val": value,}) 392 | raise ValidationError(exc_msg) 393 | 394 | def to_python(self, value): 395 | if value is None or value is True or value is False: 396 | return value 397 | value = super(XPathBooleanField, self).to_python(value) 398 | if value in self.true_vals: 399 | return True 400 | elif value in self.false_vals: 401 | return False 402 | else: 403 | return value 404 | 405 | 406 | class XPathTextListField(XPathListField): 407 | 408 | def to_python(self, value): 409 | value = super(XPathTextListField, self).to_python(value) 410 | if value is None: 411 | return value 412 | else: 413 | return [force_unicode(getattr(v, "text", v)) for v in value] 414 | 415 | 416 | class XPathIntegerListField(XPathTextListField): 417 | 418 | def to_python(self, value): 419 | value = super(XPathIntegerListField, self).to_python(value) 420 | if value is None: 421 | return value 422 | else: 423 | return [int(v) for v in value] 424 | 425 | 426 | class XPathFloatListField(XPathTextListField): 427 | 428 | def to_python(self, value): 429 | value = super(XPathFloatListField, self).to_python(value) 430 | if value is None: 431 | return value 432 | else: 433 | return [float(v) for v in value] 434 | 435 | 436 | class XPathDateTimeListField(XPathTextListField): 437 | 438 | def to_python(self, value): 439 | value = super(XPathDateTimeListField, self).to_python(value) 440 | if value is None: 441 | return value 442 | else: 443 | return [parse_datetime(v) for v in value] 444 | 445 | 446 | class XPathBooleanListField(XPathTextListField): 447 | 448 | true_vals = ('true',) 449 | false_vals = ('false',) 450 | 451 | def __init__(self, *args, **kwargs): 452 | true_vals = kwargs.pop('true_vals', None) 453 | if true_vals is not None: 454 | self.true_vals = true_vals 455 | false_vals = kwargs.pop('false_vals', None) 456 | if false_vals is not None: 457 | self.false_vals = false_vals 458 | super(XPathBooleanField, self).__init__(*args, **kwargs) 459 | 460 | def validate(self, value, model_instance): 461 | super(XPathBooleanField, self).validate(value, model_instance) 462 | values = super(XPathBooleanListField, self).to_python(value) 463 | if values is None: 464 | return 465 | for value in values: 466 | if value not in self.true_vals and value not in self.false_vals: 467 | opts = model_instance._meta 468 | raise ValidationError(("XPathBooleanListField %(field)r on " 469 | " xml model %(app_label)s.%(object_name)s" 470 | " has value %(value)r not in 'true_vals'" 471 | " or 'false_vals'") % { 472 | "field": self.name, 473 | "app_label": opts.app_label, 474 | "object_name": opts.object_name, 475 | "value": value,}) 476 | 477 | def to_python(self, value): 478 | value = super(XPathBooleanField, self).to_python(value) 479 | if value is None: 480 | return value 481 | elif value in self.true_vals: 482 | return True 483 | elif value in self.false_vals: 484 | return False 485 | else: 486 | return value 487 | 488 | 489 | class XPathHtmlField(XPathSingleNodeField): 490 | """ 491 | Differs from XPathTextField in that it serializes mixed content to a 492 | unicode string, rather than simply returning the first text node. 493 | """ 494 | #: Whether to strip the 'xmlns="http://www.w3.org/1999/xhtml"' from 495 | #: the serialized html strings 496 | strip_xhtml_ns = True 497 | 498 | def __init__(self, xpath_query, strip_xhtml_ns=True, **kwargs): 499 | self.strip_xhtml_ns = strip_xhtml_ns 500 | super(XPathHtmlField, self).__init__(xpath_query, **kwargs) 501 | 502 | def format_value(self, value): 503 | formatted = etree.tostring(value, encoding='unicode', method='html') 504 | if self.strip_xhtml_ns: 505 | formatted = formatted.replace(u' xmlns="http://www.w3.org/1999/xhtml"', '') 506 | return formatted 507 | 508 | def to_python(self, value): 509 | value = super(XPathHtmlField, self).to_python(value) 510 | if value is None: 511 | return value 512 | if isinstance(value, etree._Element): 513 | return self.format_value(value) 514 | 515 | 516 | class XPathHtmlListField(XPathListField): 517 | """ 518 | Differs from XPathHtmlListField in that it serializes mixed content to 519 | a unicode string, rather than simply returning the first text node for 520 | each node in the result. 521 | """ 522 | #: Whether to strip the 'xmlns="http://www.w3.org/1999/xhtml"' from 523 | #: the serialized html strings 524 | strip_xhtml_ns = True 525 | 526 | def __init__(self, xpath_query, strip_xhtml_ns=True, **kwargs): 527 | self.strip_xhtml_ns = strip_xhtml_ns 528 | super(XPathHtmlListField, self).__init__(xpath_query, **kwargs) 529 | 530 | def format_value(self, value): 531 | formatted = etree.tostring(value, encoding='unicode', method='html') 532 | if self.strip_xhtml_ns: 533 | formatted = formatted.replace(u' xmlns="http://www.w3.org/1999/xhtml"', '') 534 | return formatted 535 | 536 | def to_python(self, value): 537 | value = super(XPathHtmlListField, self).to_python(value) 538 | if value is None: 539 | return value 540 | else: 541 | return [self.format_value(v) for v in value] 542 | 543 | 544 | class XPathInnerHtmlMixin(object): 545 | 546 | self_closing_re = re.compile( 547 | r'<(area|base(?:font)?|frame|col|br|hr|input|img|link|meta|param)' 548 | r'([^/>]*?)>') 549 | 550 | def get_inner_html(self, value): 551 | if not isinstance(value, str): 552 | return value 553 | # Strip surrounding tag 554 | value = re.sub(r"(?s)^<([^>\s]*)(?:[^>]*>|>)(.*)$", r'\2', value) 555 | # Replace open-close tags into self-closing where appropriate 556 | # e.g. "

" => "
" 557 | value = self.self_closing_re.sub(r'<\1\2/>', value) 558 | # Remove leading and trailing whitespace 559 | value = value.strip() 560 | return value 561 | 562 | 563 | class XPathInnerHtmlField(XPathInnerHtmlMixin, XPathHtmlField): 564 | 565 | def to_python(self, value): 566 | if value is None: 567 | return value 568 | value = super(XPathInnerHtmlField, self).to_python(value) 569 | return self.get_inner_html(value) 570 | 571 | 572 | class XPathInnerHtmlListField(XPathInnerHtmlMixin, XPathHtmlListField): 573 | 574 | def to_python(self, value): 575 | if value is None: 576 | return value 577 | value = super(XPathInnerHtmlListField, self).to_python(value) 578 | return [self.get_inner_html(v) for v in value] 579 | 580 | 581 | class XsltField(XmlField, metaclass=XsltFieldBase): 582 | 583 | #: Instance of lxml.etree.XMLParser 584 | parser = None 585 | 586 | #: Extra extensions to pass on to lxml.etree.XSLT() 587 | extensions = {} 588 | 589 | xslt_file = None 590 | xslt_string = None 591 | 592 | _xslt_tree = None 593 | 594 | def __init__(self, xslt_file=None, xslt_string=None, parser=None, 595 | extensions=None, **kwargs): 596 | super(XsltField, self).__init__(**kwargs) 597 | 598 | if xslt_file is None and xslt_string is None: 599 | raise ValidationError("XsltField requires either xslt_file or " 600 | "xslt_string") 601 | elif xslt_file is not None and xslt_string is not None: 602 | raise ValidationError("XsltField.__init__() accepts either " 603 | "xslt_file or xslt_string as keyword " 604 | "arguments, not both") 605 | 606 | self.xslt_file = xslt_file 607 | self.xslt_string = xslt_string 608 | self.parser = parser 609 | if extensions is not None: 610 | self.extensions = extensions 611 | 612 | def get_xslt_tree(self, model_instance): 613 | if self._xslt_tree is None: 614 | parser = self.parser 615 | if parser is None: 616 | parser = model_instance._meta.get_parser() 617 | if self.xslt_file is not None: 618 | self._xslt_tree = etree.parse(self.xslt_file, parser) 619 | elif self.xslt_string is not None: 620 | self._xslt_tree = etree.XML(self.xslt_string, parser) 621 | return self._xslt_tree 622 | 623 | 624 | class SchematronField(XmlField, metaclass=XsltFieldBase): 625 | 626 | #: Instance of lxml.etree.XMLParser 627 | parser = None 628 | 629 | #: Extra extensions to pass on to lxml.etree.XSLT() 630 | extensions = {} 631 | 632 | schematron_file = None 633 | schematron_string = None 634 | 635 | _schematron = None 636 | _schematron_tree = None 637 | _schematron_xslt = None 638 | 639 | def __init__(self, schematron_file=None, schematron_string=None, parser=None, 640 | extensions=None, **kwargs): 641 | self.schematron_kwargs = { 642 | 'compile_params': kwargs.pop('compile_params', None), 643 | 'include_params': kwargs.pop('include_params', None), 644 | 'expand_params': kwargs.pop('expand_params', None), 645 | 'phase': kwargs.pop('phase', None), 646 | 'store_xslt': True, 647 | 'store_report': True, 648 | 'store_schematron': True, 649 | } 650 | self.schematron_kwargs = { 651 | k: v for k, v in self.schematron_kwargs.items() 652 | if v is not None 653 | } 654 | 655 | super(SchematronField, self).__init__(**kwargs) 656 | 657 | if schematron_file is None and schematron_string is None: 658 | raise ValidationError("SchematronField requires either " 659 | "schematron_file or schematron_string") 660 | elif schematron_file is not None and schematron_string is not None: 661 | raise ValidationError("SchematronField.__init__() accepts either " 662 | "schematron_file or schematron_string as " 663 | "keyword arguments, not both") 664 | 665 | self.schematron_file = schematron_file 666 | self.schematron_string = schematron_string 667 | self.parser = parser 668 | if extensions is not None: 669 | self.extensions = extensions 670 | 671 | def get_xslt_tree(self, model_instance): 672 | if self._schematron_xslt is None: 673 | schematron_tree = self.get_schematron_tree(model_instance) 674 | self._schematron = isoschematron.Schematron(schematron_tree, **self.schematron_kwargs) 675 | self._schematron_xslt = self._schematron.validator_xslt.getroot() 676 | return self._schematron_xslt 677 | 678 | def get_schematron_tree(self, model_instance): 679 | if self._schematron_tree is None: 680 | parser = self.parser 681 | if parser is None: 682 | parser = model_instance._meta.get_parser() 683 | if self.schematron_file is not None: 684 | self._schematron_tree = etree.parse(self.schematron_file, parser) 685 | elif self.schematron_string is not None: 686 | self._schematron_tree = etree.XML(self.schematron_string, parser) 687 | return self._schematron_tree 688 | 689 | 690 | # Extra imports so that these can be used via xmlmodels.fields 691 | from .related import (EmbeddedXPathField, EmbeddedXPathListField, 692 | EmbeddedXsltField, EmbeddedSchematronField,) 693 | -------------------------------------------------------------------------------- /djxml/xmlmodels/loading.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utilities for loading xml_models and the modules that contain them. 3 | 4 | More or less identical to django.db.models.loading, with a few db 5 | specific things removed. 6 | """ 7 | 8 | from importlib import import_module 9 | from collections import OrderedDict 10 | 11 | from django.conf import settings 12 | from django.core.exceptions import ImproperlyConfigured 13 | from django.utils.module_loading import module_has_submodule 14 | 15 | import sys 16 | import os 17 | import threading 18 | 19 | __all__ = ('get_apps', 'get_app', 'get_xml_models', 'get_xml_model', 20 | 'register_xml_models', 'load_app', 'app_cache_ready') 21 | 22 | class AppCache(object): 23 | """ 24 | A cache that stores installed applications and their xml_models. Used to 25 | provide reverse-relations and for app introspection (e.g. admin). 26 | """ 27 | # Use the Borg pattern to share state between all instances. Details at 28 | # http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66531. 29 | __shared_state = dict( 30 | # Keys of app_store are the xml_model modules for each application. 31 | app_store = OrderedDict(), 32 | 33 | # Mapping of app_labels to a dictionary of xml_model names to model 34 | # code. 35 | app_xml_models = OrderedDict(), 36 | 37 | # Mapping of app_labels to errors raised when trying to import the app 38 | app_errors = {}, 39 | 40 | # -- Everything below here is only used when populating the cache -- 41 | loaded = False, 42 | handled = {}, 43 | postponed = [], 44 | nesting_level = 0, 45 | write_lock = threading.RLock(), 46 | _get_xml_models_cache = {}, 47 | ) 48 | 49 | def __init__(self): 50 | self.__dict__ = self.__shared_state 51 | 52 | def _populate(self): 53 | """ 54 | Fill in all the cache information. This method is threadsafe, in the 55 | sense that every caller will see the same state upon return, and if 56 | the cache is already initialised, it does not work. 57 | """ 58 | if self.loaded: 59 | return 60 | self.write_lock.acquire() 61 | try: 62 | if self.loaded: 63 | return 64 | for app_name in settings.INSTALLED_APPS: 65 | if app_name in self.handled: 66 | continue 67 | self.load_app(app_name, True) 68 | if not self.nesting_level: 69 | for app_name in self.postponed: 70 | self.load_app(app_name) 71 | self.loaded = True 72 | finally: 73 | self.write_lock.release() 74 | 75 | def load_app(self, app_name, can_postpone=False): 76 | """ 77 | Loads the app with the provided fully qualified name, and returns the 78 | xml model module. 79 | """ 80 | self.handled[app_name] = None 81 | self.nesting_level += 1 82 | app_module = import_module(app_name) 83 | try: 84 | xml_models = import_module('.xml_models', app_name) 85 | except ImportError: 86 | self.nesting_level -= 1 87 | # If the app doesn't have an xml_models module, we can just ignore 88 | # the ImportError and return no xml_models for it. 89 | if not module_has_submodule(app_module, 'xml_models'): 90 | return None 91 | # But if the app does have an xml_models module, we need to figure 92 | # out whether to suppress or propagate the error. If can_postpone 93 | # is True then it may be that the package is still being imported 94 | # by Python and the xml_models module isn't available yet. So we 95 | # add the app to the postponed list and we'll try it again after 96 | # all the recursion has finished (in populate). If can_postpone is 97 | # False then it's time to raise the ImportError. 98 | else: 99 | if can_postpone: 100 | self.postponed.append(app_name) 101 | return None 102 | else: 103 | raise 104 | 105 | self.nesting_level -= 1 106 | if xml_models not in self.app_store: 107 | self.app_store[xml_models] = len(self.app_store) 108 | return xml_models 109 | 110 | def app_cache_ready(self): 111 | """ 112 | Returns true if the xml model cache is fully populated. 113 | 114 | Useful for code that wants to cache the results of get_xml_models() 115 | for themselves once it is safe to do so. 116 | """ 117 | return self.loaded 118 | 119 | def get_apps(self): 120 | "Returns a list of all installed modules that contain xml models." 121 | self._populate() 122 | 123 | # Ensure the returned list is always in the same order (with new apps 124 | # added at the end). This avoids unstable ordering on the admin app 125 | # list page, for example. 126 | apps = [(v, k) for k, v in self.app_store.items()] 127 | apps.sort() 128 | return [elt[1] for elt in apps] 129 | 130 | def get_app(self, app_label, emptyOK=False): 131 | """ 132 | Returns the module containing the xml models for the given app_label. 133 | If the app has no xml models in it and 'emptyOK' is True, returns None 134 | """ 135 | self._populate() 136 | self.write_lock.acquire() 137 | try: 138 | for app_name in settings.INSTALLED_APPS: 139 | if app_label == app_name.split('.')[-1]: 140 | mod = self.load_app(app_name, False) 141 | if mod is None: 142 | if emptyOK: 143 | return None 144 | else: 145 | return mod 146 | raise ImproperlyConfigured("App with label %s could not be found"\ 147 | % app_label) 148 | finally: 149 | self.write_lock.release() 150 | 151 | def get_app_errors(self): 152 | "Returns the map of known problems with the INSTALLED_APPS." 153 | self._populate() 154 | return self.app_errors 155 | 156 | def get_xml_models(self, app_mod=None, include_deferred=False): 157 | """ 158 | Given a module containing xml models, returns a list of the xml_models 159 | Otherwise returns a list of all installed xml_models. 160 | 161 | By default, xml models created to satisfy deferred attribute 162 | queries are *not* included in the list of xml models. However, if 163 | you specify include_deferred, they will be. 164 | """ 165 | cache_key = (app_mod, False, include_deferred) 166 | try: 167 | return self._get_xml_models_cache[cache_key] 168 | except KeyError: 169 | pass 170 | self._populate() 171 | if app_mod: 172 | app_list = [self.app_xml_models.get(app_mod.__name__.split('.')[-2], OrderedDict())] 173 | else: 174 | app_list = self.app_xml_models.values() 175 | xml_model_list = [] 176 | for app in app_list: 177 | xml_model_list.extend( 178 | model for model in app.values() 179 | if ((not model._deferred or include_deferred)) 180 | ) 181 | self._get_xml_models_cache[cache_key] = xml_model_list 182 | return xml_model_list 183 | 184 | def get_xml_model(self, app_label, model_name, seed_cache=True): 185 | """ 186 | Returns the xml model matching the given app_label and 187 | case-insensitive model_name. 188 | 189 | Returns None if no xml model is found. 190 | """ 191 | if seed_cache: 192 | self._populate() 193 | return self.app_xml_models.get(app_label, OrderedDict()).get( 194 | model_name.lower()) 195 | 196 | def register_xml_models(self, app_label, *xml_models): 197 | """ 198 | Register a set of xml models as belonging to an app. 199 | """ 200 | for model in xml_models: 201 | # Store as 'name: model' pair in a dictionary 202 | # in the app_models dictionary 203 | model_name = model._meta.object_name.lower() 204 | model_dict = self.app_xml_models.setdefault(app_label, OrderedDict()) 205 | if model_name in model_dict: 206 | # The same model may be imported via different paths (e.g. 207 | # appname.xml_models and project.appname.xml_models). We use the 208 | # source filename as a means to detect identity. 209 | fname1 = os.path.abspath(sys.modules[model.__module__].__file__) 210 | fname2 = os.path.abspath(sys.modules[model_dict[model_name].__module__].__file__) 211 | # Since the filename extension could be .py the first time and 212 | # .pyc or .pyo the second time, ignore the extension when 213 | # comparing. 214 | if os.path.splitext(fname1)[0] == os.path.splitext(fname2)[0]: 215 | continue 216 | model_dict[model_name] = model 217 | self._get_xml_models_cache.clear() 218 | 219 | cache = AppCache() 220 | 221 | # These methods were always module level, so are kept that way for backwards 222 | # compatibility. 223 | get_apps = cache.get_apps 224 | get_app = cache.get_app 225 | get_app_errors = cache.get_app_errors 226 | get_xml_models = cache.get_xml_models 227 | get_xml_model = cache.get_xml_model 228 | register_xml_models = cache.register_xml_models 229 | load_app = cache.load_app 230 | app_cache_ready = cache.app_cache_ready 231 | -------------------------------------------------------------------------------- /djxml/xmlmodels/options.py: -------------------------------------------------------------------------------- 1 | from bisect import bisect 2 | from collections import OrderedDict 3 | 4 | from lxml import etree 5 | 6 | from django.core.exceptions import FieldDoesNotExist 7 | from django.utils.encoding import smart_bytes, smart_str 8 | 9 | from .exceptions import ExtensionNamespaceException 10 | from .fields import XmlPrimaryElementField 11 | 12 | DEFAULT_NAMES = ('app_label', 'namespaces', 'parser_opts', 'extension_ns_uri', 13 | 'xsd_schema', 'xsd_schema_file',) 14 | 15 | 16 | class Options(object): 17 | 18 | def __init__(self, meta, app_label=None, namespaces=None, parser_opts=None, 19 | extension_ns_uri=None, xsd_schema=None, xsd_schema_file=None): 20 | self.local_fields = [] 21 | self.module_name = None 22 | self.object_name, self.app_label = None, app_label 23 | self.meta = meta 24 | 25 | self.root = None 26 | self.has_root_field, self.root_field = False, None 27 | 28 | # Dict mapping ns prefixes to ns URIs 29 | self.namespaces = namespaces or {} 30 | 31 | # Default namespace uri for functions passed as extensions to 32 | # XSLT/XPath 33 | self.extension_ns_uri = extension_ns_uri 34 | 35 | # Extensions generated by XmlModelBase.add_to_class() 36 | self.extensions = {} 37 | 38 | # An instance of lxml.etree.XMLSchema, can be set in Meta 39 | self.xsd_schema = xsd_schema 40 | # The path to an xml schema file, can be set in Meta 41 | self.xsd_schema_file = xsd_schema_file 42 | 43 | # Dict passed as kwargs to create lxml.etree.XMLParser instance 44 | self.parser_opts = parser_opts or {} 45 | self.parser = None 46 | self.parents = OrderedDict() 47 | 48 | def contribute_to_class(self, cls, name): 49 | cls._meta = self 50 | # First, construct the default values for these options. 51 | self.object_name = cls.__name__ 52 | self.module_name = self.object_name.lower() 53 | 54 | # Next, apply any overridden values from 'class Meta'. 55 | if self.meta: 56 | meta_attrs = self.meta.__dict__.copy() 57 | for name in self.meta.__dict__: 58 | # Ignore any private attributes that Django doesn't care about 59 | # NOTE: We can't modify a dictionary's contents while looping 60 | # over it, so we loop over the *original* dictionary instead. 61 | if name.startswith('_'): 62 | del meta_attrs[name] 63 | for attr_name in DEFAULT_NAMES: 64 | if attr_name in meta_attrs: 65 | setattr(self, attr_name, meta_attrs.pop(attr_name)) 66 | elif hasattr(self.meta, attr_name): 67 | setattr(self, attr_name, getattr(self.meta, attr_name)) 68 | 69 | # Any leftover attributes must be invalid. 70 | if meta_attrs != {}: 71 | raise TypeError("'class Meta' got invalid attribute(s): %s" \ 72 | % ','.join(list(meta_attrs.keys()))) 73 | if self.xsd_schema is not None and self.xsd_schema_file is not None: 74 | raise TypeError("'class Meta' got attribute 'xsd_schema' " 75 | "and 'xsd_schema_file'; only one may be " 76 | "specified.") 77 | if self.schema and not isinstance(self.schema, etree.XMLSchema): 78 | raise TypeError("'class Meta' got attribute 'xsd_schema' " 79 | "of type %r, expected lxml.etree.XMLSchema" \ 80 | % self.xsd_schema.__class.__name) 81 | 82 | del self.meta 83 | 84 | def _prepare(self, model): 85 | if not self.has_root_field: 86 | root_field = XmlPrimaryElementField() 87 | model.add_to_class('root', root_field) 88 | if self.xsd_schema_file is not None: 89 | schema_root = etree.parse(self.xsd_schema_file) 90 | self.xsd_schema = etree.XMLSchema(schema_root) 91 | 92 | def get_parser(self): 93 | if self.parser is None: 94 | self.parser = etree.XMLParser(**self.parser_opts) 95 | return self.parser 96 | 97 | def add_field(self, field): 98 | # Insert the given field in the order in which it was created, using 99 | # the "creation_counter" attribute of the field. 100 | self.local_fields.insert(bisect(self.local_fields, field), field) 101 | self.setup_root(field) 102 | if hasattr(self, '_field_cache'): 103 | del self._field_cache 104 | del self._field_name_cache 105 | 106 | if hasattr(self, '_name_map'): 107 | del self._name_map 108 | 109 | def add_extension(self, method, extension_name=None): 110 | if method.lxml_extension_name is not None: 111 | extension_name = method.lxml_extension_name 112 | 113 | ns_uri = method.lxml_ns_uri 114 | if ns_uri is None: 115 | if self.extension_ns_uri is not None: 116 | ns_uri = self.extension_ns_uri 117 | else: 118 | msg = ("Extension %r has no extension_ns_uri defined and %r " 119 | "does not define a default extension namespace uri") \ 120 | % (extension_name, self.app_label) 121 | raise ExtensionNamespaceException(msg) 122 | 123 | self.extensions[(ns_uri, extension_name,)] = method 124 | 125 | def setup_root(self, field): 126 | if not self.root and field.is_root_field: 127 | self.etree = field 128 | 129 | def __repr__(self): 130 | return '' % self.object_name 131 | 132 | def __str__(self): 133 | return "%s.%s" % (smart_str(self.app_label), smart_str(self.module_name)) 134 | 135 | def _fields(self): 136 | """ 137 | The getter for self.fields. This returns the list of field objects 138 | available to this model. 139 | 140 | Callers are not permitted to modify this list, since it's a reference 141 | to this instance (not a copy). 142 | """ 143 | try: 144 | self._field_name_cache 145 | except AttributeError: 146 | self._fill_fields_cache() 147 | return self._field_name_cache 148 | fields = property(_fields) 149 | 150 | def _fill_fields_cache(self): 151 | cache = [] 152 | for parent in self.parents: 153 | for field in parent._meta.fields: 154 | cache.append(field) 155 | cache.extend([f for f in self.local_fields]) 156 | self._field_name_cache = tuple(cache) 157 | 158 | def get_field(self, name): 159 | """ 160 | Returns the requested field by name. Raises FieldDoesNotExist on error. 161 | """ 162 | for f in self.fields: 163 | if f.name == name: 164 | return f 165 | raise FieldDoesNotExist('%s has no field named %r' \ 166 | % (self.object_name, name)) 167 | -------------------------------------------------------------------------------- /djxml/xmlmodels/related.py: -------------------------------------------------------------------------------- 1 | from .signals import xmlclass_prepared 2 | from .loading import get_xml_model 3 | from .fields import (SchematronField, XmlField, XPathSingleNodeField, 4 | XPathSingleNodeField, XPathListField, XsltField,) 5 | 6 | 7 | RECURSIVE_RELATIONSHIP_CONSTANT = 'self' 8 | 9 | pending_lookups = {} 10 | 11 | 12 | def add_lazy_relation(cls, field, relation, operation): 13 | """ 14 | Adds a lookup on ``cls`` when a related field is defined using a string, 15 | i.e.:: 16 | 17 | class MyModel(Model): 18 | children = EmbeddedXPathField("AnotherModel", "child::*") 19 | 20 | This string can be: 21 | 22 | * RECURSIVE_RELATIONSHIP_CONSTANT (i.e. "self") to indicate a recursive 23 | relation. 24 | 25 | * The name of a model (i.e "AnotherModel") to indicate another model in 26 | the same app. 27 | 28 | * An app-label and model name (i.e. "someapp.AnotherModel") to indicate 29 | another model in a different app. 30 | 31 | If the other model hasn't yet been loaded -- almost a given if you're using 32 | lazy relationships -- then the relation won't be set up until the 33 | xmlclass_prepared signal fires at the end of model initialization. 34 | 35 | operation is the work that must be performed once the relation can be 36 | resolved. 37 | """ 38 | # Check for recursive relations 39 | if relation == RECURSIVE_RELATIONSHIP_CONSTANT: 40 | app_label = cls._meta.app_label 41 | model_name = cls.__name__ 42 | 43 | else: 44 | # Look for an "app.Model" relation 45 | try: 46 | app_label, model_name = relation.split(".") 47 | except ValueError: 48 | # If we can't split, assume a model in current app 49 | app_label = cls._meta.app_label 50 | model_name = relation 51 | except AttributeError: 52 | # If it doesn't have a split it's actually a model class 53 | app_label = relation._meta.app_label 54 | model_name = relation._meta.object_name 55 | 56 | # Try to look up the related model, and if it's already loaded resolve the 57 | # string right away. If get_xml_model returns None, it means that the 58 | # related model isn't loaded yet, so we need to pend the relation until 59 | # the class is prepared. 60 | model = get_xml_model(app_label, model_name, False) 61 | if model: 62 | operation(field, model, cls) 63 | else: 64 | key = (app_label, model_name) 65 | value = (cls, field, operation) 66 | pending_lookups.setdefault(key, []).append(value) 67 | 68 | 69 | def do_pending_lookups(sender, **kwargs): 70 | """ 71 | Handle any pending relations to the sending model. 72 | Sent from xmlclass_prepared. 73 | """ 74 | key = (sender._meta.app_label, sender.__name__) 75 | for cls, field, operation in pending_lookups.pop(key, []): 76 | operation(field, sender, cls) 77 | 78 | 79 | xmlclass_prepared.connect(do_pending_lookups) 80 | 81 | 82 | class EmbeddedField(XmlField): 83 | 84 | embedded_model = None 85 | 86 | def contribute_to_class(self, cls, name): 87 | self.set_attributes_from_name(name) 88 | self.model = cls 89 | 90 | # Set up for lazy initialized embedded models 91 | if isinstance(self.embedded_model, str): 92 | def _resolve_lookup(field, resolved_model, cls): 93 | field.embedded_model = resolved_model 94 | add_lazy_relation(cls, self, self.embedded_model, _resolve_lookup) 95 | 96 | cls._meta.add_field(self) 97 | 98 | 99 | class EmbeddedXPathField(XPathSingleNodeField, EmbeddedField): 100 | 101 | def __init__(self, xml_model, *args, **kwargs): 102 | self.embedded_model = xml_model 103 | super(EmbeddedXPathField, self).__init__(*args, **kwargs) 104 | 105 | def to_python(self, value): 106 | value = super(EmbeddedXPathField, self).to_python(value) 107 | if value is None: 108 | return value 109 | else: 110 | return self.embedded_model(value) 111 | 112 | def contribute_to_class(self, cls, name): 113 | EmbeddedField.contribute_to_class(self, cls, name) 114 | 115 | 116 | class EmbeddedXPathListField(XPathListField, EmbeddedField): 117 | 118 | def __init__(self, xml_model, *args, **kwargs): 119 | self.embedded_model = xml_model 120 | super(EmbeddedXPathListField, self).__init__(*args, **kwargs) 121 | 122 | def to_python(self, value): 123 | value = super(EmbeddedXPathListField, self).to_python(value) 124 | if value is None: 125 | return value 126 | else: 127 | return [self.embedded_model(v) for v in value] 128 | 129 | def contribute_to_class(self, cls, name): 130 | EmbeddedField.contribute_to_class(self, cls, name) 131 | 132 | 133 | class EmbeddedXsltField(XsltField, EmbeddedField): 134 | 135 | def __init__(self, xml_model, *args, **kwargs): 136 | self.embedded_model = xml_model 137 | super(EmbeddedXsltField, self).__init__(*args, **kwargs) 138 | 139 | def to_python(self, value): 140 | value = super(EmbeddedXsltField, self).to_python(value) 141 | if value is None: 142 | return value 143 | else: 144 | return self.embedded_model(value) 145 | 146 | def contribute_to_class(self, cls, name): 147 | EmbeddedField.contribute_to_class(self, cls, name) 148 | 149 | 150 | class EmbeddedSchematronField(SchematronField, EmbeddedField): 151 | 152 | def __init__(self, xml_model, *args, **kwargs): 153 | self.embedded_model = xml_model 154 | super(EmbeddedSchematronField, self).__init__(*args, **kwargs) 155 | 156 | def to_python(self, value): 157 | value = super(EmbeddedSchematronField, self).to_python(value) 158 | if value is None: 159 | return value 160 | else: 161 | return self.embedded_model(value) 162 | 163 | def contribute_to_class(self, cls, name): 164 | EmbeddedField.contribute_to_class(self, cls, name) 165 | -------------------------------------------------------------------------------- /djxml/xmlmodels/signals.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from django.dispatch import Signal 3 | 4 | xmlclass_prepared = Signal(providing_args=["class"]) 5 | -------------------------------------------------------------------------------- /djxml/xmlmodels/utils.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | import pytz 3 | import dateutil.parser 4 | 5 | from .exceptions import XPathDateTimeException 6 | 7 | 8 | def parse_datetime(dt_str): 9 | eastern_tz = pytz.timezone("America/New_York") 10 | try: 11 | dt = dateutil.parser.parse(dt_str) 12 | except ValueError: 13 | raise XPathDateTimeException("Could not parse datetime %s" % dt_str) 14 | else: 15 | if dt.tzinfo is None: 16 | return dt 17 | else: 18 | eastern_dt = eastern_tz.normalize(dt.astimezone(eastern_tz)) 19 | iso_fmt = "%Y-%m-%dT%H:%M:%S" 20 | naive_dt = dateutil.parser.parse(eastern_dt.strftime(iso_fmt)) 21 | return naive_dt 22 | -------------------------------------------------------------------------------- /docs/advanced_example.md: -------------------------------------------------------------------------------- 1 | # Advanced Example 2 | 3 | ## myapp/xmlmodels.py 4 | 5 | ```python 6 | import re, time 7 | from datetime import datetime 8 | from os.path import dirname, join 9 | from lxml import etree 10 | from djxml import xmlmodels 11 | 12 | strip_namespaces = etree.XSLT(etree.XML(""" 13 | 15 | 16 | 17 | 18 | """)) 19 | 20 | 21 | class AtomFeed(xmlmodels.XmlModel): 22 | 23 | class Meta: 24 | extension_ns_uri = "urn:local:atom-feed-functions" 25 | namespaces = { 26 | "fn": extension_ns_uri, 27 | "atom": "http://www.w3.org/2005/Atom", 28 | } 29 | 30 | feed_title = xmlmodels.XPathTextField("/atom:feed/atom:title") 31 | 32 | updated = xmlmodels.XPathDateTimeField("/atom:feed/atom:*[%s]" \ 33 | % "local-name()='updated' or (local-name()='published' and not(../atom:updated))") 34 | 35 | entries = xmlmodels.XPathListField("/atom:feed/atom:entry", required=False) 36 | 37 | titles = xmlmodels.XPathTextListField("/atom:feed/atom:entry/atom:title", 38 | required=False) 39 | 40 | transform_to_rss = xmlmodels.XsltField(join(dirname(__file__), "atom2rss.xsl")) 41 | 42 | @xmlmodels.lxml_extension 43 | def escape_xhtml(self, context, nodes): 44 | return u"".join([etree.tounicode(strip_namespaces(n)) for n in nodes]) 45 | 46 | @xmlmodels.lxml_extension 47 | def convert_atom_date_to_rss(self, context, rfc3339_str): 48 | try: 49 | m = re.match(r"([\d:T-]+)(?:\.\d+)?(Z|[+-][\d:]{5})", rfc3339_str) 50 | except TypeError: 51 | return "" 52 | dt_str, tz_str = m.groups() 53 | dt = datetime(*[t for t in time.strptime(dt_str, "%Y-%m-%dT%H:%M:%S")][0:6]) 54 | tz_str = 'Z' if tz_str == 'Z' else tz_str[:3] + tz_str[4:] 55 | return dt.strftime("%a, %d %b %Y %H:%M:%S") + tz_str 56 | 57 | 58 | def test(): 59 | atom_xml_file = join(dirname(__file__), 'atom_feed.xml') 60 | atom_feed = AtomFeed.create_from_file(atom_xml_file) 61 | rss_feed = atom_feed.transform_to_rss() 62 | print u"\n".join([ 63 | u"feed_title = %r" % atom_feed.feed_title, 64 | u"updated = %r" % atom_feed.updated, 65 | u"num_entries = %d" % len(atom_feed.entries), 66 | u"titles = %r" % (u", ".join(atom_feed.titles)), u"",]) 67 | print u"rss = %s" % etree.tounicode(rss_feed) 68 | ``` 69 | 70 | ## myapp/atom_feed.xml 71 | 72 | ```xml 73 | 74 | 75 | Example Feed 76 | 77 | 2012-07-05T18:30:02Z 78 | urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6 79 | 80 | An example entry 81 | 82 | urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 83 | 2012-07-05T18:30:02Z 84 | Some text. 85 | 86 | 87 | ``` 88 | 89 | ## myapp/atom2rss.xsl 90 | 91 | ```xml 92 | 93 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | ``` -------------------------------------------------------------------------------- /runtests.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import os 3 | import sys 4 | 5 | 6 | os.environ['DJANGO_SETTINGS_MODULE'] = 'djxml.tests.settings' 7 | 8 | 9 | import django 10 | from django.core.management import execute_from_command_line 11 | 12 | 13 | # Give feedback on used versions 14 | sys.stderr.write('Using Python version %s from %s\n' % (sys.version[:5], sys.executable)) 15 | sys.stderr.write('Using Django version %s from %s\n' % ( 16 | django.get_version(), 17 | os.path.dirname(os.path.abspath(django.__file__)))) 18 | 19 | def runtests(): 20 | argv = sys.argv[:1] + ['test', 'djxml', '--traceback', '--verbosity=1'] + sys.argv[1:] 21 | execute_from_command_line(argv) 22 | 23 | if __name__ == '__main__': 24 | runtests() 25 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [bdist_wheel] 2 | universal = 1 3 | 4 | [flake8] 5 | exclude = tmp 6 | ignore = 7 | max-line-length = 100 8 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import re 3 | import os.path 4 | from setuptools import setup, find_packages 5 | 6 | 7 | setup_kwargs = {} 8 | 9 | try: 10 | setup_kwargs['long_description'] = open('README.rst').read() 11 | except IOError: 12 | # Use the create_readme_rst command to convert README to reStructuredText 13 | pass 14 | 15 | with open(os.path.join(os.path.dirname(__file__), "djxml", "__init__.py")) as f: 16 | for line in f: 17 | if m := re.search(r"""^__version__ = (['"])(.+?)\1$""", line): 18 | version = m.group(2) 19 | break 20 | else: 21 | raise LookupError("Unable to find __version__ in djxml/__init__.py") 22 | 23 | setup( 24 | name='django-xml', 25 | version=version, 26 | install_requires=[ 27 | 'lxml', 28 | 'pytz', 29 | 'python-dateutil', 30 | 'Django>=2.2', 31 | ], 32 | description="Provides an abstraction to lxml's XPath and XSLT " + \ 33 | "functionality in a manner resembling django database models", 34 | author='The Atlantic', 35 | author_email='atmoprogrammers@theatlantic.com', 36 | url='https://github.com/theatlantic/django-xml', 37 | packages=find_packages(), 38 | classifiers=[ 39 | 'Environment :: Web Environment', 40 | 'Intended Audience :: Developers', 41 | 'Operating System :: OS Independent', 42 | 'Programming Language :: Python', 43 | 'Programming Language :: Python :: 3.7', 44 | 'Programming Language :: Python :: 3.8', 45 | 'Framework :: Django', 46 | 'Framework :: Django :: 2.2', 47 | 'Framework :: Django :: 3.0', 48 | 'Framework :: Django :: 3.1', 49 | 'Framework :: Django :: 3.2', 50 | ], 51 | include_package_data=True, 52 | zip_safe=False, 53 | entry_points={ 54 | 'distutils.commands': [ 55 | 'create_readme_rst = djxml.build:create_readme_rst', 56 | ], 57 | }, 58 | **setup_kwargs) 59 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | envlist = 3 | py{37,38}-django{22,30,31,32} 4 | 5 | [testenv] 6 | commands = 7 | {posargs:python runtests.py} 8 | deps = 9 | django22: Django>=2.2.19,<3.0 10 | django30: Django>=3.0.13,<3.1 11 | django31: Django>=3.1.7,<3.2 12 | django32: Django>=3.2rc1,<4.0 13 | 14 | [testenv:pep8] 15 | description = Run PEP8 pycodestyle (flake8) against the djxml/ package directory 16 | skipsdist = true 17 | skip_install = true 18 | basepython = python3.7 19 | deps = flake8 20 | commands = flake8 djxml 21 | 22 | [testenv:clean] 23 | description = Clean all build and test artifacts 24 | skipsdist = true 25 | skip_install = true 26 | deps = 27 | whitelist_externals = 28 | find 29 | rm 30 | commands = 31 | find {toxinidir} -type f -name "*.pyc" -delete 32 | find {toxinidir} -type d -name "__pycache__" -delete 33 | rm -rf {toxworkdir} {toxinidir}/build django_xml.egg-info 34 | 35 | [gh-actions] 36 | python = 37 | 3.7: py37 38 | 3.8: py38 39 | --------------------------------------------------------------------------------