├── .gitignore ├── .mailmap ├── .travis.yml ├── CONTRIBUTING.md ├── LICENSE ├── MANIFEST.in ├── README.md ├── README_pt_Br.md ├── apply_headers.py ├── docs └── source │ ├── conf.py │ └── index.rst ├── pymarc ├── __init__.py ├── constants.py ├── exceptions.py ├── field.py ├── leader.py ├── marc8.py ├── marc8_mapping.py ├── marcjson.py ├── marcxml.py ├── reader.py ├── record.py └── writer.py ├── requirements.dev.txt ├── setup.cfg ├── setup.py └── test ├── 1251.dat ├── __init__.py ├── alphatag.dat ├── bad_eacc_encoding.dat ├── bad_indicator.dat ├── bad_marc8_escape.dat ├── bad_records.mrc ├── bad_subfield_code.dat ├── bad_tag.xml ├── batch.json ├── batch.xml ├── diacritic.dat ├── marc.dat ├── marc8.dat ├── multi_isbn.dat ├── one.dat ├── one.json ├── regression45.dat ├── test.dat ├── test.json ├── test_encode.py ├── test_field.py ├── test_json.py ├── test_leader.py ├── test_marc8.py ├── test_marc8.txt ├── test_ordered_fields.py ├── test_reader.py ├── test_record.py ├── test_utf8.py ├── test_utf8.txt ├── test_writer.py ├── test_xml.py ├── testunimarc.dat ├── utf8.xml ├── utf8_errors.dat ├── utf8_invalid.mrc ├── utf8_with_leader_flag.dat └── utf8_without_leader_flag.dat /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | pymarc.egg-info 3 | build 4 | dist 5 | *$py.class 6 | .eggs 7 | *.egg 8 | Pipfile 9 | Pipfile.lock 10 | .vscode 11 | -------------------------------------------------------------------------------- /.mailmap: -------------------------------------------------------------------------------- 1 | Aaron S. Lav 2 | Dan Scott 3 | Eric Hellman 4 | Edward Betts 5 | James Tayson 6 | Nick Ruest 7 | Geoffrey Spear 8 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | 3 | script: 4 | - python setup.py test 5 | 6 | stages: 7 | - lint 8 | - test 9 | 10 | jobs: 11 | include: 12 | # Linting 13 | - stage: lint 14 | name: "Check linting with flake8" 15 | install: pip install -r requirements.dev.txt 16 | script: flake8 . 17 | - stage: lint 18 | name: "Check format with black" 19 | install: pip install -r requirements.dev.txt 20 | script: black --check --diff . 21 | # Unit tests 22 | - stage: test 23 | name: "Unit tests pypy3" 24 | python: "pypy3" 25 | - stage: test 26 | name: "Unit tests python 3.6" 27 | python: "3.6" 28 | - stage: test 29 | name: "Unit tests python 3.7" 30 | python: "3.7" 31 | - stage: test 32 | name: "Unit tests python 3.8" 33 | python: "3.8" 34 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | Feel free to [report bugs](https://github.com/edsu/pymarc/issues) you have encountered 4 | and [suggest new features](https://github.com/edsu/pymarc/pulls). 5 | 6 | For any new development, please respect the standards in place on the project (enforced 7 | by the CI): 8 | 9 | * formatting with [black](https://github.com/psf/black) 10 | * validated by [flake8](http://flake8.pycqa.org/en/latest) 11 | and (coming soon) [mypy](http://mypy-lang.org) 12 | * tested with [unittest](https://docs.python.org/fr/3/library/unittest.html) 13 | * compatible from python 3.3 to python 3.8 14 | 15 | To install development dependencies: `pip install -r requirements.dev.txt`. 16 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Redistribution and use in source and binary forms, with or without 2 | modification, are permitted provided that the following conditions are met: 3 | 4 | 1. Redistributions of source code must retain the above copyright notice, this 5 | list of conditions and the following disclaimer. 6 | 7 | 2. Redistributions in binary form must reproduce the above copyright notice, 8 | this list of conditions and the following disclaimer in the documentation 9 | and/or other materials provided with the distribution. 10 | 11 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 12 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 13 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 14 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 15 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 16 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 17 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 18 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 19 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 20 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 21 | 22 | Copyright for this project is held by its many contributors, including: 23 | 24 | Adam Constabaris 25 | André Nesse 26 | Chris Adams 27 | Dan Chudnov 28 | Dan Michael O. Heggø 29 | Dan Scott 30 | David Chouinard 31 | Ed Hill 32 | Ed Summers 33 | Edward Betts 34 | Eric Hellman 35 | Gabriel Farrell 36 | Geoffrey Spear 37 | Godmar Back 38 | Helga 39 | James Tayson 40 | Jay Luker 41 | Jim Nicholls 42 | Karol Sikora 43 | Lucas Souza 44 | Mark A. Matienzo 45 | Martin Czygan 46 | Michael B. Klein 47 | Michael J. Giarlo 48 | Mikhail Terekhov 49 | Nick Ruest 50 | Pierre Verkest 51 | Radim Řehůřek 52 | Renaud Boyer 53 | Robert Marchman 54 | Sean Chen 55 | Simon Hohl 56 | Ted Lawless 57 | Victor Seva 58 | Will Earp 59 | cclauss 60 | cyperus-papyrus 61 | gitgovdoc 62 | mmh 63 | nemobis 64 | wrCisco 65 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | recursive-include test *.py *.dat *.xml *.json *.mrc *.txt *.xml 2 | recursive-include docs *.py 3 | recursive-include docs *.rst 4 | 5 | include *.md 6 | include *.txt 7 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # The pymarc repository has moved to [GitLab](https://gitlab.com/pymarc/pymarc). 2 | -------------------------------------------------------------------------------- /README_pt_Br.md: -------------------------------------------------------------------------------- 1 | # O repositório pymarc foi movido para [GitLab](https://gitlab.com/pymarc/pymarc). 2 | -------------------------------------------------------------------------------- /apply_headers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | """Apply standard license headers and generate contributor list. 4 | 5 | Rather than trying to maintain file-level lists of contributors and copyright 6 | dates, apply a standard license header that points to the LICENSE file for the 7 | licensing details. And then generate the LICENSE file with a complete list of 8 | contributors based on the git commit history. 9 | 10 | To adjust contributor names or combine email addresses, see .mailmap. 11 | 12 | See https://github.com/edsu/pymarc/issues/147 for context. 13 | """ 14 | 15 | # This file is part of pymarc. It is subject to the license terms in the 16 | # LICENSE file found in the top-level directory of this distribution and at 17 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 18 | # propagated, or distributed according to the terms contained in the LICENSE 19 | # file. 20 | 21 | import pathlib 22 | import shlex 23 | import subprocess 24 | 25 | 26 | def get_contributors(): 27 | """Get a complete list of contributors from `git log`.""" 28 | # dictionary = add each name only once 29 | contribs = {} 30 | 31 | gitargs = shlex.split("git log --use-mailmap --format=short") 32 | log = subprocess.run(gitargs, capture_output=True, encoding="utf-8") 33 | for line in log.stdout.split("\n"): 34 | if line[0 : len("Author: ")] == "Author: ": 35 | contribs[line[len("Author: ") :]] = 1 36 | 37 | # Return a list of the contributors 38 | return sorted(contribs) 39 | 40 | 41 | def generate_license(contribs): 42 | """Generate a BSD-2 license file that lists contributors.""" 43 | bsd2 = """Redistribution and use in source and binary forms, with or without 44 | modification, are permitted provided that the following conditions are met: 45 | 46 | 1. Redistributions of source code must retain the above copyright notice, this 47 | list of conditions and the following disclaimer. 48 | 49 | 2. Redistributions in binary form must reproduce the above copyright notice, 50 | this list of conditions and the following disclaimer in the documentation 51 | and/or other materials provided with the distribution. 52 | 53 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 54 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 55 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 56 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 57 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 58 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 59 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 60 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 61 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 62 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 63 | 64 | """ 65 | 66 | with open("LICENSE", "w") as licensef: 67 | licensef.write(bsd2) 68 | licensef.write( 69 | "Copyright for this project is held by its many contributors, including:\n\n" 70 | ) 71 | for contrib in contribs: 72 | licensef.write("{}\n".format(contrib)) 73 | 74 | 75 | def apply_headers(): 76 | """Ensure standard license header is in each Python file.""" 77 | header = """# This file is part of pymarc. It is subject to the license terms in the 78 | # LICENSE file found in the top-level directory of this distribution and at 79 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 80 | # propagated, or distributed according to the terms contained in the LICENSE 81 | # file. 82 | """ 83 | 84 | path = pathlib.Path(".") 85 | for pyfile in list(path.glob("**/*.py")): 86 | if str(pyfile) == "docs/source/conf.py" or str(pyfile) == "test/__init__.py": 87 | continue 88 | with open(pyfile, "r") as reader: 89 | contents = reader.read() 90 | if contents.find(header) == -1: 91 | if str(pyfile) == "test/__init__.py": 92 | # Avoid angering black with a blank line at the end 93 | write_header(pyfile, reader, contents, header) 94 | else: 95 | write_header(pyfile, reader, contents, header + "\n") 96 | 97 | 98 | def write_header(pyfile, reader, contents, header): 99 | """Rewrite Python source file with the license header.""" 100 | reader.close() 101 | utf8_decl = "# -*- coding: utf-8 -*-\n" 102 | with open(pyfile, "w") as writer: 103 | if contents.startswith("# __init__.py"): 104 | sections = contents.split("\n\n", 1) 105 | writer.write(sections[0]) 106 | writer.write("\n\n") 107 | writer.write(header) 108 | writer.write(sections[1]) 109 | elif contents.startswith(utf8_decl): 110 | sections = contents.split(utf8_decl, 1) 111 | writer.write(utf8_decl) 112 | writer.write("\n") 113 | writer.write(header) 114 | writer.write(sections[1]) 115 | else: 116 | writer.write(header) 117 | writer.write(contents) 118 | 119 | 120 | if __name__ == "__main__": 121 | generate_license(get_contributors()) 122 | apply_headers() 123 | -------------------------------------------------------------------------------- /docs/source/conf.py: -------------------------------------------------------------------------------- 1 | """Pymarc's documentation build configuration file. 2 | 3 | Created by sphinx-quickstart on Thu Jul 21 10:24:11 2016. 4 | 5 | This file is execfile()d with the current directory set to its 6 | containing dir. 7 | 8 | Note that not all possible configuration values are present in this 9 | autogenerated file. 10 | 11 | All configuration values have a default; values that are commented out 12 | serve to show the default. 13 | 14 | If extensions (or modules to document with autodoc) are in another directory, 15 | add these directories to sys.path here. If the directory is relative to the 16 | documentation root, use os.path.abspath to make it absolute, like shown here. 17 | """ 18 | 19 | import os 20 | import sys 21 | 22 | sys.path.append(os.path.join(os.path.dirname(__file__), "../..")) 23 | print(sys.path) 24 | # sys.path.insert(0, os.path.abspath('.')) 25 | 26 | # -- General configuration ------------------------------------------------ 27 | 28 | # If your documentation needs a minimal Sphinx version, state it here. 29 | # 30 | # needs_sphinx = '1.0' 31 | 32 | # Add any Sphinx extension module names here, as strings. They can be 33 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom 34 | # ones. 35 | 36 | extensions = [ 37 | "sphinx.ext.autodoc", 38 | "sphinx.ext.autosummary", 39 | "sphinx.ext.intersphinx", 40 | "sphinx.ext.ifconfig", 41 | "sphinx.ext.viewcode", 42 | ] 43 | 44 | # Add any paths that contain templates here, relative to this directory. 45 | templates_path = ["_templates"] 46 | 47 | # The suffix(es) of source filenames. 48 | # You can specify multiple suffix as a list of string: 49 | # 50 | # source_suffix = ['.rst', '.md'] 51 | source_suffix = ".rst" 52 | 53 | # The encoding of source files. 54 | # 55 | # source_encoding = 'utf-8-sig' 56 | 57 | # The master toctree document. 58 | master_doc = "index" 59 | 60 | # General information about the project. 61 | project = "pymarc" 62 | copyright = "Contributors listed in the accompanying LICENSE file" 63 | author = "Ed Summers" 64 | 65 | # The version info for the project you're documenting, acts as replacement for 66 | # |version| and |release|, also used in various other places throughout the 67 | # built documents. 68 | # 69 | # The short X.Y version. 70 | version = "4.0.0" 71 | with open(os.path.join(os.path.dirname(__file__), "../../setup.py")) as setup_file: 72 | for line in setup_file: 73 | if line.startswith("version"): 74 | __, version = line.split(" = ") 75 | version = version.replace('"', "") 76 | break 77 | release = version 78 | 79 | # The language for content autogenerated by Sphinx. Refer to documentation 80 | # for a list of supported languages. 81 | # 82 | # This is also used if you do content translation via gettext catalogs. 83 | # Usually you set "language" from the command line for these cases. 84 | language = None 85 | 86 | # There are two options for replacing |today|: either, you set today to some 87 | # non-false value, then it is used: 88 | # 89 | # today = '' 90 | # 91 | # Else, today_fmt is used as the format for a strftime call. 92 | # 93 | # today_fmt = '%B %d, %Y' 94 | 95 | # List of patterns, relative to source directory, that match files and 96 | # directories to ignore when looking for source files. 97 | # This patterns also effect to html_static_path and html_extra_path 98 | exclude_patterns = [] 99 | 100 | # The reST default role (used for this markup: `text`) to use for all 101 | # documents. 102 | # 103 | # default_role = None 104 | 105 | # If true, '()' will be appended to :func: etc. cross-reference text. 106 | # 107 | # add_function_parentheses = True 108 | 109 | # If true, the current module name will be prepended to all description 110 | # unit titles (such as .. function::). 111 | # 112 | # add_module_names = True 113 | 114 | # If true, sectionauthor and moduleauthor directives will be shown in the 115 | # output. They are ignored by default. 116 | # 117 | # show_authors = False 118 | 119 | # The name of the Pygments (syntax highlighting) style to use. 120 | pygments_style = "sphinx" 121 | 122 | # A list of ignored prefixes for module index sorting. 123 | # modindex_common_prefix = [] 124 | 125 | # If true, keep warnings as "system message" paragraphs in the built documents. 126 | # keep_warnings = False 127 | 128 | # If true, `todo` and `todoList` produce output, else they produce nothing. 129 | todo_include_todos = False 130 | 131 | 132 | # -- Options for HTML output ---------------------------------------------- 133 | 134 | # The theme to use for HTML and HTML Help pages. See the documentation for 135 | # a list of builtin themes. 136 | # 137 | html_theme = "sphinx_rtd_theme" 138 | 139 | # Theme options are theme-specific and customize the look and feel of a theme 140 | # further. For a list of options available for each theme, see the 141 | # documentation. 142 | # 143 | # html_theme_options = {} 144 | 145 | # Add any paths that contain custom themes here, relative to this directory. 146 | # html_theme_path = [] 147 | 148 | # The name for this set of Sphinx documents. 149 | # " v documentation" by default. 150 | # 151 | # html_title = 'pymarc v3.1.5' 152 | 153 | # A shorter title for the navigation bar. Default is the same as html_title. 154 | # 155 | # html_short_title = None 156 | 157 | # The name of an image file (relative to this directory) to place at the top 158 | # of the sidebar. 159 | # 160 | # html_logo = None 161 | 162 | # The name of an image file (relative to this directory) to use as a favicon of 163 | # the docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 164 | # pixels large. 165 | # 166 | # html_favicon = None 167 | 168 | # Add any paths that contain custom static files (such as style sheets) here, 169 | # relative to this directory. They are copied after the builtin static files, 170 | # so a file named "default.css" will overwrite the builtin "default.css". 171 | html_static_path = ["_static"] 172 | 173 | # Add any extra paths that contain custom files (such as robots.txt or 174 | # .htaccess) here, relative to this directory. These files are copied 175 | # directly to the root of the documentation. 176 | # 177 | # html_extra_path = [] 178 | 179 | # If not None, a 'Last updated on:' timestamp is inserted at every page 180 | # bottom, using the given strftime format. 181 | # The empty string is equivalent to '%b %d, %Y'. 182 | # 183 | # html_last_updated_fmt = None 184 | 185 | # If true, SmartyPants will be used to convert quotes and dashes to 186 | # typographically correct entities. 187 | # 188 | # html_use_smartypants = True 189 | 190 | # Custom sidebar templates, maps document names to template names. 191 | # 192 | # html_sidebars = {} 193 | 194 | # Additional templates that should be rendered to pages, maps page names to 195 | # template names. 196 | # 197 | # html_additional_pages = {} 198 | 199 | # If false, no module index is generated. 200 | # 201 | # html_domain_indices = True 202 | 203 | # If false, no index is generated. 204 | # 205 | # html_use_index = True 206 | 207 | # If true, the index is split into individual pages for each letter. 208 | # 209 | # html_split_index = False 210 | 211 | # If true, links to the reST sources are added to the pages. 212 | # 213 | # html_show_sourcelink = True 214 | 215 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. 216 | # 217 | # html_show_sphinx = True 218 | 219 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. 220 | # 221 | # html_show_copyright = True 222 | 223 | # If true, an OpenSearch description file will be output, and all pages will 224 | # contain a tag referring to it. The value of this option must be the 225 | # base URL from which the finished HTML is served. 226 | # 227 | # html_use_opensearch = '' 228 | 229 | # This is the file name suffix for HTML files (e.g. ".xhtml"). 230 | # html_file_suffix = None 231 | 232 | # Language to be used for generating the HTML full-text search index. 233 | # Sphinx supports the following languages: 234 | # 'da', 'de', 'en', 'es', 'fi', 'fr', 'h', 'it', 'ja' 235 | # 'nl', 'no', 'pt', 'ro', 'r', 'sv', 'tr', 'zh' 236 | # 237 | # html_search_language = 'en' 238 | 239 | # A dictionary with options for the search language support, empty by default. 240 | # 'ja' uses this config value. 241 | # 'zh' user can custom change `jieba` dictionary path. 242 | # 243 | # html_search_options = {'type': 'default'} 244 | 245 | # The name of a javascript file (relative to the configuration directory) that 246 | # implements a search results scorer. If empty, the default will be used. 247 | # 248 | # html_search_scorer = 'scorer.js' 249 | 250 | # Output file base name for HTML help builder. 251 | htmlhelp_basename = "pymarcdoc" 252 | 253 | # -- Options for LaTeX output --------------------------------------------- 254 | 255 | latex_elements = { 256 | # The paper size ('letterpaper' or 'a4paper'). 257 | # 258 | # 'papersize': 'letterpaper', 259 | # The font size ('10pt', '11pt' or '12pt'). 260 | # 261 | # 'pointsize': '10pt', 262 | # Additional stuff for the LaTeX preamble. 263 | # 264 | # 'preamble': '', 265 | # Latex figure (float) alignment 266 | # 267 | # 'figure_align': 'htbp', 268 | } 269 | 270 | # Grouping the document tree into LaTeX files. List of tuples 271 | # (source start file, target name, title, 272 | # author, documentclass [howto, manual, or own class]). 273 | latex_documents = [ 274 | (master_doc, "pymarc.tex", "pymarc Documentation", "Ed Summers", "manual"), 275 | ] 276 | 277 | # The name of an image file (relative to this directory) to place at the top of 278 | # the title page. 279 | # 280 | # latex_logo = None 281 | 282 | # For "manual" documents, if this is true, then toplevel headings are parts, 283 | # not chapters. 284 | # 285 | # latex_use_parts = False 286 | 287 | # If true, show page references after internal links. 288 | # 289 | # latex_show_pagerefs = False 290 | 291 | # If true, show URL addresses after external links. 292 | # 293 | # latex_show_urls = False 294 | 295 | # Documents to append as an appendix to all manuals. 296 | # 297 | # latex_appendices = [] 298 | 299 | # It false, will not define \strong, \code, itleref, \crossref ... but only 300 | # \sphinxstrong, ..., \sphinxtitleref, ... To help avoid clash with user added 301 | # packages. 302 | # 303 | # latex_keep_old_macro_names = True 304 | 305 | # If false, no module index is generated. 306 | # 307 | # latex_domain_indices = True 308 | 309 | 310 | # -- Options for manual page output --------------------------------------- 311 | 312 | # One entry per manual page. List of tuples 313 | # (source start file, name, description, authors, manual section). 314 | man_pages = [(master_doc, "pymarc", "pymarc Documentation", [author], 1)] 315 | 316 | # If true, show URL addresses after external links. 317 | # 318 | # man_show_urls = False 319 | 320 | 321 | # -- Options for Texinfo output ------------------------------------------- 322 | 323 | # Grouping the document tree into Texinfo files. List of tuples 324 | # (source start file, target name, title, author, 325 | # dir menu entry, description, category) 326 | texinfo_documents = [ 327 | ( 328 | master_doc, 329 | "pymarc", 330 | "pymarc Documentation", 331 | author, 332 | "pymarc", 333 | "One line description of project.", 334 | "Miscellaneous", 335 | ), 336 | ] 337 | 338 | # Documents to append as an appendix to all manuals. 339 | # 340 | # texinfo_appendices = [] 341 | 342 | # If false, no module index is generated. 343 | # 344 | # texinfo_domain_indices = True 345 | 346 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 347 | # 348 | # texinfo_show_urls = 'footnote' 349 | 350 | # If true, do not generate a @detailmenu in the "Top" node's menu. 351 | # 352 | # texinfo_no_detailmenu = False 353 | 354 | 355 | # Example configuration for intersphinx: refer to the Python standard library. 356 | intersphinx_mapping = {"https://docs.python.org/": None} 357 | -------------------------------------------------------------------------------- /docs/source/index.rst: -------------------------------------------------------------------------------- 1 | .. pymarc documentation master file, created by 2 | sphinx-quickstart on Thu Jul 21 10:24:11 2016. 3 | You can adapt this file completely to your liking, but it should at least 4 | contain the root `toctree` directive. 5 | 6 | Pymarc 7 | ====== 8 | 9 | Release v\ |version| 10 | 11 | Pymarc is a python library for working with bibliographic data encoded in MARC21. 12 | 13 | It should work under python 2.x and 3.x. It provides an API for reading, writing 14 | and modifying MARC records. It was mostly designed to be an emergency 15 | eject seat, for getting your data assets out of MARC and into some kind 16 | of saner representation. However over the years it has been used to 17 | create and modify MARC records, since despite `repeated calls`_ for it to die as a 18 | format, MARC seems to be living quite happily as a zombie. 19 | 20 | Below are some common examples of how you might want to use pymarc. If 21 | you run across an example that you think should be here please send a 22 | pull request. 23 | 24 | .. _repeated calls: https://web.archive.org/web/20170731163019/http://www.marc-must-die.info/index.php/Main_Page 25 | 26 | Reading 27 | ~~~~~~~ 28 | 29 | Most often you will have some MARC data and will want to extract data 30 | from it. Here's an example of reading a batch of records and printing 31 | out the title. If you are curious this example uses the batch file 32 | available here in pymarc repository: 33 | 34 | .. code-block:: python 35 | 36 | from pymarc import MARCReader 37 | 38 | with open('test/marc.dat', 'rb') as fh: 39 | reader = MARCReader(fh) 40 | for record in reader: 41 | print(record.title()) 42 | 43 | The pragmatic programmer : from journeyman to master / 44 | Programming Python / 45 | Learning Python / 46 | Python cookbook / 47 | Python programming for the absolute beginner / 48 | Web programming : techniques for integrating Python, Linux, Apache, and MySQL / 49 | Python programming on Win32 / 50 | Python programming : an introduction to computer science / 51 | Python Web programming / 52 | Core python programming / 53 | Python and Tkinter programming / 54 | Game programming with Python, Lua, and Ruby / 55 | Python programming patterns / 56 | Python programming with the Java class libraries : a tutorial for building Web 57 | and Enterprise applications / 58 | Learn to program using Python : a tutorial for hobbyists, self-starters, and all 59 | who want to learn the art of computer programming / 60 | Programming with Python / 61 | BSD Sockets programming from a multi-language perspective / 62 | Design patterns : elements of reusable object-oriented software / 63 | Introduction to algorithms / 64 | ANSI Common Lisp / 65 | 66 | A pymarc.Record object has a few handy methods like title for 67 | getting at bits of a bibliographic record, others include: author, 68 | isbn, subjects, location, notes, 69 | physicaldescription, publisher, pubyear. But really, to work 70 | with MARC data you need to understand the numeric field tags and 71 | subfield codes that are used to designate various bits of information. 72 | There is a lot more hiding in a MARC record than these methods provide 73 | access to. For example the title method extracts the information 74 | from the 245 field, subfields a and b. You can access 75 | 245a like so: 76 | 77 | .. code-block:: python 78 | 79 | print(record['245']['a']) 80 | 81 | Some fields like subjects can repeat. In cases like that you will want 82 | to use get_fields to get all of them as pymarc.Field objects, 83 | which you can then interact with further: 84 | 85 | .. code-block:: python 86 | 87 | for f in record.get_fields('650'): 88 | print(f) 89 | 90 | If you are new to MARC fields Understanding 91 | MARC (http://www.loc.gov/marc/umb/) is a pretty good primer, and the 92 | MARC 21 Formats (http://www.loc.gov/marc/marcdocz.html) page at the 93 | Library of Congress is a good reference once you understand the basics. 94 | 95 | Writing 96 | ~~~~~~~ 97 | 98 | Here's an example of creating a record and writing it out to a file. 99 | 100 | .. code-block:: python 101 | 102 | from pymarc import Record, Field 103 | 104 | record = Record() 105 | record.add_field( 106 | Field( 107 | tag = '245', 108 | indicators = ['0','1'], 109 | subfields = [ 110 | 'a', 'The pragmatic programmer : ', 111 | 'b', 'from journeyman to master /', 112 | 'c', 'Andrew Hunt, David Thomas.' 113 | ])) 114 | with open('file.dat', 'wb') as out: 115 | out.write(record.as_marc()) 116 | 117 | Updating 118 | ~~~~~~~~ 119 | 120 | Updating works the same way, you read it in, modify it, and then write 121 | it out again: 122 | 123 | .. code-block:: python 124 | 125 | from pymarc import MARCReader 126 | 127 | with open('test/marc.dat', 'rb') as fh: 128 | reader = MARCReader(fh) 129 | record = next(reader) 130 | record['245']['a'] = 'The Zombie Programmer' 131 | with open('file.dat', 'wb') as out: 132 | out.write(record.as_marc()) 133 | 134 | JSON and XML 135 | ~~~~~~~~~~~~ 136 | 137 | If you find yourself using MARC data a fair bit, and distributing it, 138 | you may make other developers a bit happier by using the JSON or XML 139 | serializations. pymarc has support for both. The main benefit here is 140 | that the UTF8 character encoding is used, rather than the frustratingly 141 | archaic MARC8 encoding. Also they will be able to use JSON and XML tools 142 | to get at the data they want instead of some crazy MARC processing 143 | library like, ahem, pymarc. 144 | 145 | 146 | API Docs 147 | ======== 148 | 149 | Reader 150 | ~~~~~~ 151 | 152 | .. automodule:: pymarc.reader 153 | :members: 154 | :undoc-members: 155 | :show-inheritance: 156 | 157 | Record 158 | ~~~~~~ 159 | 160 | .. automodule:: pymarc.record 161 | :members: 162 | :undoc-members: 163 | :show-inheritance: 164 | 165 | Writer 166 | ~~~~~~ 167 | 168 | .. automodule:: pymarc.writer 169 | :members: 170 | :undoc-members: 171 | :show-inheritance: 172 | 173 | Field 174 | ----- 175 | 176 | .. automodule:: pymarc.field 177 | :members: 178 | :undoc-members: 179 | :show-inheritance: 180 | 181 | Exceptions 182 | ~~~~~~~~~~ 183 | 184 | .. automodule:: pymarc.exceptions 185 | :members: 186 | :undoc-members: 187 | :show-inheritance: 188 | 189 | MarcXML 190 | ~~~~~~~ 191 | 192 | .. automodule:: pymarc.marcxml 193 | :members: 194 | :undoc-members: 195 | :show-inheritance: 196 | 197 | Constants 198 | ~~~~~~~~~ 199 | 200 | .. automodule:: pymarc.constants 201 | :members: 202 | :undoc-members: 203 | :show-inheritance: 204 | 205 | MARC-8 206 | ~~~~~~ 207 | 208 | .. automodule:: pymarc.marc8 209 | :members: 210 | :undoc-members: 211 | :show-inheritance: 212 | 213 | MARC-8 mapping 214 | ~~~~~~~~~~~~~~ 215 | 216 | .. automodule:: pymarc.marc8_mapping 217 | :members: 218 | :undoc-members: 219 | :show-inheritance: 220 | 221 | Leader 222 | ------ 223 | 224 | .. automodule:: pymarc.leader 225 | :members: 226 | :undoc-members: 227 | :show-inheritance: 228 | 229 | 230 | Indices and tables 231 | ================== 232 | 233 | * :ref:`genindex` 234 | * :ref:`modindex` 235 | * :ref:`search` 236 | -------------------------------------------------------------------------------- /pymarc/__init__.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | from .record import * 8 | from .field import * 9 | from .exceptions import * 10 | from .reader import * 11 | from .writer import * 12 | from .constants import * 13 | from .marc8 import marc8_to_unicode, MARC8ToUnicode 14 | from .marcxml import * 15 | from .marcjson import * 16 | -------------------------------------------------------------------------------- /pymarc/constants.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """Constants for pymarc.""" 8 | 9 | LEADER_LEN = 24 10 | DIRECTORY_ENTRY_LEN = 12 11 | SUBFIELD_INDICATOR = chr(0x1F) 12 | END_OF_FIELD = chr(0x1E) 13 | END_OF_RECORD = chr(0x1D) 14 | -------------------------------------------------------------------------------- /pymarc/exceptions.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """Exceptions for pymarc.""" 8 | 9 | 10 | class PymarcException(Exception): 11 | """Base pymarc Exception.""" 12 | 13 | pass 14 | 15 | 16 | class RecordLengthInvalid(PymarcException): 17 | """Invalid record length.""" 18 | 19 | def __str__(self): 20 | return "Invalid record length in first 5 bytes of record" 21 | 22 | 23 | class RecordLeaderInvalid(PymarcException): 24 | """Unable to extract record leader.""" 25 | 26 | def __str__(self): 27 | return "Unable to extract record leader" 28 | 29 | 30 | class RecordDirectoryInvalid(PymarcException): 31 | """Invalid directory.""" 32 | 33 | def __str__(self): 34 | return "Invalid directory" 35 | 36 | 37 | class NoFieldsFound(PymarcException): 38 | """Unable to locate fields in record data.""" 39 | 40 | def __str__(self): 41 | return "Unable to locate fields in record data" 42 | 43 | 44 | class BaseAddressInvalid(PymarcException): 45 | """Base address exceeds size of record.""" 46 | 47 | def __str__(self): 48 | return "Base address exceeds size of record" 49 | 50 | 51 | class BaseAddressNotFound(PymarcException): 52 | """Unable to locate base address of record.""" 53 | 54 | def __str__(self): 55 | return "Unable to locate base address of record" 56 | 57 | 58 | class WriteNeedsRecord(PymarcException): 59 | """Write requires a pymarc.Record object as an argument.""" 60 | 61 | def __str__(self): 62 | return "Write requires a pymarc.Record object as an argument" 63 | 64 | 65 | class NoActiveFile(PymarcException): 66 | """There is no active file to write to in call to write.""" 67 | 68 | def __str__(self): 69 | return "There is no active file to write to in call to write" 70 | 71 | 72 | class FieldNotFound(PymarcException): 73 | """Record does not contain the specified field.""" 74 | 75 | def __str__(self): 76 | return "Record does not contain the specified field" 77 | 78 | 79 | class BadSubfieldCodeWarning(Warning): 80 | """Warning about a non-ASCII subfield code.""" 81 | 82 | pass 83 | 84 | 85 | class BadLeaderValue(PymarcException): 86 | """Error when setting a leader value.""" 87 | 88 | pass 89 | -------------------------------------------------------------------------------- /pymarc/field.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """The pymarc.field file.""" 8 | 9 | import logging 10 | 11 | from pymarc.constants import SUBFIELD_INDICATOR, END_OF_FIELD 12 | from pymarc.marc8 import marc8_to_unicode 13 | 14 | 15 | class Field: 16 | """Field() pass in the field tag, indicators and subfields for the tag. 17 | 18 | .. code-block:: python 19 | 20 | field = Field( 21 | tag = '245', 22 | indicators = ['0','1'], 23 | subfields = [ 24 | 'a', 'The pragmatic programmer : ', 25 | 'b', 'from journeyman to master /', 26 | 'c', 'Andrew Hunt, David Thomas.', 27 | ]) 28 | 29 | If you want to create a control field, don't pass in the indicators 30 | and use a data parameter rather than a subfields parameter: 31 | 32 | .. code-block:: python 33 | 34 | field = Field(tag='001', data='fol05731351') 35 | """ 36 | 37 | def __init__(self, tag, indicators=None, subfields=None, data=u""): 38 | """Initialize a field `tag`.""" 39 | if indicators is None: 40 | indicators = [] 41 | if subfields is None: 42 | subfields = [] 43 | indicators = [str(x) for x in indicators] 44 | 45 | # attempt to normalize integer tags if necessary 46 | try: 47 | self.tag = "%03i" % int(tag) 48 | except ValueError: 49 | self.tag = "%03s" % tag 50 | 51 | # assume controlfields are numeric only; replicates ruby-marc behavior 52 | if self.tag < "010" and self.tag.isdigit(): 53 | self.data = data 54 | else: 55 | self.indicators = indicators 56 | self.subfields = subfields 57 | 58 | def __iter__(self): 59 | self.__pos = 0 60 | return self 61 | 62 | def __str__(self): 63 | """String representation of the field. 64 | 65 | A Field object in a string context will return the tag, indicators 66 | and subfield as a string. This follows MARCMaker format; see [1] 67 | and [2] for further reference. Special character mnemonic strings 68 | have yet to be implemented (see [3]), so be forewarned. Note also 69 | for complete MARCMaker compatibility, you will need to change your 70 | newlines to DOS format ('CRLF'). 71 | 72 | [1] http://www.loc.gov/marc/makrbrkr.html#mechanics 73 | [2] http://search.cpan.org/~eijabb/MARC-File-MARCMaker/ 74 | [3] http://www.loc.gov/marc/mnemonics.html 75 | """ 76 | if self.is_control_field(): 77 | text = "=%s %s" % (self.tag, self.data.replace(" ", "\\")) 78 | else: 79 | text = "=%s " % (self.tag) 80 | for indicator in self.indicators: 81 | if indicator in (" ", "\\"): 82 | text += "\\" 83 | else: 84 | text += "%s" % indicator 85 | for subfield in self: 86 | text += "$%s%s" % subfield 87 | return text 88 | 89 | def __getitem__(self, subfield): 90 | """Retrieve the first subfield with a given subfield code in a field. 91 | 92 | .. code-block:: python 93 | 94 | field['a'] 95 | 96 | Handy for quick lookups. 97 | """ 98 | subfields = self.get_subfields(subfield) 99 | if len(subfields) > 0: 100 | return subfields[0] 101 | return None 102 | 103 | def __contains__(self, subfield): 104 | """Allows a shorthand test of field membership. 105 | 106 | .. code-block:: python 107 | 108 | 'a' in field 109 | 110 | """ 111 | subfields = self.get_subfields(subfield) 112 | return len(subfields) > 0 113 | 114 | def __setitem__(self, code, value): 115 | """Set the values of the subfield code in a field. 116 | 117 | .. code-block:: python 118 | 119 | field['a'] = 'value' 120 | 121 | Raises KeyError if there is more than one subfield code. 122 | """ 123 | subfields = self.get_subfields(code) 124 | if len(subfields) > 1: 125 | raise KeyError("more than one code '%s'" % code) 126 | elif len(subfields) == 0: 127 | raise KeyError("no code '%s'" % code) 128 | num_code = len(self.subfields) // 2 129 | while num_code >= 0: 130 | if self.subfields[(num_code * 2) - 2] == code: 131 | self.subfields[(num_code * 2) - 1] = value 132 | break 133 | num_code -= 1 134 | 135 | def __next__(self): 136 | if not hasattr(self, "subfields"): 137 | raise StopIteration 138 | while self.__pos < len(self.subfields): 139 | subfield = (self.subfields[self.__pos], self.subfields[self.__pos + 1]) 140 | self.__pos += 2 141 | return subfield 142 | raise StopIteration 143 | 144 | def value(self): 145 | """Returns the field as a string w/ tag, indicators, and subfield indicators.""" 146 | if self.is_control_field(): 147 | return self.data 148 | value_list = [] 149 | for subfield in self: 150 | value_list.append(subfield[1].strip()) 151 | return " ".join(value_list) 152 | 153 | def get_subfields(self, *codes): 154 | """Get subfields matching `codes`. 155 | 156 | get_subfields() accepts one or more subfield codes and returns 157 | a list of subfield values. The order of the subfield values 158 | in the list will be the order that they appear in the field. 159 | 160 | .. code-block:: python 161 | 162 | print(field.get_subfields('a')) 163 | print(field.get_subfields('a', 'b', 'z')) 164 | """ 165 | values = [] 166 | for subfield in self: 167 | if subfield[0] in codes: 168 | values.append(subfield[1]) 169 | return values 170 | 171 | def add_subfield(self, code, value, pos=None): 172 | """Adds a subfield code/value to the end of a field or at a position (pos). 173 | 174 | .. code-block:: python 175 | 176 | field.add_subfield('u', 'http://www.loc.gov') 177 | field.add_subfield('u', 'http://www.loc.gov', 0) 178 | 179 | If pos is not supplied or out of range, the subfield will be added at the end. 180 | """ 181 | append = pos is None or (pos + 1) * 2 > len(self.subfields) 182 | 183 | if append: 184 | self.subfields.append(code) 185 | self.subfields.append(value) 186 | else: 187 | i = pos * 2 188 | self.subfields.insert(i, code) 189 | self.subfields.insert(i + 1, value) 190 | 191 | def delete_subfield(self, code): 192 | """Deletes the first subfield with the specified 'code' and returns its value. 193 | 194 | .. code-block:: python 195 | 196 | value = field.delete_subfield('a') 197 | 198 | If no subfield is found with the specified code None is returned. 199 | """ 200 | try: 201 | index = self.subfields.index(code) 202 | if index % 2 == 0: 203 | value = self.subfields.pop(index + 1) 204 | self.subfields.pop(index) 205 | return value 206 | else: 207 | return None 208 | except ValueError: 209 | return None 210 | 211 | def is_control_field(self): 212 | """Returns true or false if the field is considered a control field. 213 | 214 | Control fields lack indicators and subfields. 215 | """ 216 | if self.tag < "010" and self.tag.isdigit(): 217 | return True 218 | return False 219 | 220 | def as_marc(self, encoding): 221 | """Used during conversion of a field to raw marc.""" 222 | if self.is_control_field(): 223 | return (self.data + END_OF_FIELD).encode(encoding) 224 | marc = self.indicator1 + self.indicator2 225 | for subfield in self: 226 | marc += SUBFIELD_INDICATOR + subfield[0] + subfield[1] 227 | 228 | return (marc + END_OF_FIELD).encode(encoding) 229 | 230 | # alias for backwards compatibility 231 | as_marc21 = as_marc 232 | 233 | def format_field(self): 234 | """Returns the field as a string w/ tag, indicators, and subfield indicators. 235 | 236 | Like :func:`Field.value() `, but prettier 237 | (adds spaces, formats subject headings). 238 | """ 239 | if self.is_control_field(): 240 | return self.data 241 | fielddata = "" 242 | for subfield in self: 243 | if subfield[0] == "6": 244 | continue 245 | if not self.is_subject_field(): 246 | fielddata += " %s" % subfield[1] 247 | else: 248 | if subfield[0] not in ("v", "x", "y", "z"): 249 | fielddata += " %s" % subfield[1] 250 | else: 251 | fielddata += " -- %s" % subfield[1] 252 | return fielddata.strip() 253 | 254 | def is_subject_field(self): 255 | """Returns True or False if the field is considered a subject field. 256 | 257 | Used by :func:`format_field() ` . 258 | """ 259 | if self.tag.startswith("6"): 260 | return True 261 | return False 262 | 263 | @property 264 | def indicator1(self): 265 | """Indicator 1.""" 266 | return self.indicators[0] 267 | 268 | @indicator1.setter 269 | def indicator1(self, value): 270 | """Indicator 1 (setter).""" 271 | self.indicators[0] = value 272 | 273 | @property 274 | def indicator2(self): 275 | """Indicator 2.""" 276 | return self.indicators[1] 277 | 278 | @indicator2.setter 279 | def indicator2(self, value): 280 | """Indicator 2 (setter).""" 281 | self.indicators[1] = value 282 | 283 | 284 | class RawField(Field): 285 | """MARC field that keeps data in raw, undecoded byte strings. 286 | 287 | Should only be used when input records are wrongly encoded. 288 | """ 289 | 290 | def as_marc(self, encoding=None): 291 | """Used during conversion of a field to raw marc.""" 292 | if encoding is not None: 293 | logging.warn("Attempt to force a RawField into encoding %s", encoding) 294 | if self.is_control_field(): 295 | return self.data + END_OF_FIELD 296 | marc = self.indicator1.encode("ascii") + self.indicator2.encode("ascii") 297 | for subfield in self: 298 | marc += SUBFIELD_INDICATOR.encode("ascii") + subfield[0] + subfield[1] 299 | return marc + END_OF_FIELD 300 | 301 | 302 | def map_marc8_field(f): 303 | """Map MARC8 field.""" 304 | if f.is_control_field(): 305 | f.data = marc8_to_unicode(f.data) 306 | else: 307 | f.subfields = map(marc8_to_unicode, f.subfields) 308 | return f 309 | -------------------------------------------------------------------------------- /pymarc/leader.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """The pymarc.leader file.""" 8 | from pymarc.constants import LEADER_LEN 9 | from pymarc.exceptions import BadLeaderValue, RecordLeaderInvalid 10 | 11 | 12 | class Leader(object): 13 | """Mutable leader. 14 | 15 | A class to manipulate a `Record`'s leader. 16 | 17 | You can use the properties (`status`, `bibliographic_level`, etc.) or their 18 | slices/index equivalent (`leader[5]`, `leader[7]`, etc.) to read and write 19 | values. 20 | 21 | See `LoC's documentation 22 | `_ 23 | for more infos about those fields. 24 | 25 | .. code-block:: python 26 | 27 | leader = Leader("00475cas a2200169 i 4500") 28 | leader[0:4] # returns "00475" 29 | leader.status # returns "c" 30 | leader.status = "a" # sets the status to "a" 31 | leader[5] # returns the status "a" 32 | leader[5] = "b" # sets the status to "b" 33 | str(leader) # "00475bas a2200169 i 4500" 34 | 35 | Usually the leader is accessed through the `leader` property of a record. 36 | 37 | .. code-block:: python 38 | 39 | from pymarc import MARCReader 40 | with open('test/marc.dat', 'rb') as fh: 41 | reader = MARCReader(fh) 42 | for record in reader: 43 | print(record.leader) 44 | 45 | When creating/updating a `Record` please note that `record_length` and 46 | `base_address` will only be generated in the marc21 output of 47 | :func:`record.as_marc() ` 48 | """ 49 | 50 | def __init__(self, leader): 51 | # type: (str) 52 | """Leader is initialized with a string.""" 53 | if len(leader) != LEADER_LEN: 54 | raise RecordLeaderInvalid 55 | self.leader = leader 56 | 57 | def __getitem__(self, item): 58 | # type: (str) -> str 59 | """Get values using position, slice or properties. 60 | 61 | leader[:4] == leader.length 62 | """ 63 | if isinstance(item, slice) or isinstance(item, int): 64 | return self.leader[item] 65 | return getattr(self, item) 66 | 67 | def __setitem__(self, item, value): 68 | # type: (str, str) -> str 69 | """Set values using position, slice or properties. 70 | 71 | leader[5] = "a" 72 | leader[0:4] = "0000" 73 | leader.status = "a" 74 | """ 75 | if isinstance(item, slice): 76 | self._replace_values(position=item.start, value=value) 77 | elif isinstance(item, int): 78 | self._replace_values(position=item, value=value) 79 | else: 80 | setattr(self, item, value) 81 | 82 | def __str__(self): 83 | # type: () -> str 84 | """A string representation of the leader.""" 85 | return self.leader 86 | 87 | def _replace_values(self, position, value): 88 | # type: (int, str) -> str 89 | """Replaces the values in the leader at `position` by `value`.""" 90 | if position < 0: 91 | raise IndexError("Position must be positive") 92 | after = position + len(value) 93 | if after > LEADER_LEN: 94 | raise BadLeaderValue( 95 | "%s is too long to be inserted at %d" % (value, position) 96 | ) 97 | self.leader = self.leader[:position] + value + self.leader[after:] 98 | 99 | @property 100 | def record_length(self): 101 | # type: () -> str 102 | """Record length (00-04).""" 103 | return self.leader[:5] 104 | 105 | @record_length.setter 106 | def record_length(self, value): 107 | # type: (str) -> str 108 | """Record length (00-04).""" 109 | if len(value) != 5: 110 | raise BadLeaderValue("Record length is 4 chars field, got %s" % value) 111 | self._replace_values(position=0, value=value) 112 | 113 | @property 114 | def record_status(self): 115 | # type: () -> str 116 | """Record status (05).""" 117 | return self.leader[5] 118 | 119 | @record_status.setter 120 | def record_status(self, value): 121 | # type: (str) -> str 122 | """Record status (05).""" 123 | if len(value) != 1: 124 | raise BadLeaderValue("Record status is 1 char field, got %s" % value) 125 | self._replace_values(position=5, value=value) 126 | 127 | @property 128 | def type_of_record(self): 129 | # type: () -> str 130 | """Type of record (06).""" 131 | return self.leader[6] 132 | 133 | @type_of_record.setter 134 | def type_of_record(self, value): 135 | # type: (str) -> str 136 | """Type of record (06).""" 137 | if len(value) != 1: 138 | raise BadLeaderValue("Type of record is 1 char field, got %s" % value) 139 | self._replace_values(position=6, value=value) 140 | 141 | @property 142 | def bibliographic_level(self): 143 | # type: () -> str 144 | """Bibliographic level (07).""" 145 | return self.leader[7] 146 | 147 | @bibliographic_level.setter 148 | def bibliographic_level(self, value): 149 | # type: (str) -> str 150 | """Bibliographic level (07).""" 151 | if len(value) != 1: 152 | raise BadLeaderValue("Bibliographic level is 1 char field, got %s" % value) 153 | self._replace_values(position=7, value=value) 154 | 155 | @property 156 | def type_of_control(self): 157 | # type: () -> str 158 | """Type of control (08).""" 159 | return self.leader[8] 160 | 161 | @type_of_control.setter 162 | def type_of_control(self, value): 163 | # type: (str) -> str 164 | """Type of control (08).""" 165 | if len(value) != 1: 166 | raise BadLeaderValue("Type of control is 1 char field, got %s" % value) 167 | self._replace_values(position=8, value=value) 168 | 169 | @property 170 | def coding_scheme(self): 171 | # type: () -> str 172 | """Character coding scheme (09).""" 173 | return self.leader[9] 174 | 175 | @coding_scheme.setter 176 | def coding_scheme(self, value): 177 | # type: (str) -> str 178 | """Character coding scheme (09).""" 179 | if len(value) != 1: 180 | raise BadLeaderValue( 181 | "Character coding scheme is 1 char field, got %s" % value 182 | ) 183 | self._replace_values(position=9, value=value) 184 | 185 | @property 186 | def indicator_count(self): 187 | # type: () -> str 188 | """Indicator count (10).""" 189 | return self.leader[10] 190 | 191 | @indicator_count.setter 192 | def indicator_count(self, value): 193 | # type: (str) -> str 194 | """Indicator count (10).""" 195 | if len(value) != 1: 196 | raise BadLeaderValue("Indicator count is 1 char field, got %s" % value) 197 | self._replace_values(position=10, value=value) 198 | 199 | @property 200 | def subfield_code_count(self): 201 | # type: () -> str 202 | """Subfield code count (11).""" 203 | return self.leader[11] 204 | 205 | @subfield_code_count.setter 206 | def subfield_code_count(self, value): 207 | # type: (str) -> str 208 | """Subfield code count (11).""" 209 | if len(value) != 1: 210 | raise BadLeaderValue("Subfield code count is 1 char field, got %s" % value) 211 | self._replace_values(position=11, value=value) 212 | 213 | @property 214 | def base_address(self): 215 | # type: () -> str 216 | """Base address of data (12-16).""" 217 | return self.leader[12:17] 218 | 219 | @base_address.setter 220 | def base_address(self, value): 221 | # type: (str) -> str 222 | """Base address of data (12-16).""" 223 | if len(value) != 5: 224 | raise BadLeaderValue( 225 | "Base address of data is 4 chars field, got %s" % value 226 | ) 227 | self._replace_values(position=12, value=value) 228 | 229 | @property 230 | def encoding_level(self): 231 | # type: () -> str 232 | """Encoding level (17).""" 233 | return self.leader[17] 234 | 235 | @encoding_level.setter 236 | def encoding_level(self, value): 237 | # type: (str) -> str 238 | """Encoding level (17).""" 239 | if len(value) != 1: 240 | raise BadLeaderValue("Encoding level is 1 char field, got %s" % value) 241 | self._replace_values(position=17, value=value) 242 | 243 | @property 244 | def cataloging_form(self): 245 | # type: () -> str 246 | """Descriptive cataloging form (18).""" 247 | return self.leader[18] 248 | 249 | @cataloging_form.setter 250 | def cataloging_form(self, value): 251 | # type: (str) -> str 252 | """Descriptive cataloging form (18).""" 253 | if len(value) != 1: 254 | raise BadLeaderValue( 255 | "Descriptive cataloging form is 1 char field, got %s" % value 256 | ) 257 | self._replace_values(position=18, value=value) 258 | 259 | @property 260 | def multipart_ressource(self): 261 | # type: () -> str 262 | """Multipart resource record level (19).""" 263 | return self.leader[19] 264 | 265 | @multipart_ressource.setter 266 | def multipart_ressource(self, value): 267 | # type: (str) -> str 268 | """Multipart resource record level (19).""" 269 | if len(value) != 1: 270 | raise BadLeaderValue( 271 | "Multipart resource record level is 1 char field, got %s" % value 272 | ) 273 | self._replace_values(position=19, value=value) 274 | 275 | @property 276 | def length_of_field_length(self): 277 | # type: () -> str 278 | """Length of the length-of-field portion (20).""" 279 | return self.leader[20] 280 | 281 | @length_of_field_length.setter 282 | def length_of_field_length(self, value): 283 | # type: (str) -> str 284 | """Length of the length-of-field portion (20).""" 285 | if len(value) != 1: 286 | raise BadLeaderValue( 287 | "Length of the length-of-field portion is 1 char field, got %s" % value 288 | ) 289 | self._replace_values(position=20, value=value) 290 | 291 | @property 292 | def starting_character_position_length(self): 293 | # type: () -> str 294 | """Length of the starting-character-position portion (21).""" 295 | return self.leader[21] 296 | 297 | @starting_character_position_length.setter 298 | def starting_character_position_length(self, value): 299 | # type: (str) -> str 300 | """Length of the starting-character-position portion (21).""" 301 | if len(value) != 1: 302 | raise BadLeaderValue( 303 | "Length of the starting-character-position portion is 1 char field, got %s" 304 | % value 305 | ) 306 | self._replace_values(position=21, value=value) 307 | 308 | @property 309 | def implementation_defined_length(self): 310 | # type: () -> str 311 | """Length of the implementation-defined portion (22).""" 312 | return self.leader[22] 313 | 314 | @implementation_defined_length.setter 315 | def implementation_defined_length(self, value): 316 | # type: (str) -> str 317 | """Length of the starting-character-position portion (22).""" 318 | if len(value) != 1: 319 | raise BadLeaderValue( 320 | "Length of the implementation-defined portion is 1 char field, got %s" 321 | % value 322 | ) 323 | self._replace_values(position=22, value=value) 324 | -------------------------------------------------------------------------------- /pymarc/marc8.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """Handle MARC-8 files. 8 | 9 | see http://www.loc.gov/marc/specifications/speccharmarc8.html 10 | """ 11 | 12 | import sys 13 | import unicodedata 14 | 15 | from pymarc import marc8_mapping 16 | 17 | 18 | def marc8_to_unicode(marc8, hide_utf8_warnings=False): 19 | """Pass in a string, and get back a Unicode object. 20 | 21 | .. code-block:: python 22 | 23 | print marc8_to_unicode(record.title()) 24 | """ 25 | # XXX: might be good to stash away a converter somehow 26 | # instead of always re-creating it 27 | converter = MARC8ToUnicode(quiet=hide_utf8_warnings) 28 | try: 29 | return converter.translate(marc8) 30 | except IndexError: 31 | # convert IndexError into UnicodeDecodeErrors 32 | raise UnicodeDecodeError( 33 | "marc8_to_unicode", 34 | marc8, 35 | 0, 36 | len(marc8), 37 | "invalid multibyte character encoding", 38 | ) 39 | except TypeError: 40 | # convert TypeError into UnicodeDecodeErrors 41 | raise UnicodeDecodeError( 42 | "marc8_to_unicode", 43 | marc8, 44 | 0, 45 | len(marc8), 46 | "invalid multibyte character encoding", 47 | ) 48 | 49 | 50 | class MARC8ToUnicode: 51 | """Converts MARC-8 to Unicode. 52 | 53 | Note that currently, unicode strings aren't normalized, and some codecs (e.g. 54 | iso8859-1) will fail on such strings. When I can require python 2.3, this will go 55 | away. 56 | 57 | Warning: MARC-8 EACC (East Asian characters) makes some 58 | distinctions which aren't captured in Unicode. The LC tables give 59 | the option of mapping such characters either to a Unicode private 60 | use area, or a substitute character which (usually) gives the 61 | sense. I've picked the second, so this means that the MARC data 62 | should be treated as primary and the Unicode data used for display 63 | purposes only. (If you know of either of fonts designed for use 64 | with LC's private-use Unicode assignments, or of attempts to 65 | standardize Unicode characters to allow round-trips from EACC, 66 | or if you need the private-use Unicode character translations, 67 | please inform me, asl2@pobox.com. 68 | """ 69 | 70 | basic_latin = 0x42 71 | ansel = 0x45 72 | 73 | def __init__(self, G0=basic_latin, G1=ansel, quiet=False): 74 | """Init.""" 75 | self.g0 = G0 76 | self.g0_set = set([b"(", b",", b"$"]) 77 | self.g1 = G1 78 | self.g1_set = set([b")", b"-", b"$"]) 79 | self.quiet = quiet 80 | 81 | def translate(self, marc8_string): 82 | """Translate.""" 83 | # don't choke on empty marc8_string 84 | if not marc8_string: 85 | return u"" 86 | uni_list = [] 87 | combinings = [] 88 | pos = 0 89 | while pos < len(marc8_string): 90 | # http://www.loc.gov/marc/specifications/speccharmarc8.html 91 | if marc8_string[pos : pos + 1] == b"\x1b": 92 | next_byte = marc8_string[pos + 1 : pos + 2] 93 | if next_byte in self.g0_set: 94 | if len(marc8_string) >= pos + 3: 95 | if ( 96 | marc8_string[pos + 2 : pos + 3] == b"," 97 | and next_byte == b"$" 98 | ): 99 | pos += 1 100 | self.g0 = ord(marc8_string[pos + 2 : pos + 3]) 101 | pos = pos + 3 102 | continue 103 | else: 104 | # if there aren't enough remaining characters, readd 105 | # the escape character so it doesn't get lost; may 106 | # help users diagnose problem records 107 | uni_list.append(marc8_string[pos : pos + 1].decode("ascii")) 108 | pos += 1 109 | continue 110 | 111 | elif next_byte in self.g1_set: 112 | if marc8_string[pos + 2 : pos + 3] == b"-" and next_byte == b"$": 113 | pos += 1 114 | self.g1 = ord(marc8_string[pos + 2 : pos + 3]) 115 | pos = pos + 3 116 | continue 117 | else: 118 | charset = ord(next_byte) 119 | if charset in marc8_mapping.CODESETS: 120 | self.g0 = charset 121 | pos += 2 122 | elif charset == 0x73: 123 | self.g0 = self.basic_latin 124 | pos += 2 125 | if pos == len(marc8_string): 126 | break 127 | 128 | def is_multibyte(charset): 129 | return charset == 0x31 130 | 131 | mb_flag = is_multibyte(self.g0) 132 | 133 | if mb_flag: 134 | code_point = ( 135 | ord(marc8_string[pos : pos + 1]) * 65536 136 | + ord(marc8_string[pos + 1 : pos + 2]) * 256 137 | + ord(marc8_string[pos + 2 : pos + 3]) 138 | ) 139 | pos += 3 140 | else: 141 | code_point = ord(marc8_string[pos : pos + 1]) 142 | pos += 1 143 | 144 | if code_point < 0x20 or (code_point > 0x80 and code_point < 0xA0): 145 | uni = chr(code_point) 146 | continue 147 | 148 | try: 149 | if code_point > 0x80 and not mb_flag: 150 | (uni, cflag) = marc8_mapping.CODESETS[self.g1][code_point] 151 | else: 152 | (uni, cflag) = marc8_mapping.CODESETS[self.g0][code_point] 153 | except KeyError: 154 | try: 155 | uni = marc8_mapping.ODD_MAP[code_point] 156 | uni_list.append(chr(uni)) 157 | # we can short circuit because we know these mappings 158 | # won't be involved in combinings. (i hope?) 159 | continue 160 | except KeyError: 161 | pass 162 | if not self.quiet: 163 | sys.stderr.write( 164 | "Unable to parse character 0x%x in g0=%s g1=%s\n" 165 | % (code_point, self.g0, self.g1) 166 | ) 167 | uni = ord(" ") 168 | cflag = False 169 | 170 | if cflag: 171 | combinings.append(chr(uni)) 172 | else: 173 | uni_list.append(chr(uni)) 174 | if len(combinings) > 0: 175 | uni_list.extend(combinings) 176 | combinings = [] 177 | 178 | # what to do if combining chars left over? 179 | uni_str = u"".join(uni_list) 180 | 181 | # unicodedata.normalize not available until Python 2.3 182 | if hasattr(unicodedata, "normalize"): 183 | uni_str = unicodedata.normalize("NFC", uni_str) 184 | 185 | return uni_str 186 | -------------------------------------------------------------------------------- /pymarc/marcjson.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """From JSON to MARC21.""" 8 | 9 | from pymarc import Field, Record, JSONReader 10 | 11 | 12 | class JsonHandler: 13 | """Handle JSON.""" 14 | 15 | def __init__(self): 16 | """Init.""" 17 | self.records = [] 18 | self._record = None 19 | self._field = None 20 | self._text = [] 21 | 22 | def element(self, element_dict, name=None): 23 | """Converts a JSON `element_dict` to pymarc fields.""" 24 | if not name: 25 | self._record = Record() 26 | self.element(element_dict, "leader") 27 | elif name == "leader": 28 | self._record.leader = element_dict[name] 29 | self.element(element_dict, "fields") 30 | elif name == "fields": 31 | fields = iter(element_dict[name]) 32 | for field in fields: 33 | tag, remaining = field.popitem() 34 | self._field = Field(tag) 35 | if self._field.is_control_field(): 36 | self._field.data = remaining 37 | else: 38 | self.element(remaining, "subfields") 39 | self._field.indicators.extend( 40 | [remaining["ind1"], remaining["ind2"]] 41 | ) 42 | self._record.add_field(self._field) 43 | self.process_record(self._record) 44 | elif name == "subfields": 45 | subfields = iter(element_dict[name]) 46 | for subfield in subfields: 47 | code, text = subfield.popitem() 48 | self._field.add_subfield(code, text) 49 | 50 | def elements(self, dict_list): 51 | """Sends `dict_list` to `element`.""" 52 | if type(dict_list) is not list: 53 | dict_list = [dict_list] 54 | for rec in dict_list: 55 | self.element(rec) 56 | return self.records 57 | 58 | def process_record(self, record): 59 | """Append `record` to `self.records`.""" 60 | self.records.append(record) 61 | 62 | 63 | def parse_json_to_array(json_file): 64 | """JSON to elements.""" 65 | json_reader = JSONReader(json_file) 66 | handler = JsonHandler() 67 | return handler.elements(json_reader.records) 68 | -------------------------------------------------------------------------------- /pymarc/marcxml.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """From XML to MARC21 and back again.""" 8 | 9 | import unicodedata 10 | from xml.sax import make_parser 11 | from xml.sax.handler import ContentHandler, feature_namespaces 12 | import xml.etree.ElementTree as ET 13 | 14 | from pymarc import Field, MARC8ToUnicode, Record 15 | 16 | 17 | XSI_NS = "http://www.w3.org/2001/XMLSchema-instance" 18 | MARC_XML_NS = "http://www.loc.gov/MARC21/slim" 19 | MARC_XML_SCHEMA = "http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" 20 | 21 | 22 | class XmlHandler(ContentHandler): 23 | """XML Handler. 24 | 25 | You can subclass XmlHandler and add your own process_record 26 | method that'll be passed a pymarc.Record as it becomes 27 | available. This could be useful if you want to stream the 28 | records elsewhere (like to a rdbms) without having to store 29 | them all in memory. 30 | """ 31 | 32 | def __init__(self, strict=False, normalize_form=None): 33 | """Initialize XmlHandler. 34 | 35 | * `strict` will ignore elements not matching MARC_XML_NS. 36 | * see unicodedata.normalize for valid `normalize_form` values 37 | """ 38 | self.records = [] 39 | self._record = None 40 | self._field = None 41 | self._subfield_code = None 42 | self._text = [] 43 | self._strict = strict 44 | self.normalize_form = normalize_form 45 | 46 | def startElementNS(self, name, qname, attrs): 47 | """Start element NS.""" 48 | if self._strict and name[0] != MARC_XML_NS: 49 | return 50 | 51 | element = name[1] 52 | self._text = [] 53 | 54 | if element == "record": 55 | self._record = Record() 56 | elif element == "controlfield": 57 | tag = attrs.getValue((None, u"tag")) 58 | self._field = Field(tag) 59 | elif element == "datafield": 60 | tag = attrs.getValue((None, u"tag")) 61 | ind1 = attrs.get((None, u"ind1"), u" ") 62 | ind2 = attrs.get((None, u"ind2"), u" ") 63 | self._field = Field(tag, [ind1, ind2]) 64 | elif element == "subfield": 65 | self._subfield_code = attrs[(None, "code")] 66 | 67 | def endElementNS(self, name, qname): 68 | """End element NS.""" 69 | if self._strict and name[0] != MARC_XML_NS: 70 | return 71 | 72 | element = name[1] 73 | if self.normalize_form is not None: 74 | text = unicodedata.normalize(self.normalize_form, u"".join(self._text)) 75 | else: 76 | text = u"".join(self._text) 77 | 78 | if element == "record": 79 | self.process_record(self._record) 80 | self._record = None 81 | elif element == "leader": 82 | self._record.leader = text 83 | elif element == "controlfield": 84 | self._field.data = text 85 | self._record.add_field(self._field) 86 | self._field = None 87 | elif element == "datafield": 88 | self._record.add_field(self._field) 89 | self._field = None 90 | elif element == "subfield": 91 | self._field.subfields.append(self._subfield_code) 92 | self._field.subfields.append(text) 93 | self._subfield_code = None 94 | 95 | self._text = [] 96 | 97 | def characters(self, chars): 98 | """Append `chars` to `_text`.""" 99 | self._text.append(chars) 100 | 101 | def process_record(self, record): 102 | """Append `record` to `records`.""" 103 | self.records.append(record) 104 | 105 | 106 | def parse_xml(xml_file, handler): 107 | """Parse a file with a given subclass of xml.sax.handler.ContentHandler.""" 108 | parser = make_parser() 109 | parser.setContentHandler(handler) 110 | parser.setFeature(feature_namespaces, 1) 111 | parser.parse(xml_file) 112 | 113 | 114 | def map_xml(function, *files): 115 | """Map a function onto the file. 116 | 117 | So that for each record that is parsed the function will get called with the 118 | extracted record 119 | 120 | .. code-block:: python 121 | 122 | def do_it(r): 123 | print(r) 124 | 125 | map_xml(do_it, 'marc.xml') 126 | """ 127 | handler = XmlHandler() 128 | handler.process_record = function 129 | for xml_file in files: 130 | parse_xml(xml_file, handler) 131 | 132 | 133 | def parse_xml_to_array(xml_file, strict=False, normalize_form=None): 134 | """Parse an XML file and return the records as an array. 135 | 136 | Instead of passing in a file path you can also pass in an open file handle, or a file 137 | like object like StringIO. If you would like the parser to explicitly check the 138 | namespaces for the MARCSlim namespace use the strict=True option. Valid values for 139 | normalize_form are 'NFC', 'NFKC', 'NFD', and 'NFKD'. See 140 | unicodedata.normalize for more info on these. 141 | """ 142 | handler = XmlHandler(strict, normalize_form) 143 | parse_xml(xml_file, handler) 144 | return handler.records 145 | 146 | 147 | def record_to_xml(record, quiet=False, namespace=False): 148 | """From MARC to XML.""" 149 | node = record_to_xml_node(record, quiet, namespace) 150 | return ET.tostring(node) 151 | 152 | 153 | def record_to_xml_node(record, quiet=False, namespace=False): 154 | """Converts a record object to a chunk of XML. 155 | 156 | If you would like to include the marcxml namespace in the root tag set namespace to 157 | True. 158 | """ 159 | # helper for converting non-unicode data to unicode 160 | # TODO: maybe should set g0 and g1 appropriately using 066 $a and $b? 161 | marc8 = MARC8ToUnicode(quiet=quiet) 162 | 163 | def translate(data): 164 | if type(data) == str: 165 | return data 166 | else: 167 | return marc8.translate(data) 168 | 169 | root = ET.Element("record") 170 | if namespace: 171 | root.set("xmlns", MARC_XML_NS) 172 | root.set("xmlns:xsi", XSI_NS) 173 | root.set("xsi:schemaLocation", MARC_XML_SCHEMA) 174 | leader = ET.SubElement(root, "leader") 175 | leader.text = str(record.leader) 176 | for field in record: 177 | if field.is_control_field(): 178 | control_field = ET.SubElement(root, "controlfield") 179 | control_field.set("tag", field.tag) 180 | control_field.text = translate(field.data) 181 | else: 182 | data_field = ET.SubElement(root, "datafield") 183 | data_field.set("ind1", field.indicators[0]) 184 | data_field.set("ind2", field.indicators[1]) 185 | data_field.set("tag", field.tag) 186 | for subfield in field: 187 | data_subfield = ET.SubElement(data_field, "subfield") 188 | data_subfield.set("code", subfield[0]) 189 | data_subfield.text = translate(subfield[1]) 190 | 191 | return root 192 | -------------------------------------------------------------------------------- /pymarc/reader.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """Pymarc Reader.""" 8 | import os 9 | import sys 10 | import json 11 | 12 | from io import BytesIO, StringIO 13 | 14 | from pymarc import Record, Field 15 | from pymarc.exceptions import PymarcException, RecordLengthInvalid 16 | 17 | 18 | class Reader: 19 | """A base class for all iterating readers in the pymarc package.""" 20 | 21 | def __iter__(self): 22 | return self 23 | 24 | 25 | class MARCReader(Reader): 26 | """An iterator class for reading a file of MARC21 records. 27 | 28 | Simple usage: 29 | 30 | .. code-block:: python 31 | 32 | from pymarc import MARCReader 33 | 34 | ## pass in a file object 35 | reader = MARCReader(file('file.dat')) 36 | for record in reader: 37 | ... 38 | 39 | ## pass in marc in transmission format 40 | reader = MARCReader(rawmarc) 41 | for record in reader: 42 | ... 43 | 44 | If you would like to have your Record object contain unicode strings 45 | use the to_unicode parameter: 46 | 47 | .. code-block:: python 48 | 49 | reader = MARCReader(file('file.dat'), to_unicode=True) 50 | 51 | This will decode from MARC-8 or UTF-8 depending on the value in the 52 | MARC leader at position 9. 53 | 54 | If you find yourself in the unfortunate position of having data that 55 | is utf-8 encoded without the leader set appropriately you can use 56 | the force_utf8 parameter: 57 | 58 | .. code-block:: python 59 | 60 | reader = MARCReader(file('file.dat'), to_unicode=True, 61 | force_utf8=True) 62 | 63 | If you find yourself in the unfortunate position of having data that is 64 | mostly utf-8 encoded but with a few non-utf-8 characters, you can also use 65 | the utf8_handling parameter, which takes the same values ('strict', 66 | 'replace', and 'ignore') as the Python Unicode codecs (see 67 | http://docs.python.org/library/codecs.html for more info). 68 | 69 | Although, it's not legal in MARC-21 to use anything but MARC-8 or UTF-8, but 70 | if you have a file in incorrect encode and you know what it is, you can 71 | try to use your encode in parameter "file_encoding". 72 | 73 | You may want to parse data in a permissive way to avoid stop on the first 74 | wrong record and reads as much as records as possible: 75 | 76 | .. code-block:: python 77 | 78 | reader = MARCReader(file('file.dat'), permissive=True) 79 | 80 | In such case ``None`` is return by the iterator. 81 | This give you the full control to implement the expected behavior getting 82 | exception information under ``reader.last_exception`` which will store 83 | a tuple with (, ): 84 | 85 | .. code-block:: python 86 | 87 | reader = MARCReader(file('file.dat'), permissive=True) 88 | for record in reader: 89 | if record is None: 90 | print( 91 | "Current chunk: ", 92 | reader.current_chunk, 93 | " was ignored because the following exception raised: ", 94 | reader.current_exception 95 | ) 96 | else: 97 | # do something with record 98 | """ 99 | 100 | _current_chunk = None 101 | _current_exception = None 102 | 103 | @property 104 | def current_chunk(self): 105 | """Current chunk.""" 106 | return self._current_chunk 107 | 108 | @property 109 | def current_exception(self): 110 | """Current exception.""" 111 | return self._current_exception 112 | 113 | def __init__( 114 | self, 115 | marc_target, 116 | to_unicode=True, 117 | force_utf8=False, 118 | hide_utf8_warnings=False, 119 | utf8_handling="strict", 120 | file_encoding="iso8859-1", 121 | permissive=False, 122 | ): 123 | """The constructor to which you can pass either raw marc or a file-like object. 124 | 125 | Basically the argument you pass in should be raw MARC in transmission format or 126 | an object that responds to read(). 127 | """ 128 | super(MARCReader, self).__init__() 129 | self.to_unicode = to_unicode 130 | self.force_utf8 = force_utf8 131 | self.hide_utf8_warnings = hide_utf8_warnings 132 | self.utf8_handling = utf8_handling 133 | self.file_encoding = file_encoding 134 | self.permissive = permissive 135 | if hasattr(marc_target, "read") and callable(marc_target.read): 136 | self.file_handle = marc_target 137 | else: 138 | self.file_handle = BytesIO(marc_target) 139 | 140 | def close(self): 141 | """Close the handle.""" 142 | if self.file_handle: 143 | self.file_handle.close() 144 | self.file_handle = None 145 | 146 | def __next__(self): 147 | first5 = self.file_handle.read(5) 148 | if not first5: 149 | raise StopIteration 150 | if len(first5) < 5: 151 | raise RecordLengthInvalid 152 | 153 | try: 154 | length = int(first5) 155 | except ValueError: 156 | raise RecordLengthInvalid 157 | 158 | chunk = self.file_handle.read(length - 5) 159 | chunk = first5 + chunk 160 | self._current_chunk = chunk 161 | self._current_exception = None 162 | try: 163 | record = Record( 164 | chunk, 165 | to_unicode=self.to_unicode, 166 | force_utf8=self.force_utf8, 167 | hide_utf8_warnings=self.hide_utf8_warnings, 168 | utf8_handling=self.utf8_handling, 169 | file_encoding=self.file_encoding, 170 | ) 171 | except (PymarcException, UnicodeDecodeError, ValueError) as ex: 172 | if self.permissive: 173 | self._current_exception = ex 174 | record = None 175 | else: 176 | raise ex 177 | return record 178 | 179 | 180 | def map_records(f, *files): 181 | """Applies a given function to each record in a batch. 182 | 183 | You can pass in multiple batches. 184 | 185 | .. code-block:: python 186 | 187 | def print_title(r): 188 | print(r['245']) 189 | map_records(print_title, file('marc.dat')) 190 | """ 191 | for file in files: 192 | list(map(f, MARCReader(file))) 193 | 194 | 195 | class JSONReader(Reader): 196 | """JSON Reader.""" 197 | 198 | def __init__(self, marc_target, encoding="utf-8", stream=False): 199 | """The constructor to which you can pass either raw marc or a file-like object. 200 | 201 | Basically the argument you pass in should be raw JSON in transmission format or 202 | an object that responds to read(). 203 | """ 204 | self.encoding = encoding 205 | if hasattr(marc_target, "read") and callable(marc_target.read): 206 | self.file_handle = marc_target 207 | else: 208 | if os.path.exists(marc_target): 209 | self.file_handle = open(marc_target, "r") 210 | else: 211 | self.file_handle = StringIO(marc_target) 212 | if stream: 213 | sys.stderr.write( 214 | "Streaming not yet implemented, your data will be loaded into memory\n" 215 | ) 216 | self.records = json.load(self.file_handle, strict=False) 217 | 218 | def __iter__(self): 219 | if hasattr(self.records, "__iter__") and not isinstance(self.records, dict): 220 | self.iter = iter(self.records) 221 | else: 222 | self.iter = iter([self.records]) 223 | return self 224 | 225 | def __next__(self): 226 | jobj = next(self.iter) 227 | rec = Record() 228 | rec.leader = jobj["leader"] 229 | for field in jobj["fields"]: 230 | k, v = list(field.items())[0] 231 | if "subfields" in v and hasattr(v, "update"): 232 | # flatten m-i-j dict to list in pymarc 233 | subfields = [] 234 | for sub in v["subfields"]: 235 | for code, value in sub.items(): 236 | subfields.extend((code, value)) 237 | fld = Field( 238 | tag=k, subfields=subfields, indicators=[v["ind1"], v["ind2"]] 239 | ) 240 | else: 241 | fld = Field(tag=k, data=v) 242 | rec.add_field(fld) 243 | return rec 244 | -------------------------------------------------------------------------------- /pymarc/writer.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """Pymarc Writer.""" 8 | import json 9 | import xml.etree.ElementTree as ET 10 | 11 | import pymarc 12 | from pymarc import Record, WriteNeedsRecord 13 | 14 | 15 | class Writer(object): 16 | """Base Writer object.""" 17 | 18 | def __init__(self, file_handle): 19 | """Init.""" 20 | self.file_handle = file_handle 21 | 22 | def write(self, record): 23 | """Write.""" 24 | if not isinstance(record, Record): 25 | raise WriteNeedsRecord 26 | 27 | def close(self, close_fh=True): 28 | """Closes the writer. 29 | 30 | If close_fh is False close will also close the underlying file handle 31 | that was passed in to the constructor. The default is True. 32 | """ 33 | if close_fh: 34 | self.file_handle.close() 35 | self.file_handle = None 36 | 37 | 38 | class JSONWriter(Writer): 39 | """A class for writing records as an array of MARC-in-JSON objects. 40 | 41 | IMPORTANT: You must the close a JSONWriter, 42 | otherwise you will not get valid JSON. 43 | 44 | Simple usage:: 45 | 46 | .. code-block:: python 47 | 48 | from pymarc import JSONWriter 49 | 50 | # writing to a file 51 | writer = JSONWriter(open('file.json','wt')) 52 | writer.write(record) 53 | writer.close() # Important! 54 | 55 | # writing to a string 56 | string = StringIO() 57 | writer = JSONWriter(string) 58 | writer.write(record) 59 | writer.close(close_fh=False) # Important! 60 | print(string) 61 | """ 62 | 63 | def __init__(self, file_handle): 64 | """You need to pass in a text file like object.""" 65 | super(JSONWriter, self).__init__(file_handle) 66 | self.write_count = 0 67 | self.file_handle.write("[") 68 | 69 | def write(self, record): 70 | """Writes a record.""" 71 | Writer.write(self, record) 72 | if self.write_count > 0: 73 | self.file_handle.write(",") 74 | json.dump(record.as_dict(), self.file_handle, separators=(",", ":")) 75 | self.write_count += 1 76 | 77 | def close(self, close_fh=True): 78 | """Closes the writer. 79 | 80 | If close_fh is False close will also close the underlying file 81 | handle that was passed in to the constructor. The default is True. 82 | """ 83 | self.file_handle.write("]") 84 | Writer.close(self, close_fh) 85 | 86 | 87 | class MARCWriter(Writer): 88 | """A class for writing MARC21 records in transmission format. 89 | 90 | Simple usage:: 91 | 92 | .. code-block:: python 93 | 94 | from pymarc import MARCWriter 95 | 96 | # writing to a file 97 | writer = MARCWriter(open('file.dat','wb')) 98 | writer.write(record) 99 | writer.close() 100 | 101 | # writing to a string (Python 2 only) 102 | string = StringIO() 103 | writer = MARCWriter(string) 104 | writer.write(record) 105 | writer.close(close_fh=False) 106 | print(string) 107 | 108 | # writing to memory (Python 3 only) 109 | 110 | memory = BytesIO() 111 | writer = MARCWriter(memory) 112 | writer.write(record) 113 | writer.close(close_fh=False) 114 | """ 115 | 116 | def __init__(self, file_handle): 117 | """You need to pass in a byte file like object.""" 118 | super(MARCWriter, self).__init__(file_handle) 119 | 120 | def write(self, record): 121 | """Writes a record.""" 122 | Writer.write(self, record) 123 | self.file_handle.write(record.as_marc()) 124 | 125 | 126 | class TextWriter(Writer): 127 | """A class for writing records in prettified text MARCMaker format. 128 | 129 | A blank line separates each record. 130 | 131 | Simple usage: 132 | 133 | .. code-block:: python 134 | 135 | from pymarc import TextWriter 136 | 137 | # writing to a file 138 | writer = TextWriter(open('file.txt','wt')) 139 | writer.write(record) 140 | writer.close() 141 | 142 | # writing to a string 143 | string = StringIO() 144 | writer = TextWriter(string) 145 | writer.write(record) 146 | writer.close(close_fh=False) 147 | print(string) 148 | """ 149 | 150 | def __init__(self, file_handle): 151 | """You need to pass in a text file like object.""" 152 | super(TextWriter, self).__init__(file_handle) 153 | self.write_count = 0 154 | 155 | def write(self, record): 156 | """Writes a record.""" 157 | Writer.write(self, record) 158 | if self.write_count > 0: 159 | self.file_handle.write("\n") 160 | self.file_handle.write(str(record)) 161 | self.write_count += 1 162 | 163 | 164 | class XMLWriter(Writer): 165 | """A class for writing records as a MARCXML collection. 166 | 167 | IMPORTANT: You must then close an XMLWriter, otherwise you will not get 168 | a valid XML document. 169 | 170 | Simple usage: 171 | 172 | .. code-block:: python 173 | 174 | from pymarc import XMLWriter 175 | 176 | # writing to a file 177 | writer = XMLWriter(open('file.xml','wb')) 178 | writer.write(record) 179 | writer.close() # Important! 180 | 181 | # writing to a string (Python 2 only) 182 | string = StringIO() 183 | writer = XMLWriter(string) 184 | writer.write(record) 185 | writer.close(close_fh=False) # Important! 186 | print(string) 187 | 188 | # writing to memory (Python 3 only) 189 | memory = BytesIO() 190 | writer = XMLWriter(memory) 191 | writer.write(record) 192 | writer.close(close_fh=False) # Important! 193 | """ 194 | 195 | def __init__(self, file_handle): 196 | """You need to pass in a binary file like object.""" 197 | super(XMLWriter, self).__init__(file_handle) 198 | self.file_handle.write(b'') 199 | self.file_handle.write(b'') 200 | 201 | def write(self, record): 202 | """Writes a record.""" 203 | Writer.write(self, record) 204 | node = pymarc.record_to_xml_node(record) 205 | self.file_handle.write(ET.tostring(node, encoding="utf-8")) 206 | 207 | def close(self, close_fh=True): 208 | """Closes the writer. 209 | 210 | If close_fh is False close will also close the underlying file handle 211 | that was passed in to the constructor. The default is True. 212 | """ 213 | self.file_handle.write(b"") 214 | Writer.close(self, close_fh) 215 | -------------------------------------------------------------------------------- /requirements.dev.txt: -------------------------------------------------------------------------------- 1 | # linting 2 | black 3 | flake8 4 | flake8-docstrings 5 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [flake8] 2 | ignore = 3 | # line length is handled by black 4 | E501 5 | # Missing docstring in public nested class (ex. Meta) 6 | D106 7 | # Missing docstring in magic method (__str__ ...) 8 | D105 9 | # First line should be in imperative mood 10 | D401 11 | # E203 & W503 are not PEP 8 compliant 12 | # @see https://github.com/psf/black 13 | W503 14 | E203 15 | per-file-ignores = 16 | # no need for doctrings in tests 17 | test_*: D100, D101, D102, D103 18 | # ignore unused imports and wildcard imports in __init__ and missing doctrings 19 | __init__.py: D104, F401, F403 -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """Pymarc setup.""" 8 | 9 | from setuptools import setup 10 | 11 | version = "3.2.0" 12 | 13 | classifiers = """ 14 | Intended Audience :: Education 15 | Intended Audience :: Developers 16 | Intended Audience :: Information Technology 17 | License :: OSI Approved :: BSD License 18 | Programming Language :: Python :: 3 19 | Programming Language :: Python :: 3.6 20 | Programming Language :: Python :: 3.7 21 | Programming Language :: Python :: 3.8 22 | Topic :: Text Processing :: General 23 | """ 24 | 25 | 26 | with open("README.md") as f: 27 | long_description = f.read() 28 | 29 | setup( 30 | name="pymarc", 31 | version=version, 32 | url="http://github.com/edsu/pymarc", 33 | author="Ed Summers", 34 | author_email="ehs@pobox.com", 35 | license="http://www.opensource.org/licenses/bsd-license.php", 36 | packages=["pymarc"], 37 | description="Read, write and modify MARC bibliographic data", 38 | long_description=long_description, 39 | long_description_content_type="text/markdown", 40 | classifiers=list(filter(None, classifiers.split("\n"))), 41 | test_suite="test", 42 | python_requires=">=3.6.*", 43 | ) 44 | -------------------------------------------------------------------------------- /test/1251.dat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edsu/pymarc/cf33051421ac74389c1bc6d54921fb9612083d1b/test/1251.dat -------------------------------------------------------------------------------- /test/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edsu/pymarc/cf33051421ac74389c1bc6d54921fb9612083d1b/test/__init__.py -------------------------------------------------------------------------------- /test/alphatag.dat: -------------------------------------------------------------------------------- 1 | 01339cam 2200349 4500001001000000005001700010008004100027010002100068020000900089035001000098035002100108040001300129049001500142092001400157100002400171240002700195245009300222260004900315300002600364500013400390520015500524650002800679650004900707700002500756949001900781995002100800CAT003600821CAT004100857CAT004100898CAT00260093999900240096500000000119940309194144.0690411r19681961nyua j 001 0 eng  a68009306 /AC/r83 c1.25 a269239 a(OCoLC)00000697^ aDLCcDLC aWN8D[Juv.] a542bM9171 aMullin, Virginia L.10aChemistry for children10aChemistry experiments for children,cby Virginia L. Mullin. Illustrated by Bernard Case. aNew York,bDover Publicationsc[1968, c1962] a96 p.billus.c24 cm. a"An unabridged and unaltered republication of the work originally published ... in 1961 under the title: Chemistry for children." aGives directions for many simple chemistry experiments, including descriptions of necessary equipment, principles, techniques, and safety precautions. 1aChemistryxExperiments. 0aChemistryxExperimentsvJuvenile literature.1 aCase, Bernard,eill. a542bM917ljuv a(CPomAG)00000001 aCONVb00c20051122lWN801h2158 aBATCH-UPDb00c20051122lWN801h2238 aBATCH-UPDb00c20051123lWN801h0235 c20071018lWN801h1952 lWFISaJuv. 542 M917 -------------------------------------------------------------------------------- /test/bad_eacc_encoding.dat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edsu/pymarc/cf33051421ac74389c1bc6d54921fb9612083d1b/test/bad_eacc_encoding.dat -------------------------------------------------------------------------------- /test/bad_indicator.dat: -------------------------------------------------------------------------------- 1 | 01159cam a22003258a 45000 2002900000001000700029005001700036008004100053010001300094020002800107035002000135040001800155043001200173049000900185050002300194082002100217100002500238245007900263260005300342300005300395504006400448600003800512600002200550650004600572650004600618650004800664650005500712650005500767951001100822a9780253325525 (alk. paper)42670520101218164208.0890831s1990 inu b s00110deng  a89046000 a0253325528 (alk. paper) a(OCoLC)20854397 aDLCcDLCdVGM an-us--- aEAUU00aE185.86b.G38 199000a973/.049607322010aGatewood, Willard B.10aAristocrats of color :bthe Black elite, 1880-1920 /cWillard B. Gatewood.0 aBloomington :bIndiana University Press,cc1990. axii, 450 p., [16] p. of plates :bill. ;c24 cm. aIncludes bibliographical references (p. 411-436) and index.10aBruce, Blanche Kelso,d1841-1898.10aBruce, Josephine. 0aAfrican AmericansxHistoryy19th century. 0aAfrican AmericansxHistoryy20th century. 0aAfrican AmericansxSocial life and customs. 0aUpper classzUnited StatesxHistoryy19th century. 0aUpper classzUnited StatesxHistoryy20th century. a426705 -------------------------------------------------------------------------------- /test/bad_marc8_escape.dat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edsu/pymarc/cf33051421ac74389c1bc6d54921fb9612083d1b/test/bad_marc8_escape.dat -------------------------------------------------------------------------------- /test/bad_records.mrc: -------------------------------------------------------------------------------- 1 | 00127 2200037 450024500890000001aThe pragmatic programmer : bfrom journeyman to master /cAndrew Hunt, David Thomas.00127 2299937 450024500890000001aThe pragmatic programmer : bfrom journeyman to master /cAndrew Hunt, David Thomas.00127 2200000 450024500890000001aThe pragmatic programmer : bfrom journeyman to master /cAndrew Hunt, David Thomas.00128 2200038 4500245008900000101aThe pragmatic programmer : bfrom journeyman to master /cAndrew Hunt, David Thomas.00128 2200038 4500245ù0890000101aThe pragmatic programmer : bfrom journeyman to master /cAndrew Hunt, David Thomas.00127 22f0037 450024500890000001aThe pragmatic programmer : bfrom journeyman to master /cAndrew Hunt, David Thomas.00026 2200025 450000127 2200037 450024500890000001aThe pragmatic programmer : bfrom journeyman to master /cAndrew Hunt, David Thomas. 2 | -------------------------------------------------------------------------------- /test/bad_subfield_code.dat: -------------------------------------------------------------------------------- 1 | 00755cam 22002414a 4500001001300000003000600013005001700019008004100036010001700077020004300094040001800137042000800155050002600163082001700189100003100206245005400237260004200291300007200333500003300405650003700438630002500475630001300500fol05731351 IMchF20000613133448.0000107s2000 nyua 001 0 eng  a 00020737  a0471383147 (paper/cd-rom : alk. paper) aDLCcDLCdDLC apcc00aQA76.73.P22bM33 200000a005.13/32211 aMartinsson, Tobias,d1976-10áActivePerl with ASP and ADO /cTobias Martinsson. aNew York :bJohn Wiley & Sons,c2000. axxi, 289 p. :bill. ;c23 cm. +e1 computer laser disc (4 3/4 in.) a"Wiley Computer Publishing." 0aPerl (Computer program language)00aActive server pages.00aActiveX. -------------------------------------------------------------------------------- /test/bad_tag.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | Z 22 4500 4 | 5 | The Untamed 6 | 7 | by Max Brand 8 | 9 | 10 | Brand, Max, 11 | 12 | 13 | dc 14 | 15 | 16 | 1892-1944 17 | 18 | 19 | [electronic resource] / 20 | 21 | 22 | Project Gutenberg, 23 | 24 | 25 | 2004 26 | 27 | 28 | Project Gutenberg 29 | 30 | 31 | Freely available. 32 | 33 | 34 | Electronic text 35 | 36 | 37 | Project Gutenberg 38 | 39 | 40 | 10886 41 | 42 | 43 | http://www.gutenberg.org/etext/10886 44 | 45 | 46 | http://www.gutenberg.org/license 47 | 48 | 49 | Rights 50 | 51 | 52 | -------------------------------------------------------------------------------- /test/batch.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "leader": "00925njm 22002777a 4500", 4 | "fields": [ 5 | { 6 | "001": "5637241" 7 | }, 8 | { 9 | "003": "DLC" 10 | }, 11 | { 12 | "005": "19920826084036.0" 13 | }, 14 | { 15 | "007": "sdubumennmplu" 16 | }, 17 | { 18 | "008": "910926s1957 nyuuun eng " 19 | }, 20 | { 21 | "010": { 22 | "subfields": [ 23 | { 24 | "a": " 91758335 " 25 | } 26 | ], 27 | "ind1": " ", 28 | "ind2": " " 29 | } 30 | }, 31 | { 32 | "028": { 33 | "subfields": [ 34 | { 35 | "a": "1259" 36 | }, 37 | { 38 | "b": "Atlantic" 39 | } 40 | ], 41 | "ind1": "0", 42 | "ind2": "0" 43 | } 44 | }, 45 | { 46 | "040": { 47 | "subfields": [ 48 | { 49 | "a": "DLC" 50 | }, 51 | { 52 | "c": "DLC" 53 | } 54 | ], 55 | "ind1": " ", 56 | "ind2": " " 57 | } 58 | }, 59 | { 60 | "050": { 61 | "subfields": [ 62 | { 63 | "a": "Atlantic 1259" 64 | } 65 | ], 66 | "ind1": "0", 67 | "ind2": "0" 68 | } 69 | }, 70 | { 71 | "245": { 72 | "subfields": [ 73 | { 74 | "a": "The Great Ray Charles" 75 | }, 76 | { 77 | "h": "[sound recording]." 78 | } 79 | ], 80 | "ind1": "0", 81 | "ind2": "4" 82 | } 83 | }, 84 | { 85 | "260": { 86 | "subfields": [ 87 | { 88 | "a": "New York, N.Y. :" 89 | }, 90 | { 91 | "b": "Atlantic," 92 | }, 93 | { 94 | "c": "[1957?]" 95 | } 96 | ], 97 | "ind1": " ", 98 | "ind2": " " 99 | } 100 | }, 101 | { 102 | "300": { 103 | "subfields": [ 104 | { 105 | "a": "1 sound disc :" 106 | }, 107 | { 108 | "b": "analog, 33 1/3 rpm ;" 109 | }, 110 | { 111 | "c": "12 in." 112 | } 113 | ], 114 | "ind1": " ", 115 | "ind2": " " 116 | } 117 | }, 118 | { 119 | "511": { 120 | "subfields": [ 121 | { 122 | "a": "Ray Charles, piano & celeste." 123 | } 124 | ], 125 | "ind1": "0", 126 | "ind2": " " 127 | } 128 | }, 129 | { 130 | "505": { 131 | "subfields": [ 132 | { 133 | "a": "The Ray -- My melancholy baby -- Black coffee -- There's no you -- Doodlin' -- Sweet sixteen bars -- I surrender dear -- Undecided." 134 | } 135 | ], 136 | "ind1": "0", 137 | "ind2": " " 138 | } 139 | }, 140 | { 141 | "500": { 142 | "subfields": [ 143 | { 144 | "a": "Brief record." 145 | } 146 | ], 147 | "ind1": " ", 148 | "ind2": " " 149 | } 150 | }, 151 | { 152 | "650": { 153 | "subfields": [ 154 | { 155 | "a": "Jazz" 156 | }, 157 | { 158 | "y": "1951-1960." 159 | } 160 | ], 161 | "ind1": " ", 162 | "ind2": "0" 163 | } 164 | }, 165 | { 166 | "650": { 167 | "subfields": [ 168 | { 169 | "a": "Piano with jazz ensemble." 170 | } 171 | ], 172 | "ind1": " ", 173 | "ind2": "0" 174 | } 175 | }, 176 | { 177 | "700": { 178 | "subfields": [ 179 | { 180 | "a": "Charles, Ray," 181 | }, 182 | { 183 | "d": "1930-" 184 | }, 185 | { 186 | "4": "prf" 187 | } 188 | ], 189 | "ind1": "1", 190 | "ind2": " " 191 | } 192 | } 193 | ] 194 | },{ 195 | "leader": "01832cmma 2200349 a 4500", 196 | "fields": [ 197 | { 198 | "001": "12149120" 199 | }, 200 | { 201 | "005": "20001005175443.0" 202 | }, 203 | { 204 | "007": "cr |||" 205 | }, 206 | { 207 | "008": "000407m19949999dcu g m eng d" 208 | }, 209 | { 210 | "906": { 211 | "subfields": [ 212 | { 213 | "a": "0" 214 | }, 215 | { 216 | "b": "ibc" 217 | }, 218 | { 219 | "c": "copycat" 220 | }, 221 | { 222 | "d": "1" 223 | }, 224 | { 225 | "e": "ncip" 226 | }, 227 | { 228 | "f": "20" 229 | }, 230 | { 231 | "g": "y-gencompf" 232 | } 233 | ], 234 | "ind1": " ", 235 | "ind2": " " 236 | } 237 | }, 238 | { 239 | "925": { 240 | "subfields": [ 241 | { 242 | "a": "undetermined" 243 | }, 244 | { 245 | "x": "web preservation project (wpp)" 246 | } 247 | ], 248 | "ind1": "0", 249 | "ind2": " " 250 | } 251 | }, 252 | { 253 | "955": { 254 | "subfields": [ 255 | { 256 | "a": "vb07 (stars done) 08-19-00 to HLCD lk00; AA3s lk29 received for subject Aug 25, 2000; to DEWEY 08-25-00; aa11 08-28-00" 257 | } 258 | ], 259 | "ind1": " ", 260 | "ind2": " " 261 | } 262 | }, 263 | { 264 | "010": { 265 | "subfields": [ 266 | { 267 | "a": " 00530046 " 268 | } 269 | ], 270 | "ind1": " ", 271 | "ind2": " " 272 | } 273 | }, 274 | { 275 | "035": { 276 | "subfields": [ 277 | { 278 | "a": "(OCoLC)ocm44279786" 279 | } 280 | ], 281 | "ind1": " ", 282 | "ind2": " " 283 | } 284 | }, 285 | { 286 | "040": { 287 | "subfields": [ 288 | { 289 | "a": "IEU" 290 | }, 291 | { 292 | "c": "IEU" 293 | }, 294 | { 295 | "d": "N@F" 296 | }, 297 | { 298 | "d": "DLC" 299 | } 300 | ], 301 | "ind1": " ", 302 | "ind2": " " 303 | } 304 | }, 305 | { 306 | "042": { 307 | "subfields": [ 308 | { 309 | "a": "lccopycat" 310 | } 311 | ], 312 | "ind1": " ", 313 | "ind2": " " 314 | } 315 | }, 316 | { 317 | "043": { 318 | "subfields": [ 319 | { 320 | "a": "n-us-dc" 321 | }, 322 | { 323 | "a": "n-us---" 324 | } 325 | ], 326 | "ind1": " ", 327 | "ind2": " " 328 | } 329 | }, 330 | { 331 | "050": { 332 | "subfields": [ 333 | { 334 | "a": "F204.W5" 335 | } 336 | ], 337 | "ind1": "0", 338 | "ind2": "0" 339 | } 340 | }, 341 | { 342 | "082": { 343 | "subfields": [ 344 | { 345 | "a": "975.3" 346 | }, 347 | { 348 | "2": "13" 349 | } 350 | ], 351 | "ind1": "1", 352 | "ind2": "0" 353 | } 354 | }, 355 | { 356 | "245": { 357 | "subfields": [ 358 | { 359 | "a": "The White House" 360 | }, 361 | { 362 | "h": "[computer file]." 363 | } 364 | ], 365 | "ind1": "0", 366 | "ind2": "4" 367 | } 368 | }, 369 | { 370 | "256": { 371 | "subfields": [ 372 | { 373 | "a": "Computer data." 374 | } 375 | ], 376 | "ind1": " ", 377 | "ind2": " " 378 | } 379 | }, 380 | { 381 | "260": { 382 | "subfields": [ 383 | { 384 | "a": "Washington, D.C. :" 385 | }, 386 | { 387 | "b": "White House Web Team," 388 | }, 389 | { 390 | "c": "1994-" 391 | } 392 | ], 393 | "ind1": " ", 394 | "ind2": " " 395 | } 396 | }, 397 | { 398 | "538": { 399 | "subfields": [ 400 | { 401 | "a": "Mode of access: Internet." 402 | } 403 | ], 404 | "ind1": " ", 405 | "ind2": " " 406 | } 407 | }, 408 | { 409 | "500": { 410 | "subfields": [ 411 | { 412 | "a": "Title from home page as viewed on Aug. 19, 2000." 413 | } 414 | ], 415 | "ind1": " ", 416 | "ind2": " " 417 | } 418 | }, 419 | { 420 | "520": { 421 | "subfields": [ 422 | { 423 | "a": "Features the White House. Highlights the Executive Office of the President, which includes senior policy advisors and offices responsible for the President's correspondence and communications, the Office of the Vice President, and the Office of the First Lady. Posts contact information via mailing address, telephone and fax numbers, and e-mail. Contains the Interactive Citizens' Handbook with information on health, travel and tourism, education and training, and housing. Provides a tour and the history of the White House. Links to White House for Kids." 424 | } 425 | ], 426 | "ind1": "8", 427 | "ind2": " " 428 | } 429 | }, 430 | { 431 | "610": { 432 | "subfields": [ 433 | { 434 | "a": "White House (Washington, D.C.)" 435 | } 436 | ], 437 | "ind1": "2", 438 | "ind2": "0" 439 | } 440 | }, 441 | { 442 | "610": { 443 | "subfields": [ 444 | { 445 | "a": "United States." 446 | }, 447 | { 448 | "b": "Executive Office of the President." 449 | } 450 | ], 451 | "ind1": "1", 452 | "ind2": "0" 453 | } 454 | }, 455 | { 456 | "610": { 457 | "subfields": [ 458 | { 459 | "a": "United States." 460 | }, 461 | { 462 | "b": "Office of the Vice President." 463 | } 464 | ], 465 | "ind1": "1", 466 | "ind2": "0" 467 | } 468 | }, 469 | { 470 | "610": { 471 | "subfields": [ 472 | { 473 | "a": "United States." 474 | }, 475 | { 476 | "b": "Office of the First Lady." 477 | } 478 | ], 479 | "ind1": "1", 480 | "ind2": "0" 481 | } 482 | }, 483 | { 484 | "710": { 485 | "subfields": [ 486 | { 487 | "a": "White House Web Team." 488 | } 489 | ], 490 | "ind1": "2", 491 | "ind2": " " 492 | } 493 | }, 494 | { 495 | "856": { 496 | "subfields": [ 497 | { 498 | "u": "http://www.whitehouse.gov" 499 | } 500 | ], 501 | "ind1": "4", 502 | "ind2": "0" 503 | } 504 | }, 505 | { 506 | "856": { 507 | "subfields": [ 508 | { 509 | "u": "http://lcweb.loc.gov/staff/wpp/whitehouse.html" 510 | }, 511 | { 512 | "z": "Web site archive" 513 | } 514 | ], 515 | "ind1": "4", 516 | "ind2": "0" 517 | } 518 | } 519 | ] 520 | } 521 | ] 522 | -------------------------------------------------------------------------------- /test/batch.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 6 | 00925njm 22002777a 4500 7 | 5637241 8 | DLC 9 | 19920826084036.0 10 | sdubumennmplu 11 | 910926s1957 nyuuun eng 12 | 13 | 91758335 14 | 15 | 16 | 1259 17 | Atlantic 18 | 19 | 20 | DLC 21 | DLC 22 | 23 | 24 | Atlantic 1259 25 | 26 | 27 | The Great Ray Charles 28 | [sound recording]. 29 | 30 | 31 | New York, N.Y. : 32 | Atlantic, 33 | [1957?] 34 | 35 | 36 | 1 sound disc : 37 | analog, 33 1/3 rpm ; 38 | 12 in. 39 | 40 | 41 | Ray Charles, piano & celeste. 42 | 43 | 44 | The Ray -- My melancholy baby -- Black coffee -- There's no you -- Doodlin' -- Sweet sixteen bars -- I surrender dear -- Undecided. 45 | 46 | 47 | Brief record. 48 | 49 | 50 | Jazz 51 | 1951-1960. 52 | 53 | 54 | Piano with jazz ensemble. 55 | 56 | 57 | Charles, Ray, 58 | 1930- 59 | prf 60 | 61 | 62 | 63 | 01832cmma 2200349 a 4500 64 | 12149120 65 | 20001005175443.0 66 | cr ||| 67 | 000407m19949999dcu g m eng d 68 | 69 | 0 70 | ibc 71 | copycat 72 | 1 73 | ncip 74 | 20 75 | y-gencompf 76 | 77 | 78 | undetermined 79 | web preservation project (wpp) 80 | 81 | 82 | vb07 (stars done) 08-19-00 to HLCD lk00; AA3s lk29 received for subject Aug 25, 2000; to DEWEY 08-25-00; aa11 08-28-00 83 | 84 | 85 | 00530046 86 | 87 | 88 | (OCoLC)ocm44279786 89 | 90 | 91 | IEU 92 | IEU 93 | N@F 94 | DLC 95 | 96 | 97 | lccopycat 98 | 99 | 100 | n-us-dc 101 | n-us--- 102 | 103 | 104 | F204.W5 105 | 106 | 107 | 975.3 108 | 13 109 | 110 | 111 | The White House 112 | [computer file]. 113 | 114 | 115 | Computer data. 116 | 117 | 118 | Washington, D.C. : 119 | White House Web Team, 120 | 1994- 121 | 122 | 123 | Mode of access: Internet. 124 | 125 | 126 | Title from home page as viewed on Aug. 19, 2000. 127 | 128 | 129 | Features the White House. Highlights the Executive Office of the President, which includes senior policy advisors and offices responsible for the President's correspondence and communications, the Office of the Vice President, and the Office of the First Lady. Posts contact information via mailing address, telephone and fax numbers, and e-mail. Contains the Interactive Citizens' Handbook with information on health, travel and tourism, education and training, and housing. Provides a tour and the history of the White House. Links to White House for Kids. 130 | 131 | 132 | White House (Washington, D.C.) 133 | 134 | 135 | United States. 136 | Executive Office of the President. 137 | 138 | 139 | United States. 140 | Office of the Vice President. 141 | 142 | 143 | United States. 144 | Office of the First Lady. 145 | 146 | 147 | White House Web Team. 148 | 149 | 150 | http://www.whitehouse.gov 151 | 152 | 153 | http://lcweb.loc.gov/staff/wpp/whitehouse.html 154 | Web site archive 155 | 156 | 157 | 158 | -------------------------------------------------------------------------------- /test/diacritic.dat: -------------------------------------------------------------------------------- 1 | 02710cam a22003018a 45000010009000000050017000090080041000269060045000679250104001129550080002160100017002960200029003130200026003420400013003680500023003810820017004042450099004212600052005202630009005723000011005815040041005925051531006336500039021646500028022036500035022317000024022669630118022901709126920131029120147.0111220s2012 enk b 000 0 eng  a7bcbccorignewd1eecipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy defaulteclaim1 2012-10-24eclaim2 2013-06-13eto CAD 2013-10-29 bxg08 2011-12-20ixg08 2011-12-20axg09 2011-12-21 to Deweywrd05 2011-12-23 a 2011052495 a9780415782654 (hardback) a9780203112021 (ebook) aDLCcDLC00aK564.C6bA835 201200a302.23/122300aAmateur media :bsocial, cultural and legal perspectives /cedited by Dan Hunter ... [et al.]. aAbingdon, Oxon ;aN.Y., NY :bRoutledge,c2012. a1206 ap. cm. aIncludes bibliographical references.0 aHistories of user-generated content: between formal and informal media economies / Ramon Lobato, Julian Thomas and Dan Hunter -- Competing myths of informal economies / Megan Richardson and Jake Goldenfein -- Start with the household / John Quiggin -- Amateur digital content and proportional commerce / Steven Hetcher -- YouTube and the formalisation of amateur media / Jean Burgess -- The relationship between user-generated content and commerce / Kimberlee Weatherall -- The manufacture of 'authentic' buzz and the legal relations of MasterChef / Kathy Bowrey -- Harry Potter and the transformation wand : fair use, canonicity and fan activity / David Tan -- The simulation of 'authentic' buzz : T-mobile and the flash mob dance / Marc Trabsky -- Prestige and professionalisation at the margins of the journalistic field : the case of music writers / Ramon Lobato and Lawson Fletcher -- Swedish subtitling strike called off! : fan-to-fan piracy, translation, and the primacy of authorisation / Eva Hemmungs Wirtén -- Have amateur media enhanced the possibilities for good media work? / David Hesmondhalgh -- Minecraft as Web 2.0 : amateur creativity and digital games / Greg Lastowka -- Cosplay, creativity and immaterial labours of love / Melissa de Zwart -- Web Zero: the amateur and the indie game developer / Christian McCrea -- Anonymous speech on the internet / Brian Murchison -- The privacy interest in anonymous blogging / Lisa Austin -- 'Privacy' of social networking texts / Megan Richardson and Julian Thomas. 0aSocial mediaxLaw and legislation. 0aUser-generated content. 0aInternetxLaw and legislation.1 aHunter, Dan,d1966- aStephen Gutierrez; phone: +44-2070176003; email: stephen.gutierrez@informa.com; bc: stephen.gutierrez@informa.com -------------------------------------------------------------------------------- /test/marc.dat: -------------------------------------------------------------------------------- 1 | 01060cam 22002894a 45000010009000000050017000090080041000260350021000679060045000889250044001339550160001779550053003370100017003900200015004070400018004220420008004400500023004480820014004711000025004852450088005102600044005983000027006425040041006696500026007107000026007369850008007621177850420040816084925.0990802s2000 mau b 001 0 eng  a(DLC) 99043581 a0bvipcorignewd1eocipf19gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc05 to ja00 08-02-99; jf05 to subj. 08/02/99; jf11 to sl 08-03-99; jf25 08-05-99 to ddc; bk rec'd, to CIP ver. ps07 01-07-00; CIP ver jf05 to sl 04/05/00 aADDED COPIES: another copy to ASCD ps15 01-12-00 a 99043581  a020161622X aDLCcDLCdDLC apcc00aQA76.6b.H857 200000a005.12211 aHunt, Andrew,d1964-14aThe pragmatic programmer :bfrom journeyman to master /cAndrew Hunt, David Thomas. aReading, Mass :bAddison-Wesley,c2000. axxiv, 321 p. ;c24 cm. aIncludes bibliographical references. 0aComputer programming.1 aThomas, David,d1956- eGAP00979cam 2200241 a 45000010009000000050017000090080041000269060045000679250042001129550206001540100017003600200015003770400018003920500026004100820017004361000016004532450037004692500012005062600051005183000078005695040051006476500039006981251588220020923085341.0010827s2001 cc a b 001 0 eng  a7bcbccorigcopd2encipf20gy-gencatlg0 aacquireb2 shelf copyxpolicy default apb07 2001-08-27 to ASCDajf00 2001-08-31ajf00 2001-09-05;cjf03 2001-10-16 to Subj.djf01 2001-10-25 to slejf12 2001-11-23; jf12 to Dewey 11-23-01aaa20 2001-12-07; copy 2 added jf16 to BCCD 09-23-02 a 2001276084 a0596000855 aDLCcDLCdDLC00aQA76.73.P98bL88 200100a005.13/32211 aLutz, Mark.10aProgramming Python /cMark Lutz. a2nd ed. aBeijing :aSebastopol, CA :bO'Reilly,cc2001. axxxvii, 1255 p. :bill. ;c24 cm. +e1 computer optical disc (4 3/4 in.). aIncludes bibliographical references and index. 0aPython (Computer program language)00887cam 2200253 a 45000010009000000050017000090080041000269060045000679250044001129550151001560100017003070200015003240400018003390500027003570820017003841000016004012450051004172500012004682600040004803000035005205040020005556500039005757000019006141361051220040714135238.0040601s2004 caua 001 0 eng  a7bcbccorigcopd2encipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apv17 2004-06-01 Preprocessor to ASCDajf00 2004-06-03;cjf03 2004-06-24 to Subj.djf09 2004-06-28 to slejf12 2004-07-01 to Deweyaaa25 2004-07-14 a 2004273129 a0596002815 aDLCcDLCdDLC00aQA76.73.P98bL877 200400a005.13/32221 aLutz, Mark.10aLearning Python /cMark Lutz and David Ascher. a2nd ed. aSebastopol, CA :bO'Reilly,cc2004. axxvi, 591 p. :bill. ;c24 cm. aIncludes index. 0aPython (Computer program language)1 aAscher, David.01038cam 2200289 a 45000010009000000050017000090080041000269060045000679250042001129550167001540100017003210150015003380200015003530350023003680400023003910420014004140500026004280820017004542450065004712600039005363000027005755000048006025000020006506500039006707000020007097000019007291306994220030606071827.0030127s2002 cau 001 0 eng  a7bcbcccopycatd2encipf20gy-gencatlg0 aacquireb2 shelf copyxpolicy default aps04 2003-01-27 to ASCDajf00 2003-01-30cjf05 2003-01-30 to subj.djf09 2003-01-30 to slejf12 2003-02-03 to Deweyaaa20 2003-03-10ajg07 2003-06-06 copy 2 added a 2003268354 aGBA2-Y6761 a0596001673 a(OCoLC)ocm49044543 aUKMcUKMdCUSdDLC alccopycat00aQA76.73.P98bP95 200200a005.13/322100aPython cookbook /cedited by Alex Martelli and David Ascher. aSebastopol, CA :bO'Reilly,c2002. axxix, 574 p. ;c24 cm. a"Recipes from the Python community"--Cover. aIncludes index. 0aPython (Computer program language)1 aMartelli, Alex.1 aAscher, David.00759nam 22002295a 45000010009000000050017000090080041000260350022000679060045000899250044001349550020001780100017001980200015002150400013002300420008002431000021002512450068002722600084003402630009004243000012004339630084004451312796220030318153335.0030318s2003 inu 000 0 eng  a(DLC) 2003104024 a0bibccorignewd2eepcnf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc16 2003-03-18 a 2003104024 a1592000738 aDLCcDLC apcc1 aDawson, Michael.10aPython programming for the absolute beginner /cMichael Dawson. aIndianapolis, IN :bPremier Press Inc., a division of Course Technology,c2003. a0306 ap.ccm. aMargaret Bauer ; phone (812) 273-0561 ; fax null email margaret_bauer@yahoo.com01304cam 22002894a 45000010009000000050017000090080041000269060045000679250044001129550280001560100017004360200015004530400018004680420008004860500024004940820017005181000049005352450150005842460030007342600053007643000036008175040064008536500026009176500023009437000021009667000027009871256551420020718085037.0011016s2002 njua b 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc16 2001-10-16 to ASCD;cjf03 2001-10-17 to Subj.djf01 2001-10-25 to slejf25 2001-10-25 to Deweyaaa20 2001-10-26aps16 2002-01-11 bk rec'd, to CIP ver.ajf00 2002-01-16fjf04 2002-01-18 to S.L.gjf12 2002-01-18 to bccdajf00 2002-03-26; copy 2 added jf16 to BCCD 07-18-02 a 2001055410 a0130410659 aDLCcDLCdDLC apcc00aQA76.625b.T48 200200a005.2/762211 aThiruvathukal, George K.q(George Kuriakose)10aWeb programming :btechniques for integrating Python, Linux, Apache, and MySQL /cGeorge K. Thiruvathukal, John P. Shafaee, Thomas W. Christoper.14aWeb programming in Python aUpper Saddle River, NJ :bPrentice Hall,cc2002. axviii, 745 p. :bill. ;c24 cm. aIncludes bibliographical references (p. 723-725) and index. 0aInternet programming. 0aWeb sitesxDesign.1 aShafaee, John P.1 aChristopher, Thomas W.01023cam 22002774a 45000010009000000050017000090080041000269060045000679250044001129550168001560100017003240200035003410400018003760420008003940500026004020820016004281000029004442450067004732500012005402600051005523000035006035000020006386500039006586300021006977000027007181187737320010105091546.0991228s2000 cc a 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default ato ASCD pc01 12-28-99; jf03 01-04-00 ; jf11 to sl 1-4-00; jf12 to Dewey 01-06-00; aa05 01-10-00; CIP ver. pv08 to BCCD 05-01-00; copy 2 added jf16 to BCCD 01-05-01 a 99085714  a1565926218 (pbk. : alk. paper) aDLCcDLCdDLC apcc00aQA76.73.P98bH36 200000a005.2652211 aHammond, Markq(Mark J.)10aPython programming on Win32 /cMark Hammond and Andy Robinson. a1st ed. aBeijing ;aSebastopol, CA :bO'Reilly,cc2000. axvii, 652 p. :bill. ;c24 cm. aIncludes index. 0aPython (Computer program language)00aMicrosoft Win32.1 aRobinson, Andy,d1967-00867cam 22002538a 45000010009000000050017000090080041000269060045000679250044001129550127001560100017002830200015003000400013003150420008003280500026003360820017003621000019003792450080003982600047004782630009005253000011005346500039005459630029005841343237720031222144424.0031211s2003 oru 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc27 2003-12-11 RUSH to ASCDcjf07 2003-12-17 to subjectdjf09 2003-12-17 to slejp05 2003-12-18 to Deweyaaa20 2003-12-22 a 2003064366 a1887902996 aDLCcDLC apcc00aQA76.73.P98bZ45 200300a005.13/32221 aZelle, John M.10aPython programming :ban introduction to computer science /cJohn M. Zelle. aWilsonville, OR :bFranklin, Beedlec2003. a0312 ap. cm. 0aPython (Computer program language) aTom Sumner, 503-682-766801008cam 22002774a 45000010009000000050017000090080041000269060045000679250044001129550171001560100017003270200015003440400018003590420008003770500026003850820018004111000026004292450065004552600046005203000034005665000020006006500039006206500026006596500023006857000022007081222727720030509151148.0001109s2002 inua 001 0 eng  a7bcbccorignewd2eepcnf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc16 11-09-00apv11 2002-04-12 2 copies to ASCDajf00 2002-04-18;cjf03 2002-08-20 to Subj.djf09 2002-08-21 to slejf25 2002-09-10 2 copies to Deweyaaa05 2002-10-04 a 00110884  a0735710902 aDLCcDLCdDLC apcc00aQA76.73.P98bH65 200200a005.2/7622211 aHolden, Steve,d1950-10aPython Web programming /cSteve Holden [with David Beazley]. aIndianapolis, Ind. :bNew Riders,cc2002. axxi, 691 p. :bill. ;c23 cm. aIncludes index. 0aPython (Computer program language) 0aInternet programming. 0aWeb sitesxDesign.1 aBeazley, David M.01049cam 22002534a 45000010009000000050017000090080041000269060045000679250044001129550148001560100017003040200015003210400018003360420008003540500026003620820017003881000018004052450047004232600052004703000066005225040051005885380117006396500039007561216916820010522141328.0000911s2000 nju b 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default ato ASCD pc16 09-11-00; jf02 09-12-00 ; jf11 to sl 9-12-00; jf12 to Dewey 09-14-00; aa05 09-14-00; CIP Ver. jf02 05-11-01; jf12 to BCCD 05-22-01 a 00047856  a0130260363 aDLCcDLCdDLC apcc00aQA76.73.P98bC48 200100a005.13/32211 aChun, Wesley.10aCore python programming /cWesley J. Chun. aUpper Saddle River, NJ :bPrentice Hall,c2001. axxix, 771 p. ;c24 cm. +e1 computer optical disc (4 3/4 in.) aIncludes bibliographical references and index. aSystem requirements for accompanying computer disc: Windows 9x/Me/NT/2000; a Web brouser; Macintosh; UNIX/Linux. 0aPython (Computer program language)00948cam 22002654a 45000010009000000050017000090080041000269060045000679250044001129550133001560100017002890200015003060400018003210420008003390500026003470820017003731000021003902450055004112600038004663000036005045040051005406500039005916500036006306300016006661213218820010817152505.0000804s2000 ctua b 001 0 eng  a7bcbccorignewd2encipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default ato ASCD pb05 08-04-00;jfoo 08-08-00;cjf03 2001-07-18 to Subj.djf02 2001-07-19 to slejf25 2001-07-31 to Deweyaaa20 2001-08-17 a 00697831  a1884777813 aDLCcDLCdDLC apcc00aQA76.73.P98bG73 200000a005.13/32211 aGrayson, John E.10aPython and Tkinter programming /cJohn E. Grayson. aGreenwich, CT :bManning,cc2000. axxiii, 658 p. :bill. ;c24 cm. aIncludes bibliographical references and index. 0aPython (Computer program language) 0aTcl (Computer program language)00aTk toolkit.00767nam 22002295a 45000010009000000050017000090080041000260350022000679060045000899250044001349550020001780100017001980200015002150400013002300420008002432450081002512600079003322630009004113000012004204400021004329630084004531337832520031020153106.0031020s2003 inu 000 0 eng  a(DLC) 2003114351 a0bibccorignewd2eepcnf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc10 2003-10-20 a 2003114351 a1592000770 aDLCcDLC apcc00aGame programming with Python, Lua, and Ruby /c[edited by] Estelle Manticas. aIndianapolis, IN :bPremier Press, a Division of Course Technology,c2003. a0311 ap.ccm. aGame development aMargaret Bauer ; phone (812) 273-0561 ; fax null email margaret_bauer@yahoo.com01121cam 22002414a 45000010009000000050017000090080041000269060045000679250044001129550362001560100017005180200015005350400018005500420008005680500026005760820017006021000027006192450058006462600052007043000033007565040051007896500039008401256552920030227150222.0011016s2002 njua b 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc16 2001-10-16 to ASCD;cjf05 2001-10-17 to subj.djf04 2001-10-17 to S.L.ejf25 2001-10-25 to Deweyaaa20 2001-10-26aps11 2002-01-15 bk rec'd, to CIP ver.fjp07 2002-02-13ajp00 2002-03-08gjp85 2002-03-08 to BCCD; copy 2 added jf16 to BCCD 07-18-02ajf00 2003-01-29ajf07 2003-02-27 somehow copy 1 got back into the CIP ver. stream (handed to acting TL) a 2001055411 a0130409561 aDLCcDLCdDLC apcc00aQA76.73.P98bC47 200200a005.13/32211 aChristopher, Thomas W.10aPython programming patterns /cThomas W. Christopher. aUpper Saddle River, NJ :bPrentice Hall,c2002. axix 538 p. :bill. ;c24 cm. aIncludes bibliographical references and index. 0aPython (Computer program language)01062cam 22002778a 45000010009000000050017000090080041000269060045000679250044001129550120001560100017002760200028002930400013003210420008003340500026003420820018003681000024003862450134004102600041005442630009005853000011005946500039006056500037006446500039006819630064007201275256420020426115101.0020424s2002 mau 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc21 2002-04-24 to ASCDcjf05 2002-04-25 to subj.djf09 2002-04-25 to slejf25 2002-04-26 to Deweyaaa20 2002-04-26 a 2002066565 a0201616165 (alk. paper) aDLCcDLC apcc00aQA76.73.P98bH54 200200a005.2/7622211 aHightower, Richard.10aPython programming with the Java class libraries :ba tutorial for building Web and Enterprise applications /cRichard Hightower. aBoston, MA :bAddison-Wesley,c2002. a0207 ap. cm. 0aPython (Computer program language) 0aJava (Computer program language) 0aApplication softwarexDevelopment. aMarilyn Rash, 617-848-6509; email: timothy.nicholls@awl.com01012cam 22002294a 45000010009000000050017000090080041000269060045000679250044001129550192001560100017003480200028003650400018003930420008004110500026004190820017004451000017004622450151004792600042006303000071006726500039007431216723920010608101527.0000908s2001 mau b 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default ato ASCD pc21 09-08-00;jf05 (desc) 09/08/00 ; jf11 to sl 9-11-00; jf12 to Dewey 09-12-00;aa03 9-12-00;CIP ver jf05 to sl 01/11/01; jf12 to BCCD 02-01-01; copy 2 added jf16 to BCCD 04-24-01 a 00046921  a0201709384 (alk. paper) aDLCcDLCdDLC apcc00aQA76.73.P48bG38 200100a005.13/32211 aGauld, Alan.10aLearn to program using Python :ba tutorial for hobbyists, self-starters, and all who want to learn the art of computer programming /cAlan Gauld. aReading, MA :bAddison-Wesley,c2001. axii, 270 p. ;c24 cm.e+ 1 computer laser optical disc (4 3/4 in.) 0aPython (Computer program language)00935cam 22002534a 450000100070000000500170000700800410002403500210006590600450008692500440013195501480017501000170032302000220034004000180036204200080038005000260038808200170041410000160043124500610044726000390050830000750054765000390062270000200066120525620000830103214.0990629s2000 caua 001 0 eng  9(DLC) 99065006 a7bcbccorignewd2eopcnf19gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apn08/e-pcn 06-29-99; to ASCD pb02 06-10-00; jf00 06-13-00; jf03 08-17-00 ; jf11 to sl 8-22-00; jf25 2 copies to Dewey 08-24-00; aa19 08-30-2000 a 99065006  a0761523340 (pbk.) aDLCcDLCdDLC apcc00aQA76.73.P98bA48 199900a005.13/32211 aAltom, Tim.10aProgramming with Python /cTim Altom with Mitch Chapman. aRocklin, CA :bPrima Tech,cc1999. axxxiv, 372 p. :bill. ;c24 cm. +e1 computer optical disc (4 3/4 in.) 0aPython (Computer program language)1 aChapman, Mitch.01214cam 22002774a 45000010009000000050017000090080041000269060045000679250044001129550202001560100017003580200047003750400018004220420008004400500024004480820017004721000019004892450079005082600051005873000058006386500026006966500048007226500040007706500049008108560077008591328439520040226131230.0030722s2004 maua 001 0 eng  a7bcbccorignewd1eecipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default ajf05 2003-07-22cjf05 2003-07-22 to subj.djf09 2003-07-22 to slejf12 2003-07-23 to Deweyaaa20 2003-07-29ajf00 2004-01-29fjf07 2004-02-02ejf12 2004-02-03 to BCCDajf16 2004-02-26 copy2 to BCCD a 2003016400 a1584502681 (Pbk. with CD-ROM : alk. paper) aDLCcDLCdDLC apcc00aQA76.625b.J66 200400a005.2/762221 aJones, M. Tim.10aBSD Sockets programming from a multi-language perspective /cM. Tim Jones. aHingham, Mass. :bCharles River Media,cc2004. axix, 444 p. :bill. ;c24 cm. +e1 CD-ROM (4 3/4 in.) 0aInternet programming. 0aComputer networksxDesign and construction. 0aInternetworking (Telecommunication) 0aProgramming languages (Electronic computers)413Table of contentsuhttp://www.loc.gov/catdir/toc/ecip047/2003016400.html01113cam 2200277 a 4500001000800000005001700008008004100025035002100066906004500087955011600132010001700248020003300265040001800298050002300316082001600339245009800355260004600453300003300499440004900532504006400581650005100645650003600696650002300732700001800755991006200773159816719981112152315.2940902s1995 maua b 001 0 eng  9(DLC) 94034264 a7bcbccorignewd1eocipf19gy-gencatlg apc18 to ja00 09-02-94; jf06 to subj 09-06-94; jf11 to sl 09-06-94; jf12 09-06-94 to ddc; CIP ver. jc03 11-23-94 a 94034264  a0201633612 (acid-free paper) aDLCcDLCdDLC00aQA76.64b.D47 199500a005.1/222000aDesign patterns :belements of reusable object-oriented software /cErich Gamma ... [et al.]. aReading, Mass. :bAddison-Wesley,cc1995. axv, 395 p. :bill. ;c25 cm. 0aAddison-Wesley professional computing series aIncludes bibliographical references (p. 375-381) and index. 0aObject-oriented programming (Computer science) 0aComputer softwarexReusability. 0aSoftware patterns.1 aGamma, Erich. bc-GenCollhQA76.64i.D47 1995p00011185514tCopy 1wBOOKS01233cam 2200289 a 45000010009000000050017000090080041000269060045000679250044001129550226001560100017003820200034003990400013004330500024004460820014004702450065004842460015005492500012005642600043005763000021006195000111006405040068007516500026008196500025008457000022008707000051008921237004420020812080859.0010405s2001 mau b 001 0 eng  a7bcbccorignewd1eocipf20gy-gencatlg0 aacquireb2 shelf copiesxpolicy default apc20 to ja00 04-05-01; jp07 04-11-01 sent to sl;jp85 to Dewey 04-19-01; aa20 04-20-01aps13 2001-08-16 bk rec'd, to CIP ver.fjf07 2001-08-20ajf00 2001-08-20gjf12 2001-08-23 to bccdajf01 2001-09-13 copy 2 added to BCCD a 2001031277 a0262032937 (hc. : alk. paper) aDLCcDLC00aQA76.6b.I5858 200100a005.122100aIntroduction to algorithms /cThomas H. Cormen ... [et al.].30aAlgorithms a2nd ed. aCambridge, Mass. :bMIT Press,cc2001. axxi, 1180 p. cm. aRev. ed. of: Introduction to algorithms / Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest. c1990. aIncludes bibliographical references (p. [1127]-1130) and index. 0aComputer programming. 0aComputer algorithms.1 aCormen, Thomas H.1 aCormen, Thomas H.tIntroduction to algorithms.01009pam 2200265 a 4500001000800000005001700008008004100025035002100066906004500087955011600132010001700248020002200265040001800287050002600305082001700331100001800348245003700366260005300403300003500456440005200491500002700543504006400570650004400634991006500678303540919960425075058.2951006s1996 njua b 001 0 eng  9(DLC) 95045017 a7bcbccorignewd1eocipf19gy-gencatlg apc17 RUSH to ja00 10-06-95;jf05 to subj. 10/06/95; jf04 to S.L. 10-06-95; jf14 10-10-95; CIP ver. jk14 04-22-96 a 95045017  a0133708756 (pbk.) aDLCcDLCdDLC00aQA76.73.C28bG69 199600a005.13/32201 aGraham, Paul.10aANSI Common Lisp /cPaul Graham. aEnglewood Cliffs, N.J. :bPrentice Hall,cc1996. axiii, 432 p. :bill. ;c23 cm. 0aPrentice Hall series in artificial intelligence a"An Alan R. Apt book." aIncludes bibliographical references (p. 401-414) and index. 0aCOMMON LISP (Computer program language) bc-GenCollhQA76.73.C28iG69 1996p00034751468tCopy 1wBOOKS -------------------------------------------------------------------------------- /test/marc8.dat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edsu/pymarc/cf33051421ac74389c1bc6d54921fb9612083d1b/test/marc8.dat -------------------------------------------------------------------------------- /test/multi_isbn.dat: -------------------------------------------------------------------------------- 1 | 00794pam a2200241 i 4500001000800000005001700008008004100025035002100066906004500087010001700132020002200149020003300171020002900204040001800233050002500251082002500276100002700301245007400328260004900402300002800451490002200479991005100501461219520050208133419.0771025s1977 vtua 000 0 eng  9(DLC) 77017192 a7bcbccorignewd1eocipf19gy-gencatlg a 77017192  a0914378287 (v. 1) a0914378295 (lim. ed.) (v. 1) a0914378260 (pbk.) (v. 1) aDLCcDLCdDLC00aPS3569.H44bW3 pt. 100a811/.5/4 sa811/.5/41 aJohnson, Judith Emlyn.14aThe town scold /cby Judith Johnson Sherwin ; ill. by Margaret Lampe. aTaftsville, Vt. :bCountryman Press,cc1977. a39 p. :bill. ;c23 cm.0 aHer Waste ; pt. 1 bc-GenCollhPS3569.H44iW3 pt. 1tCopy 1wBOOKS -------------------------------------------------------------------------------- /test/one.dat: -------------------------------------------------------------------------------- 1 | 00755cam 22002414a 4500001001300000003000600013005001700019008004100036010001700077020004300094040001800137042000800155050002600163082001700189100003100206245005400237260004200291300007200333500003300405650003700438630002500475630001300500fol05731351 IMchF20000613133448.0000107s2000 nyua 001 0 eng  a 00020737  a0471383147 (paper/cd-rom : alk. paper) aDLCcDLCdDLC apcc00aQA76.73.P22bM33 200000a005.13/32211 aMartinsson, Tobias,d1976-10aActivePerl with ASP and ADO /cTobias Martinsson. aNew York :bJohn Wiley & Sons,c2000. axxi, 289 p. :bill. ;c23 cm. +e1 computer laser disc (4 3/4 in.) a"Wiley Computer Publishing." 0aPerl (Computer program language)00aActive server pages.00aActiveX. -------------------------------------------------------------------------------- /test/one.json: -------------------------------------------------------------------------------- 1 | { 2 | "leader": "00755cam 22002414a 4500", 3 | "fields": [ 4 | { 5 | "001": "fol05731351 " 6 | }, 7 | { 8 | "003": "IMchF" 9 | }, 10 | { 11 | "005": "20000613133448.0" 12 | }, 13 | { 14 | "008": "000107s2000 nyua 001 0 eng " 15 | }, 16 | { 17 | "010": { 18 | "subfields": [ 19 | { 20 | "a": " 00020737 " 21 | } 22 | ], 23 | "ind1": " ", 24 | "ind2": " " 25 | } 26 | }, 27 | { 28 | "020": { 29 | "subfields": [ 30 | { 31 | "a": "0471383147 (paper/cd-rom : alk. paper)" 32 | } 33 | ], 34 | "ind1": " ", 35 | "ind2": " " 36 | } 37 | }, 38 | { 39 | "040": { 40 | "subfields": [ 41 | { 42 | "a": "DLC" 43 | }, 44 | { 45 | "c": "DLC" 46 | }, 47 | { 48 | "d": "DLC" 49 | } 50 | ], 51 | "ind1": " ", 52 | "ind2": " " 53 | } 54 | }, 55 | { 56 | "042": { 57 | "subfields": [ 58 | { 59 | "a": "pcc" 60 | } 61 | ], 62 | "ind1": " ", 63 | "ind2": " " 64 | } 65 | }, 66 | { 67 | "050": { 68 | "subfields": [ 69 | { 70 | "a": "QA76.73.P22" 71 | }, 72 | { 73 | "b": "M33 2000" 74 | } 75 | ], 76 | "ind1": "0", 77 | "ind2": "0" 78 | } 79 | }, 80 | { 81 | "082": { 82 | "subfields": [ 83 | { 84 | "a": "005.13/3" 85 | }, 86 | { 87 | "2": "21" 88 | } 89 | ], 90 | "ind1": "0", 91 | "ind2": "0" 92 | } 93 | }, 94 | { 95 | "100": { 96 | "subfields": [ 97 | { 98 | "a": "Martinsson, Tobias," 99 | }, 100 | { 101 | "d": "1976-" 102 | } 103 | ], 104 | "ind1": "1", 105 | "ind2": " " 106 | } 107 | }, 108 | { 109 | "245": { 110 | "subfields": [ 111 | { 112 | "a": "ActivePerl with ASP and ADO /" 113 | }, 114 | { 115 | "c": "Tobias Martinsson." 116 | } 117 | ], 118 | "ind1": "1", 119 | "ind2": "0" 120 | } 121 | }, 122 | { 123 | "260": { 124 | "subfields": [ 125 | { 126 | "a": "New York :" 127 | }, 128 | { 129 | "b": "John Wiley & Sons," 130 | }, 131 | { 132 | "c": "2000." 133 | } 134 | ], 135 | "ind1": " ", 136 | "ind2": " " 137 | } 138 | }, 139 | { 140 | "300": { 141 | "subfields": [ 142 | { 143 | "a": "xxi, 289 p. :" 144 | }, 145 | { 146 | "b": "ill. ;" 147 | }, 148 | { 149 | "c": "23 cm. +" 150 | }, 151 | { 152 | "e": "1 computer laser disc (4 3/4 in.)" 153 | } 154 | ], 155 | "ind1": " ", 156 | "ind2": " " 157 | } 158 | }, 159 | { 160 | "500": { 161 | "subfields": [ 162 | { 163 | "a": "\"Wiley Computer Publishing.\"" 164 | } 165 | ], 166 | "ind1": " ", 167 | "ind2": " " 168 | } 169 | }, 170 | { 171 | "650": { 172 | "subfields": [ 173 | { 174 | "a": "Perl (Computer program language)" 175 | } 176 | ], 177 | "ind1": " ", 178 | "ind2": "0" 179 | } 180 | }, 181 | { 182 | "630": { 183 | "subfields": [ 184 | { 185 | "a": "Active server pages." 186 | } 187 | ], 188 | "ind1": "0", 189 | "ind2": "0" 190 | } 191 | }, 192 | { 193 | "630": { 194 | "subfields": [ 195 | { 196 | "a": "ActiveX." 197 | } 198 | ], 199 | "ind1": "0", 200 | "ind2": "0" 201 | } 202 | } 203 | ] 204 | } -------------------------------------------------------------------------------- /test/test.dat: -------------------------------------------------------------------------------- 1 | 00755cam 22002414a 4500001001300000003000600013005001700019008004100036010001700077020004300094040001800137042000800155050002600163082001700189100003100206245005400237260004200291300007200333500003300405650003700438630002500475630001300500fol05731351 IMchF20000613133448.0000107s2000 nyua 001 0 eng  a 00020737  a0471383147 (paper/cd-rom : alk. paper) aDLCcDLCdDLC apcc00aQA76.73.P22bM33 200000a005.13/32211 aMartinsson, Tobias,d1976-10aActivePerl with ASP and ADO /cTobias Martinsson. aNew York :bJohn Wiley & Sons,c2000. axxi, 289 p. :bill. ;c23 cm. +e1 computer laser disc (4 3/4 in.) a"Wiley Computer Publishing." 0aPerl (Computer program language)00aActive server pages.00aActiveX.00647pam 2200241 a 4500001001300000003000600013005001700019008004100036010001700077020001500094040001800109042000800127050002600135082001500161100002600176245006700202260003800269263000900307300001100316650003700327650002500364700001600389fol05754809 IMchF20000601115601.0000203s2000 mau 001 0 eng  a 00022023  a1565926994 aDLCcDLCdDLC apcc00aQA76.73.P22bD47 200000a005.742211 aDescartes, Alligator.10aProgramming the Perl DBI /cAlligator Descartes and Tim Bunce. aCmabridge, MA :bO'Reilly,c2000. a1111 ap. cm. 0aPerl (Computer program language) 0aDatabase management.1 aBunce, Tim.00605cam 22002054a 4500001001300000003000600013005001700019008004100036010001700077040001800094042000800112050002700120082001700147100002100164245005500185260004500240300002600285504005100311650003700362fol05843555 IMchF20000525142739.0000318s1999 cau b 001 0 eng  a 00501349  aDLCcDLCdDLC apcc00aQA76.73.P22bB763 199900a005.13/32211 aBrown, Martin C.10aPerl :bprogrammer's reference /cMartin C. Brown. aBerkeley :bOsborne/McGraw-Hill,cc1999. axix, 380 p. ;c22 cm. aIncludes bibliographical references and index. 0aPerl (Computer program language)00579cam 22002054a 4500001001300000003000600013005001700019008004100036010001700077020001500094040001800109042000800127050002700135082001700162100002100179245005500200260004500255300003600300650003700336fol05843579 IMchF20000525142716.0000318s1999 caua 001 0 eng  a 00502116  a0072120002 aDLCcDLCdDLC apcc00aQA76.73.P22bB762 199900a005.13/32211 aBrown, Martin C.10aPerl :bthe complete reference /cMartin C. Brown. aBerkeley :bOsborne/McGraw-Hill,cc1999. axxxv, 1179 p. :bill. ;c24 cm. 0aPerl (Computer program language)00801nam 22002778a 4500001001300000003000600013005001700019008004100036010001700077020001500094040001300109042000800122050002600130082001800156100002000174245008800194250003200282260004100314263000900355300001100364650003700375650003600412650002600448700002500474700002400499fol05848297 IMchF20000524125727.0000518s2000 mau 001 0 eng  a 00041664  a1565924193 aDLCcDLC apcc00aQA76.73.P22bG84 200000a005.2/7622211 aGuelich, Scott.10aCGI programming with Perl /cScott Guelich, Shishir Gundavaram & Gunther Birznieks. a2nd ed., expanded & updated aCambridge, Mass. :bO'Reilly,c2000. a0006 ap. cm. 0aPerl (Computer program language) 0aCGI (Computer network protocol) 0aInternet programming.1 aGundavaram, Shishir.1 aBirznieks, Gunther.00665nam 22002298a 4500001001300000003000600013005001700019008004100036010001700077020001500094040001300109042000800122050002700130082001700157111005200174245008600226250001200312260004100324263000900365300001100374650005000385fol05865950 IMchF20000615103017.0000612s2000 mau 100 0 eng  a 00055759  a0596000138 aDLCcDLC apcc00aQA76.73.P22bP475 200000a005.13/32212 aPerl Conference 4.0d(2000 :cMonterey, Calif.)10aProceedings of the Perl Conference 4.0 :bJuly 17-20, 2000, Monterey, California. a1st ed. aCambridge, Mass. :bO'Reilly,c2000. a0006 ap. cm. 0aPerl (Computer program language)vCongresses.00579nam 22002178a 4500001001300000003000600013005001700019008004100036010001700077020001500094040001300109042000800122050002600130082001700156100002800173245006200201260004100263263000900304300001100313650003700324fol05865956 IMchF20000615102948.0000612s2000 mau 000 0 eng  a 00055770  a1565926099 aDLCcDLC apcc00aQA76.73.P22bB43 200000a005.13/32211 aBlank-Edelman, David N.10aPerl for system administration /cDavid N. Blank-Edelman. aCambridge, Mass. :bO'Reilly,c2000. a0006 ap. cm. 0aPerl (Computer program language)00661nam 22002538a 4500001001300000003000600013005001700019008004100036010001700077020001500094040001300109042000800122050002600130082001700156100001700173245006700190250001200257260004100269263000900310300001100319650003700330700002300367700001700390fol05865967 IMchF20000615102611.0000614s2000 mau 000 0 eng  a 00055799  a0596000278 aDLCcDLC apcc00aQA76.73.P22bW35 200000a005.13/32211 aWall, Larry.10aProgramming Perl /cLarry Wall, Tom Christiansen & Jon Orwant. a3rd ed. aCambridge, Mass. :bO'Reilly,c2000. a0007 ap. cm. 0aPerl (Computer program language)1 aChristiansen, Tom.1 aOrwant, Jon.00603cam 22002054a 4500001001300000003000600013005001700019008004100036010001700077020001500094040001800109042000800127050002600135082001700161100003200178245006000210260005700270300003300327650003700360fol05872355 IMchF20000706095105.0000315s1999 njua 001 0 eng  a 00500678  a013020868X aDLCcDLCdDLC apcc00aQA76.73.P22bL69 199900a005.13/32211 aLowe, Vincentq(Vincent D.)10aPerl programmer's interactive workbook /cVincent Lowe. aUpper Saddle River, NJ :bPrentice Hall PTP,cc1999. axx, 633 p. :bill. ;c23 cm. 0aPerl (Computer program language)00696nam 22002538a 4500001001300000003000600013005001700019008004100036010001700077020002800094040001300122042000800135050002600143082001700169100002600186245004400212260005100256263000900307300001100316500002000327650003700347650001700384650004100401fol05882032 IMchF20000707091904.0000630s2000 cau 001 0 eng  a 00058174  a0764547291 (alk. paper) aDLCcDLC apcc00aQA76.73.P22bF64 200000a005.13/32212 aFoster-Johnson, Eric.10aCross-platform Perl /cEric F. Johnson. aFoster City, CA :bIDG Books Worldwide,c2000. a0009 ap. cm. aIncludes index. 0aPerl (Computer program language) 0aWeb servers. 0aCross-platform software development. -------------------------------------------------------------------------------- /test/test.json: -------------------------------------------------------------------------------- 1 | [{ 2 | "leader":"01471cjm a2200349 a 4500", 3 | "fields": 4 | [ 5 | { 6 | "001":"5674874" 7 | }, 8 | { 9 | "005":"20030305110405.0" 10 | }, 11 | { 12 | "007":"sdubsmennmplu" 13 | }, 14 | { 15 | "008":"930331s1963 nyuppn eng d" 16 | }, 17 | { 18 | "035": 19 | { 20 | "subfields": 21 | [ 22 | { 23 | "9":"(DLC) 93707283" 24 | } 25 | ], 26 | "ind1":" ", 27 | "ind2":" " 28 | } 29 | }, 30 | { 31 | "906": 32 | { 33 | "subfields": 34 | [ 35 | { 36 | "a":"7" 37 | }, 38 | { 39 | "b":"cbc" 40 | }, 41 | { 42 | "c":"copycat" 43 | }, 44 | { 45 | "d":"4" 46 | }, 47 | { 48 | "e":"ncip" 49 | }, 50 | { 51 | "f":"19" 52 | }, 53 | { 54 | "g":"y-soundrec" 55 | } 56 | ], 57 | "ind1":" ", 58 | "ind2":" " 59 | } 60 | }, 61 | { 62 | "010": 63 | { 64 | "subfields": 65 | [ 66 | { 67 | "a":" 93707283 " 68 | } 69 | ], 70 | "ind1":" ", 71 | "ind2":" " 72 | } 73 | }, 74 | { 75 | "028": 76 | { 77 | "subfields": 78 | [ 79 | { 80 | "a":"CS 8786" 81 | }, 82 | { 83 | "b":"Columbia" 84 | } 85 | ], 86 | "ind1":"0", 87 | "ind2":"2" 88 | } 89 | }, 90 | { 91 | "035": 92 | { 93 | "subfields": 94 | [ 95 | { 96 | "a":"(OCoLC)13083787" 97 | } 98 | ], 99 | "ind1":" ", 100 | "ind2":" " 101 | } 102 | }, 103 | { 104 | "040": 105 | { 106 | "subfields": 107 | [ 108 | { 109 | "a":"OClU" 110 | }, 111 | { 112 | "c":"DLC" 113 | }, 114 | { 115 | "d":"DLC" 116 | } 117 | ], 118 | "ind1":" ", 119 | "ind2":" " 120 | } 121 | }, 122 | { 123 | "041": 124 | { 125 | "subfields": 126 | [ 127 | { 128 | "d":"eng" 129 | }, 130 | { 131 | "g":"eng" 132 | } 133 | ], 134 | "ind1":"0", 135 | "ind2":" " 136 | } 137 | }, 138 | { 139 | "042": 140 | { 141 | "subfields": 142 | [ 143 | { 144 | "a":"lccopycat" 145 | } 146 | ], 147 | "ind1":" ", 148 | "ind2":" " 149 | } 150 | }, 151 | { 152 | "050": 153 | { 154 | "subfields": 155 | [ 156 | { 157 | "a":"Columbia CS 8786" 158 | } 159 | ], 160 | "ind1":"0", 161 | "ind2":"0" 162 | } 163 | }, 164 | { 165 | "100": 166 | { 167 | "subfields": 168 | [ 169 | { 170 | "a":"Dylan, 171 | Bob, 172 | " 173 | }, 174 | { 175 | "d":"1941-" 176 | } 177 | ], 178 | "ind1":"1", 179 | "ind2":" " 180 | } 181 | }, 182 | { 183 | "245": 184 | { 185 | "subfields": 186 | [ 187 | { 188 | "a":"The freewheelin' Bob Dylan" 189 | }, 190 | { 191 | "h":" 192 | [ 193 | sound recording 194 | ] 195 | ." 196 | } 197 | ], 198 | "ind1":"1", 199 | "ind2":"4" 200 | } 201 | }, 202 | { 203 | "260": 204 | { 205 | "subfields": 206 | [ 207 | { 208 | "a":" 209 | [ 210 | New York, 211 | N.Y. 212 | ] 213 | :" 214 | }, 215 | { 216 | "b":"Columbia, 217 | " 218 | }, 219 | { 220 | "c":" 221 | [ 222 | 1963 223 | ] 224 | " 225 | } 226 | ], 227 | "ind1":" ", 228 | "ind2":" " 229 | } 230 | }, 231 | { 232 | "300": 233 | { 234 | "subfields": 235 | [ 236 | { 237 | "a":"1 sound disc :" 238 | }, 239 | { 240 | "b":"analog, 241 | 33 1/3 rpm, 242 | stereo. ;" 243 | }, 244 | { 245 | "c":"12 in." 246 | } 247 | ], 248 | "ind1":" ", 249 | "ind2":" " 250 | } 251 | }, 252 | { 253 | "500": 254 | { 255 | "subfields": 256 | [ 257 | { 258 | "a":"Songs." 259 | } 260 | ], 261 | "ind1":" ", 262 | "ind2":" " 263 | } 264 | }, 265 | { 266 | "511": 267 | { 268 | "subfields": 269 | [ 270 | { 271 | "a":"The composer accompanying himself on the guitar ; in part with instrumental ensemble." 272 | } 273 | ], 274 | "ind1":"0", 275 | "ind2":" " 276 | } 277 | }, 278 | { 279 | "500": 280 | { 281 | "subfields": 282 | [ 283 | { 284 | "a":"Program notes by Nat Hentoff on container." 285 | } 286 | ], 287 | "ind1":" ", 288 | "ind2":" " 289 | } 290 | }, 291 | { 292 | "505": 293 | { 294 | "subfields": 295 | [ 296 | { 297 | "a":"Blowin' in the wind -- Girl from the north country -- Masters of war -- Down the highway -- Bob Dylan's blues -- A hard rain's a-gonna fall -- Don't think twice, 298 | it's all right -- Bob Dylan's dream -- Oxford town -- Talking World War III blues -- Corrina, 299 | Corrina -- Honey, 300 | just allow me one more chance -- I shall be free." 301 | } 302 | ], 303 | "ind1":"0", 304 | "ind2":" " 305 | } 306 | }, 307 | { 308 | "650": 309 | { 310 | "subfields": 311 | [ 312 | { 313 | "a":"Popular music" 314 | }, 315 | { 316 | "y":"1961-1970." 317 | } 318 | ], 319 | "ind1":" ", 320 | "ind2":"0" 321 | } 322 | }, 323 | { 324 | "650": 325 | { 326 | "subfields": 327 | [ 328 | { 329 | "a":"Blues (Music)" 330 | }, 331 | { 332 | "y":"1961-1970." 333 | } 334 | ], 335 | "ind1":" ", 336 | "ind2":"0" 337 | } 338 | }, 339 | { 340 | "856": 341 | { 342 | "subfields": 343 | [ 344 | { 345 | "3":"Preservation copy (limited access)" 346 | }, 347 | { 348 | "u":"http://hdl.loc.gov/loc.mbrsrs/lp0001.dyln" 349 | } 350 | ], 351 | "ind1":"4", 352 | "ind2":"1" 353 | } 354 | }, 355 | { 356 | "952": 357 | { 358 | "subfields": 359 | [ 360 | { 361 | "a":"New" 362 | } 363 | ], 364 | "ind1":" ", 365 | "ind2":" " 366 | } 367 | }, 368 | { 369 | "953": 370 | { 371 | "subfields": 372 | [ 373 | { 374 | "a":"TA28" 375 | } 376 | ], 377 | "ind1":" ", 378 | "ind2":" " 379 | } 380 | }, 381 | { 382 | "991": 383 | { 384 | "subfields": 385 | [ 386 | { 387 | "b":"c-RecSound" 388 | }, 389 | { 390 | "h":"Columbia CS 8786" 391 | }, 392 | { 393 | "w":"MUSIC" 394 | } 395 | ], 396 | "ind1":" ", 397 | "ind2":" " 398 | } 399 | } 400 | ] 401 | } 402 | ] 403 | -------------------------------------------------------------------------------- /test/test_encode.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | import unittest 8 | 9 | from pymarc import MARCReader 10 | 11 | 12 | class Encode(unittest.TestCase): 13 | def test_encode_decode(self): 14 | # get raw data from file 15 | with open("test/one.dat", "rb") as fh: 16 | original = fh.read() 17 | 18 | # create a record object for the file 19 | with open("test/one.dat", "rb") as fh: 20 | reader = MARCReader(fh) 21 | record = next(reader) 22 | # make sure original data is the same as 23 | # the record encoded as MARC 24 | raw = record.as_marc() 25 | self.assertEqual(original, raw) 26 | 27 | def test_encode_decode_alphatag(self): 28 | # get raw data from file containing non-numeric tags 29 | with open("test/alphatag.dat", "rb") as fh: 30 | original = fh.read() 31 | 32 | # create a record object for the file 33 | with open("test/alphatag.dat", "rb") as fh: 34 | reader = MARCReader(fh) 35 | record = next(reader) 36 | # make sure original data is the same as 37 | # the record encoded as MARC 38 | raw = record.as_marc() 39 | self.assertEqual(original, raw) 40 | 41 | 42 | def suite(): 43 | test_suite = unittest.makeSuite(Encode, "test") 44 | return test_suite 45 | 46 | 47 | if __name__ == "__main__": 48 | unittest.main() 49 | -------------------------------------------------------------------------------- /test/test_field.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | import unittest 8 | import sys 9 | 10 | from pymarc.field import Field 11 | 12 | 13 | class FieldTest(unittest.TestCase): 14 | def setUp(self): 15 | self.field = Field( 16 | tag="245", 17 | indicators=[0, 1], 18 | subfields=["a", "Huckleberry Finn: ", "b", "An American Odyssey"], 19 | ) 20 | 21 | self.controlfield = Field( 22 | tag="008", data="831227m19799999nyu ||| | ger " 23 | ) 24 | 25 | self.subjectfield = Field( 26 | tag="650", 27 | indicators=[" ", "0"], 28 | subfields=["a", "Python (Computer program language)", "v", "Poetry."], 29 | ) 30 | 31 | def test_string(self): 32 | self.assertEqual( 33 | str(self.field), "=245 01$aHuckleberry Finn: $bAn American Odyssey" 34 | ) 35 | 36 | def test_controlfield_string(self): 37 | self.assertEqual( 38 | str(self.controlfield), r"=008 831227m19799999nyu\\\\\\\\\\\|||\|\ger\\" 39 | ) 40 | 41 | def test_indicators(self): 42 | self.assertEqual(self.field.indicator1, "0") 43 | self.assertEqual(self.field.indicator2, "1") 44 | 45 | def test_subfields_created(self): 46 | subfields = self.field.subfields 47 | self.assertEqual(len(subfields), 4) 48 | 49 | def test_subfield_short(self): 50 | self.assertEqual(self.field["a"], "Huckleberry Finn: ") 51 | self.assertEqual(self.field["z"], None) 52 | 53 | def test_subfields(self): 54 | self.assertEqual(self.field.get_subfields("a"), ["Huckleberry Finn: "]) 55 | self.assertEqual( 56 | self.subjectfield.get_subfields("a"), ["Python (Computer program language)"] 57 | ) 58 | 59 | def test_subfields_multi(self): 60 | self.assertEqual( 61 | self.field.get_subfields("a", "b"), 62 | ["Huckleberry Finn: ", "An American Odyssey"], 63 | ) 64 | self.assertEqual( 65 | self.subjectfield.get_subfields("a", "v"), 66 | ["Python (Computer program language)", "Poetry."], 67 | ) 68 | 69 | def test_encode(self): 70 | self.field.as_marc(encoding="utf-8") 71 | 72 | def test_membership(self): 73 | self.assertTrue("a" in self.field) 74 | self.assertFalse("zzz" in self.field) 75 | 76 | def test_iterator(self): 77 | string = "" 78 | for subfield in self.field: 79 | string += subfield[0] 80 | string += subfield[1] 81 | self.assertEqual(string, "aHuckleberry Finn: bAn American Odyssey") 82 | 83 | def test_value(self): 84 | self.assertEqual(self.field.value(), "Huckleberry Finn: An American Odyssey") 85 | self.assertEqual( 86 | self.controlfield.value(), "831227m19799999nyu ||| | ger " 87 | ) 88 | 89 | def test_non_integer_tag(self): 90 | # make sure this doesn't throw an exception 91 | Field(tag="3 0", indicators=[0, 1], subfields=["a", "foo"]) 92 | 93 | def test_add_subfield(self): 94 | field = Field(tag="245", indicators=[0, 1], subfields=["a", "foo"]) 95 | field.add_subfield("a", "bar") 96 | self.assertEqual(field.__str__(), "=245 01$afoo$abar") 97 | field.add_subfield("b", "baz", 0) 98 | self.assertEqual(field.__str__(), "=245 01$bbaz$afoo$abar") 99 | field.add_subfield("c", "qux", 2) 100 | self.assertEqual(field.__str__(), "=245 01$bbaz$afoo$cqux$abar") 101 | field.add_subfield("z", "wat", 8) 102 | self.assertEqual(field.__str__(), "=245 01$bbaz$afoo$cqux$abar$zwat") 103 | 104 | def test_delete_subfield(self): 105 | field = Field( 106 | tag="200", 107 | indicators=[0, 1], 108 | subfields=["a", "My Title", "a", "Kinda Bogus Anyhow"], 109 | ) 110 | self.assertEqual(field.delete_subfield("z"), None) 111 | self.assertEqual(field.delete_subfield("a"), "My Title") 112 | self.assertEqual(field.delete_subfield("a"), "Kinda Bogus Anyhow") 113 | self.assertTrue(len(field.subfields) == 0) 114 | 115 | def test_is_subject_field(self): 116 | self.assertEqual(self.subjectfield.is_subject_field(), True) 117 | self.assertEqual(self.field.is_subject_field(), False) 118 | 119 | def test_format_field(self): 120 | self.subjectfield.add_subfield("6", "880-4") 121 | self.assertEqual( 122 | self.subjectfield.format_field(), 123 | "Python (Computer program language) -- Poetry.", 124 | ) 125 | self.field.add_subfield("6", "880-1") 126 | self.assertEqual( 127 | self.field.format_field(), "Huckleberry Finn: An American Odyssey" 128 | ) 129 | 130 | def test_tag_normalize(self): 131 | f = Field(tag="42", indicators=["", ""]) 132 | self.assertEqual(f.tag, "042") 133 | 134 | def test_alphatag(self): 135 | f = Field(tag="CAT", indicators=[0, 1], subfields=["a", "foo"]) 136 | self.assertEqual(f.tag, "CAT") 137 | self.assertEqual(f["a"], "foo") 138 | self.assertEqual(f.is_control_field(), False) 139 | 140 | def test_setitem_no_key(self): 141 | try: 142 | self.field["h"] = "error" 143 | except KeyError: 144 | pass 145 | except Exception: 146 | e = sys.exc_info()[1] 147 | self.fail("Unexpected exception thrown: %s" % e) 148 | else: 149 | self.fail("KeyError not thrown") 150 | 151 | def test_setitem_repeated_key(self): 152 | try: 153 | self.field.add_subfield("a", "bar") 154 | self.field["a"] = "error" 155 | except KeyError: 156 | pass 157 | except Exception: 158 | e = sys.exc_info()[1] 159 | self.fail("Unexpected exception thrown: %s" % e) 160 | else: 161 | self.fail("KeyError not thrown") 162 | 163 | def test_iter_over_controlfield(self): 164 | try: 165 | [subfield for subfield in self.controlfield] 166 | except AttributeError as e: 167 | self.fail("Error during iteration: %s" % e) 168 | 169 | def test_setitem(self): 170 | self.field["a"] = "changed" 171 | self.assertEqual(self.field["a"], "changed") 172 | 173 | def test_delete_subfield_only_by_code(self): 174 | self.field.delete_subfield("An American Odyssey") 175 | self.assertEqual(self.field["b"], "An American Odyssey") 176 | self.field.delete_subfield("b") 177 | self.assertTrue(self.field["b"] is None) 178 | 179 | def test_set_indicators_affects_str(self): 180 | self.field.indicators[0] = "9" 181 | self.field.indicator2 = "9" 182 | self.assertEquals( 183 | str(self.field), "=245 99$aHuckleberry Finn: $bAn American Odyssey" 184 | ) 185 | 186 | def test_set_indicators_affects_marc(self): 187 | self.field.indicators[0] = "9" 188 | self.field.indicator2 = "9" 189 | self.assertEquals( 190 | self.field.as_marc("utf-8"), 191 | b"99\x1faHuckleberry Finn: \x1fbAn American Odyssey\x1e", 192 | ) 193 | 194 | 195 | def suite(): 196 | test_suite = unittest.makeSuite(FieldTest, "test") 197 | return test_suite 198 | 199 | 200 | if __name__ == "__main__": 201 | unittest.main() 202 | -------------------------------------------------------------------------------- /test/test_json.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | import json 8 | import unittest 9 | 10 | import pymarc 11 | 12 | 13 | class JsonReaderTest(unittest.TestCase): 14 | def setUp(self): 15 | with open("test/test.json") as fh: 16 | self.in_json = json.load(fh, strict=False) 17 | 18 | with open("test/test.json") as fh: 19 | self.reader = pymarc.JSONReader(fh) 20 | 21 | def testRoundtrip(self): 22 | """Test from and to json. 23 | 24 | Tests that result of loading records from the test file 25 | produces objects deeply equal to the result of loading 26 | marc-in-json files directly 27 | """ 28 | recs = list(self.reader) 29 | self.assertEqual( 30 | len(self.in_json), len(recs), "Incorrect number of records found" 31 | ) 32 | for i, rec in enumerate(recs): 33 | deserialized = json.loads(rec.as_json(), strict=False) 34 | comp = self.in_json[i] 35 | self.assertEqual(comp, deserialized) 36 | 37 | def testOneRecord(self): 38 | """Tests case when in source json there is only 1 record not wrapped in list.""" 39 | data = json.dumps(self.in_json[0]) 40 | reader = pymarc.JSONReader(data) 41 | self.assertEqual([rec.as_dict() for rec in reader][0], self.in_json[0]) 42 | 43 | 44 | class JsonTest(unittest.TestCase): 45 | def setUp(self): 46 | self.reader = pymarc.MARCReader(open("test/test.dat", "rb")) 47 | self._record = pymarc.Record() 48 | field = pymarc.Field( 49 | tag="245", indicators=["1", "0"], subfields=["a", "Python", "c", "Guido"] 50 | ) 51 | self._record.add_field(field) 52 | 53 | def test_as_dict_single(self): 54 | _expected = { 55 | "fields": [ 56 | { 57 | "245": { 58 | "ind1": "1", 59 | "ind2": "0", 60 | "subfields": [{"a": "Python"}, {"c": "Guido"}], 61 | } 62 | } 63 | ], 64 | "leader": " 22 4500", 65 | } 66 | self.assertEqual(_expected, self._record.as_dict()) 67 | 68 | def test_as_json_types(self): 69 | rd = self._record.as_dict() 70 | self.assertTrue(isinstance(rd, dict)) 71 | self.assertTrue(isinstance(rd["leader"], str)) 72 | self.assertTrue(isinstance(rd["fields"], list)) 73 | self.assertTrue(isinstance(rd["fields"][0], dict)) 74 | self.assertTrue(isinstance(rd["fields"][0], dict)) 75 | self.assertTrue(isinstance(rd["fields"][0]["245"]["ind1"], str)) 76 | self.assertTrue(isinstance(rd["fields"][0]["245"]["ind2"], str)) 77 | self.assertTrue(isinstance(rd["fields"][0]["245"]["subfields"], list)) 78 | self.assertTrue(isinstance(rd["fields"][0]["245"]["subfields"][0], dict)) 79 | self.assertTrue(isinstance(rd["fields"][0]["245"]["subfields"][0]["a"], str)) 80 | self.assertTrue(isinstance(rd["fields"][0]["245"]["subfields"][1]["c"], str)) 81 | 82 | def test_as_json_simple(self): 83 | record = json.loads(self._record.as_json()) 84 | 85 | self.assertTrue("leader" in record) 86 | self.assertEqual(record["leader"], " 22 4500") 87 | 88 | self.assertTrue("fields" in record) 89 | self.assertTrue("245" in record["fields"][0]) 90 | self.assertEqual( 91 | record["fields"][0]["245"], 92 | { 93 | u"subfields": [{u"a": u"Python"}, {u"c": u"Guido"}], 94 | u"ind2": u"0", 95 | u"ind1": u"1", 96 | }, 97 | ) 98 | 99 | def test_as_json_multiple(self): 100 | for record in self.reader: 101 | self.assertEqual(dict, json.loads(record.as_json()).__class__) 102 | 103 | 104 | class JsonParse(unittest.TestCase): 105 | def setUp(self): 106 | self.reader_dat = pymarc.MARCReader(open("test/one.dat", "rb")) 107 | self.parse_json = pymarc.parse_json_to_array(open("test/one.json")) 108 | 109 | self.batch_xml = pymarc.parse_xml_to_array(open("test/batch.xml")) 110 | self.batch_json = pymarc.parse_json_to_array(open("test/batch.json")) 111 | 112 | def testRoundtrip(self): 113 | recs = list(self.reader_dat) 114 | self.assertEqual( 115 | len(self.parse_json), len(recs), "Incorrect number of records found" 116 | ) 117 | for from_dat, from_json in zip(recs, self.parse_json): 118 | self.assertEqual(from_dat.as_marc(), from_json.as_marc(), "Icorrect Record") 119 | 120 | def testParseJsonXml(self): 121 | self.assertEqual( 122 | len(self.batch_json), 123 | len(self.batch_xml), 124 | "Incorrect number of parse records found", 125 | ) 126 | for from_dat, from_json in zip(self.batch_json, self.batch_xml): 127 | self.assertEqual(from_dat.as_marc(), from_json.as_marc(), "Icorrect Record") 128 | 129 | 130 | def suite(): 131 | test_suite = unittest.makeSuite(JsonTest, "test") 132 | return test_suite 133 | 134 | 135 | if __name__ == "__main__": 136 | unittest.main() 137 | -------------------------------------------------------------------------------- /test/test_leader.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | """Leader tests.""" 8 | import random 9 | import string 10 | import unittest 11 | 12 | from pymarc.exceptions import BadLeaderValue, RecordLeaderInvalid 13 | from pymarc.leader import Leader 14 | 15 | LEADER = "00475casaa2200169 ib4500" 16 | 17 | FIELDS = [ 18 | ("record_length", slice(0, 5), "00475"), 19 | ("record_status", 5, "c"), 20 | ("type_of_record", 6, "a"), 21 | ("bibliographic_level", 7, "s"), 22 | ("type_of_control", 8, "a"), 23 | ("coding_scheme", 9, "a"), 24 | ("indicator_count", 10, "2"), 25 | ("subfield_code_count", 11, "2"), 26 | ("base_address", slice(12, 17), "00169"), 27 | ("encoding_level", 17, " "), 28 | ("cataloging_form", 18, "i"), 29 | ("multipart_ressource", 19, "b"), 30 | ("length_of_field_length", 20, "4"), 31 | ("starting_character_position_length", 21, "5"), 32 | ("implementation_defined_length", 22, "0"), 33 | ] 34 | 35 | 36 | def random_string(length): 37 | """Random string to fill a field.""" 38 | letters = string.ascii_lowercase 39 | return "".join(random.choice(letters) for i in range(length)) 40 | 41 | 42 | class LeaderTest(unittest.TestCase): 43 | """LeaderTest.""" 44 | 45 | def test_leader_invalid_length(self): 46 | self.assertRaises(RecordLeaderInvalid, Leader, LEADER[:-1]) 47 | 48 | def test_leader_value(self): 49 | leader = Leader(LEADER) 50 | self.assertEqual(leader.leader, LEADER) 51 | 52 | def test_str(self): 53 | leader = Leader(LEADER) 54 | self.assertEqual(str(leader), LEADER) 55 | 56 | def test_add(self): 57 | leader = Leader(LEADER) 58 | new_leader = leader[0:9] + "b" + leader[10:] 59 | self.assertEqual(new_leader, "00475casab2200169 ib4500") 60 | 61 | def test_getters(self): 62 | leader = Leader(LEADER) 63 | for field, index, expected in FIELDS: 64 | self.assertEqual(getattr(leader, field), leader[index]) 65 | self.assertEqual(expected, leader[index]) 66 | 67 | def test_setters(self): 68 | leader = Leader(LEADER) 69 | for field, index, expected in FIELDS: 70 | value = random_string(len(expected)) 71 | leader[index] = value 72 | self.assertEqual(getattr(leader, field), value) 73 | value = random_string(len(expected)) 74 | setattr(leader, field, value) 75 | self.assertEqual(leader[index], value) 76 | 77 | def test_setters_errors(self): 78 | leader = Leader(LEADER) 79 | for field, index, expected in FIELDS: 80 | value = random_string(len(expected) + 1) 81 | with self.assertRaises(BadLeaderValue): 82 | setattr(leader, field, value) 83 | 84 | 85 | def suite(): 86 | test_suite = unittest.makeSuite(LeaderTest, "test") 87 | return test_suite 88 | 89 | 90 | if __name__ == "__main__": 91 | unittest.main() 92 | -------------------------------------------------------------------------------- /test/test_marc8.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # This file is part of pymarc. It is subject to the license terms in the 4 | # LICENSE file found in the top-level directory of this distribution and at 5 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 6 | # propagated, or distributed according to the terms contained in the LICENSE 7 | # file. 8 | 9 | import os 10 | from unittest import TestCase, makeSuite 11 | 12 | 13 | from pymarc import Field, MARCReader, MARCWriter, Record, marc8_to_unicode 14 | 15 | 16 | class MARC8Test(TestCase): 17 | def test_marc8_reader(self): 18 | with open("test/marc8.dat", "rb") as fh: 19 | reader = MARCReader(fh, to_unicode=False) 20 | r = next(reader) 21 | self.assertEqual(type(r), Record) 22 | utitle = r["240"]["a"] 23 | self.assertEqual(type(utitle), bytes) 24 | self.assertEqual(utitle, b"De la solitude \xe1a la communaut\xe2e.") 25 | 26 | def test_marc8_reader_to_unicode(self): 27 | with open("test/marc8.dat", "rb") as fh: 28 | reader = MARCReader(fh, to_unicode=True) 29 | r = next(reader) 30 | self.assertEqual(type(r), Record) 31 | utitle = r["240"]["a"] 32 | self.assertEqual(type(utitle), str) 33 | self.assertEqual(utitle, u"De la solitude \xe0 la communaut\xe9.") 34 | 35 | def test_marc8_reader_to_1251(self): 36 | with open("test/1251.dat", "rb") as fh: 37 | reader = MARCReader(fh, file_encoding="cp1251") 38 | r = next(reader) 39 | self.assertEqual(type(r), Record) 40 | utitle = r["245"]["a"] 41 | self.assertEqual(type(utitle), str) 42 | self.assertEqual(utitle, u"Основы гидравлического расчета инженерных сетей") 43 | 44 | def test_marc8_reader_to_1251_without_1251(self): 45 | with open("test/1251.dat", "rb") as fh: 46 | reader = MARCReader(fh,) 47 | try: 48 | r = next(reader) 49 | r = next(reader) 50 | self.assertEqual(type(r), Record) 51 | utitle = r["245"]["a"] 52 | self.assertEqual(type(utitle), str) 53 | self.assertEqual(utitle, u"Психологический тренинг с подростками") 54 | except AssertionError: 55 | self.assertTrue("Was enable to decode invalid MARC") 56 | 57 | def test_marc8_reader_to_unicode_bad_eacc_sequence(self): 58 | with open("test/bad_eacc_encoding.dat", "rb") as fh: 59 | reader = MARCReader(fh, to_unicode=True, hide_utf8_warnings=True) 60 | try: 61 | next(reader) 62 | self.assertFalse("Was able to decode invalid MARC8") 63 | except UnicodeDecodeError: 64 | self.assertTrue("Caught UnicodeDecodeError as expected") 65 | 66 | def test_marc8_reader_to_unicode_bad_escape(self): 67 | with open("test/bad_marc8_escape.dat", "rb") as fh: 68 | reader = MARCReader(fh, to_unicode=True) 69 | r = next(reader) 70 | self.assertEqual(type(r), Record) 71 | upublisher = r["260"]["b"] 72 | self.assertEqual(type(upublisher), str) 73 | self.assertEqual(upublisher, u"La Soci\xe9t\x1b,") 74 | 75 | def test_marc8_to_unicode(self): 76 | marc8_file = open("test/test_marc8.txt", "rb") 77 | utf8_file = open("test/test_utf8.txt", "rb") 78 | count = 0 79 | 80 | while True: 81 | marc8 = marc8_file.readline().strip(b"\n") 82 | utf8 = utf8_file.readline().strip(b"\n") 83 | if marc8 == b"" or utf8 == b"": 84 | break 85 | count += 1 86 | self.assertEqual(marc8_to_unicode(marc8).encode("utf8"), utf8) 87 | 88 | self.assertEqual(count, 1515) 89 | marc8_file.close() 90 | utf8_file.close() 91 | 92 | def test_writing_unicode(self): 93 | record = Record() 94 | record.add_field(Field(245, ["1", "0"], ["a", chr(0x1234)])) 95 | record.leader = " a " 96 | writer = MARCWriter(open("test/foo", "wb")) 97 | writer.write(record) 98 | writer.close() 99 | 100 | reader = MARCReader(open("test/foo", "rb"), to_unicode=True) 101 | record = next(reader) 102 | self.assertEqual(record["245"]["a"], chr(0x1234)) 103 | reader.close() 104 | 105 | os.remove("test/foo") 106 | 107 | def test_reading_utf8_with_flag(self): 108 | with open("test/utf8_with_leader_flag.dat", "rb") as fh: 109 | reader = MARCReader(fh, to_unicode=False) 110 | record = next(reader) 111 | self.assertEqual(type(record), Record) 112 | utitle = record["240"]["a"] 113 | self.assertEqual(type(utitle), bytes) 114 | self.assertEqual(utitle, b"De la solitude a\xcc\x80 la communaute\xcc\x81.") 115 | 116 | with open("test/utf8_with_leader_flag.dat", "rb") as fh: 117 | reader = MARCReader(fh, to_unicode=True) 118 | record = next(reader) 119 | self.assertEqual(type(record), Record) 120 | utitle = record["240"]["a"] 121 | self.assertEqual(type(utitle), str) 122 | self.assertEqual( 123 | utitle, 124 | u"De la solitude a" 125 | + chr(0x0300) 126 | + " la communaute" 127 | + chr(0x0301) 128 | + ".", 129 | ) 130 | 131 | def test_reading_utf8_without_flag(self): 132 | with open("test/utf8_without_leader_flag.dat", "rb") as fh: 133 | reader = MARCReader(fh, to_unicode=False) 134 | record = next(reader) 135 | self.assertEqual(type(record), Record) 136 | utitle = record["240"]["a"] 137 | self.assertEqual(type(utitle), bytes) 138 | self.assertEqual(utitle, b"De la solitude a\xcc\x80 la communaute\xcc\x81.") 139 | 140 | with open("test/utf8_without_leader_flag.dat", "rb") as fh: 141 | reader = MARCReader(fh, to_unicode=True, hide_utf8_warnings=True) 142 | record = next(reader) 143 | self.assertEqual(type(record), Record) 144 | utitle = record["240"]["a"] 145 | self.assertEqual(type(utitle), str) 146 | # unless you force utf-8 characters will get lost and 147 | # warnings will appear in the terminal 148 | self.assertEqual(utitle, "De la solitude a la communaute .") 149 | 150 | # force reading as utf-8 151 | with open("test/utf8_without_leader_flag.dat", "rb") as fh: 152 | reader = MARCReader( 153 | fh, to_unicode=True, force_utf8=True, hide_utf8_warnings=True 154 | ) 155 | record = next(reader) 156 | self.assertEqual(type(record), Record) 157 | utitle = record["240"]["a"] 158 | self.assertEqual(type(utitle), str) 159 | self.assertEqual( 160 | utitle, 161 | u"De la solitude a" 162 | + chr(0x0300) 163 | + " la communaute" 164 | + chr(0x0301) 165 | + ".", 166 | ) 167 | 168 | def test_record_create_force_utf8(self, force_utf8=True): 169 | r = Record(force_utf8=True) 170 | self.assertEqual(r.leader[9], "a") 171 | 172 | def test_subscript_2(self): 173 | self.assertEqual( 174 | marc8_to_unicode(b"CO\x1bb2\x1bs is a gas"), u"CO\u2082 is a gas" 175 | ) 176 | self.assertEqual(marc8_to_unicode(b"CO\x1bb2\x1bs"), u"CO\u2082") 177 | 178 | def test_eszett_euro(self): 179 | # MARC-8 mapping: Revised June 2004 to add the Eszett (M+C7) and the 180 | # Euro Sign (M+C8) to the MARC-8 set. 181 | self.assertEqual( 182 | marc8_to_unicode(b"ESZETT SYMBOL: \xc7 is U+00DF"), 183 | u"ESZETT SYMBOL: \u00df is U+00DF", 184 | ) 185 | self.assertEqual( 186 | marc8_to_unicode(b"EURO SIGN: \xc8 is U+20AC"), 187 | u"EURO SIGN: \u20ac is U+20AC", 188 | ) 189 | 190 | def test_alif(self): 191 | # MARC-8 mapping: Revised March 2005 to change the mapping from MARC-8 192 | # to Unicode for the Alif (M+2E) from U+02BE to U+02BC. 193 | self.assertEqual( 194 | marc8_to_unicode(b"ALIF: \xae is U+02BC"), u"ALIF: \u02bc is U+02BC" 195 | ) 196 | 197 | 198 | def suite(): 199 | test_suite = makeSuite(MARC8Test, "test") 200 | return test_suite 201 | -------------------------------------------------------------------------------- /test/test_ordered_fields.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | import unittest 8 | 9 | import pymarc 10 | 11 | 12 | class OrderedFieldsTest(unittest.TestCase): 13 | def test_add_ordered_fields(self): 14 | 15 | record = pymarc.Record() 16 | for tag in ("999", "888", "111", "abc", "666", "988", "998"): 17 | field = pymarc.Field(tag, ["0", "0"], ["a", "foo"]) 18 | record.add_ordered_field(field) 19 | 20 | # ensure all numeric fields are in strict order 21 | ordered = True 22 | last_tag = 0 23 | for field in record: 24 | if not field.tag.isdigit(): 25 | continue 26 | curr_tag = int(field.tag) 27 | if last_tag > curr_tag: 28 | ordered = False 29 | last_tag = curr_tag 30 | 31 | self.assertTrue(ordered, "Fields are not strictly ordered numerically") 32 | 33 | def test_add_grouped_fields(self): 34 | record = pymarc.Record() 35 | for tag in ("999", "888", "111", "abc", "666", "988", "998"): 36 | field = pymarc.Field(tag, ["0", "0"], ["a", "foo"]) 37 | record.add_grouped_field(field) 38 | 39 | # ensure all numeric fields are in grouped order 40 | grouped = list() 41 | for field in record: 42 | if not field.tag.isdigit(): 43 | continue 44 | grouped.append(field.tag) 45 | 46 | exp = ["111", "666", "888", "999", "988", "998"] 47 | 48 | self.assertEqual(grouped, exp, "Fields are not grouped numerically") 49 | 50 | 51 | def suite(): 52 | test_suite = unittest.makeSuite(OrderedFieldsTest, "test") 53 | return test_suite 54 | 55 | 56 | if __name__ == "__main__": 57 | unittest.main() 58 | -------------------------------------------------------------------------------- /test/test_reader.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # This file is part of pymarc. It is subject to the license terms in the 4 | # LICENSE file found in the top-level directory of this distribution and at 5 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 6 | # propagated, or distributed according to the terms contained in the LICENSE 7 | # file. 8 | 9 | import re 10 | import unittest 11 | 12 | import pymarc 13 | 14 | 15 | class MARCReaderBaseTest(object): 16 | def test_iterator(self): 17 | count = 0 18 | for record in self.reader: 19 | count += 1 20 | self.assertEqual(count, 10, "found expected number of MARC21 records") 21 | 22 | def test_string(self): 23 | # basic test of stringification 24 | starts_with_leader = re.compile("^=LDR") 25 | has_numeric_tag = re.compile(r"\n=\d\d\d ") 26 | for record in self.reader: 27 | text = str(record) 28 | self.assertTrue(starts_with_leader.search(text), "got leader") 29 | self.assertTrue(has_numeric_tag.search(text), "got a tag") 30 | 31 | 32 | class MARCReaderFileTest(unittest.TestCase, MARCReaderBaseTest): 33 | """Tests MARCReader which provides iterator based access to a MARC file.""" 34 | 35 | def setUp(self): 36 | self.reader = pymarc.MARCReader(open("test/test.dat", "rb")) 37 | 38 | def tearDown(self): 39 | if self.reader: 40 | self.reader.close() 41 | 42 | def test_map_records(self): 43 | self.count = 0 44 | 45 | def f(r): 46 | self.count += 1 47 | 48 | with open("test/test.dat", "rb") as fh: 49 | pymarc.map_records(f, fh) 50 | self.assertEqual(self.count, 10, "map_records appears to work") 51 | 52 | def test_multi_map_records(self): 53 | self.count = 0 54 | 55 | def f(r): 56 | self.count += 1 57 | 58 | fh1 = open("test/test.dat", "rb") 59 | fh2 = open("test/test.dat", "rb") 60 | pymarc.map_records(f, fh1, fh2) 61 | self.assertEqual(self.count, 20, "map_records appears to work") 62 | fh1.close() 63 | fh2.close() 64 | 65 | def disabled_test_codecs(self): 66 | import codecs 67 | 68 | with codecs.open("test/test.dat", encoding="utf-8") as fh: 69 | reader = pymarc.MARCReader(fh) 70 | record = next(reader) 71 | self.assertEqual(record["245"]["a"], u"ActivePerl with ASP and ADO /") 72 | 73 | def test_bad_subfield(self): 74 | with open("test/bad_subfield_code.dat", "rb") as fh: 75 | reader = pymarc.MARCReader(fh) 76 | record = next(reader) 77 | self.assertEqual(record["245"]["a"], u"ActivePerl with ASP and ADO /") 78 | 79 | def test_bad_indicator(self): 80 | with open("test/bad_indicator.dat", "rb") as fh: 81 | reader = pymarc.MARCReader(fh) 82 | record = next(reader) 83 | self.assertEqual(record["245"]["a"], "Aristocrats of color :") 84 | 85 | def test_regression_45(self): 86 | # https://github.com/edsu/pymarc/issues/45 87 | with open("test/regression45.dat", "rb") as fh: 88 | reader = pymarc.MARCReader(fh) 89 | record = next(reader) 90 | self.assertEqual(record["752"]["a"], "Russian Federation") 91 | self.assertEqual(record["752"]["b"], "Kostroma Oblast") 92 | self.assertEqual(record["752"]["d"], "Kostroma") 93 | 94 | def test_strict_mode(self): 95 | with self.assertRaises(pymarc.exceptions.BaseAddressInvalid), open( 96 | "test/bad_records.mrc", "rb" 97 | ) as fh: 98 | reader = pymarc.MARCReader(fh) 99 | for record in reader: 100 | self.assertIsNotNone(reader.current_chunk) 101 | 102 | # inherit same tests from MARCReaderBaseTest 103 | 104 | 105 | class MARCReaderStringTest(unittest.TestCase, MARCReaderBaseTest): 106 | def setUp(self): 107 | fh = open("test/test.dat", "rb") 108 | raw = fh.read() 109 | fh.close() 110 | 111 | self.reader = pymarc.reader.MARCReader(raw) 112 | 113 | # inherit same tests from MARCReaderBaseTest 114 | 115 | 116 | class MARCReaderFilePermissiveTest(unittest.TestCase): 117 | """Tests MARCReader which provides iterator based access in a permissive way.""" 118 | 119 | def setUp(self): 120 | self.reader = pymarc.MARCReader( 121 | open("test/bad_records.mrc", "rb"), permissive=True 122 | ) 123 | 124 | def tearDown(self): 125 | if self.reader: 126 | self.reader.close() 127 | 128 | def test_permissive_mode(self): 129 | """Test permissive mode. 130 | 131 | In bad_records.mrc we expect following records in the given order : 132 | 133 | * working record 134 | * BaseAddressInvalid (base_address (99937) >= len(marc)) 135 | * BaseAddressNotFound (base_address (00000) <= 0) 136 | * RecordDirectoryInvalid (len(directory) % DIRECTORY_ENTRY_LEN != 0) 137 | * UnicodeDecodeError (directory with non ascii code (245ù0890000)) 138 | * ValueError (base_address with literal (f0037)) 139 | * last record should be ok 140 | """ 141 | expected_exceptions = [ 142 | None, 143 | pymarc.exceptions.BaseAddressInvalid, 144 | pymarc.exceptions.BaseAddressNotFound, 145 | pymarc.exceptions.RecordDirectoryInvalid, 146 | UnicodeDecodeError, 147 | ValueError, 148 | pymarc.exceptions.NoFieldsFound, 149 | None, 150 | ] 151 | for exception_type in expected_exceptions: 152 | record = next(self.reader) 153 | self.assertIsNotNone(self.reader.current_chunk) 154 | if exception_type is None: 155 | self.assertIsNotNone(record) 156 | self.assertIsNone(self.reader.current_exception) 157 | self.assertEqual(record["245"]["a"], "The pragmatic programmer : ") 158 | self.assertEqual(record["245"]["b"], "from journeyman to master /") 159 | self.assertEqual(record["245"]["c"], "Andrew Hunt, David Thomas.") 160 | else: 161 | self.assertIsNone( 162 | record, 163 | "expected parsing error with the following " 164 | "exception %r" % exception_type, 165 | ) 166 | self.assertTrue( 167 | isinstance(self.reader.current_exception, exception_type), 168 | "expected %r exception, " 169 | "received: %r" % (exception_type, self.reader.current_exception), 170 | ) 171 | 172 | 173 | def suite(): 174 | file_suite = unittest.makeSuite(MARCReaderFileTest, "test") 175 | string_suite = unittest.makeSuite(MARCReaderStringTest, "test") 176 | permissive_file_suite = unittest.makeSuite(MARCReaderFilePermissiveTest, "test") 177 | test_suite = unittest.TestSuite((file_suite, string_suite, permissive_file_suite)) 178 | return test_suite 179 | 180 | 181 | if __name__ == "__main__": 182 | unittest.main() 183 | -------------------------------------------------------------------------------- /test/test_utf8.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | import os 8 | import unittest 9 | 10 | import pymarc 11 | 12 | 13 | class MARCUnicodeTest(unittest.TestCase): 14 | def test_read_utf8(self): 15 | self.field_count = 0 16 | 17 | def process_xml(record): 18 | for field in record.get_fields(): 19 | self.field_count += 1 20 | 21 | pymarc.map_xml(process_xml, "test/utf8.xml") 22 | self.assertEqual(self.field_count, 8) 23 | 24 | def test_copy_utf8(self): 25 | writer = pymarc.MARCWriter(open("test/write-utf8-test.dat", "wb")) 26 | new_record = pymarc.Record(to_unicode=True, force_utf8=True) 27 | 28 | def process_xml(record): 29 | new_record.leader = record.leader 30 | 31 | for field in record.get_fields(): 32 | new_record.add_field(field) 33 | 34 | pymarc.map_xml(process_xml, "test/utf8.xml") 35 | 36 | try: 37 | writer.write(new_record) 38 | writer.close() 39 | 40 | finally: 41 | # remove it 42 | os.remove("test/write-utf8-test.dat") 43 | 44 | def test_combining_diacritic(self): 45 | """Issue 74: raises UnicodeEncodeError on Python 2.""" 46 | reader = pymarc.MARCReader(open("test/diacritic.dat", "rb")) 47 | record = next(reader) 48 | str(record) 49 | 50 | 51 | def suite(): 52 | test_suite = unittest.makeSuite(MARCUnicodeTest, "test") 53 | return test_suite 54 | 55 | 56 | if __name__ == "__main__": 57 | unittest.main() 58 | -------------------------------------------------------------------------------- /test/test_xml.py: -------------------------------------------------------------------------------- 1 | # This file is part of pymarc. It is subject to the license terms in the 2 | # LICENSE file found in the top-level directory of this distribution and at 3 | # https://opensource.org/licenses/BSD-2-Clause. pymarc may be copied, modified, 4 | # propagated, or distributed according to the terms contained in the LICENSE 5 | # file. 6 | 7 | import pymarc 8 | import unittest 9 | 10 | from io import BytesIO 11 | 12 | 13 | class XmlTest(unittest.TestCase): 14 | def test_map_xml(self): 15 | self.seen = 0 16 | 17 | def count(record): 18 | self.seen += 1 19 | 20 | pymarc.map_xml(count, "test/batch.xml") 21 | self.assertEqual(2, self.seen) 22 | 23 | def test_multi_map_xml(self): 24 | self.seen = 0 25 | 26 | def count(record): 27 | self.seen += 1 28 | 29 | pymarc.map_xml(count, "test/batch.xml", "test/batch.xml") 30 | self.assertEqual(4, self.seen) 31 | 32 | def test_parse_to_array(self): 33 | records = pymarc.parse_xml_to_array("test/batch.xml") 34 | self.assertEqual(len(records), 2) 35 | 36 | # should've got two records 37 | self.assertEqual(type(records[0]), pymarc.Record) 38 | self.assertEqual(type(records[1]), pymarc.Record) 39 | 40 | # first record should have 18 fields 41 | record = records[0] 42 | self.assertEqual(len(record.get_fields()), 18) 43 | 44 | # check the content of a control field 45 | self.assertEqual( 46 | record["008"].data, u"910926s1957 nyuuun eng " 47 | ) 48 | 49 | # check a data field with subfields 50 | field = record["245"] 51 | self.assertEqual(field.indicator1, "0") 52 | self.assertEqual(field.indicator2, "4") 53 | self.assertEqual(field["a"], u"The Great Ray Charles") 54 | self.assertEqual(field["h"], u"[sound recording].") 55 | 56 | def test_xml(self): 57 | # read in xml to a record 58 | record1 = pymarc.parse_xml_to_array("test/batch.xml")[0] 59 | # generate xml 60 | xml = pymarc.record_to_xml(record1) 61 | # parse generated xml 62 | record2 = pymarc.parse_xml_to_array(BytesIO(xml))[0] 63 | 64 | # compare original and resulting record 65 | self.assertEqual(record1.leader, record2.leader) 66 | 67 | field1 = record1.get_fields() 68 | field2 = record2.get_fields() 69 | self.assertEqual(len(field1), len(field2)) 70 | 71 | pos = 0 72 | while pos < len(field1): 73 | self.assertEqual(field1[pos].tag, field2[pos].tag) 74 | if field1[pos].is_control_field(): 75 | self.assertEqual(field1[pos].data, field2[pos].data) 76 | else: 77 | self.assertEqual( 78 | field1[pos].get_subfields(), field2[pos].get_subfields() 79 | ) 80 | self.assertEqual(field1[pos].indicators, field2[pos].indicators) 81 | pos += 1 82 | 83 | def test_strict(self): 84 | a = pymarc.parse_xml_to_array(open("test/batch.xml"), strict=True) 85 | self.assertEqual(len(a), 2) 86 | 87 | def test_xml_namespaces(self): 88 | """Tests the 'namespace' parameter of the record_to_xml() method.""" 89 | # get a test record 90 | fh = open("test/test.dat", "rb") 91 | record = next(pymarc.reader.MARCReader(fh)) 92 | # record_to_xml() with quiet set to False should generate errors 93 | # and write them to sys.stderr 94 | xml = pymarc.record_to_xml(record, namespace=False) 95 | # look for the xmlns in the written xml, should be -1 96 | self.assertFalse(b'xmlns="http://www.loc.gov/MARC21/slim"' in xml) 97 | 98 | # record_to_xml() with quiet set to True should not generate errors 99 | xml = pymarc.record_to_xml(record, namespace=True) 100 | # look for the xmlns in the written xml, should be >= 0 101 | self.assertTrue(b'xmlns="http://www.loc.gov/MARC21/slim"' in xml) 102 | 103 | fh.close() 104 | 105 | def test_bad_tag(self): 106 | a = pymarc.parse_xml_to_array(open("test/bad_tag.xml")) 107 | self.assertEqual(len(a), 1) 108 | 109 | 110 | def suite(): 111 | test_suite = unittest.makeSuite(XmlTest, "test") 112 | return test_suite 113 | 114 | 115 | if __name__ == "__main__": 116 | unittest.main() 117 | -------------------------------------------------------------------------------- /test/testunimarc.dat: -------------------------------------------------------------------------------- 1 | 02498nam0 22007213i 4500001002000000005001700020010001800037100004100055101000800096102000700104200011900111210003100230215002200261410005200283410008200335454010400417700004600521702004300567702004500610702004300655790005600698801002300754899002400777899002700801899002400828899002400852899002400876899002400900899002700924899002700951899002700978899002701005899002401032899002401056899002701080899002701107899002701134899002401161899002401185899002701209899002701236899002401263899002701287899002401314899002701338899002401365899002401389899002401413899002401437899002401461899002401485899002401509899002401533899002401557899002401581899002401605899002401629899002401653899002401677899002401701899002401725899002701749IT\ICCU\ANA\001937020091021165606.1 a88-04-40682-8 a19961119d1996 ||||0itac50 ba aita aIT1 aˆL'‰altra faccia della spiralefIsaac Asimovgtraduzione di Cesare Scagliagintroduzione di Fruttero & Lucentini aMilanocA. Mondadorid1996 aV, 201 p.d20 cm. 01001IT\ICCU\CFI\001275112001 aBestsellersv641 01001IT\ICCU\RMS\188104412001 aˆIl ‰ciclo delle fondazionifIsaac Asimovv4 01001IT\ICCU\RAV\000506112001 aSecond foundation.1700 1aAsimovb, Isaac3IT\ICCU\CFIV\0073274070 1aAsimovb, Isaac3IT\ICCU\CFIV\0073274070 1aFrutterob, Carlo3IT\ICCU\CFIV\007373 1aLucentinib, Franco3IT\ICCU\CFIV\007375 1aScagliab, Cesare3IT\ICCU\RAVV\003503 1aAzimovb, Ajzek3IT\ICCU\RAVV\501922zAsimov, Isaac 3aITbICCUc20140902 1AL00732TO0 Q9fP/G 1AN00012ANA BAfP/GeN 1AT00292TO0 S6fP/G 1BL00962VIA PNfP/G 1CB00072MO1 BPfP/G 1CN00252TO0 H4fP/G 1CN00492TO0 52fP/GeN 1CN00632TO0 F4fP/GeN 1CN00852TO0 FFfP/GeN 1CN01782TO0 FLfP/GeN 1CN02352TO0 FRfP/G 1FG00892FOG 31fP/G 1FR00342RMS I9fP/GeN 1FR00422RMS H2fP/GeN 1FR01482RMS I4fP/GeN 1LI00022LIA CCfP/G 1LI00182LIA PIfP/G 1LI00742LIA LAfP/GeN 1LT00462RMS A5fP/GeN 1LT00792RMS N7fP/G 1MI11552LO1 23fP/GeN 1MO01352MOD ADfP/G 1NA00792NAP BNfP/GeN 1NA06142NAP 16fP/G 1NU01062CAG S2fP/G 1RI00882RMS P5fP/G 1RM00612RMS KIfP/G 1RM02912RMB L1fP/G 1RM06242RMB F2fP/G 1RM14202RMS K1fP/G 1RM14302RMB Z3fP/G 1RM14432RMS LPfP/G 1RM14532RMB O1fP/G 1RM16002RMS EYfP/G 1SP00222LIG 06fP/G 1SV00702LIG 69fP/G 1TV00622VIA MBfP/G 1TV00912VIA RIfP/G 1TV01062VIA SNfP/G 1VI01722VIA SBfP/GeN 2 | -------------------------------------------------------------------------------- /test/utf8.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 00640njm a2200205uu 4500 4 | ASP925318/clmu 5 | 100813s9999 xx ||nn s|||||||||||||| d 6 | 7 | Zen Classics 8 | 9 | 10 | Wilfried Hiller: Book of Stars 11 | 12 | 13 | Virgin Classics 14 | 03 Nov 2008 15 | 16 | 17 | 18 minutes 18 | 19 | 20 | Préludes 21 | 22 | 23 | http://www.aspresolver.com/aspresolver.asp?CLMU;925318 24 | 25 | 26 | -------------------------------------------------------------------------------- /test/utf8_errors.dat: -------------------------------------------------------------------------------- 1 | 01474ckm a22003857a 45000010011000000030004000110050017000150070007000320070007000390080041000460350018000870350022001050350018001279060045001459550053001900100017002430370046002600400019003060500029003252450064003542600028004183000037004465000041004835000045005245000012005695000091005815800068006726500063007406500052008036550038008558520090008938560077009839850012010609910016010722001700030DLC20030825111738.0kh||||cr||||010917q18801893|||nnn | knota  a(DLC)12536505 a(DLC) 2001700030 a(DLC)12536505 a7bcbccorignewduencipf20gy-printpho aqw07 in P&P BAR 09-17-01; to P&P refile 09-24-01 a 2001700030 aLC-USZ62-81330bDLCc(b&w film copy neg.) aDLCcDLCegihc00aLOT 9527, no. 12b[item]00a[Imperial military middle school, Ṣanʻāʾ]h[graphic]. c[between 1880 and 1893] a1 photographic print :balbumen. aTitle translated from album caption. aCaptioned in Ottoman Turkish and French. aNo. 12. aIn album: Imperial schools, architectural plans and student portraits, Ottoman Empire. aForms part of: Abdul-Hamid II Collection (Library of Congress). 7aMilitary educationzYemenzṢanʻāʾy1880-1900.2lctgm 7aSchoolszYemenzṢanʻāʾy1880-1900.2lctgm 7aAlbumen printsy1880-1900.2gmgpc aLibrary of CongressbPrints and Photographs DivisioneWashington, D.C. 20540 USAndcu413b&w film copy neg.dcphf3b28212uhttp://hdl.loc.gov/loc.pnp/cph.3b28212 app/ahii bc-P&Pvobj. -------------------------------------------------------------------------------- /test/utf8_invalid.mrc: -------------------------------------------------------------------------------- 1 | 01004cam a2200277 450000100070000000500170000700800410002403500190006509000320008424500810011626000680019730000180026544000480028350000190033153300780035070000200042871000600044890900150050896000140052396400160053796500530055396600240060696700670063096800110069796900180070814491720021009092157.0021009n 000 0 eng u a(Sirsi) a12702 b2020 (Series 19, Box 02-10)04aThe coming revolution in teacher licensure :bredefining teacher preparation a[Washington, D.C.:�bAssociation of Teacher Educators]c1994. ap. [1]-11, 13 0aAction in teacher education ;vv. 16, no. 2 a"Summer 1994." aCopy.b[S.l.] :cNational Council for Accreditation of Teacher Education.1 aWise, Arthur E.2 aNational Council for Accreditation of Teacher Education a1996003454 aEducation aK-12 Reform aTeacher preparation and professional development aProgram development aEducation, Regulations, Administration, Accreditation Services aAdults aUnited States -------------------------------------------------------------------------------- /test/utf8_with_leader_flag.dat: -------------------------------------------------------------------------------- 1 | 01123cam a2200349 a 4500001000200000005001700002008004100019010003200060035002000092035001300112040001000125041001100135049004700146049004000193050001700233090002300250100002000273240005000293245008000343260004700423300002100470500005500491504005400546650001600600650001000616650002300626650001600649650002200665730005000687951001100737957002500748219900316000000.0840112s1962 pau b 00010 eng  a 61014599/L/r83o00204096 aocmDCLC6114599B 9AAA-0425 dPPT-M1 aengund98aTMYMbBF575.L7 T68 1962c1z39074500724638 aPPCMc1lFred B. Rogers, M.D.oGift0 aBF697b.T623 aBF575.L7bT68 196210aTournier, Paul.10aDe la solitude à la communauté.lEnglish.10aEscape from loneliness /cby Paul Tournier ; translated by John S. Gilmour.0 aPhiladelphia :bWestminster Press,cc1962. a192 p. ;c21 cm. aTranslation of De la solitude à la communauté. aIncludes bibliographical references (p. 190-192). 0aLoneliness. 0aSelf. 0aSocial psychology. 2aLoneliness. 2aSocial Isolation.01aDe la solitude à la communauté.lEnglish. x070101 aBF 575.L7 T728d 1962 -------------------------------------------------------------------------------- /test/utf8_without_leader_flag.dat: -------------------------------------------------------------------------------- 1 | 01123cam 2200349 a 4500001000200000005001700002008004100019010003200060035002000092035001300112040001000125041001100135049004700146049004000193050001700233090002300250100002000273240005000293245008000343260004700423300002100470500005500491504005400546650001600600650001000616650002300626650001600649650002200665730005000687951001100737957002500748219900316000000.0840112s1962 pau b 00010 eng  a 61014599/L/r83o00204096 aocmDCLC6114599B 9AAA-0425 dPPT-M1 aengund98aTMYMbBF575.L7 T68 1962c1z39074500724638 aPPCMc1lFred B. Rogers, M.D.oGift0 aBF697b.T623 aBF575.L7bT68 196210aTournier, Paul.10aDe la solitude à la communauté.lEnglish.10aEscape from loneliness /cby Paul Tournier ; translated by John S. Gilmour.0 aPhiladelphia :bWestminster Press,cc1962. a192 p. ;c21 cm. aTranslation of De la solitude à la communauté. aIncludes bibliographical references (p. 190-192). 0aLoneliness. 0aSelf. 0aSocial psychology. 2aLoneliness. 2aSocial Isolation.01aDe la solitude à la communauté.lEnglish. x070101 aBF 575.L7 T728d 1962 --------------------------------------------------------------------------------