├── sampleECCOtxt.xml ├── sampleECCOtextTravel.xml ├── XmlParsingExperiments.ipynb └── LoCdata-Gale-LXML.v0.2.ipynb /sampleECCOtxt.xml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sgsinclair/experiments/master/sampleECCOtxt.xml -------------------------------------------------------------------------------- /sampleECCOtextTravel.xml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sgsinclair/experiments/master/sampleECCOtextTravel.xml -------------------------------------------------------------------------------- /XmlParsingExperiments.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 50, 6 | "metadata": { 7 | "collapsed": true 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "from lxml import etree\n", 12 | "\n", 13 | "testfile = \"sampleECCOtxt.xml\"\n", 14 | "taglist = ['documentID', 'ESTCID', 'pubDate','ESTCID',\n", 15 | " 'language','module','locSubject','notes',\n", 16 | " 'fullTitle','displayTitle','currentVolume', \n", 17 | " 'totalVolumes', 'imprintPublisher','imprintFull',\n", 18 | " 'imprintCity', 'publicationPlace']" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 51, 24 | "metadata": { 25 | "collapsed": true 26 | }, 27 | "outputs": [], 28 | "source": [ 29 | "# here we iterparse multiple times looking for a specific tag\n", 30 | "def iterparse_multiple(file, tags):\n", 31 | " elementlist = []\n", 32 | " for xmltag in tags:\n", 33 | " context = etree.iterparse(file, events=('end',), tag=xmltag)\n", 34 | " for event, elem in context:\n", 35 | " elementlist.append(elem.text)\n", 36 | " return elementlist" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 52, 42 | "metadata": {}, 43 | "outputs": [ 44 | { 45 | "name": "stdout", 46 | "output_type": "stream", 47 | "text": [ 48 | "CPU times: user 5.56 s, sys: 863 ms, total: 6.43 s\n", 49 | "Wall time: 6.43 s\n" 50 | ] 51 | }, 52 | { 53 | "data": { 54 | "text/plain": [ 55 | "21" 56 | ] 57 | }, 58 | "execution_count": 52, 59 | "metadata": {}, 60 | "output_type": "execute_result" 61 | } 62 | ], 63 | "source": [ 64 | "%%time \n", 65 | "\n", 66 | "len(iterparse_multiple(testfile, taglist))" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 55, 72 | "metadata": { 73 | "collapsed": true 74 | }, 75 | "outputs": [], 76 | "source": [ 77 | "# here we iterparse once and look at each element\n", 78 | "def iterparse_single(file, tags):\n", 79 | " tagset = set(tags)\n", 80 | " elementlist = []\n", 81 | " context = etree.iterparse(file)\n", 82 | " for event, elem in context:\n", 83 | " if elem.tag in tagset:\n", 84 | " elementlist.append(elem.text)\n", 85 | " return elementlist" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": 56, 91 | "metadata": {}, 92 | "outputs": [ 93 | { 94 | "name": "stdout", 95 | "output_type": "stream", 96 | "text": [ 97 | "CPU times: user 602 ms, sys: 61 ms, total: 663 ms\n", 98 | "Wall time: 661 ms\n" 99 | ] 100 | }, 101 | { 102 | "data": { 103 | "text/plain": [ 104 | "20" 105 | ] 106 | }, 107 | "execution_count": 56, 108 | "metadata": {}, 109 | "output_type": "execute_result" 110 | } 111 | ], 112 | "source": [ 113 | "%%time \n", 114 | "\n", 115 | "len(iterparse_single(testfile, taglist))" 116 | ] 117 | } 118 | ], 119 | "metadata": { 120 | "kernelspec": { 121 | "display_name": "Python 3", 122 | "language": "python", 123 | "name": "python3" 124 | }, 125 | "language_info": { 126 | "codemirror_mode": { 127 | "name": "ipython", 128 | "version": 3 129 | }, 130 | "file_extension": ".py", 131 | "mimetype": "text/x-python", 132 | "name": "python", 133 | "nbconvert_exporter": "python", 134 | "pygments_lexer": "ipython3", 135 | "version": "3.6.3" 136 | } 137 | }, 138 | "nbformat": 4, 139 | "nbformat_minor": 2 140 | } 141 | -------------------------------------------------------------------------------- /LoCdata-Gale-LXML.v0.2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Skimming LCSH data from Gale files\n", 8 | "\n", 9 | "The goal of this notebook is to figure out a quick way to skim the thousands of ECCO and NCCO file metadata from Gale-Cengage, discover if they have a relevant Library of Congress Subject Heading, and, if the file does have a relevant LCSH, add it to a list.\n", 10 | "\n", 11 | "The terms I am looking for include `'travel', 'discov', 'explor', 'voyage', 'guide', 'antiquit'`. These are the same terms that I used to search the HTRC.\n", 12 | "\n", 13 | "In a past notebook, I struggled with using xml, in part because it was very slow to open an xml file, read the whole thing in to BeautifulSoup, and then see if what I wanted was there. This [lxml method from IBM](https://www.ibm.com/developerworks/xml/library/x-hiperfparse/), however, was helpful in developing something quicker." 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": 1, 19 | "metadata": { 20 | "collapsed": true 21 | }, 22 | "outputs": [], 23 | "source": [ 24 | "from lxml\n", 25 | "import pandas as pd\n", 26 | "import glob\n", 27 | "import lxml" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 4, 33 | "metadata": {}, 34 | "outputs": [ 35 | { 36 | "name": "stdout", 37 | "output_type": "stream", 38 | "text": [ 39 | "Wall time: 866 ms\n" 40 | ] 41 | } 42 | ], 43 | "source": [ 44 | "%%time \n", 45 | "\n", 46 | "context = etree.iterparse('files/sampleECCOtxt.xml', events=('end',), tag='locSubject')\n", 47 | "\n", 48 | "loclist = []\n", 49 | "for event, elem in context:\n", 50 | " loclist.append(elem.text)\n", 51 | "# Let's just try printing these in a few different ways" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 4, 57 | "metadata": {}, 58 | "outputs": [ 59 | { 60 | "data": { 61 | "text/plain": [ 62 | "'Westminster (London, England); History; Early works to 1800; London (England); History; Early works to 1800'" 63 | ] 64 | }, 65 | "execution_count": 4, 66 | "metadata": {}, 67 | "output_type": "execute_result" 68 | } 69 | ], 70 | "source": [ 71 | "string = '; '.join(loclist)\n", 72 | "string" 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": 5, 78 | "metadata": { 79 | "collapsed": true 80 | }, 81 | "outputs": [], 82 | "source": [ 83 | "for event, elem in context:\n", 84 | " print(elem.text)\n", 85 | " # Um, unsure why this isn't working, when it works above?" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": 6, 91 | "metadata": {}, 92 | "outputs": [ 93 | { 94 | "data": { 95 | "text/plain": [ 96 | "['Westminster (London, England)',\n", 97 | " 'History',\n", 98 | " 'Early works to 1800',\n", 99 | " 'London (England)',\n", 100 | " 'History',\n", 101 | " 'Early works to 1800']" 102 | ] 103 | }, 104 | "execution_count": 6, 105 | "metadata": {}, 106 | "output_type": "execute_result" 107 | } 108 | ], 109 | "source": [ 110 | "loclist" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "I should also not have to worry if the tag does not exist; as can be seen below, it won't break or throw an derror - the result will just be empy. So, I should still be able to apply my general method of something like `if word is in termlist`." 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": 7, 123 | "metadata": { 124 | "collapsed": true 125 | }, 126 | "outputs": [], 127 | "source": [ 128 | "contextnone = etree.iterparse('files/sampleECCOtxt.xml', events=('end',), tag='holdings2')" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": 8, 134 | "metadata": {}, 135 | "outputs": [ 136 | { 137 | "data": { 138 | "text/plain": [ 139 | "[]" 140 | ] 141 | }, 142 | "execution_count": 8, 143 | "metadata": {}, 144 | "output_type": "execute_result" 145 | } 146 | ], 147 | "source": [ 148 | "emptylist = []\n", 149 | "for event, elem in contextnone:\n", 150 | " emptylist.append(elem.text)\n", 151 | "emptylist" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": {}, 157 | "source": [ 158 | "### Running over multiple files\n", 159 | " \n", 160 | "Let's try run this over a few files to see what happens. The main chunk of code below was originally running BeautifulSoup, which took way too long; hopefully this will move a bit more quickly!\n", 161 | "\n", 162 | "I will also need a list of metadata tags to iterate through, in order to grab the metadata for the files that I want. Let's do that first.\n", 163 | "\n", 164 | "I also have to remember that although this chunk of code should work for ECCO files, I may have to adjust it slightly to work with NCCO files, especially since lxml tags are sensitive to capital letters (or at least, they are they are used above)." 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 11, 170 | "metadata": { 171 | "collapsed": true 172 | }, 173 | "outputs": [], 174 | "source": [ 175 | "taglist = ['documentID', 'ESTCID', 'pubDate','ESTCID',\n", 176 | " 'language','module','locSubject','notes',\n", 177 | " 'fullTitle','displayTitle','currentVolume', \n", 178 | " 'totalVolumes', 'imprintPublisher','imprintFull',\n", 179 | " 'imprintCity', 'publicationPlace']" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 11, 185 | "metadata": { 186 | "collapsed": true 187 | }, 188 | "outputs": [], 189 | "source": [ 190 | "# set up a dict to hold all the strings of all these elements\n", 191 | "elementdict = {}\n", 192 | "\n", 193 | "for xmltag in taglist:\n", 194 | " elementlist = []\n", 195 | " context = etree.iterparse('files/sampleECCOtxt.xml', events=('end',), tag=xmltag)\n", 196 | " for event, elem in context:\n", 197 | " elementlist.append(elem.text)\n", 198 | " elementdict[xmltag] = ', '.join(elementlist)\n", 199 | "\n", 200 | " " 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": 12, 206 | "metadata": {}, 207 | "outputs": [ 208 | { 209 | "data": { 210 | "text/plain": [ 211 | "{'ESTCID': 'T228085',\n", 212 | " 'currentVolume': 'Volume 2',\n", 213 | " 'displayTitle': 'A new and compleat survey of London. In ten parts. I. All the publick transactions and memorable events, that have happened to the citizens, from ...',\n", 214 | " 'documentID': '1299400102',\n", 215 | " 'fullTitle': 'A new and compleat survey of London. In ten parts. I. All the publick transactions and memorable events, that have happened to the citizens, from its foundation, to the year 1742. II. A particular description of the thirteen wards on the East of Walbrook. III. Of the twelve wards on the West of Walbrook. IV. A political account of London; parallels between this and the most celebrated cities of antiquity, as well as the modern great cities of Europe, Asia and Africa. V. An historical account of the city governments, ecclesiastical, civil and military. VI. A full account of the great and extensive commerce of the city; and of the several incorporations of the arts and mysteries of the citizens. VII. Of the present state of learning in this city. VIII. History and antiquities of Westminster; its government, ecclesiastical and civil. IX. Of the several parishes and liberties in the county of Middlesex, within the bill of mortality. X. Of the borough of Southwark, and places contiguous in the county of Surry. In two volumes. By a citizen, and native of London.',\n", 216 | " 'imprintCity': 'London',\n", 217 | " 'imprintFull': 'London : printed for S. Lyne, at the Globe in Newgate-street; and J. Ilive, in Aldersgate-street, MDCCXLII. [1742]',\n", 218 | " 'imprintPublisher': 'printed for S. Lyne, at the Globe in Newgate-street; and J. Ilive, in Aldersgate-street',\n", 219 | " 'language': 'English',\n", 220 | " 'locSubject': 'Westminster (London, England), History, Early works to 1800, London (England), History, Early works to 1800',\n", 221 | " 'module': 'History and Geography',\n", 222 | " 'notes': 'In fact in two volumes; volume 2 has a separate register, but continuous pagination and the imprint reads: \"London: printed for S. Lyne; and J. Ilive, 1742\". With an index.',\n", 223 | " 'pubDate': '17420101',\n", 224 | " 'publicationPlace': 'London',\n", 225 | " 'totalVolumes': '2'}" 226 | ] 227 | }, 228 | "execution_count": 12, 229 | "metadata": {}, 230 | "output_type": "execute_result" 231 | } 232 | ], 233 | "source": [ 234 | "#aaaand, let's see if it worked\n", 235 | "elementdict" 236 | ] 237 | }, 238 | { 239 | "cell_type": "markdown", 240 | "metadata": {}, 241 | "source": [ 242 | "Great! And if we set up a file list and try to do two files, one with travel, one without, let's see if we can get that working..." 243 | ] 244 | }, 245 | { 246 | "cell_type": "code", 247 | "execution_count": 7, 248 | "metadata": { 249 | "collapsed": true 250 | }, 251 | "outputs": [], 252 | "source": [ 253 | "twofiles = ['files/sampleECCOtextTravel.xml', 'files/sampleECCOtxt.xml']" 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": 9, 259 | "metadata": { 260 | "collapsed": true 261 | }, 262 | "outputs": [], 263 | "source": [ 264 | "# first, we only want the files that have certain LCSH\n", 265 | "# so let's set up a list of tags\n", 266 | "termlist = ['travel', 'discov', 'explor', 'voyage', 'guide', 'antiquit']\n", 267 | "\n", 268 | "teststring = 'Italy, Description and travel, Early works to 1800'\n" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": 15, 274 | "metadata": {}, 275 | "outputs": [ 276 | { 277 | "name": "stdout", 278 | "output_type": "stream", 279 | "text": [ 280 | "yes\n" 281 | ] 282 | } 283 | ], 284 | "source": [ 285 | "# just to make sure that my if statement will work:\n", 286 | "if any(x in teststring for x in termlist):\n", 287 | " print('yes')\n", 288 | "else:\n", 289 | " print('no')" 290 | ] 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": {}, 295 | "source": [ 296 | "## _note_ Run the next cell before the below ones\n", 297 | "Make sure to establish the `taglist` and `termlist`" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": 2, 303 | "metadata": {}, 304 | "outputs": [], 305 | "source": [ 306 | "import pandas as pd\n", 307 | "import glob\n", 308 | "import lxml\n", 309 | "\n", 310 | "taglist = ['documentID', 'ESTCID', 'pubDate','ESTCID',\n", 311 | " 'language','module','locSubject','notes',\n", 312 | " 'fullTitle','displayTitle','currentVolume', \n", 313 | " 'totalVolumes', 'imprintPublisher','imprintFull',\n", 314 | " 'imprintCity', 'publicationPlace']\n", 315 | "termlist = ['travel', 'discov', 'explor', 'voyage', 'guide', 'antiquit']" 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": {}, 321 | "source": [ 322 | "The next three cells have variations of:\n", 323 | "1. two files, one with travel and one without: `8.82 s ± 2.84 s per loop (mean ± std. dev. of 10 runs, 1 loop each)`\n", 324 | "2. one file, with travel: `7.18 s ± 751 ms per loop (mean ± std. dev. of 10 runs, 1 loop each)`\n", 325 | "3. one file, without travel: `2.78 s ± 334 ms per loop (mean ± std. dev. of 10 runs, 1 loop each)`\n", 326 | "\n", 327 | "So, in general, if I have 3000 files and 300 of them are travel related (as was roughly the case below), it would take 9,720 s (162 minutes / 2.7 hours).\n", 328 | "\n", 329 | "Is there a way to do this more quickly? At this rate, since there is about 200,000 files on ECCO I and ECCO II, it will take 680,000 seconds, or 180 hours. Ouch. \n", 330 | "\n", 331 | "Of course, some of these sections - like medicine or Lit&Lang, will be much lower in the number of travel texts that they have, so that number of hours could drop by a lot. For example, if only 2% (rather than 10%) of all 200,000 files have a travel tag, the time then becomes 576,800 or 160ish hours. \n", 332 | "\n", 333 | "These files are on my local hard drive, and not on the Gale hard drive, so I am guessing these loops might go a bit faster than when I'm using the Gale hard drive? Google did not provide an easy answer to this." 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": 6, 339 | "metadata": {}, 340 | "outputs": [ 341 | { 342 | "name": "stdout", 343 | "output_type": "stream", 344 | "text": [ 345 | "8.82 s ± 2.84 s per loop (mean ± std. dev. of 10 runs, 1 loop each)\n" 346 | ] 347 | } 348 | ], 349 | "source": [ 350 | "%%timeit -r 10\n", 351 | "\n", 352 | "# this list of dicts will hold all a dict that holds\n", 353 | "# the metadata for each relevant file\n", 354 | "listofdicts = []\n", 355 | "\n", 356 | "twofiles = ['files/sampleECCOtextTravel.xml', 'files/sampleECCOtxt.xml']\n", 357 | "\n", 358 | "for file in twofiles:\n", 359 | " testparse = etree.iterparse(file, events=('end',), tag = 'locSubject')\n", 360 | " testlist = []\n", 361 | " for event, elem in testparse:\n", 362 | " testlist.append(elem.text)\n", 363 | " # reclaim the memory at the end of each loop -\n", 364 | " # clears unneeded node references\n", 365 | " elem.clear()\n", 366 | " while elem.getprevious() is not None:\n", 367 | " del elem.getparent()[0]\n", 368 | " \n", 369 | " teststring = ', '.join(testlist) \n", 370 | " # print(teststring)\n", 371 | " \n", 372 | " if any(lcsh in teststring for lcsh in termlist):\n", 373 | " # print('yes')\n", 374 | " \n", 375 | " # if that is true, we want the metadata\n", 376 | " # so let's make a dict to hold it\n", 377 | " # this dict will be reset with every file loop\n", 378 | " filedict = {}\n", 379 | " \n", 380 | " for xmltag in taglist:\n", 381 | " # make an empty list to hold what is in each tag, \n", 382 | " # which will be written to our dict in a few steps\n", 383 | " elementlist = []\n", 384 | " \n", 385 | " context = etree.iterparse(file, events=('end',), tag=xmltag)\n", 386 | " for event, elem in context:\n", 387 | " elementlist.append(elem.text)\n", 388 | " \n", 389 | " # the below should make things faster - I think? \n", 390 | " # reclaim the memory at the end of each loop -\n", 391 | " # clears unneeded node references\n", 392 | " elem.clear()\n", 393 | " while elem.getprevious() is not None:\n", 394 | " del elem.getparent()[0]\n", 395 | " # assign to the dictionary\n", 396 | " filedict[xmltag] = ', '.join(elementlist)\n", 397 | " \n", 398 | " # after it has looped through all the xmltags, \n", 399 | " # add the filedict to list of dicts \n", 400 | " listofdicts.append(filedict)\n", 401 | " #else:\n", 402 | " # print('no')" 403 | ] 404 | }, 405 | { 406 | "cell_type": "code", 407 | "execution_count": 7, 408 | "metadata": {}, 409 | "outputs": [ 410 | { 411 | "name": "stdout", 412 | "output_type": "stream", 413 | "text": [ 414 | "7.18 s ± 751 ms per loop (mean ± std. dev. of 10 runs, 1 loop each)\n" 415 | ] 416 | } 417 | ], 418 | "source": [ 419 | "%%timeit -r 10\n", 420 | "\n", 421 | "# this list of dicts will hold all a dict that holds\n", 422 | "# the metadata for each relevant file\n", 423 | "listofdicts = []\n", 424 | "\n", 425 | "travellist = ['files/sampleECCOtextTravel.xml']\n", 426 | "\n", 427 | "for file in travellist:\n", 428 | " testparse = etree.iterparse(file, events=('end',), tag = 'locSubject')\n", 429 | " testlist = []\n", 430 | " for event, elem in testparse:\n", 431 | " testlist.append(elem.text)\n", 432 | " # reclaim the memory at the end of each loop -\n", 433 | " # clears unneeded node references\n", 434 | " elem.clear()\n", 435 | " while elem.getprevious() is not None:\n", 436 | " del elem.getparent()[0]\n", 437 | " \n", 438 | " teststring = ', '.join(testlist) \n", 439 | " # print(teststring)\n", 440 | " \n", 441 | " if any(lcsh in teststring for lcsh in termlist):\n", 442 | " # print('yes')\n", 443 | " \n", 444 | " # if that is true, we want the metadata\n", 445 | " # so let's make a dict to hold it\n", 446 | " # this dict will be reset with every file loop\n", 447 | " filedict = {}\n", 448 | " \n", 449 | " for xmltag in taglist:\n", 450 | " # make an empty list to hold what is in each tag, \n", 451 | " # which will be written to our dict in a few steps\n", 452 | " elementlist = []\n", 453 | " \n", 454 | " context = etree.iterparse(file, events=('end',), tag=xmltag)\n", 455 | " for event, elem in context:\n", 456 | " elementlist.append(elem.text)\n", 457 | " \n", 458 | " # the below should make things faster - I think? \n", 459 | " # reclaim the memory at the end of each loop -\n", 460 | " # clears unneeded node references\n", 461 | " elem.clear()\n", 462 | " while elem.getprevious() is not None:\n", 463 | " del elem.getparent()[0]\n", 464 | " # assign to the dictionary\n", 465 | " filedict[xmltag] = ', '.join(elementlist)\n", 466 | " \n", 467 | " # after it has looped through all the xmltags, \n", 468 | " # add the filedict to list of dicts \n", 469 | " listofdicts.append(filedict)\n", 470 | " #else:\n", 471 | " # print('no')\n", 472 | " " 473 | ] 474 | }, 475 | { 476 | "cell_type": "code", 477 | "execution_count": 8, 478 | "metadata": {}, 479 | "outputs": [ 480 | { 481 | "name": "stdout", 482 | "output_type": "stream", 483 | "text": [ 484 | "2.78 s ± 334 ms per loop (mean ± std. dev. of 10 runs, 1 loop each)\n" 485 | ] 486 | } 487 | ], 488 | "source": [ 489 | "%%timeit -r 10\n", 490 | "\n", 491 | "# this list of dicts will hold all a dict that holds\n", 492 | "# the metadata for each relevant file\n", 493 | "listofdicts = []\n", 494 | "\n", 495 | "nontravellist = ['files/sampleECCOtxt.xml']\n", 496 | "\n", 497 | "for file in nontravellist:\n", 498 | " testparse = etree.iterparse(file, events=('end',), tag = 'locSubject')\n", 499 | " testlist = []\n", 500 | " for event, elem in testparse:\n", 501 | " testlist.append(elem.text)\n", 502 | " # reclaim the memory at the end of each loop -\n", 503 | " # clears unneeded node references\n", 504 | " elem.clear()\n", 505 | " while elem.getprevious() is not None:\n", 506 | " del elem.getparent()[0]\n", 507 | " \n", 508 | " teststring = ', '.join(testlist) \n", 509 | " # print(teststring)\n", 510 | " \n", 511 | " if any(lcsh in teststring for lcsh in termlist):\n", 512 | " #print('yes')\n", 513 | " \n", 514 | " # if that is true, we want the metadata\n", 515 | " # so let's make a dict to hold it\n", 516 | " # this dict will be reset with every file loop\n", 517 | " filedict = {}\n", 518 | " \n", 519 | " for xmltag in taglist:\n", 520 | " # make an empty list to hold what is in each tag, \n", 521 | " # which will be written to our dict in a few steps\n", 522 | " elementlist = []\n", 523 | " \n", 524 | " context = etree.iterparse(file, events=('end',), tag=xmltag)\n", 525 | " for event, elem in context:\n", 526 | " elementlist.append(elem.text)\n", 527 | " \n", 528 | " # the below should make things faster - I think? \n", 529 | " # reclaim the memory at the end of each loop -\n", 530 | " # clears unneeded node references\n", 531 | " elem.clear()\n", 532 | " while elem.getprevious() is not None:\n", 533 | " del elem.getparent()[0]\n", 534 | " # assign to the dictionary\n", 535 | " filedict[xmltag] = ', '.join(elementlist)\n", 536 | " \n", 537 | " # after it has looped through all the xmltags, \n", 538 | " # add the filedict to list of dicts \n", 539 | " listofdicts.append(filedict)\n", 540 | " #else:\n", 541 | " #print('no')\n", 542 | " " 543 | ] 544 | }, 545 | { 546 | "cell_type": "code", 547 | "execution_count": 18, 548 | "metadata": {}, 549 | "outputs": [ 550 | { 551 | "data": { 552 | "text/plain": [ 553 | "[]" 554 | ] 555 | }, 556 | "execution_count": 18, 557 | "metadata": {}, 558 | "output_type": "execute_result" 559 | } 560 | ], 561 | "source": [ 562 | "listofdicts" 563 | ] 564 | }, 565 | { 566 | "cell_type": "code", 567 | "execution_count": 19, 568 | "metadata": { 569 | "collapsed": true 570 | }, 571 | "outputs": [], 572 | "source": [ 573 | "dictsdf = pd.DataFrame(listofdicts)" 574 | ] 575 | }, 576 | { 577 | "cell_type": "code", 578 | "execution_count": 20, 579 | "metadata": {}, 580 | "outputs": [ 581 | { 582 | "data": { 583 | "text/html": [ 584 | "
\n", 585 | "\n", 598 | "\n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | "
ESTCIDcurrentVolumedisplayTitledocumentIDfullTitleimprintCityimprintFullimprintPublisherlanguagelocSubjectmodulenotespubDatepublicationPlacetotalVolumes
0T1100700The travels of the learned Father Montfaucon f...0084800600The travels of the learned Father Montfaucon f...LondonLondon : printed by D. L. for E. Curll at the ...printed by D. L. for E. Curll at the Dial and ...EnglishItaly, Description and travel, Early works to ...History and GeographyWith an index. In this issue the imprint date...17120101London0
\n", 640 | "
" 641 | ], 642 | "text/plain": [ 643 | " ESTCID currentVolume displayTitle \\\n", 644 | "0 T110070 0 The travels of the learned Father Montfaucon f... \n", 645 | "\n", 646 | " documentID fullTitle imprintCity \\\n", 647 | "0 0084800600 The travels of the learned Father Montfaucon f... London \n", 648 | "\n", 649 | " imprintFull \\\n", 650 | "0 London : printed by D. L. for E. Curll at the ... \n", 651 | "\n", 652 | " imprintPublisher language \\\n", 653 | "0 printed by D. L. for E. Curll at the Dial and ... English \n", 654 | "\n", 655 | " locSubject module \\\n", 656 | "0 Italy, Description and travel, Early works to ... History and Geography \n", 657 | "\n", 658 | " notes pubDate \\\n", 659 | "0 With an index. In this issue the imprint date... 17120101 \n", 660 | "\n", 661 | " publicationPlace totalVolumes \n", 662 | "0 London 0 " 663 | ] 664 | }, 665 | "execution_count": 20, 666 | "metadata": {}, 667 | "output_type": "execute_result" 668 | } 669 | ], 670 | "source": [ 671 | "dictsdf" 672 | ] 673 | }, 674 | { 675 | "cell_type": "markdown", 676 | "metadata": {}, 677 | "source": [ 678 | "## Applying it to all the files\n", 679 | "\n", 680 | "Now that I have this code working, I should be able to apply it to all the other files.\n", 681 | "\n", 682 | "I've asked an astute colleague (thanks Jonathan!) to look over the above, in case there is some error that would cause something to cascade and run forever (or, just run really slow). That's most likely to happen because of my unfamiliarity with lxml. Pandas, I think, should be able to handle what I'm asking of it.\n", 683 | "\n", 684 | "So now, step one of this new task: get a file list. \n", 685 | "\n", 686 | "We'll start with the HistAndGeo section of ECCOII, since it has only 3385 files as opposed to the ~14000 files of ECCOI (ECCO was released in two segments). I'm also fairly hopeful that there should be at least one or two pieces of travel writing in there, since it is the ECCO genre most closely related to travel writing." 687 | ] 688 | }, 689 | { 690 | "cell_type": "code", 691 | "execution_count": 21, 692 | "metadata": { 693 | "collapsed": true 694 | }, 695 | "outputs": [], 696 | "source": [ 697 | "filelist = glob.glob('D:/ECCOII 2001/HistAndGeo/XML/*.xml')" 698 | ] 699 | }, 700 | { 701 | "cell_type": "code", 702 | "execution_count": 22, 703 | "metadata": {}, 704 | "outputs": [ 705 | { 706 | "data": { 707 | "text/plain": [ 708 | "3384" 709 | ] 710 | }, 711 | "execution_count": 22, 712 | "metadata": {}, 713 | "output_type": "execute_result" 714 | } 715 | ], 716 | "source": [ 717 | "len(filelist)" 718 | ] 719 | }, 720 | { 721 | "cell_type": "code", 722 | "execution_count": 26, 723 | "metadata": {}, 724 | "outputs": [ 725 | { 726 | "data": { 727 | "text/plain": [ 728 | "['D:/ECCOII 2001/HistAndGeo/XML\\\\1299100101.xml',\n", 729 | " 'D:/ECCOII 2001/HistAndGeo/XML\\\\1299100102.xml',\n", 730 | " 'D:/ECCOII 2001/HistAndGeo/XML\\\\1299100103.xml',\n", 731 | " 'D:/ECCOII 2001/HistAndGeo/XML\\\\1299100200.xml',\n", 732 | " 'D:/ECCOII 2001/HistAndGeo/XML\\\\1299100301.xml']" 733 | ] 734 | }, 735 | "execution_count": 26, 736 | "metadata": {}, 737 | "output_type": "execute_result" 738 | } 739 | ], 740 | "source": [ 741 | "shortfilelist = filelist[:100]\n", 742 | "shortfilelist[:5]" 743 | ] 744 | }, 745 | { 746 | "cell_type": "markdown", 747 | "metadata": {}, 748 | "source": [ 749 | "Okay, now I need to replicate the code up above so that it will work on this larger batch.\n", 750 | "\n", 751 | "I'm thinking, as well, of things that I will have to watch for: where will these files overlap with the ones in mybib already? I will have to be careful when integrating and comparing my various points of data, but the volume information and the file id numbering system (ie, having 0s vs 1/2/etc on the end of the filename) should help!" 752 | ] 753 | }, 754 | { 755 | "cell_type": "markdown", 756 | "metadata": {}, 757 | "source": [ 758 | "So, let's modify the code that I used earlier.\n", 759 | "\n", 760 | "In particular, I want to add a line that will note which ECCO segment it came from - ECCOI or ECCOII." 761 | ] 762 | }, 763 | { 764 | "cell_type": "code", 765 | "execution_count": 27, 766 | "metadata": { 767 | "collapsed": true 768 | }, 769 | "outputs": [], 770 | "source": [ 771 | "# this list of dicts will hold all a dict that holds\n", 772 | "# the metadata for each relevant file\n", 773 | "listofdicts = []\n", 774 | "\n", 775 | "for file in shortfilelist:\n", 776 | " \n", 777 | " # the first iterparse will test to see if it has the desired lcsh.\n", 778 | " testparse = etree.iterparse(file, events=('end',), tag = 'locSubject')\n", 779 | " testlist = []\n", 780 | " for event, elem in testparse:\n", 781 | " testlist.append(elem.text)\n", 782 | " \n", 783 | " # the below will reclaim the memory at the end of each loop -\n", 784 | " # clears unneeded node references\n", 785 | " elem.clear()\n", 786 | " while elem.getprevious() is not None:\n", 787 | " del elem.getparent()[0]\n", 788 | " \n", 789 | " # and back to the purpose of our code - \n", 790 | " # note that putting it in a string makes it easier to search\n", 791 | " # comparing list items required an exact match,\n", 792 | " # and I wanted fuzzier searching, \n", 793 | " # just in case there were any errors in the controlled vocabulary\n", 794 | " # of the lcsh.\n", 795 | " teststring = ', '.join(testlist) \n", 796 | " if any(lcsh in teststring for lcsh in termlist):\n", 797 | " \n", 798 | " # if that is true, we want the metadata for that file\n", 799 | " # so let's make a dict to hold it\n", 800 | " # this dict will be reset with every file loop\n", 801 | " filedict = {}\n", 802 | " \n", 803 | " # a dict entry to indicate which ECCO release it came from\n", 804 | " filedict['eccorelease'] = '2'\n", 805 | " \n", 806 | " for xmltag in taglist:\n", 807 | " # make an empty list to hold what is in each tag, \n", 808 | " # which will be written to our dict in a few steps\n", 809 | " elementlist = []\n", 810 | " \n", 811 | " context = etree.iterparse(file, events=('end',), tag=xmltag)\n", 812 | " for event, elem in context:\n", 813 | " elementlist.append(elem.text)\n", 814 | " \n", 815 | " # the below should make things faster - I think? \n", 816 | " # reclaim the memory at the end of each loop -\n", 817 | " # clears unneeded node references\n", 818 | " elem.clear()\n", 819 | " while elem.getprevious() is not None:\n", 820 | " del elem.getparent()[0]\n", 821 | " # assign to the dictionary\n", 822 | " filedict[xmltag] = ', '.join(elementlist)\n", 823 | " \n", 824 | " # after it has looped through all the xmltags, \n", 825 | " # add the filedict to list of dicts \n", 826 | " listofdicts.append(filedict) " 827 | ] 828 | }, 829 | { 830 | "cell_type": "code", 831 | "execution_count": 28, 832 | "metadata": { 833 | "collapsed": true 834 | }, 835 | "outputs": [], 836 | "source": [ 837 | "dictsdf = pd.DataFrame(listofdicts)" 838 | ] 839 | }, 840 | { 841 | "cell_type": "code", 842 | "execution_count": 29, 843 | "metadata": {}, 844 | "outputs": [ 845 | { 846 | "data": { 847 | "text/html": [ 848 | "
\n", 849 | "\n", 862 | "\n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 936 | " \n", 937 | " \n", 938 | " \n", 939 | " \n", 940 | " \n", 941 | " \n", 942 | " \n", 943 | " \n", 944 | " \n", 945 | " \n", 946 | " \n", 947 | " \n", 948 | " \n", 949 | " \n", 950 | " \n", 951 | " \n", 952 | " \n", 953 | " \n", 954 | " \n", 955 | " \n", 956 | " \n", 957 | " \n", 958 | " \n", 959 | " \n", 960 | " \n", 961 | " \n", 962 | " \n", 963 | " \n", 964 | " \n", 965 | " \n", 966 | " \n", 967 | " \n", 968 | " \n", 969 | " \n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | "
ESTCIDcurrentVolumedisplayTitledocumentIDfullTitleimprintCityimprintFullimprintPublisherlanguagelocSubjectmodulenotespubDatepublicationPlacetotalVolumes
0N025387Volume 16The World displayed; Or, A curious collection ...1309700116The World displayed; Or, A curious collection ...LondonLondon : Printed for J. Newbery, at the Bible ...Printed for J. Newbery, at the Bible and Sun, ...EnglishVoyages and travelsHistory and Geography17670101London1
1N0313560An essay towards a description of the city of ...1309700600An essay towards a description of the city of ...[Bath][Bath] : Printed for W. Frederick, bookseller,...Printed for W. Frederick, bookseller, in BathEnglishBath (England), Description and travel, Early ...History and GeographyWith an additional titlepage for pt. 1: 'An es...17420101Bath0
2N0292270Interesting account of the early voyages, made...1309900200Interesting account of the early voyages, made...LondonLondon : Printed for the proprietors, and sold...Printed for the proprietors, and sold at Stalk...EnglishExplorers, Portugal, Early works to 1800, Expl...History and Geography17900101London0
3N031489Volume 1An entertaining journey to the Netherlands; co...1309900301An entertaining journey to the Netherlands; co...LondonLondon : printed for W. Smith, M DCC LXXXII. [...printed for W. SmithEnglishNetherlands, Description and travel, Early wor...History and GeographyCoriat Junior = Samuel Paterson?.17820101London3
4N031489Volume 2An entertaining journey to the Netherlands; co...1309900302An entertaining journey to the Netherlands; co...LondonLondon : printed for W. Smith, M DCC LXXXII. [...printed for W. SmithEnglishNetherlands, Description and travel, Early wor...History and GeographyCoriat Junior = Samuel Paterson?.17820101London3
5N031489Volume 3An entertaining journey to the Netherlands; co...1309900303An entertaining journey to the Netherlands; co...LondonLondon : printed for W. Smith, M DCC LXXXII. [...printed for W. SmithEnglishNetherlands, Description and travel, Early wor...History and GeographyCoriat Junior = Samuel Paterson?.17820101London3
6T1885700A New collection of voyages and travels, dedic...1309901100A New collection of voyages and travels, dedic...LondonLondon : Printed for E. Newbery, the corner of...Printed for E. Newbery, the corner of St. Paul...EnglishVoyages and travels, Early works to 1800History and GeographyWilliam Mavor is the editor of the 'Historical...17960101London0
7T1723460Miscellaneous remarks made on the spot, in a l...1310300400Miscellaneous remarks made on the spot, in a l...LondonLondon : Printed for S. Hooper, at Gay's Head,...Printed for S. Hooper, at Gay's Head, near Bea...EnglishItaly, Description and travel, Early works to ...History and Geography17560101London0
\n", 1030 | "
" 1031 | ], 1032 | "text/plain": [ 1033 | " ESTCID currentVolume displayTitle \\\n", 1034 | "0 N025387 Volume 16 The World displayed; Or, A curious collection ... \n", 1035 | "1 N031356 0 An essay towards a description of the city of ... \n", 1036 | "2 N029227 0 Interesting account of the early voyages, made... \n", 1037 | "3 N031489 Volume 1 An entertaining journey to the Netherlands; co... \n", 1038 | "4 N031489 Volume 2 An entertaining journey to the Netherlands; co... \n", 1039 | "5 N031489 Volume 3 An entertaining journey to the Netherlands; co... \n", 1040 | "6 T188570 0 A New collection of voyages and travels, dedic... \n", 1041 | "7 T172346 0 Miscellaneous remarks made on the spot, in a l... \n", 1042 | "\n", 1043 | " documentID fullTitle imprintCity \\\n", 1044 | "0 1309700116 The World displayed; Or, A curious collection ... London \n", 1045 | "1 1309700600 An essay towards a description of the city of ... [Bath] \n", 1046 | "2 1309900200 Interesting account of the early voyages, made... London \n", 1047 | "3 1309900301 An entertaining journey to the Netherlands; co... London \n", 1048 | "4 1309900302 An entertaining journey to the Netherlands; co... London \n", 1049 | "5 1309900303 An entertaining journey to the Netherlands; co... London \n", 1050 | "6 1309901100 A New collection of voyages and travels, dedic... London \n", 1051 | "7 1310300400 Miscellaneous remarks made on the spot, in a l... London \n", 1052 | "\n", 1053 | " imprintFull \\\n", 1054 | "0 London : Printed for J. Newbery, at the Bible ... \n", 1055 | "1 [Bath] : Printed for W. Frederick, bookseller,... \n", 1056 | "2 London : Printed for the proprietors, and sold... \n", 1057 | "3 London : printed for W. Smith, M DCC LXXXII. [... \n", 1058 | "4 London : printed for W. Smith, M DCC LXXXII. [... \n", 1059 | "5 London : printed for W. Smith, M DCC LXXXII. [... \n", 1060 | "6 London : Printed for E. Newbery, the corner of... \n", 1061 | "7 London : Printed for S. Hooper, at Gay's Head,... \n", 1062 | "\n", 1063 | " imprintPublisher language \\\n", 1064 | "0 Printed for J. Newbery, at the Bible and Sun, ... English \n", 1065 | "1 Printed for W. Frederick, bookseller, in Bath English \n", 1066 | "2 Printed for the proprietors, and sold at Stalk... English \n", 1067 | "3 printed for W. Smith English \n", 1068 | "4 printed for W. Smith English \n", 1069 | "5 printed for W. Smith English \n", 1070 | "6 Printed for E. Newbery, the corner of St. Paul... English \n", 1071 | "7 Printed for S. Hooper, at Gay's Head, near Bea... English \n", 1072 | "\n", 1073 | " locSubject module \\\n", 1074 | "0 Voyages and travels History and Geography \n", 1075 | "1 Bath (England), Description and travel, Early ... History and Geography \n", 1076 | "2 Explorers, Portugal, Early works to 1800, Expl... History and Geography \n", 1077 | "3 Netherlands, Description and travel, Early wor... History and Geography \n", 1078 | "4 Netherlands, Description and travel, Early wor... History and Geography \n", 1079 | "5 Netherlands, Description and travel, Early wor... History and Geography \n", 1080 | "6 Voyages and travels, Early works to 1800 History and Geography \n", 1081 | "7 Italy, Description and travel, Early works to ... History and Geography \n", 1082 | "\n", 1083 | " notes pubDate \\\n", 1084 | "0 17670101 \n", 1085 | "1 With an additional titlepage for pt. 1: 'An es... 17420101 \n", 1086 | "2 17900101 \n", 1087 | "3 Coriat Junior = Samuel Paterson?. 17820101 \n", 1088 | "4 Coriat Junior = Samuel Paterson?. 17820101 \n", 1089 | "5 Coriat Junior = Samuel Paterson?. 17820101 \n", 1090 | "6 William Mavor is the editor of the 'Historical... 17960101 \n", 1091 | "7 17560101 \n", 1092 | "\n", 1093 | " publicationPlace totalVolumes \n", 1094 | "0 London 1 \n", 1095 | "1 Bath 0 \n", 1096 | "2 London 0 \n", 1097 | "3 London 3 \n", 1098 | "4 London 3 \n", 1099 | "5 London 3 \n", 1100 | "6 London 0 \n", 1101 | "7 London 0 " 1102 | ] 1103 | }, 1104 | "execution_count": 29, 1105 | "metadata": {}, 1106 | "output_type": "execute_result" 1107 | } 1108 | ], 1109 | "source": [ 1110 | "dictsdf" 1111 | ] 1112 | }, 1113 | { 1114 | "cell_type": "markdown", 1115 | "metadata": {}, 1116 | "source": [ 1117 | "Hurrah, it worked! Now, to replicate the code and do it for the multiple sections of ECCO II. " 1118 | ] 1119 | }, 1120 | { 1121 | "cell_type": "markdown", 1122 | "metadata": {}, 1123 | "source": [ 1124 | "# ECCO Part 2\n", 1125 | "Let's run our analysis on ECCO pt 2 (there are less files here than on ECCO pt 1!)" 1126 | ] 1127 | }, 1128 | { 1129 | "cell_type": "code", 1130 | "execution_count": 1, 1131 | "metadata": { 1132 | "collapsed": true 1133 | }, 1134 | "outputs": [], 1135 | "source": [ 1136 | "from lxml import etree\n", 1137 | "import pandas as pd\n", 1138 | "import glob" 1139 | ] 1140 | }, 1141 | { 1142 | "cell_type": "markdown", 1143 | "metadata": {}, 1144 | "source": [ 1145 | "Because many of the files are " 1146 | ] 1147 | }, 1148 | { 1149 | "cell_type": "code", 1150 | "execution_count": 52, 1151 | "metadata": { 1152 | "collapsed": true 1153 | }, 1154 | "outputs": [], 1155 | "source": [ 1156 | "import os\n", 1157 | "filelist = []\n", 1158 | "for root, dirs, files in os.walk(mypath):\n", 1159 | " for file in files:\n", 1160 | " if file.endswith(\".xml\"):\n", 1161 | " filelist.append(os.path.join(root, file))" 1162 | ] 1163 | }, 1164 | { 1165 | "cell_type": "code", 1166 | "execution_count": 53, 1167 | "metadata": {}, 1168 | "outputs": [ 1169 | { 1170 | "data": { 1171 | "text/plain": [ 1172 | "52690" 1173 | ] 1174 | }, 1175 | "execution_count": 53, 1176 | "metadata": {}, 1177 | "output_type": "execute_result" 1178 | } 1179 | ], 1180 | "source": [ 1181 | "len(filelist)" 1182 | ] 1183 | }, 1184 | { 1185 | "cell_type": "code", 1186 | "execution_count": 65, 1187 | "metadata": { 1188 | "scrolled": true 1189 | }, 1190 | "outputs": [ 1191 | { 1192 | "data": { 1193 | "text/plain": [ 1194 | "['D:/ECCOII 2001/GenRef\\\\XML\\\\1336600100.xml',\n", 1195 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600200.xml',\n", 1196 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600300.xml',\n", 1197 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600400.xml',\n", 1198 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600500.xml',\n", 1199 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600600.xml',\n", 1200 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600700.xml',\n", 1201 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600800.xml',\n", 1202 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336600900.xml',\n", 1203 | " 'D:/ECCOII 2001/GenRef\\\\XML\\\\1336601000.xml']" 1204 | ] 1205 | }, 1206 | "execution_count": 65, 1207 | "metadata": {}, 1208 | "output_type": "execute_result" 1209 | } 1210 | ], 1211 | "source": [ 1212 | "filelist[:10]" 1213 | ] 1214 | }, 1215 | { 1216 | "cell_type": "markdown", 1217 | "metadata": {}, 1218 | "source": [ 1219 | "Looks good - there shouldn't be any subfolders or word docs, for example. \n", 1220 | "\n", 1221 | "I am...a little hesitant to run something on such a large number of files. What if it breaks, partway through? What if my laptop panics? I think, however, this is a good opportunity to take a break, make some hot chocolate, watch the snow fall, and let my code run (and then, maybe, return to writing?!)" 1222 | ] 1223 | }, 1224 | { 1225 | "cell_type": "markdown", 1226 | "metadata": {}, 1227 | "source": [ 1228 | "### UH OH.\n", 1229 | "\n", 1230 | "*revised plan* Okay, so I started it running, and then let it be. And, of course, something didn't work and it threw a \"file not found\" error. :( The weird thing, is that is didn't happen on the `testparse = etree` line like I would have expected; instead, it happend on the first `for event, elem in testparse: testlist.append(elem.text)`.\n", 1231 | "\n", 1232 | "So, I'm going to return to doing it in batches; at least, then, I will know whether it is happening in a certain folder. Grumble grumble.\n", 1233 | "\n", 1234 | "(side note: I am a little concerned about what if the capitalization happens to be different somewhere? But, I think I can just run a `locsubject` (without the capital S) to catch any differences...)\n", 1235 | "\n", 1236 | "In order to do a longer test, I'll use the `ECCOII HistAndGeo` subsection as an experiment - there were about 3300 files in there. \n", 1237 | "\n", 1238 | "_side note_: My laptop went to sleep just after the 2500 mark, so I had to start it back up again, woops. No duplicates in the total of `312` relevant files below, though I will have to filter out anything that isn't printed in Great Britain - there are some texts printed in New York, Dublin, etc., in there." 1239 | ] 1240 | }, 1241 | { 1242 | "cell_type": "code", 1243 | "execution_count": 2, 1244 | "metadata": { 1245 | "collapsed": true 1246 | }, 1247 | "outputs": [], 1248 | "source": [ 1249 | "filelistHistAndGeo = glob.glob('D:/ECCOII 2001/HistAndGeo/XML/*.xml')" 1250 | ] 1251 | }, 1252 | { 1253 | "cell_type": "code", 1254 | "execution_count": 19, 1255 | "metadata": {}, 1256 | "outputs": [ 1257 | { 1258 | "name": "stdout", 1259 | "output_type": "stream", 1260 | "text": [ 1261 | "3000\n" 1262 | ] 1263 | } 1264 | ], 1265 | "source": [ 1266 | "termlist = ['travel', 'discov', 'explor', 'voyage', 'guide', 'antiquit']\n", 1267 | "taglist = ['documentID', 'ESTCID', 'pubDate','ESTCID',\n", 1268 | " 'language','module','locSubject','notes',\n", 1269 | " 'fullTitle','displayTitle','currentVolume', \n", 1270 | " 'totalVolumes', 'imprintPublisher','imprintFull',\n", 1271 | " 'imprintCity', 'publicationPlace']\n", 1272 | "\n", 1273 | "# this list of dicts will hold all a dict that holds\n", 1274 | "# the metadata for each relevant file\n", 1275 | "\n", 1276 | "# listofdicts = []\n", 1277 | "\n", 1278 | "# and, a count so that I can track my progress\n", 1279 | "count = 2500\n", 1280 | "\n", 1281 | "for file in filelistHistAndGeo[2500:]:\n", 1282 | " count+=1\n", 1283 | " if (count % 500) == 0:\n", 1284 | " print(count)\n", 1285 | " \n", 1286 | " \n", 1287 | " # the first iterparse will test to see if it has the desired lcsh.\n", 1288 | " testparse = etree.iterparse(file, events=('end',), tag = 'locSubject')\n", 1289 | " testlist = []\n", 1290 | " for event, elem in testparse:\n", 1291 | " testlist.append(elem.text)\n", 1292 | " \n", 1293 | " # the below will reclaim the memory at the end of each loop -\n", 1294 | " # clears unneeded node references\n", 1295 | " elem.clear()\n", 1296 | " while elem.getprevious() is not None:\n", 1297 | " del elem.getparent()[0]\n", 1298 | " \n", 1299 | " # and back to the purpose of our code - \n", 1300 | " # note that putting it in a string makes it easier to search\n", 1301 | " # comparing list items required an exact match,\n", 1302 | " # and I wanted fuzzier searching, \n", 1303 | " # just in case there were any errors in the controlled vocabulary\n", 1304 | " # of the lcsh.\n", 1305 | " teststring = ', '.join(testlist) \n", 1306 | " if any(lcsh in teststring for lcsh in termlist):\n", 1307 | " \n", 1308 | " # if that is true, we want the metadata for that file\n", 1309 | " # so let's make a dict to hold it\n", 1310 | " # this dict will be reset with every file loop\n", 1311 | " filedict = {}\n", 1312 | " \n", 1313 | " # a dict entry to indicate which ECCO release it came from\n", 1314 | " filedict['eccorelease'] = '2'\n", 1315 | " \n", 1316 | " for xmltag in taglist:\n", 1317 | " # make an empty list to hold what is in each tag, \n", 1318 | " # which will be written to our dict in a few steps\n", 1319 | " elementlist = []\n", 1320 | " \n", 1321 | " context = etree.iterparse(file, events=('end',), tag=xmltag)\n", 1322 | " for event, elem in context:\n", 1323 | " elementlist.append(elem.text)\n", 1324 | " \n", 1325 | " # the below should make things faster - I think? \n", 1326 | " # reclaim the memory at the end of each loop -\n", 1327 | " # clears unneeded node references\n", 1328 | " elem.clear()\n", 1329 | " while elem.getprevious() is not None:\n", 1330 | " del elem.getparent()[0]\n", 1331 | " # assign to the dictionary\n", 1332 | " filedict[xmltag] = ', '.join(elementlist)\n", 1333 | " \n", 1334 | " # after it has looped through all the xmltags, \n", 1335 | " # add the filedict to list of dicts \n", 1336 | " listofdicts.append(filedict) \n", 1337 | " \n" 1338 | ] 1339 | }, 1340 | { 1341 | "cell_type": "code", 1342 | "execution_count": 20, 1343 | "metadata": {}, 1344 | "outputs": [ 1345 | { 1346 | "data": { 1347 | "text/plain": [ 1348 | "312" 1349 | ] 1350 | }, 1351 | "execution_count": 20, 1352 | "metadata": {}, 1353 | "output_type": "execute_result" 1354 | } 1355 | ], 1356 | "source": [ 1357 | "len(listofdicts)" 1358 | ] 1359 | }, 1360 | { 1361 | "cell_type": "code", 1362 | "execution_count": 21, 1363 | "metadata": {}, 1364 | "outputs": [ 1365 | { 1366 | "data": { 1367 | "text/html": [ 1368 | "
\n", 1369 | "\n", 1382 | "\n", 1383 | " \n", 1384 | " \n", 1385 | " \n", 1386 | " \n", 1387 | " \n", 1388 | " \n", 1389 | " \n", 1390 | " \n", 1391 | " \n", 1392 | " \n", 1393 | " \n", 1394 | " \n", 1395 | " \n", 1396 | " \n", 1397 | " \n", 1398 | " \n", 1399 | " \n", 1400 | " \n", 1401 | " \n", 1402 | " \n", 1403 | " \n", 1404 | " \n", 1405 | " \n", 1406 | " \n", 1407 | " \n", 1408 | " \n", 1409 | " \n", 1410 | " \n", 1411 | " \n", 1412 | " \n", 1413 | " \n", 1414 | " \n", 1415 | " \n", 1416 | " \n", 1417 | " \n", 1418 | " \n", 1419 | " \n", 1420 | " \n", 1421 | " \n", 1422 | " \n", 1423 | " \n", 1424 | " \n", 1425 | " \n", 1426 | " \n", 1427 | " \n", 1428 | " \n", 1429 | " \n", 1430 | " \n", 1431 | " \n", 1432 | " \n", 1433 | " \n", 1434 | " \n", 1435 | " \n", 1436 | " \n", 1437 | " \n", 1438 | " \n", 1439 | " \n", 1440 | " \n", 1441 | " \n", 1442 | " \n", 1443 | " \n", 1444 | " \n", 1445 | " \n", 1446 | " \n", 1447 | " \n", 1448 | " \n", 1449 | " \n", 1450 | " \n", 1451 | " \n", 1452 | " \n", 1453 | " \n", 1454 | " \n", 1455 | " \n", 1456 | " \n", 1457 | " \n", 1458 | " \n", 1459 | " \n", 1460 | " \n", 1461 | " \n", 1462 | " \n", 1463 | " \n", 1464 | " \n", 1465 | " \n", 1466 | " \n", 1467 | " \n", 1468 | " \n", 1469 | " \n", 1470 | " \n", 1471 | " \n", 1472 | " \n", 1473 | " \n", 1474 | " \n", 1475 | " \n", 1476 | " \n", 1477 | " \n", 1478 | " \n", 1479 | " \n", 1480 | " \n", 1481 | " \n", 1482 | " \n", 1483 | " \n", 1484 | " \n", 1485 | " \n", 1486 | " \n", 1487 | " \n", 1488 | " \n", 1489 | " \n", 1490 | " \n", 1491 | " \n", 1492 | " \n", 1493 | " \n", 1494 | " \n", 1495 | " \n", 1496 | " \n", 1497 | " \n", 1498 | " \n", 1499 | " \n", 1500 | " \n", 1501 | " \n", 1502 | " \n", 1503 | " \n", 1504 | " \n", 1505 | " \n", 1506 | " \n", 1507 | " \n", 1508 | " \n", 1509 | " \n", 1510 | " \n", 1511 | " \n", 1512 | " \n", 1513 | " \n", 1514 | " \n", 1515 | " \n", 1516 | " \n", 1517 | " \n", 1518 | " \n", 1519 | " \n", 1520 | " \n", 1521 | " \n", 1522 | " \n", 1523 | " \n", 1524 | " \n", 1525 | " \n", 1526 | " \n", 1527 | " \n", 1528 | " \n", 1529 | " \n", 1530 | " \n", 1531 | " \n", 1532 | " \n", 1533 | " \n", 1534 | " \n", 1535 | " \n", 1536 | " \n", 1537 | " \n", 1538 | " \n", 1539 | " \n", 1540 | " \n", 1541 | " \n", 1542 | " \n", 1543 | " \n", 1544 | " \n", 1545 | " \n", 1546 | " \n", 1547 | " \n", 1548 | " \n", 1549 | " \n", 1550 | " \n", 1551 | " \n", 1552 | " \n", 1553 | " \n", 1554 | " \n", 1555 | " \n", 1556 | " \n", 1557 | " \n", 1558 | " \n", 1559 | " \n", 1560 | " \n", 1561 | " \n", 1562 | " \n", 1563 | " \n", 1564 | " \n", 1565 | " \n", 1566 | " \n", 1567 | " \n", 1568 | " \n", 1569 | " \n", 1570 | " \n", 1571 | " \n", 1572 | " \n", 1573 | " \n", 1574 | " \n", 1575 | " \n", 1576 | " \n", 1577 | " \n", 1578 | " \n", 1579 | " \n", 1580 | " \n", 1581 | " \n", 1582 | " \n", 1583 | " \n", 1584 | " \n", 1585 | " \n", 1586 | " \n", 1587 | " \n", 1588 | " \n", 1589 | " \n", 1590 | " \n", 1591 | " \n", 1592 | " \n", 1593 | " \n", 1594 | " \n", 1595 | " \n", 1596 | " \n", 1597 | " \n", 1598 | " \n", 1599 | " \n", 1600 | " \n", 1601 | " \n", 1602 | " \n", 1603 | " \n", 1604 | " \n", 1605 | " \n", 1606 | " \n", 1607 | " \n", 1608 | " \n", 1609 | " \n", 1610 | " \n", 1611 | " \n", 1612 | " \n", 1613 | " \n", 1614 | " \n", 1615 | " \n", 1616 | " \n", 1617 | " \n", 1618 | " \n", 1619 | " \n", 1620 | " \n", 1621 | " \n", 1622 | " \n", 1623 | " \n", 1624 | " \n", 1625 | " \n", 1626 | " \n", 1627 | " \n", 1628 | " \n", 1629 | " \n", 1630 | " \n", 1631 | " \n", 1632 | " \n", 1633 | " \n", 1634 | " \n", 1635 | " \n", 1636 | " \n", 1637 | " \n", 1638 | " \n", 1639 | " \n", 1640 | " \n", 1641 | " \n", 1642 | " \n", 1643 | " \n", 1644 | " \n", 1645 | " \n", 1646 | " \n", 1647 | " \n", 1648 | " \n", 1649 | " \n", 1650 | " \n", 1651 | " \n", 1652 | " \n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | " \n", 1662 | " \n", 1663 | " \n", 1664 | " \n", 1665 | " \n", 1666 | " \n", 1667 | " \n", 1668 | " \n", 1669 | " \n", 1670 | " \n", 1671 | " \n", 1672 | " \n", 1673 | " \n", 1674 | " \n", 1675 | " \n", 1676 | " \n", 1677 | " \n", 1678 | " \n", 1679 | " \n", 1680 | " \n", 1681 | " \n", 1682 | " \n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | " \n", 1749 | " \n", 1750 | " \n", 1751 | " \n", 1752 | " \n", 1753 | " \n", 1754 | " \n", 1755 | " \n", 1756 | " \n", 1757 | " \n", 1758 | " \n", 1759 | " \n", 1760 | " \n", 1761 | " \n", 1762 | " \n", 1763 | " \n", 1764 | " \n", 1765 | " \n", 1766 | " \n", 1767 | " \n", 1768 | " \n", 1769 | " \n", 1770 | " \n", 1771 | " \n", 1772 | " \n", 1773 | " \n", 1774 | " \n", 1775 | " \n", 1776 | " \n", 1777 | " \n", 1778 | " \n", 1779 | " \n", 1780 | " \n", 1781 | " \n", 1782 | " \n", 1783 | " \n", 1784 | " \n", 1785 | " \n", 1786 | " \n", 1787 | " \n", 1788 | " \n", 1789 | " \n", 1790 | " \n", 1791 | " \n", 1792 | " \n", 1793 | " \n", 1794 | " \n", 1795 | " \n", 1796 | " \n", 1797 | " \n", 1798 | " \n", 1799 | " \n", 1800 | " \n", 1801 | " \n", 1802 | " \n", 1803 | " \n", 1804 | " \n", 1805 | " \n", 1806 | " \n", 1807 | " \n", 1808 | " \n", 1809 | " \n", 1810 | " \n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | " \n", 1854 | " \n", 1855 | " \n", 1856 | " \n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | " \n", 1862 | " \n", 1863 | " \n", 1864 | " \n", 1865 | " \n", 1866 | " \n", 1867 | " \n", 1868 | " \n", 1869 | " \n", 1870 | " \n", 1871 | " \n", 1872 | " \n", 1873 | " \n", 1874 | " \n", 1875 | " \n", 1876 | " \n", 1877 | " \n", 1878 | " \n", 1879 | " \n", 1880 | " \n", 1881 | " \n", 1882 | " \n", 1883 | " \n", 1884 | " \n", 1885 | " \n", 1886 | " \n", 1887 | " \n", 1888 | " \n", 1889 | " \n", 1890 | " \n", 1891 | " \n", 1892 | " \n", 1893 | " \n", 1894 | " \n", 1895 | " \n", 1896 | " \n", 1897 | " \n", 1898 | " \n", 1899 | " \n", 1900 | " \n", 1901 | " \n", 1902 | " \n", 1903 | " \n", 1904 | " \n", 1905 | " \n", 1906 | " \n", 1907 | " \n", 1908 | " \n", 1909 | " \n", 1910 | " \n", 1911 | " \n", 1912 | " \n", 1913 | " \n", 1914 | " \n", 1915 | " \n", 1916 | " \n", 1917 | " \n", 1918 | " \n", 1919 | " \n", 1920 | " \n", 1921 | " \n", 1922 | " \n", 1923 | " \n", 1924 | " \n", 1925 | " \n", 1926 | " \n", 1927 | " \n", 1928 | " \n", 1929 | " \n", 1930 | " \n", 1931 | " \n", 1932 | " \n", 1933 | " \n", 1934 | " \n", 1935 | " \n", 1936 | " \n", 1937 | " \n", 1938 | " \n", 1939 | " \n", 1940 | " \n", 1941 | " \n", 1942 | " \n", 1943 | " \n", 1944 | " \n", 1945 | " \n", 1946 | " \n", 1947 | " \n", 1948 | " \n", 1949 | " \n", 1950 | " \n", 1951 | " \n", 1952 | " \n", 1953 | " \n", 1954 | " \n", 1955 | " \n", 1956 | " \n", 1957 | " \n", 1958 | " \n", 1959 | " \n", 1960 | " \n", 1961 | " \n", 1962 | " \n", 1963 | " \n", 1964 | " \n", 1965 | " \n", 1966 | " \n", 1967 | " \n", 1968 | " \n", 1969 | " \n", 1970 | " \n", 1971 | " \n", 1972 | " \n", 1973 | " \n", 1974 | " \n", 1975 | " \n", 1976 | " \n", 1977 | " \n", 1978 | " \n", 1979 | " \n", 1980 | " \n", 1981 | " \n", 1982 | " \n", 1983 | " \n", 1984 | " \n", 1985 | " \n", 1986 | " \n", 1987 | " \n", 1988 | " \n", 1989 | " \n", 1990 | " \n", 1991 | " \n", 1992 | " \n", 1993 | " \n", 1994 | " \n", 1995 | " \n", 1996 | " \n", 1997 | " \n", 1998 | " \n", 1999 | " \n", 2000 | " \n", 2001 | " \n", 2002 | " \n", 2003 | " \n", 2004 | " \n", 2005 | " \n", 2006 | " \n", 2007 | " \n", 2008 | " \n", 2009 | " \n", 2010 | " \n", 2011 | " \n", 2012 | " \n", 2013 | " \n", 2014 | " \n", 2015 | " \n", 2016 | " \n", 2017 | " \n", 2018 | " \n", 2019 | " \n", 2020 | " \n", 2021 | " \n", 2022 | " \n", 2023 | " \n", 2024 | " \n", 2025 | " \n", 2026 | " \n", 2027 | " \n", 2028 | " \n", 2029 | " \n", 2030 | " \n", 2031 | " \n", 2032 | " \n", 2033 | " \n", 2034 | " \n", 2035 | " \n", 2036 | " \n", 2037 | " \n", 2038 | " \n", 2039 | " \n", 2040 | " \n", 2041 | " \n", 2042 | " \n", 2043 | " \n", 2044 | " \n", 2045 | " \n", 2046 | " \n", 2047 | " \n", 2048 | " \n", 2049 | " \n", 2050 | " \n", 2051 | " \n", 2052 | " \n", 2053 | " \n", 2054 | " \n", 2055 | " \n", 2056 | " \n", 2057 | " \n", 2058 | " \n", 2059 | " \n", 2060 | " \n", 2061 | " \n", 2062 | " \n", 2063 | " \n", 2064 | " \n", 2065 | " \n", 2066 | " \n", 2067 | " \n", 2068 | " \n", 2069 | " \n", 2070 | " \n", 2071 | " \n", 2072 | " \n", 2073 | " \n", 2074 | " \n", 2075 | " \n", 2076 | " \n", 2077 | " \n", 2078 | " \n", 2079 | " \n", 2080 | " \n", 2081 | " \n", 2082 | " \n", 2083 | " \n", 2084 | " \n", 2085 | " \n", 2086 | " \n", 2087 | " \n", 2088 | " \n", 2089 | " \n", 2090 | " \n", 2091 | " \n", 2092 | " \n", 2093 | " \n", 2094 | " \n", 2095 | " \n", 2096 | " \n", 2097 | " \n", 2098 | " \n", 2099 | " \n", 2100 | " \n", 2101 | " \n", 2102 | " \n", 2103 | " \n", 2104 | " \n", 2105 | " \n", 2106 | " \n", 2107 | " \n", 2108 | " \n", 2109 | " \n", 2110 | " \n", 2111 | " \n", 2112 | " \n", 2113 | " \n", 2114 | " \n", 2115 | " \n", 2116 | " \n", 2117 | " \n", 2118 | " \n", 2119 | " \n", 2120 | " \n", 2121 | " \n", 2122 | " \n", 2123 | " \n", 2124 | " \n", 2125 | " \n", 2126 | " \n", 2127 | " \n", 2128 | " \n", 2129 | " \n", 2130 | " \n", 2131 | " \n", 2132 | " \n", 2133 | " \n", 2134 | " \n", 2135 | " \n", 2136 | " \n", 2137 | " \n", 2138 | " \n", 2139 | " \n", 2140 | " \n", 2141 | " \n", 2142 | " \n", 2143 | " \n", 2144 | " \n", 2145 | " \n", 2146 | " \n", 2147 | " \n", 2148 | " \n", 2149 | " \n", 2150 | " \n", 2151 | " \n", 2152 | " \n", 2153 | " \n", 2154 | " \n", 2155 | " \n", 2156 | " \n", 2157 | " \n", 2158 | " \n", 2159 | " \n", 2160 | " \n", 2161 | " \n", 2162 | " \n", 2163 | " \n", 2164 | " \n", 2165 | " \n", 2166 | " \n", 2167 | " \n", 2168 | " \n", 2169 | " \n", 2170 | " \n", 2171 | " \n", 2172 | " \n", 2173 | " \n", 2174 | " \n", 2175 | " \n", 2176 | " \n", 2177 | " \n", 2178 | " \n", 2179 | " \n", 2180 | " \n", 2181 | " \n", 2182 | " \n", 2183 | " \n", 2184 | " \n", 2185 | " \n", 2186 | " \n", 2187 | " \n", 2188 | " \n", 2189 | " \n", 2190 | " \n", 2191 | " \n", 2192 | " \n", 2193 | " \n", 2194 | " \n", 2195 | " \n", 2196 | " \n", 2197 | " \n", 2198 | " \n", 2199 | " \n", 2200 | " \n", 2201 | " \n", 2202 | " \n", 2203 | " \n", 2204 | " \n", 2205 | " \n", 2206 | " \n", 2207 | " \n", 2208 | " \n", 2209 | " \n", 2210 | " \n", 2211 | " \n", 2212 | " \n", 2213 | " \n", 2214 | " \n", 2215 | " \n", 2216 | " \n", 2217 | " \n", 2218 | " \n", 2219 | " \n", 2220 | " \n", 2221 | " \n", 2222 | " \n", 2223 | " \n", 2224 | " \n", 2225 | " \n", 2226 | " \n", 2227 | " \n", 2228 | " \n", 2229 | " \n", 2230 | " \n", 2231 | " \n", 2232 | " \n", 2233 | " \n", 2234 | " \n", 2235 | " \n", 2236 | " \n", 2237 | " \n", 2238 | " \n", 2239 | " \n", 2240 | " \n", 2241 | " \n", 2242 | " \n", 2243 | " \n", 2244 | " \n", 2245 | " \n", 2246 | " \n", 2247 | " \n", 2248 | " \n", 2249 | " \n", 2250 | " \n", 2251 | " \n", 2252 | " \n", 2253 | " \n", 2254 | " \n", 2255 | " \n", 2256 | " \n", 2257 | " \n", 2258 | " \n", 2259 | " \n", 2260 | " \n", 2261 | " \n", 2262 | " \n", 2263 | " \n", 2264 | " \n", 2265 | " \n", 2266 | " \n", 2267 | " \n", 2268 | " \n", 2269 | " \n", 2270 | " \n", 2271 | " \n", 2272 | " \n", 2273 | " \n", 2274 | " \n", 2275 | " \n", 2276 | " \n", 2277 | " \n", 2278 | " \n", 2279 | " \n", 2280 | " \n", 2281 | " \n", 2282 | " \n", 2283 | " \n", 2284 | " \n", 2285 | " \n", 2286 | " \n", 2287 | " \n", 2288 | " \n", 2289 | " \n", 2290 | " \n", 2291 | " \n", 2292 | " \n", 2293 | " \n", 2294 | " \n", 2295 | " \n", 2296 | " \n", 2297 | " \n", 2298 | " \n", 2299 | " \n", 2300 | " \n", 2301 | " \n", 2302 | " \n", 2303 | " \n", 2304 | " \n", 2305 | " \n", 2306 | " \n", 2307 | " \n", 2308 | " \n", 2309 | " \n", 2310 | " \n", 2311 | " \n", 2312 | " \n", 2313 | " \n", 2314 | " \n", 2315 | " \n", 2316 | " \n", 2317 | " \n", 2318 | " \n", 2319 | " \n", 2320 | " \n", 2321 | " \n", 2322 | " \n", 2323 | " \n", 2324 | " \n", 2325 | " \n", 2326 | " \n", 2327 | " \n", 2328 | " \n", 2329 | " \n", 2330 | " \n", 2331 | " \n", 2332 | " \n", 2333 | " \n", 2334 | " \n", 2335 | " \n", 2336 | " \n", 2337 | " \n", 2338 | " \n", 2339 | " \n", 2340 | " \n", 2341 | " \n", 2342 | " \n", 2343 | " \n", 2344 | " \n", 2345 | " \n", 2346 | " \n", 2347 | " \n", 2348 | " \n", 2349 | " \n", 2350 | " \n", 2351 | " \n", 2352 | " \n", 2353 | " \n", 2354 | " \n", 2355 | " \n", 2356 | " \n", 2357 | " \n", 2358 | " \n", 2359 | " \n", 2360 | " \n", 2361 | " \n", 2362 | " \n", 2363 | " \n", 2364 | " \n", 2365 | " \n", 2366 | " \n", 2367 | " \n", 2368 | " \n", 2369 | " \n", 2370 | " \n", 2371 | " \n", 2372 | " \n", 2373 | " \n", 2374 | " \n", 2375 | " \n", 2376 | " \n", 2377 | " \n", 2378 | " \n", 2379 | " \n", 2380 | " \n", 2381 | " \n", 2382 | " \n", 2383 | " \n", 2384 | " \n", 2385 | " \n", 2386 | " \n", 2387 | " \n", 2388 | " \n", 2389 | " \n", 2390 | " \n", 2391 | " \n", 2392 | " \n", 2393 | " \n", 2394 | " \n", 2395 | " \n", 2396 | " \n", 2397 | " \n", 2398 | " \n", 2399 | " \n", 2400 | " \n", 2401 | " \n", 2402 | " \n", 2403 | " \n", 2404 | " \n", 2405 | " \n", 2406 | " \n", 2407 | " \n", 2408 | " \n", 2409 | " \n", 2410 | " \n", 2411 | " \n", 2412 | " \n", 2413 | " \n", 2414 | " \n", 2415 | " \n", 2416 | " \n", 2417 | " \n", 2418 | " \n", 2419 | " \n", 2420 | " \n", 2421 | " \n", 2422 | " \n", 2423 | " \n", 2424 | " \n", 2425 | " \n", 2426 | " \n", 2427 | " \n", 2428 | " \n", 2429 | " \n", 2430 | " \n", 2431 | " \n", 2432 | " \n", 2433 | " \n", 2434 | " \n", 2435 | " \n", 2436 | " \n", 2437 | " \n", 2438 | " \n", 2439 | " \n", 2440 | " \n", 2441 | " \n", 2442 | " \n", 2443 | " \n", 2444 | " \n", 2445 | " \n", 2446 | " \n", 2447 | " \n", 2448 | " \n", 2449 | " \n", 2450 | " \n", 2451 | " \n", 2452 | " \n", 2453 | " \n", 2454 | " \n", 2455 | " \n", 2456 | " \n", 2457 | " \n", 2458 | " \n", 2459 | " \n", 2460 | " \n", 2461 | " \n", 2462 | " \n", 2463 | " \n", 2464 | " \n", 2465 | " \n", 2466 | " \n", 2467 | " \n", 2468 | " \n", 2469 | " \n", 2470 | " \n", 2471 | " \n", 2472 | " \n", 2473 | " \n", 2474 | " \n", 2475 | " \n", 2476 | " \n", 2477 | " \n", 2478 | " \n", 2479 | " \n", 2480 | " \n", 2481 | " \n", 2482 | " \n", 2483 | " \n", 2484 | " \n", 2485 | " \n", 2486 | " \n", 2487 | " \n", 2488 | " \n", 2489 | " \n", 2490 | " \n", 2491 | " \n", 2492 | " \n", 2493 | " \n", 2494 | " \n", 2495 | " \n", 2496 | " \n", 2497 | " \n", 2498 | " \n", 2499 | " \n", 2500 | " \n", 2501 | " \n", 2502 | " \n", 2503 | " \n", 2504 | " \n", 2505 | " \n", 2506 | " \n", 2507 | " \n", 2508 | " \n", 2509 | " \n", 2510 | " \n", 2511 | " \n", 2512 | " \n", 2513 | " \n", 2514 | " \n", 2515 | " \n", 2516 | " \n", 2517 | " \n", 2518 | " \n", 2519 | " \n", 2520 | " \n", 2521 | " \n", 2522 | " \n", 2523 | " \n", 2524 | " \n", 2525 | " \n", 2526 | " \n", 2527 | " \n", 2528 | " \n", 2529 | " \n", 2530 | " \n", 2531 | " \n", 2532 | " \n", 2533 | " \n", 2534 | " \n", 2535 | " \n", 2536 | " \n", 2537 | " \n", 2538 | " \n", 2539 | " \n", 2540 | " \n", 2541 | " \n", 2542 | " \n", 2543 | " \n", 2544 | " \n", 2545 | " \n", 2546 | " \n", 2547 | " \n", 2548 | " \n", 2549 | " \n", 2550 | " \n", 2551 | " \n", 2552 | " \n", 2553 | " \n", 2554 | " \n", 2555 | " \n", 2556 | " \n", 2557 | " \n", 2558 | " \n", 2559 | " \n", 2560 | " \n", 2561 | " \n", 2562 | " \n", 2563 | " \n", 2564 | " \n", 2565 | "
ESTCIDcurrentVolumedisplayTitledocumentIDeccoreleasefullTitleimprintCityimprintFullimprintPublisherlanguagelocSubjectmodulenotespubDatepublicationPlacetotalVolumes
0N025387Volume 16The World displayed; Or, A curious collection ...13097001162The World displayed; Or, A curious collection ...LondonLondon : Printed for J. Newbery, at the Bible ...Printed for J. Newbery, at the Bible and Sun, ...EnglishVoyages and travelsHistory and Geography17670101London1
1N0313560An essay towards a description of the city of ...13097006002An essay towards a description of the city of ...[Bath][Bath] : Printed for W. Frederick, bookseller,...Printed for W. Frederick, bookseller, in BathEnglishBath (England), Description and travel, Early ...History and GeographyWith an additional titlepage for pt. 1: 'An es...17420101Bath0
2N0292270Interesting account of the early voyages, made...13099002002Interesting account of the early voyages, made...LondonLondon : Printed for the proprietors, and sold...Printed for the proprietors, and sold at Stalk...EnglishExplorers, Portugal, Early works to 1800, Expl...History and Geography17900101London0
3N031489Volume 1An entertaining journey to the Netherlands; co...13099003012An entertaining journey to the Netherlands; co...LondonLondon : printed for W. Smith, M DCC LXXXII. [...printed for W. SmithEnglishNetherlands, Description and travel, Early wor...History and GeographyCoriat Junior = Samuel Paterson?.17820101London3
4N031489Volume 2An entertaining journey to the Netherlands; co...13099003022An entertaining journey to the Netherlands; co...LondonLondon : printed for W. Smith, M DCC LXXXII. [...printed for W. SmithEnglishNetherlands, Description and travel, Early wor...History and GeographyCoriat Junior = Samuel Paterson?.17820101London3
5N031489Volume 3An entertaining journey to the Netherlands; co...13099003032An entertaining journey to the Netherlands; co...LondonLondon : printed for W. Smith, M DCC LXXXII. [...printed for W. SmithEnglishNetherlands, Description and travel, Early wor...History and GeographyCoriat Junior = Samuel Paterson?.17820101London3
6T1885700A New collection of voyages and travels, dedic...13099011002A New collection of voyages and travels, dedic...LondonLondon : Printed for E. Newbery, the corner of...Printed for E. Newbery, the corner of St. Paul...EnglishVoyages and travels, Early works to 1800History and GeographyWilliam Mavor is the editor of the 'Historical...17960101London0
7T1723460Miscellaneous remarks made on the spot, in a l...13103004002Miscellaneous remarks made on the spot, in a l...LondonLondon : Printed for S. Hooper, at Gay's Head,...Printed for S. Hooper, at Gay's Head, near Bea...EnglishItaly, Description and travel, Early works to ...History and Geography17560101London0
8T220401Volume 1The memoirs of Charles-Lewis, Baron de Pollnit...13132003012The memoirs of Charles-Lewis, Baron de Pollnit...LondonLondon : Printed for Daniel Browne, at the Bla...Printed for Daniel Browne, at the Black Swan, ...EnglishEurope, Description and travelHistory and GeographyTranslated by Stephen Whatley. In this editio...17390101London2
9T220401Volume 2The memoirs of Charles-Lewis, Baron de Pollnit...13132003022The memoirs of Charles-Lewis, Baron de Pollnit...LondonLondon : Printed for Daniel Browne, at the Bla...Printed for Daniel Browne, at the Black Swan, ...EnglishEurope, Description and travelHistory and GeographyTranslated by Stephen Whatley. In this editio...17390101London2
10W0127010A history of a voyage to the coast of Africa, ...13132004002A history of a voyage to the coast of Africa, ...PhiladelphiaPhiladelphia : printed for the author, by S. C...printed for the author, by S. C[.] Ustick, & CoEnglishHawkins, Joseph,, b. 1772, Portraits, Voyages ...History and Geography\"The author relating the history of his travel...17970101Philadelphia0
11T170016Volume 1Modern voyages: Containing a variety of useful...13162004012Modern voyages: Containing a variety of useful...DublinDublin : printed for Chamberlaine and Rice, P....printed for Chamberlaine and Rice, P. Wogan, P...EnglishVoyages and travels, Early works to 1800History and GeographyThe imprint in vol. 2 is enlarged by the addit...17900101Dublin2
12T170016Volume 2Modern voyages: Containing a variety of useful...13162004022Modern voyages: Containing a variety of useful...DublinDublin : printed for Chamberlaine and Rice, P....printed for Chamberlaine and Rice, P. Wogan, P...EnglishVoyages and travels, Early works to 1800History and GeographyThe imprint in vol. 2 is enlarged by the addit...17900101Dublin2
13T2246860The foreign travels and dangerous voyages of t...13164006002The foreign travels and dangerous voyages of t...LondonLondon : printed for M. Hotham, [1710?]printed for M. HothamEnglishVoyages and travels, Early works to 1800History and Geography17100101London0
14T1641220An accountof the shipwreck and captivity of Mr...13212009002An accountof the shipwreck and captivity of Mr...LondonLondon : Printed for C. Forster, in the Poultr...Printed for C. Forster, in the PoultryEnglishVoyages and travels, Early works to 1800History and GeographyPrice in square brackets: (Price Two Shillings...17890101London0
15T2247030Owen's new book of fairs, published by the Kin...13244001002Owen's new book of fairs, published by the Kin...LondonLondon : Printed (by assignment from W. Owen) ...Printed (by assignment from W. Owen) for J. Jo...EnglishGreat Britain, Description and travel, Early w...History and GeographyWith an initial copyright leaf. Also publishe...17990101London0
16T2239830Morse's Geography. This day is published, in o...13244009002Morse's Geography. This day is published, in o...LondonLondon : Printed for John Stockdale, Piccadill...Printed for John Stockdale, Piccadilly and sol...EnglishUnited States, Description and travel, Early w...History and GeographyAdvertisement for new books printed for John S...17920101London0
17N0346190The Present state of Sicily and Malta, Extract...13310003002The Present state of Sicily and Malta, Extract...LondonLondon : Printed for G. Kearsley, at Johnson's...Printed for G. Kearsley, at Johnson's Head, No...EnglishSicily (Italy), Description and travel, Early ...History and GeographyWith a final errata leaf. P. iv misnumbered v...17880101London0
18T1657710Holland: a jaunt to the principal places in th...13312011002Holland: a jaunt to the principal places in th...LondonLondon : Printed, and sold by W.Hay, next to t...Printed, and sold by W.Hay, next to the Societ...EnglishHolland (Netherlands : Province), Description ...History and GeographyPrice on title page: (Price 2s.)17750101London0
19T2317650A Description of the hermitage of Warkworth.13314003002A Description of the hermitage of Warkworth.[London? ][London? : s.n., 1800?]s.n.EnglishGreat Britain, Description and travel, Early w...History and Geography18000101London0
20N025348Volume 1The world displayed; or, A curious collection ...13349001012The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
21N025348Volume 2The world displayed; or, A curious collection ...13349001022The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
22N025348Volume 3The world displayed; or, A curious collection ...13349001032The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
23N025348Volume 4The world displayed; or, A curious collection ...13349001042The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
24N025348Volume 5The world displayed; or, A curious collection ...13349001052The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
25N025348Volume 6The world displayed; or, A curious collection ...13349001062The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
26N025348Volume 7The world displayed; or, A curious collection ...13349001072The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
27N025348Volume 8The world displayed; or, A curious collection ...13349001082The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
28N025348Volume 9The world displayed; or, A curious collection ...13349001092The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
29N025348Volume 10The world displayed; or, A curious collection ...13350001102The world displayed; or, A curious collection ...DublinDublin : printed by James Williams, M,DCC,LXXI...printed by James WilliamsEnglishVoyages and travelsHistory and GeographyCompiled by Christopher Smart, Oliver Goldsmit...17790101Dublin20
...................................................
282T2063540A new and enlarged book of sailing directions ...16960006002A new and enlarged book of sailing directions ...LondonLondon : Printed for Robert Laurie and James W...Printed for Robert Laurie and James Whittle, N...EnglishPilot guides, Florida, Straits ofHistory and GeographyAt foot of titlepage: \"N.B. These directions a...17960101London0
283T2156030The admirable travels of Thomas Jenkins. And D...16960010002The admirable travels of Thomas Jenkins. And D...LondonLondon : Printed for, and sold by W. Clements,...Printed for, and sold by W. Clements, and J. S...EnglishVoyages and travels, England, London, Early wo...History and GeographyWritten in the first person by David Lowellin....17910101London0
284T1872570East-Bourn, being an descriptive account of th...16964003002East-Bourn, being an descriptive account of th...LondonLondon : Printed at the Philanthropic reform, ...Printed at the Philanthropic reform, for Hookh...EnglishSussex (England), Description and travel, Earl...History and Geography17990101London0
285T186693Volume 1A journey made in the summer of 1794, through ...16980002012A journey made in the summer of 1794, through ...LondonLondon : printed for G.G. and J. Robinson, Pat...printed for G.G. and J. Robinson, Paternoster-RowEnglishEurope, Description and travelHistory and Geography17960101London2
286T186693Volume 2A journey made in the summer of 1794, through ...16980002022A journey made in the summer of 1794, through ...LondonLondon : printed for G.G. and J. Robinson, Pat...printed for G.G. and J. Robinson, Paternoster-RowEnglishEurope, Description and travelHistory and Geography17960101London2
287T2124070The world in miniature; or, The entertaining t...16982001002The world in miniature; or, The entertaining t...DublinDublin : Printed for William Williamson, books...Printed for William Williamson, bookseller, at...EnglishVoyages and travels, Early works to 1800History and GeographyWith an index.17510101Dublin0
288T2125000A tour through Ireland. In several entertainin...16982004002A tour through Ireland. In several entertainin...DublinDublin : Printed for Peter Wilson, bookseller,...Printed for Peter Wilson, bookseller, in Dame-...EnglishIreland, Description and travelHistory and GeographyAnonymous. By William Rufus Chetwood. A reiss...17480101Dublin0
289T1888480An authentic narrative of some remarkable and ...16985005002An authentic narrative of some remarkable and ...DublinDublin : Printed by Robert Dapper, for B. Dugd...Printed by Robert Dapper, for B. Dugdale, No. ...EnglishVoyages and travels, Personal narrativesHistory and GeographyWritten by Newton himself. First published as...17960101Dublin0
290T1008210The voiage and travaile of Sir John Maundevile...17056003002The voiage and travaile of Sir John Maundevile...LondonLondon : Printed for Woodman, and Lyon, in Rus...Printed for Woodman, and Lyon, in Russel-Stree...EnglishVoyages and travelsHistory and GeographyTitlepage in red and black, preceding a black ...17270101London0
291T176199Volume 1Travels in Switzerland, and in the country of ...17261003012Travels in Switzerland, and in the country of ...LondonLondon : printed for T. Cadell, in the Strand,...printed for T. Cadell, in the StrandEnglishSwitzerland, Description and travelHistory and Geography17910101London3
292T176199Volume 2Travels in Switzerland, and in the country of ...17262001022Travels in Switzerland, and in the country of ...LondonLondon : printed for T. Cadell, in the Strand,...printed for T. Cadell, in the StrandEnglishSwitzerland, Description and travelHistory and Geography17910101London3
293T176199Volume 3Travels in Switzerland, and in the country of ...17262001032Travels in Switzerland, and in the country of ...LondonLondon : printed for T. Cadell, in the Strand,...printed for T. Cadell, in the StrandEnglishSwitzerland, Description and travelHistory and Geography17910101London3
294T006285Volume 1An account of the European settlements in Amer...17267005012An account of the European settlements in Amer...LondonLondon : printed for J. Dodsley, in Pall-Mall,...printed for J. Dodsley, in Pall-MallEnglishUnited States, Description and travel, Early w...History and GeographyAnonymous. Probably a collaboration by William...17700101London2
295T006285Volume 2An account of the European settlements in Amer...17268001022An account of the European settlements in Amer...LondonLondon : printed for J. Dodsley, in Pall-Mall,...printed for J. Dodsley, in Pall-MallEnglishUnited States, Description and travel, Early w...History and GeographyAnonymous. Probably a collaboration by William...17700101London2
296T012218Volume 1Travels through Syria and Egypt, in the years ...17268002012Travels through Syria and Egypt, in the years ...LondonLondon : printed for G.G.J. and J. Robinson, P...printed for G.G.J. and J. Robinson, Pater-Nost...EnglishSyria, Description and travel, Early works to ...History and GeographyLargely a reimpression of the 1787 London edit...17880101London2
297T0418580Observations on the western parts of England, ...17268004002Observations on the western parts of England, ...LondonLondon : Printed for T. Cadell Jun. and W. Dav...Printed for T. Cadell Jun. and W. Davies, StrandEnglishEngland, Description and travel, Early works t...History and GeographyThe same setting of type was also printed with...17980101London0
298T0590330Travels in several parts of Turkey, Egypt, and...17269001002Travels in several parts of Turkey, Egypt, and...LondonLondon : Printed for the author; and sold by J...Printed for the author; and sold by J. Axtell,...EnglishMiddle East, Description and travelHistory and GeographyHorizontal chain lines.17740101London0
299T085368Volume 1A year's journey through France, and part of S...17278003012A year's journey through France, and part of S...BathBath : Printed by R. Cruttwell, for the author...Printed by R. Cruttwell, for the author; and s...EnglishFrance, Description and travel, Early works to...History and GeographyWith a list of subscribers.17770101Bath2
300T085368Volume 2A year's journey through France, and part of S...17278003022A year's journey through France, and part of S...BathBath : Printed by R. Cruttwell, for the author...Printed by R. Cruttwell, for the author; and s...EnglishFrance, Description and travel, Early works to...History and GeographyWith a list of subscribers.17770101Bath2
301T089035Volume 1Letters from Italy, describing the manners, cu...17280001012Letters from Italy, describing the manners, cu...DublinDublin : printed for W. Watson, D. Chamberlain...printed for W. Watson, D. Chamberlaine, J. Pot...EnglishItaly, Description and travelHistory and GeographyAn English woman = Anna, wife of Sir John Rigg...17760101London3
302T089035Volume 2Letters from Italy, describing the manners, cu...17280001022Letters from Italy, describing the manners, cu...DublinDublin : printed for W. Watson, D. Chamberlain...printed for W. Watson, D. Chamberlaine, J. Pot...EnglishItaly, Description and travelHistory and GeographyAn English woman = Anna, wife of Sir John Rigg...17760101London3
303T089035Volume 3Letters from Italy, describing the manners, cu...17280001032Letters from Italy, describing the manners, cu...DublinDublin : printed for W. Watson, D. Chamberlain...printed for W. Watson, D. Chamberlaine, J. Pot...EnglishItaly, Description and travelHistory and GeographyAn English woman = Anna, wife of Sir John Rigg...17760101London3
304T099651Volume 1Travels from St. Petersburg in Russia, to dive...17289001012Travels from St. Petersburg in Russia, to dive...GlasgowGlasgow : printed for the author by Robert and...printed for the author by Robert and Andrew Fo...EnglishAsia, Description and travelHistory and GeographyWith a list of subscribers in vol.1. \"Volume ...17630101Glasgow2
305T099651Volume 2Travels from St. Petersburg in Russia, to dive...17289001022Travels from St. Petersburg in Russia, to dive...GlasgowGlasgow : printed for the author by Robert and...printed for the author by Robert and Andrew Fo...EnglishAsia, Description and travelHistory and GeographyWith a list of subscribers in vol.1. \"Volume ...17630101Glasgow2
306T097845Volume 1A Collection of voyages and travels, some now ...17362002012A Collection of voyages and travels, some now ...LondonLondon : Printed by assignment from Messrs. Ch...Printed by assignment from Messrs. Churchill, ...EnglishVoyages and travelsHistory and GeographyCompiled by Awnsham Churchill and John Churchi...17440101London6
307T097845Volume 2A Collection of voyages and travels, some now ...17362002022A Collection of voyages and travels, some now ...LondonLondon : Printed by assignment from Messrs. Ch...Printed by assignment from Messrs. Churchill, ...EnglishVoyages and travelsHistory and GeographyCompiled by Awnsham Churchill and John Churchi...17440101London6
308T097845Volume 3A Collection of voyages and travels, some now ...17363001032A Collection of voyages and travels, some now ...LondonLondon : Printed by assignment from Messrs. Ch...Printed by assignment from Messrs. Churchill, ...EnglishVoyages and travelsHistory and GeographyCompiled by Awnsham Churchill and John Churchi...17440101London6
309T097845Volume 4A Collection of voyages and travels, some now ...17363001042A Collection of voyages and travels, some now ...LondonLondon : Printed by assignment from Messrs. Ch...Printed by assignment from Messrs. Churchill, ...EnglishVoyages and travelsHistory and GeographyCompiled by Awnsham Churchill and John Churchi...17440101London6
310T097845Volume 5A Collection of voyages and travels, some now ...17364001052A Collection of voyages and travels, some now ...LondonLondon : Printed by assignment from Messrs. Ch...Printed by assignment from Messrs. Churchill, ...EnglishVoyages and travelsHistory and GeographyCompiled by Awnsham Churchill and John Churchi...17440101London6
311T097845Volume 6A Collection of voyages and travels, some now ...17364001062A Collection of voyages and travels, some now ...LondonLondon : Printed by assignment from Messrs. Ch...Printed by assignment from Messrs. Churchill, ...EnglishVoyages and travelsHistory and GeographyCompiled by Awnsham Churchill and John Churchi...17440101London6
\n", 2566 | "

312 rows × 16 columns

\n", 2567 | "
" 2568 | ], 2569 | "text/plain": [ 2570 | " ESTCID currentVolume displayTitle \\\n", 2571 | "0 N025387 Volume 16 The World displayed; Or, A curious collection ... \n", 2572 | "1 N031356 0 An essay towards a description of the city of ... \n", 2573 | "2 N029227 0 Interesting account of the early voyages, made... \n", 2574 | "3 N031489 Volume 1 An entertaining journey to the Netherlands; co... \n", 2575 | "4 N031489 Volume 2 An entertaining journey to the Netherlands; co... \n", 2576 | "5 N031489 Volume 3 An entertaining journey to the Netherlands; co... \n", 2577 | "6 T188570 0 A New collection of voyages and travels, dedic... \n", 2578 | "7 T172346 0 Miscellaneous remarks made on the spot, in a l... \n", 2579 | "8 T220401 Volume 1 The memoirs of Charles-Lewis, Baron de Pollnit... \n", 2580 | "9 T220401 Volume 2 The memoirs of Charles-Lewis, Baron de Pollnit... \n", 2581 | "10 W012701 0 A history of a voyage to the coast of Africa, ... \n", 2582 | "11 T170016 Volume 1 Modern voyages: Containing a variety of useful... \n", 2583 | "12 T170016 Volume 2 Modern voyages: Containing a variety of useful... \n", 2584 | "13 T224686 0 The foreign travels and dangerous voyages of t... \n", 2585 | "14 T164122 0 An accountof the shipwreck and captivity of Mr... \n", 2586 | "15 T224703 0 Owen's new book of fairs, published by the Kin... \n", 2587 | "16 T223983 0 Morse's Geography. This day is published, in o... \n", 2588 | "17 N034619 0 The Present state of Sicily and Malta, Extract... \n", 2589 | "18 T165771 0 Holland: a jaunt to the principal places in th... \n", 2590 | "19 T231765 0 A Description of the hermitage of Warkworth. \n", 2591 | "20 N025348 Volume 1 The world displayed; or, A curious collection ... \n", 2592 | "21 N025348 Volume 2 The world displayed; or, A curious collection ... \n", 2593 | "22 N025348 Volume 3 The world displayed; or, A curious collection ... \n", 2594 | "23 N025348 Volume 4 The world displayed; or, A curious collection ... \n", 2595 | "24 N025348 Volume 5 The world displayed; or, A curious collection ... \n", 2596 | "25 N025348 Volume 6 The world displayed; or, A curious collection ... \n", 2597 | "26 N025348 Volume 7 The world displayed; or, A curious collection ... \n", 2598 | "27 N025348 Volume 8 The world displayed; or, A curious collection ... \n", 2599 | "28 N025348 Volume 9 The world displayed; or, A curious collection ... \n", 2600 | "29 N025348 Volume 10 The world displayed; or, A curious collection ... \n", 2601 | ".. ... ... ... \n", 2602 | "282 T206354 0 A new and enlarged book of sailing directions ... \n", 2603 | "283 T215603 0 The admirable travels of Thomas Jenkins. And D... \n", 2604 | "284 T187257 0 East-Bourn, being an descriptive account of th... \n", 2605 | "285 T186693 Volume 1 A journey made in the summer of 1794, through ... \n", 2606 | "286 T186693 Volume 2 A journey made in the summer of 1794, through ... \n", 2607 | "287 T212407 0 The world in miniature; or, The entertaining t... \n", 2608 | "288 T212500 0 A tour through Ireland. In several entertainin... \n", 2609 | "289 T188848 0 An authentic narrative of some remarkable and ... \n", 2610 | "290 T100821 0 The voiage and travaile of Sir John Maundevile... \n", 2611 | "291 T176199 Volume 1 Travels in Switzerland, and in the country of ... \n", 2612 | "292 T176199 Volume 2 Travels in Switzerland, and in the country of ... \n", 2613 | "293 T176199 Volume 3 Travels in Switzerland, and in the country of ... \n", 2614 | "294 T006285 Volume 1 An account of the European settlements in Amer... \n", 2615 | "295 T006285 Volume 2 An account of the European settlements in Amer... \n", 2616 | "296 T012218 Volume 1 Travels through Syria and Egypt, in the years ... \n", 2617 | "297 T041858 0 Observations on the western parts of England, ... \n", 2618 | "298 T059033 0 Travels in several parts of Turkey, Egypt, and... \n", 2619 | "299 T085368 Volume 1 A year's journey through France, and part of S... \n", 2620 | "300 T085368 Volume 2 A year's journey through France, and part of S... \n", 2621 | "301 T089035 Volume 1 Letters from Italy, describing the manners, cu... \n", 2622 | "302 T089035 Volume 2 Letters from Italy, describing the manners, cu... \n", 2623 | "303 T089035 Volume 3 Letters from Italy, describing the manners, cu... \n", 2624 | "304 T099651 Volume 1 Travels from St. Petersburg in Russia, to dive... \n", 2625 | "305 T099651 Volume 2 Travels from St. Petersburg in Russia, to dive... \n", 2626 | "306 T097845 Volume 1 A Collection of voyages and travels, some now ... \n", 2627 | "307 T097845 Volume 2 A Collection of voyages and travels, some now ... \n", 2628 | "308 T097845 Volume 3 A Collection of voyages and travels, some now ... \n", 2629 | "309 T097845 Volume 4 A Collection of voyages and travels, some now ... \n", 2630 | "310 T097845 Volume 5 A Collection of voyages and travels, some now ... \n", 2631 | "311 T097845 Volume 6 A Collection of voyages and travels, some now ... \n", 2632 | "\n", 2633 | " documentID eccorelease \\\n", 2634 | "0 1309700116 2 \n", 2635 | "1 1309700600 2 \n", 2636 | "2 1309900200 2 \n", 2637 | "3 1309900301 2 \n", 2638 | "4 1309900302 2 \n", 2639 | "5 1309900303 2 \n", 2640 | "6 1309901100 2 \n", 2641 | "7 1310300400 2 \n", 2642 | "8 1313200301 2 \n", 2643 | "9 1313200302 2 \n", 2644 | "10 1313200400 2 \n", 2645 | "11 1316200401 2 \n", 2646 | "12 1316200402 2 \n", 2647 | "13 1316400600 2 \n", 2648 | "14 1321200900 2 \n", 2649 | "15 1324400100 2 \n", 2650 | "16 1324400900 2 \n", 2651 | "17 1331000300 2 \n", 2652 | "18 1331201100 2 \n", 2653 | "19 1331400300 2 \n", 2654 | "20 1334900101 2 \n", 2655 | "21 1334900102 2 \n", 2656 | "22 1334900103 2 \n", 2657 | "23 1334900104 2 \n", 2658 | "24 1334900105 2 \n", 2659 | "25 1334900106 2 \n", 2660 | "26 1334900107 2 \n", 2661 | "27 1334900108 2 \n", 2662 | "28 1334900109 2 \n", 2663 | "29 1335000110 2 \n", 2664 | ".. ... ... \n", 2665 | "282 1696000600 2 \n", 2666 | "283 1696001000 2 \n", 2667 | "284 1696400300 2 \n", 2668 | "285 1698000201 2 \n", 2669 | "286 1698000202 2 \n", 2670 | "287 1698200100 2 \n", 2671 | "288 1698200400 2 \n", 2672 | "289 1698500500 2 \n", 2673 | "290 1705600300 2 \n", 2674 | "291 1726100301 2 \n", 2675 | "292 1726200102 2 \n", 2676 | "293 1726200103 2 \n", 2677 | "294 1726700501 2 \n", 2678 | "295 1726800102 2 \n", 2679 | "296 1726800201 2 \n", 2680 | "297 1726800400 2 \n", 2681 | "298 1726900100 2 \n", 2682 | "299 1727800301 2 \n", 2683 | "300 1727800302 2 \n", 2684 | "301 1728000101 2 \n", 2685 | "302 1728000102 2 \n", 2686 | "303 1728000103 2 \n", 2687 | "304 1728900101 2 \n", 2688 | "305 1728900102 2 \n", 2689 | "306 1736200201 2 \n", 2690 | "307 1736200202 2 \n", 2691 | "308 1736300103 2 \n", 2692 | "309 1736300104 2 \n", 2693 | "310 1736400105 2 \n", 2694 | "311 1736400106 2 \n", 2695 | "\n", 2696 | " fullTitle imprintCity \\\n", 2697 | "0 The World displayed; Or, A curious collection ... London \n", 2698 | "1 An essay towards a description of the city of ... [Bath] \n", 2699 | "2 Interesting account of the early voyages, made... London \n", 2700 | "3 An entertaining journey to the Netherlands; co... London \n", 2701 | "4 An entertaining journey to the Netherlands; co... London \n", 2702 | "5 An entertaining journey to the Netherlands; co... London \n", 2703 | "6 A New collection of voyages and travels, dedic... London \n", 2704 | "7 Miscellaneous remarks made on the spot, in a l... London \n", 2705 | "8 The memoirs of Charles-Lewis, Baron de Pollnit... London \n", 2706 | "9 The memoirs of Charles-Lewis, Baron de Pollnit... London \n", 2707 | "10 A history of a voyage to the coast of Africa, ... Philadelphia \n", 2708 | "11 Modern voyages: Containing a variety of useful... Dublin \n", 2709 | "12 Modern voyages: Containing a variety of useful... Dublin \n", 2710 | "13 The foreign travels and dangerous voyages of t... London \n", 2711 | "14 An accountof the shipwreck and captivity of Mr... London \n", 2712 | "15 Owen's new book of fairs, published by the Kin... London \n", 2713 | "16 Morse's Geography. This day is published, in o... London \n", 2714 | "17 The Present state of Sicily and Malta, Extract... London \n", 2715 | "18 Holland: a jaunt to the principal places in th... London \n", 2716 | "19 A Description of the hermitage of Warkworth. [London? ] \n", 2717 | "20 The world displayed; or, A curious collection ... Dublin \n", 2718 | "21 The world displayed; or, A curious collection ... Dublin \n", 2719 | "22 The world displayed; or, A curious collection ... Dublin \n", 2720 | "23 The world displayed; or, A curious collection ... Dublin \n", 2721 | "24 The world displayed; or, A curious collection ... Dublin \n", 2722 | "25 The world displayed; or, A curious collection ... Dublin \n", 2723 | "26 The world displayed; or, A curious collection ... Dublin \n", 2724 | "27 The world displayed; or, A curious collection ... Dublin \n", 2725 | "28 The world displayed; or, A curious collection ... Dublin \n", 2726 | "29 The world displayed; or, A curious collection ... Dublin \n", 2727 | ".. ... ... \n", 2728 | "282 A new and enlarged book of sailing directions ... London \n", 2729 | "283 The admirable travels of Thomas Jenkins. And D... London \n", 2730 | "284 East-Bourn, being an descriptive account of th... London \n", 2731 | "285 A journey made in the summer of 1794, through ... London \n", 2732 | "286 A journey made in the summer of 1794, through ... London \n", 2733 | "287 The world in miniature; or, The entertaining t... Dublin \n", 2734 | "288 A tour through Ireland. In several entertainin... Dublin \n", 2735 | "289 An authentic narrative of some remarkable and ... Dublin \n", 2736 | "290 The voiage and travaile of Sir John Maundevile... London \n", 2737 | "291 Travels in Switzerland, and in the country of ... London \n", 2738 | "292 Travels in Switzerland, and in the country of ... London \n", 2739 | "293 Travels in Switzerland, and in the country of ... London \n", 2740 | "294 An account of the European settlements in Amer... London \n", 2741 | "295 An account of the European settlements in Amer... London \n", 2742 | "296 Travels through Syria and Egypt, in the years ... London \n", 2743 | "297 Observations on the western parts of England, ... London \n", 2744 | "298 Travels in several parts of Turkey, Egypt, and... London \n", 2745 | "299 A year's journey through France, and part of S... Bath \n", 2746 | "300 A year's journey through France, and part of S... Bath \n", 2747 | "301 Letters from Italy, describing the manners, cu... Dublin \n", 2748 | "302 Letters from Italy, describing the manners, cu... Dublin \n", 2749 | "303 Letters from Italy, describing the manners, cu... Dublin \n", 2750 | "304 Travels from St. Petersburg in Russia, to dive... Glasgow \n", 2751 | "305 Travels from St. Petersburg in Russia, to dive... Glasgow \n", 2752 | "306 A Collection of voyages and travels, some now ... London \n", 2753 | "307 A Collection of voyages and travels, some now ... London \n", 2754 | "308 A Collection of voyages and travels, some now ... London \n", 2755 | "309 A Collection of voyages and travels, some now ... London \n", 2756 | "310 A Collection of voyages and travels, some now ... London \n", 2757 | "311 A Collection of voyages and travels, some now ... London \n", 2758 | "\n", 2759 | " imprintFull \\\n", 2760 | "0 London : Printed for J. Newbery, at the Bible ... \n", 2761 | "1 [Bath] : Printed for W. Frederick, bookseller,... \n", 2762 | "2 London : Printed for the proprietors, and sold... \n", 2763 | "3 London : printed for W. Smith, M DCC LXXXII. [... \n", 2764 | "4 London : printed for W. Smith, M DCC LXXXII. [... \n", 2765 | "5 London : printed for W. Smith, M DCC LXXXII. [... \n", 2766 | "6 London : Printed for E. Newbery, the corner of... \n", 2767 | "7 London : Printed for S. Hooper, at Gay's Head,... \n", 2768 | "8 London : Printed for Daniel Browne, at the Bla... \n", 2769 | "9 London : Printed for Daniel Browne, at the Bla... \n", 2770 | "10 Philadelphia : printed for the author, by S. C... \n", 2771 | "11 Dublin : printed for Chamberlaine and Rice, P.... \n", 2772 | "12 Dublin : printed for Chamberlaine and Rice, P.... \n", 2773 | "13 London : printed for M. Hotham, [1710?] \n", 2774 | "14 London : Printed for C. Forster, in the Poultr... \n", 2775 | "15 London : Printed (by assignment from W. Owen) ... \n", 2776 | "16 London : Printed for John Stockdale, Piccadill... \n", 2777 | "17 London : Printed for G. Kearsley, at Johnson's... \n", 2778 | "18 London : Printed, and sold by W.Hay, next to t... \n", 2779 | "19 [London? : s.n., 1800?] \n", 2780 | "20 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2781 | "21 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2782 | "22 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2783 | "23 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2784 | "24 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2785 | "25 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2786 | "26 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2787 | "27 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2788 | "28 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2789 | "29 Dublin : printed by James Williams, M,DCC,LXXI... \n", 2790 | ".. ... \n", 2791 | "282 London : Printed for Robert Laurie and James W... \n", 2792 | "283 London : Printed for, and sold by W. Clements,... \n", 2793 | "284 London : Printed at the Philanthropic reform, ... \n", 2794 | "285 London : printed for G.G. and J. Robinson, Pat... \n", 2795 | "286 London : printed for G.G. and J. Robinson, Pat... \n", 2796 | "287 Dublin : Printed for William Williamson, books... \n", 2797 | "288 Dublin : Printed for Peter Wilson, bookseller,... \n", 2798 | "289 Dublin : Printed by Robert Dapper, for B. Dugd... \n", 2799 | "290 London : Printed for Woodman, and Lyon, in Rus... \n", 2800 | "291 London : printed for T. Cadell, in the Strand,... \n", 2801 | "292 London : printed for T. Cadell, in the Strand,... \n", 2802 | "293 London : printed for T. Cadell, in the Strand,... \n", 2803 | "294 London : printed for J. Dodsley, in Pall-Mall,... \n", 2804 | "295 London : printed for J. Dodsley, in Pall-Mall,... \n", 2805 | "296 London : printed for G.G.J. and J. Robinson, P... \n", 2806 | "297 London : Printed for T. Cadell Jun. and W. Dav... \n", 2807 | "298 London : Printed for the author; and sold by J... \n", 2808 | "299 Bath : Printed by R. Cruttwell, for the author... \n", 2809 | "300 Bath : Printed by R. Cruttwell, for the author... \n", 2810 | "301 Dublin : printed for W. Watson, D. Chamberlain... \n", 2811 | "302 Dublin : printed for W. Watson, D. Chamberlain... \n", 2812 | "303 Dublin : printed for W. Watson, D. Chamberlain... \n", 2813 | "304 Glasgow : printed for the author by Robert and... \n", 2814 | "305 Glasgow : printed for the author by Robert and... \n", 2815 | "306 London : Printed by assignment from Messrs. Ch... \n", 2816 | "307 London : Printed by assignment from Messrs. Ch... \n", 2817 | "308 London : Printed by assignment from Messrs. Ch... \n", 2818 | "309 London : Printed by assignment from Messrs. Ch... \n", 2819 | "310 London : Printed by assignment from Messrs. Ch... \n", 2820 | "311 London : Printed by assignment from Messrs. Ch... \n", 2821 | "\n", 2822 | " imprintPublisher language \\\n", 2823 | "0 Printed for J. Newbery, at the Bible and Sun, ... English \n", 2824 | "1 Printed for W. Frederick, bookseller, in Bath English \n", 2825 | "2 Printed for the proprietors, and sold at Stalk... English \n", 2826 | "3 printed for W. Smith English \n", 2827 | "4 printed for W. Smith English \n", 2828 | "5 printed for W. Smith English \n", 2829 | "6 Printed for E. Newbery, the corner of St. Paul... English \n", 2830 | "7 Printed for S. Hooper, at Gay's Head, near Bea... English \n", 2831 | "8 Printed for Daniel Browne, at the Black Swan, ... English \n", 2832 | "9 Printed for Daniel Browne, at the Black Swan, ... English \n", 2833 | "10 printed for the author, by S. C[.] Ustick, & Co English \n", 2834 | "11 printed for Chamberlaine and Rice, P. Wogan, P... English \n", 2835 | "12 printed for Chamberlaine and Rice, P. Wogan, P... English \n", 2836 | "13 printed for M. Hotham English \n", 2837 | "14 Printed for C. Forster, in the Poultry English \n", 2838 | "15 Printed (by assignment from W. Owen) for J. Jo... English \n", 2839 | "16 Printed for John Stockdale, Piccadilly and sol... English \n", 2840 | "17 Printed for G. Kearsley, at Johnson's Head, No... English \n", 2841 | "18 Printed, and sold by W.Hay, next to the Societ... English \n", 2842 | "19 s.n. English \n", 2843 | "20 printed by James Williams English \n", 2844 | "21 printed by James Williams English \n", 2845 | "22 printed by James Williams English \n", 2846 | "23 printed by James Williams English \n", 2847 | "24 printed by James Williams English \n", 2848 | "25 printed by James Williams English \n", 2849 | "26 printed by James Williams English \n", 2850 | "27 printed by James Williams English \n", 2851 | "28 printed by James Williams English \n", 2852 | "29 printed by James Williams English \n", 2853 | ".. ... ... \n", 2854 | "282 Printed for Robert Laurie and James Whittle, N... English \n", 2855 | "283 Printed for, and sold by W. Clements, and J. S... English \n", 2856 | "284 Printed at the Philanthropic reform, for Hookh... English \n", 2857 | "285 printed for G.G. and J. Robinson, Paternoster-Row English \n", 2858 | "286 printed for G.G. and J. Robinson, Paternoster-Row English \n", 2859 | "287 Printed for William Williamson, bookseller, at... English \n", 2860 | "288 Printed for Peter Wilson, bookseller, in Dame-... English \n", 2861 | "289 Printed by Robert Dapper, for B. Dugdale, No. ... English \n", 2862 | "290 Printed for Woodman, and Lyon, in Russel-Stree... English \n", 2863 | "291 printed for T. Cadell, in the Strand English \n", 2864 | "292 printed for T. Cadell, in the Strand English \n", 2865 | "293 printed for T. Cadell, in the Strand English \n", 2866 | "294 printed for J. Dodsley, in Pall-Mall English \n", 2867 | "295 printed for J. Dodsley, in Pall-Mall English \n", 2868 | "296 printed for G.G.J. and J. Robinson, Pater-Nost... English \n", 2869 | "297 Printed for T. Cadell Jun. and W. Davies, Strand English \n", 2870 | "298 Printed for the author; and sold by J. Axtell,... English \n", 2871 | "299 Printed by R. Cruttwell, for the author; and s... English \n", 2872 | "300 Printed by R. Cruttwell, for the author; and s... English \n", 2873 | "301 printed for W. Watson, D. Chamberlaine, J. Pot... English \n", 2874 | "302 printed for W. Watson, D. Chamberlaine, J. Pot... English \n", 2875 | "303 printed for W. Watson, D. Chamberlaine, J. Pot... English \n", 2876 | "304 printed for the author by Robert and Andrew Fo... English \n", 2877 | "305 printed for the author by Robert and Andrew Fo... English \n", 2878 | "306 Printed by assignment from Messrs. Churchill, ... English \n", 2879 | "307 Printed by assignment from Messrs. Churchill, ... English \n", 2880 | "308 Printed by assignment from Messrs. Churchill, ... English \n", 2881 | "309 Printed by assignment from Messrs. Churchill, ... English \n", 2882 | "310 Printed by assignment from Messrs. Churchill, ... English \n", 2883 | "311 Printed by assignment from Messrs. Churchill, ... English \n", 2884 | "\n", 2885 | " locSubject module \\\n", 2886 | "0 Voyages and travels History and Geography \n", 2887 | "1 Bath (England), Description and travel, Early ... History and Geography \n", 2888 | "2 Explorers, Portugal, Early works to 1800, Expl... History and Geography \n", 2889 | "3 Netherlands, Description and travel, Early wor... History and Geography \n", 2890 | "4 Netherlands, Description and travel, Early wor... History and Geography \n", 2891 | "5 Netherlands, Description and travel, Early wor... History and Geography \n", 2892 | "6 Voyages and travels, Early works to 1800 History and Geography \n", 2893 | "7 Italy, Description and travel, Early works to ... History and Geography \n", 2894 | "8 Europe, Description and travel History and Geography \n", 2895 | "9 Europe, Description and travel History and Geography \n", 2896 | "10 Hawkins, Joseph,, b. 1772, Portraits, Voyages ... History and Geography \n", 2897 | "11 Voyages and travels, Early works to 1800 History and Geography \n", 2898 | "12 Voyages and travels, Early works to 1800 History and Geography \n", 2899 | "13 Voyages and travels, Early works to 1800 History and Geography \n", 2900 | "14 Voyages and travels, Early works to 1800 History and Geography \n", 2901 | "15 Great Britain, Description and travel, Early w... History and Geography \n", 2902 | "16 United States, Description and travel, Early w... History and Geography \n", 2903 | "17 Sicily (Italy), Description and travel, Early ... History and Geography \n", 2904 | "18 Holland (Netherlands : Province), Description ... History and Geography \n", 2905 | "19 Great Britain, Description and travel, Early w... History and Geography \n", 2906 | "20 Voyages and travels History and Geography \n", 2907 | "21 Voyages and travels History and Geography \n", 2908 | "22 Voyages and travels History and Geography \n", 2909 | "23 Voyages and travels History and Geography \n", 2910 | "24 Voyages and travels History and Geography \n", 2911 | "25 Voyages and travels History and Geography \n", 2912 | "26 Voyages and travels History and Geography \n", 2913 | "27 Voyages and travels History and Geography \n", 2914 | "28 Voyages and travels History and Geography \n", 2915 | "29 Voyages and travels History and Geography \n", 2916 | ".. ... ... \n", 2917 | "282 Pilot guides, Florida, Straits of History and Geography \n", 2918 | "283 Voyages and travels, England, London, Early wo... History and Geography \n", 2919 | "284 Sussex (England), Description and travel, Earl... History and Geography \n", 2920 | "285 Europe, Description and travel History and Geography \n", 2921 | "286 Europe, Description and travel History and Geography \n", 2922 | "287 Voyages and travels, Early works to 1800 History and Geography \n", 2923 | "288 Ireland, Description and travel History and Geography \n", 2924 | "289 Voyages and travels, Personal narratives History and Geography \n", 2925 | "290 Voyages and travels History and Geography \n", 2926 | "291 Switzerland, Description and travel History and Geography \n", 2927 | "292 Switzerland, Description and travel History and Geography \n", 2928 | "293 Switzerland, Description and travel History and Geography \n", 2929 | "294 United States, Description and travel, Early w... History and Geography \n", 2930 | "295 United States, Description and travel, Early w... History and Geography \n", 2931 | "296 Syria, Description and travel, Early works to ... History and Geography \n", 2932 | "297 England, Description and travel, Early works t... History and Geography \n", 2933 | "298 Middle East, Description and travel History and Geography \n", 2934 | "299 France, Description and travel, Early works to... History and Geography \n", 2935 | "300 France, Description and travel, Early works to... History and Geography \n", 2936 | "301 Italy, Description and travel History and Geography \n", 2937 | "302 Italy, Description and travel History and Geography \n", 2938 | "303 Italy, Description and travel History and Geography \n", 2939 | "304 Asia, Description and travel History and Geography \n", 2940 | "305 Asia, Description and travel History and Geography \n", 2941 | "306 Voyages and travels History and Geography \n", 2942 | "307 Voyages and travels History and Geography \n", 2943 | "308 Voyages and travels History and Geography \n", 2944 | "309 Voyages and travels History and Geography \n", 2945 | "310 Voyages and travels History and Geography \n", 2946 | "311 Voyages and travels History and Geography \n", 2947 | "\n", 2948 | " notes pubDate \\\n", 2949 | "0 17670101 \n", 2950 | "1 With an additional titlepage for pt. 1: 'An es... 17420101 \n", 2951 | "2 17900101 \n", 2952 | "3 Coriat Junior = Samuel Paterson?. 17820101 \n", 2953 | "4 Coriat Junior = Samuel Paterson?. 17820101 \n", 2954 | "5 Coriat Junior = Samuel Paterson?. 17820101 \n", 2955 | "6 William Mavor is the editor of the 'Historical... 17960101 \n", 2956 | "7 17560101 \n", 2957 | "8 Translated by Stephen Whatley. In this editio... 17390101 \n", 2958 | "9 Translated by Stephen Whatley. In this editio... 17390101 \n", 2959 | "10 \"The author relating the history of his travel... 17970101 \n", 2960 | "11 The imprint in vol. 2 is enlarged by the addit... 17900101 \n", 2961 | "12 The imprint in vol. 2 is enlarged by the addit... 17900101 \n", 2962 | "13 17100101 \n", 2963 | "14 Price in square brackets: (Price Two Shillings... 17890101 \n", 2964 | "15 With an initial copyright leaf. Also publishe... 17990101 \n", 2965 | "16 Advertisement for new books printed for John S... 17920101 \n", 2966 | "17 With a final errata leaf. P. iv misnumbered v... 17880101 \n", 2967 | "18 Price on title page: (Price 2s.) 17750101 \n", 2968 | "19 18000101 \n", 2969 | "20 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2970 | "21 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2971 | "22 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2972 | "23 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2973 | "24 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2974 | "25 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2975 | "26 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2976 | "27 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2977 | "28 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2978 | "29 Compiled by Christopher Smart, Oliver Goldsmit... 17790101 \n", 2979 | ".. ... ... \n", 2980 | "282 At foot of titlepage: \"N.B. These directions a... 17960101 \n", 2981 | "283 Written in the first person by David Lowellin.... 17910101 \n", 2982 | "284 17990101 \n", 2983 | "285 17960101 \n", 2984 | "286 17960101 \n", 2985 | "287 With an index. 17510101 \n", 2986 | "288 Anonymous. By William Rufus Chetwood. A reiss... 17480101 \n", 2987 | "289 Written by Newton himself. First published as... 17960101 \n", 2988 | "290 Titlepage in red and black, preceding a black ... 17270101 \n", 2989 | "291 17910101 \n", 2990 | "292 17910101 \n", 2991 | "293 17910101 \n", 2992 | "294 Anonymous. Probably a collaboration by William... 17700101 \n", 2993 | "295 Anonymous. Probably a collaboration by William... 17700101 \n", 2994 | "296 Largely a reimpression of the 1787 London edit... 17880101 \n", 2995 | "297 The same setting of type was also printed with... 17980101 \n", 2996 | "298 Horizontal chain lines. 17740101 \n", 2997 | "299 With a list of subscribers. 17770101 \n", 2998 | "300 With a list of subscribers. 17770101 \n", 2999 | "301 An English woman = Anna, wife of Sir John Rigg... 17760101 \n", 3000 | "302 An English woman = Anna, wife of Sir John Rigg... 17760101 \n", 3001 | "303 An English woman = Anna, wife of Sir John Rigg... 17760101 \n", 3002 | "304 With a list of subscribers in vol.1. \"Volume ... 17630101 \n", 3003 | "305 With a list of subscribers in vol.1. \"Volume ... 17630101 \n", 3004 | "306 Compiled by Awnsham Churchill and John Churchi... 17440101 \n", 3005 | "307 Compiled by Awnsham Churchill and John Churchi... 17440101 \n", 3006 | "308 Compiled by Awnsham Churchill and John Churchi... 17440101 \n", 3007 | "309 Compiled by Awnsham Churchill and John Churchi... 17440101 \n", 3008 | "310 Compiled by Awnsham Churchill and John Churchi... 17440101 \n", 3009 | "311 Compiled by Awnsham Churchill and John Churchi... 17440101 \n", 3010 | "\n", 3011 | " publicationPlace totalVolumes \n", 3012 | "0 London 1 \n", 3013 | "1 Bath 0 \n", 3014 | "2 London 0 \n", 3015 | "3 London 3 \n", 3016 | "4 London 3 \n", 3017 | "5 London 3 \n", 3018 | "6 London 0 \n", 3019 | "7 London 0 \n", 3020 | "8 London 2 \n", 3021 | "9 London 2 \n", 3022 | "10 Philadelphia 0 \n", 3023 | "11 Dublin 2 \n", 3024 | "12 Dublin 2 \n", 3025 | "13 London 0 \n", 3026 | "14 London 0 \n", 3027 | "15 London 0 \n", 3028 | "16 London 0 \n", 3029 | "17 London 0 \n", 3030 | "18 London 0 \n", 3031 | "19 London 0 \n", 3032 | "20 Dublin 20 \n", 3033 | "21 Dublin 20 \n", 3034 | "22 Dublin 20 \n", 3035 | "23 Dublin 20 \n", 3036 | "24 Dublin 20 \n", 3037 | "25 Dublin 20 \n", 3038 | "26 Dublin 20 \n", 3039 | "27 Dublin 20 \n", 3040 | "28 Dublin 20 \n", 3041 | "29 Dublin 20 \n", 3042 | ".. ... ... \n", 3043 | "282 London 0 \n", 3044 | "283 London 0 \n", 3045 | "284 London 0 \n", 3046 | "285 London 2 \n", 3047 | "286 London 2 \n", 3048 | "287 Dublin 0 \n", 3049 | "288 Dublin 0 \n", 3050 | "289 Dublin 0 \n", 3051 | "290 London 0 \n", 3052 | "291 London 3 \n", 3053 | "292 London 3 \n", 3054 | "293 London 3 \n", 3055 | "294 London 2 \n", 3056 | "295 London 2 \n", 3057 | "296 London 2 \n", 3058 | "297 London 0 \n", 3059 | "298 London 0 \n", 3060 | "299 Bath 2 \n", 3061 | "300 Bath 2 \n", 3062 | "301 London 3 \n", 3063 | "302 London 3 \n", 3064 | "303 London 3 \n", 3065 | "304 Glasgow 2 \n", 3066 | "305 Glasgow 2 \n", 3067 | "306 London 6 \n", 3068 | "307 London 6 \n", 3069 | "308 London 6 \n", 3070 | "309 London 6 \n", 3071 | "310 London 6 \n", 3072 | "311 London 6 \n", 3073 | "\n", 3074 | "[312 rows x 16 columns]" 3075 | ] 3076 | }, 3077 | "execution_count": 21, 3078 | "metadata": {}, 3079 | "output_type": "execute_result" 3080 | } 3081 | ], 3082 | "source": [ 3083 | "dfHistAndGeo = pd.DataFrame(listofdicts)\n", 3084 | "dfHistAndGeo" 3085 | ] 3086 | }, 3087 | { 3088 | "cell_type": "code", 3089 | "execution_count": 22, 3090 | "metadata": { 3091 | "collapsed": true 3092 | }, 3093 | "outputs": [], 3094 | "source": [ 3095 | "dfHistAndGeo.to_csv('files/HistAndGeo.csv')" 3096 | ] 3097 | }, 3098 | { 3099 | "cell_type": "markdown", 3100 | "metadata": {}, 3101 | "source": [ 3102 | "Okay, so, it worked (hurrah!) but it did take quite a while - a few hours. At this rate, I might have to leave do multiple sections in order to gather them all, unless there is a faster method (which there almost undoubtedly is)." 3103 | ] 3104 | }, 3105 | { 3106 | "cell_type": "markdown", 3107 | "metadata": {}, 3108 | "source": [] 3109 | }, 3110 | { 3111 | "cell_type": "markdown", 3112 | "metadata": {}, 3113 | "source": [] 3114 | } 3115 | ], 3116 | "metadata": { 3117 | "kernelspec": { 3118 | "display_name": "Python 3", 3119 | "language": "python", 3120 | "name": "python3" 3121 | }, 3122 | "language_info": { 3123 | "codemirror_mode": { 3124 | "name": "ipython", 3125 | "version": 3 3126 | }, 3127 | "file_extension": ".py", 3128 | "mimetype": "text/x-python", 3129 | "name": "python", 3130 | "nbconvert_exporter": "python", 3131 | "pygments_lexer": "ipython3", 3132 | "version": "3.6.2" 3133 | } 3134 | }, 3135 | "nbformat": 4, 3136 | "nbformat_minor": 2 3137 | } 3138 | --------------------------------------------------------------------------------