├── .gitignore ├── LICENSE ├── README.md └── access_checker.rb /.gitignore: -------------------------------------------------------------------------------- 1 | data/ 2 | output/ 3 | *.csv 4 | *.txt -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013 University Library, University of North Carolina. 2 | Written by Kristina Spurgin. 3 | 4 | This program is free software: you can redistribute it and/or modify 5 | it under the terms of the GNU General Public License as published by 6 | the Free Software Foundation, either version 3 of the License, or 7 | (at your option) any later version. 8 | 9 | This program is distributed in the hope that it will be useful, 10 | but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | GNU General Public License for more details. 13 | 14 | You should have received a copy of the GNU General Public License 15 | along with this program. If not, see . -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Access checker 2 | A simple JRuby script to check for full-text access to e-resource titles. Plain old URL/link checking won't alert you if one of your ebook links points to a valid HTML page reading "NO ACCESS." This script will. 3 | 4 | I wrote an article about the Access Checker: [Getting What We Paid for: a Script to Verify Full Access to E-Resources](http://journal.code4lib.org/articles/9684) 5 | 6 | NOTE: this Access Checker is unaware of the title > volume > issue > article hierarchy of e-journals, and doesn't have a way to input or make sense of holdings date ranges. It was designed to check access in collections of discrete items, each of which has a distinct URL---mainly ebook, streaming media, etc collections. 7 | 8 | **For list of collections/platforms/products supported, see [Access Checker wiki](https://github.com/UNC-Libraries/Access-Checker/wiki)** 9 | 10 | # Requirements 11 | - You must have [JRuby](http://jruby.org/) installed. This script has been tested on JRuby 1.7.3. Installing JRuby is super-easy; point-and-click .exe installers are available for Windows on the [JRuby homepage](http://jruby.org/). 12 | 13 | - Once JRuby is installed, you will need to install the JRuby Gems Celerity and Highline. 14 | 15 | To install these Gems, open the command line shell and type the following commands: 16 | - jruby -S gem install celerity 17 | - jruby -S gem install highline 18 | 19 | # Set up before first-time use 20 | ## Prepare your script directory 21 | Choose or create a directory/folder on your computer in which to place the access_checker.rb script. This directory can be called whatever you want, but here I'll call it the "rubyscripts" directory. 22 | 23 | **For the rest of the instructions, we'll assume the path of the rubyscripts folder is:** C:\Users\you\rubyscripts 24 | 25 | ## Download the script and put it in the rubyscripts directory 26 | * Go to https://github.com/UNC-Libraries/Access-Checker 27 | * Download ZIP file containing the files (bottom of right column) 28 | * Unzip the ZIP file on your computer 29 | * Put a copy of the access_checker.rb file from the unzipped directory into your rubyscripts directory: C:\Users\you\rubyscripts\access_checker.rb 30 | 31 | # How to use 32 | ## Prepare your input file 33 | The script expects a .csv file containing URLs for which to check access. The column containing the URL **MUST** be the last/right-most column. You may include any number of columns (RecordID#, Title, Publication Date, etc.) to the left of the URL column. 34 | Make sure there is only **one** URL per row. To use a tab-delimited file as input, see **Optional arguments** below. 35 | 36 | 37 | All URLs/titles in one input file must be in/on the same package/platform. 38 | 39 | If your URLs are prefixed with proxy strings, and you are running the script from a location where proxying isn't needed for access, deleting the proxy strings from the URLs first will speed up the script. Use Excel Replace All to do this. 40 | 41 | **Put the input file in the rubyscripts directory. Example location: C:\Users\you\rubyscripts\inputfile.csv** 42 | 43 | ## Run the script 44 | * Open your command line shell (this will be Windows PowerShell for most Windows users) 45 | * In shell, move to the rubyscripts directory. Given the example locations listed above, you will type the following and then hit Enter: 46 | ```cd C:\Users\you\rubyscripts``` 47 | 48 | In your command line shell, type (substitute in the name of your actual input file and the desired name for your actual output file): 49 | 50 | ```jruby -S access_checker.rb inputfile.csv outputfile.csv``` 51 | 52 | You may run into trouble if the filenames or directory names you need to point the Access Checker to contain spaces. In this case, it may work if you enclose the input and output file names/paths with double quotes: 53 | 54 | ```jruby -S access_checker.rb "C:\Users\Your Name\access checker\inputfile.csv" "C:\Users\Your Name\access checker\outputfile.csv"``` 55 | 56 | ### Optional arguments 57 | Include optional arguments like so: 58 | * jruby -S access_checker.rb [arguments] [input] [output] 59 | * for example: jruby -S access_checker.rb -t -b inputfile.txt outputfile.csv 60 | 61 | Options: 62 | * -t (or --tab_delimited): 63 | 64 | the input file is read as a tab-delimited file rather than a csv. If newlines or tabs are contained in the data fields themselves, this could cause errors. 65 | 66 | * -b (or --write_utf8_bom) 67 | 68 | when writing to a new (non-existing) output file, manually add a UTF-8 BOM (primary use case: allowing Excel to directly open the csv with proper encoding). Has no effect if appending to an existing output file. 69 | 70 | When asked to input "Package?" enter the 3-4 letter code from the list above the input prompt. 71 | 72 | ## Output 73 | Script will output a .csv file containing all data from the input file, with a new "access" column appended. 74 | 75 | ## If the script chokes/dies (or you need to otherwise stop it) while running... 76 | You don't have to start over from the beginning. Remove all rows already checked (i.e. included in the output file) from the input file and restart the script, using the same output file location. 77 | 78 | The header row will be inserted into the output file again, so watch for that in the final results. 79 | 80 | # How it works 81 | First, this script does not access, download, or touch *ANY* actual full-text content hosted by our providers. 82 | 83 | It simply visits the landing/description/info page for each ostensibly full-text resource---the page a user clicking the link in a catalog record would be brought to, at the same URL that our ILS link checker would ping. 84 | 85 | Depending on the platform/package, it checks for text indicating full or restricted access a) displayed on that page; OR b) buried in the page source code. 86 | -------------------------------------------------------------------------------- /access_checker.rb: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env ruby 2 | 3 | # Tested in JRuby 1.7.3 4 | # Written by Kristina Spurgin 5 | 6 | # Usage: 7 | # jruby -S access_checker.rb [arguments] [inputfilelocation] [outputfilelocation] 8 | 9 | # Input file: 10 | # .csv file with: 11 | # - one header row 12 | # - any number of columns to left of final column 13 | # - one URL in final column 14 | # - accepts tab-delimited files through use of arguments 15 | 16 | # Output file: 17 | # .csv file with all the data from the input file, plus a new column containing 18 | # access checker result 19 | 20 | # Optional arguments: 21 | # e.g. jruby -S access_checker.rb -t -b inputfile.txt outputfile.csv 22 | # 23 | # -t (or --tab_delimited): 24 | # The input file is read as a tab-delimited file rather than a csv. If 25 | # newlines or tabs are contained in the data fields themselves, this could 26 | # cause errors. Should work with utf-8 or unicode input files; may not work 27 | # with some other encodings 28 | # 29 | # -b (or --write_utf8_bom) 30 | # When writing to a new (non-existing) output file, manually add a UTF-8 BOM 31 | # (primary use case: allowing Excel to directly open the csv with proper 32 | # encoding). Has no effect if appending to an existing output file. 33 | # 34 | 35 | require 'celerity' 36 | require 'csv' 37 | require 'highline/import' 38 | require 'open-uri' 39 | 40 | puts "-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=" 41 | puts "What platform/package are you access checking?" 42 | puts "Type one of the following:" 43 | puts " asp : Alexander Street Press links" 44 | puts " alman : Al Manhal" 45 | puts " apb : Apabi ebooks" 46 | puts " brep : Brepols (brepolsonline.net)" 47 | puts " cup : Cambridge University Press" 48 | puts " ciao : Columbia International Affairs Online" 49 | puts " cod : Criterion on Demand" 50 | puts " dgry : De Gruyter ebook platform" 51 | puts " dgtla : Digitalia ebooks" 52 | puts " dram : DRAM" 53 | puts " dupsc : Duke University Press (via Silverchair)" 54 | puts " eai : Early American Imprints (Readex)" 55 | puts " ebr : Ebrary links" 56 | puts " ebs : EBSCOhost ebook collection" 57 | puts " end : Endeca - Check for undeleted records" 58 | puts " fmgfod : FMG Films on Demand" 59 | puts " ieee : IEEE" 60 | puts " igi : IGI Global" 61 | puts " kan : Kanopy Streaming Video" 62 | puts " knv : Knovel" 63 | puts " lion : LIterature ONline (Proquest)" 64 | puts " nccorv : NCCO - Check for related volumes" 65 | puts " obo : Oxford Bibliographies Online" 66 | puts " oho : Oxford Handbooks Online" 67 | puts " psynet : Psychotherapy.net videos" 68 | puts " sabov : Sabin Americana - Check for Other Volumes" 69 | puts " skno : SAGE Knowledge links" 70 | puts " srmo : SAGE Research Methods Online links" 71 | puts " scid : ScienceDirect ebooks (Elsevier)" 72 | puts " siam : SIAM: Society for Industrial and Applied Mathmatics" 73 | puts " ss : SerialsSolutions links" 74 | puts " spr : SpringerLink links" 75 | puts " uncfa : UNC Finding Aids" 76 | puts " upso : University Press (inc. Oxford) Scholarship Online links" 77 | puts " waf : Wright American Fiction" 78 | puts " wol : Wiley Online Library" 79 | puts "-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=" 80 | 81 | package = ask("Package? ") 82 | if package == "spr" 83 | get_ebk_pkg = ask("Do you also want to retrieve subject module/ebook package for each title? y/n ") 84 | end 85 | 86 | puts "\nPreparing to check access...\n" 87 | 88 | if ARGV.include?('-t') || ARGV.include?('--tab_delimited') 89 | input_is_tab_delimited = true 90 | ARGV.delete('-t') 91 | ARGV.delete('--tab_delimited') 92 | else 93 | input_is_tab_delimited = false 94 | end 95 | 96 | if ARGV.include?('-b') || ARGV.include?('--write_utf8_bom') 97 | write_utf8_bom = true 98 | ARGV.delete('-b') 99 | ARGV.delete('--write_utf8_bom') 100 | else 101 | write_utf8_bom = false 102 | end 103 | 104 | input = ARGV[0] 105 | output = ARGV[1] 106 | 107 | 108 | if input_is_tab_delimited 109 | begin 110 | # attempt to read the file using default quote_char 111 | csv_data = CSV.read(input, 112 | :headers => true, 113 | :col_sep => "\t") 114 | rescue CSV::MalformedCSVError 115 | begin 116 | # CSV wants unescaped quote_char only around entire fields. So, try 117 | # giving it an unprintable char. 118 | csv_data = CSV.read(input, 119 | :headers => true, 120 | :col_sep => "\t", 121 | :quote_char => "\x00") 122 | rescue CSV::MalformedCSVError 123 | # try to read the file as Unicode; will convert to utf-8 124 | csv_data = CSV.read(input, 125 | :headers => true, 126 | :col_sep => "\t", 127 | :quote_char => "\x00", 128 | :encoding => "BOM|UTF-16LE:UTF-8") 129 | end 130 | end 131 | else 132 | csv_data = CSV.read(input, :headers => true) 133 | end 134 | headers = csv_data.headers 135 | 136 | 137 | if write_utf8_bom and not File.exist?(output) 138 | File.open(output, 'w') do |file| 139 | file.write "\uFEFF" 140 | end 141 | end 142 | 143 | 144 | counter = 0 145 | total = csv_data.count 146 | 147 | 148 | headers << "access" 149 | 150 | if get_ebk_pkg == "y" 151 | headers << "ebook package" 152 | end 153 | 154 | CSV.open(output, "a") do |c| 155 | c << headers 156 | end 157 | 158 | 159 | if package == "kan" 160 | agent_spoof = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' 161 | b = Celerity::Browser.new(:browser => :firefox, :user_agent => agent_spoof) 162 | elsif package == "asp" || package == "lion" 163 | # On 12/6/17, ASP was redirecting from http to https when using celerity 164 | # and for unknown reason causing SSL/certificate errors. Disabling 165 | # secure_ssl, which I don't know that we really care about, for ASP. 166 | # ASP should finish making changes to their site in Jan 2018, so some 167 | # time after that see if this exception can be removed. (At the time 168 | # this was happening visiting an http URL in firefox was not redirecting 169 | # and visiting an https URL was not resulting in certificate problems.) 170 | b = Celerity::Browser.new(:browser => :firefox, :secure_ssl => false) 171 | else 172 | b = Celerity::Browser.new(:browser => :firefox) 173 | #b = Celerity::Browser.new(:browser => :firefox, :log_level => :all) 174 | end 175 | 176 | 177 | if package == "oho" || package == "obo" 178 | # unite Oxford logic under upso 179 | package = "upso" 180 | end 181 | 182 | 183 | b.css = false 184 | b.javascript_enabled = false 185 | 186 | # APPDEV-11425: Sage platforms require javascript to load 187 | if package == "srmo" || package == "skno" 188 | b.javascript_enabled = true 189 | end 190 | 191 | 192 | csv_data.each do |r| 193 | row_array = r.to_csv.parse_csv 194 | url = row_array.pop 195 | rest_of_data = row_array 196 | 197 | if package == "ss" 198 | # this creates a new url based on the library code (e.g. VB3LK7EB4T) 199 | # and criteria (e.g. JC_005405622) to get around the angular.js 200 | # it may not work on all serialsolutions URLS. Sample, working urls: 201 | # url = 'http://VB3LK7EB4T.search.serialssolutions.com/?V=1.0&L=VB3LK7EB4T&S=JCs&C=JC_005405622&T=marc' 202 | # url = 'http://VB3LK7EB4T.search.serialssolutions.com/?V=1.0&L=VB3LK7EB4T&S=JCs&C=TC_026248270&T=marc' 203 | match = url.match('://([^.]*).*&C=([^&]*)') 204 | if match and match.size == 3 205 | lib, criteria = match[1..2] 206 | url2 = "http://%s.search.serialssolutions.com/ejp/api/1/libraries/%s/search/types/title_code/%s" % [lib, lib, criteria] 207 | page = open(url2).read 208 | else 209 | page = "This script is not configured to accept this URL structure." 210 | end 211 | else 212 | # 213 | # For every package but SerSol, do this: 214 | # 215 | b.goto(url) 216 | page = b.html 217 | end 218 | 219 | if package == "apb" 220 | sleeptime = 1 221 | if page.match(/type="onlineread"/) 222 | access = "Access probably ok" 223 | else 224 | access = "Check access manually" 225 | end 226 | 227 | elsif package == "alman" 228 | sleeptime = 1 229 | if page.include?("\"AvailabilityMode\":4") 230 | access = "Preview mode" 231 | elsif page.include?("\"AvailabilityMode\":2") 232 | access = "Full access" 233 | elsif page.include?("id=\"searchBox") 234 | access = "No access. Item not found" 235 | else 236 | access = "Check access manually" 237 | end 238 | 239 | elsif package == "asp" 240 | sleeptime = 1 241 | if page.include?("Page Not Found") 242 | access = "Page not found" 243 | elsif page.include?("This is a sample. For full access:") 244 | access = "Sample" 245 | elsif page.include?("Trial login | Alexander Street") 246 | access = "Trial" 247 | elsif page.include?("Your institution does not have access to this particular content.") 248 | access = "Institution does not have access" 249 | elsif page.match(/Book not found./) 266 | access = "Page not found" 267 | elsif page.match(/title="Full Access"/) 268 | access = "Full Access" 269 | else 270 | access = "Check access manually" 271 | end 272 | 273 | elsif package == "ciao" 274 | sleeptime = 1 275 | if page.match(/
Download Pages/) 372 | access = "Full access" 373 | else 374 | access = "Check access manually" 375 | end 376 | 377 | elsif package == "ebr" 378 | sleeptime = 1 379 | if page.include?("Sorry, this ebook is not available at your library.") 380 | access = "No access" 381 | elsif page.match(/Your institution has (unlimited |)access/) 382 | access = "Full access" 383 | else 384 | access = "Check access manually" 385 | end 386 | 387 | elsif package == "ebs" 388 | sleeptime = 1 389 | #reformulate url and follow to actual results 390 | if page.match(/window.location.replace.'([^']*)/) 391 | query = page.match(/window.location.replace.'([^']*)/)[1] 392 | baseurl = b.url.gsub(/plink.*/,'') 393 | url = baseurl + query 394 | b.goto(url) 395 | page = b.html 396 | end 397 | if page.match(/class="std-warning-text">No results/) 398 | access = "No access" 399 | elsif page.match(/"available":"True"/) 400 | access = "Full access" 401 | else 402 | access = "check" 403 | end 404 | 405 | elsif package == "end" 406 | sleeptime = 1 407 | if page.include?("Invalid record") 408 | access = "deleted OK" 409 | else 410 | access = "possible ghost record - check" 411 | end 412 | 413 | elsif package == "fmgfod" 414 | sleeptime = 10 415 | if page.include?("The title you are looking for is no longer available") 416 | access = "No access" 417 | elsif page.include?("It looks like you were provided with the incorrect link.") 418 | access = "No access" 419 | elsif page.match(/player_id="divMedia"/) 420 | access = "Full access" 421 | elsif page.match(/class="now-playing-div/) 422 | access = "Full access" 423 | else 424 | access = "Check access manually" 425 | end 426 | 427 | elsif package == "ieee" 428 | sleeptime = 10 429 | if page.include?("Included in Your Digital Subscription") 430 | access = "Full access" 431 | elsif page.include?("This publication is an Open Access only journal. Open Access provides unrestricted online access to peer-reviewed journal articles.") 432 | access = "Full access" 433 | elsif page.include?("Freely Available from IEEE") 434 | access = "Full access" 435 | elsif page.include?("Page Not Found") 436 | access = "Check access manually" 437 | elsif page.include?("Full text access may be available. Click article title to sign in or learn about subscription options.") 438 | access = "Check access manually" 439 | else 440 | access = "Check access manually" 441 | end 442 | 443 | elsif package == "igi" 444 | sleeptime = 10 445 | if page.match(/title="Owned"/) 446 | access = "owned" 447 | elsif page.include?("Institution Prices") 448 | access = "not owned" 449 | else 450 | access = "check manually" 451 | end 452 | 453 | elsif package == "kan" 454 | sleeptime = 10 455 | if page.include?("Your institution has not licensed") 456 | access = "No access" 457 | elsif page.include?("This film is not available at your institution") 458 | access = "No access" 459 | elsif page.include?("Sorry, this video is not available in your territory") 460 | access = "No access" 461 | elsif page.match(/
/) 472 | access = "No access" 473 | else 474 | access = "Full access" 475 | end 476 | 477 | elsif package == "lion" 478 | sleeptime = 5 479 | if page.match(/Open access/) 480 | access = "Open access" 481 | elsif page.match(/Full text available/) 482 | access = "Full access" 483 | elsif page.match(/javascript:fulltext.*textsFT/) 484 | access = "Full access" 485 | elsif page.match(/
/) 486 | access = "Full access (Crit/Ref)" 487 | elsif page.match(/forward=critref_ft/) 488 | access = "Full access via browse list (crit/ref)" 489 | elsif page.match(//) 490 | access = "Full access (video content)" 491 | elsif page.include?("An error has occurred which prevents us from displaying this document") 492 | access = "Error" 493 | elsif page.match(/publication was not found/) 494 | access = "Not found" 495 | else 496 | access = "Check access manually" 497 | end 498 | 499 | elsif package == "nccorv" 500 | sleeptime = 1 501 | if page.match(/
/) 502 | access = "related volumes section present" 503 | else 504 | access = "no related volumes section" 505 | end 506 | 507 | # elsif package == "obo" 508 | # elsif package == "oho" 509 | # 510 | # Oxford logic is united under 'upso' 511 | # obo and oho remain as entries on the menu, but if selected are 512 | # reassigned to 'upso' before this conditional 513 | 514 | elsif package == "psynet" 515 | sleeptime = 1 516 | if page.match(/The stream you requested does not exist./) 517 | access = "Page not found" 518 | elsif page.match(/id="stream_content">/) 519 | access = "Full access likely" 520 | else 521 | access = "Check access manually" 522 | end 523 | 524 | elsif package == "sabov" 525 | sleeptime = 1 526 | if page.match(//) 527 | access = "other volumes section present" 528 | else 529 | access = "no other volumes section" 530 | end 531 | 532 | elsif package == "scid" 533 | sleeptime = 1 534 | if page.include?("(error 404)") 535 | access = "404 error" 536 | elsif page.match(/You are not entitled to access the full text/) 537 | access = "Restricted access" 538 | elsif page.include?("Sorry, your subscription does not entitle you to access this page") 539 | access = "Restricted access - cannot display page" 540 | elsif page.match(/class="offscreen">Entitled to full text<.+{4,}/) 541 | access = "Full access" 542 | elsif page.match(/"isEntitled\\?":false/) 543 | access = "Restricted access" 544 | elsif page.match(/"isEntitled\\?":true/) 545 | access = "Full access" 546 | elsif page.match(/class="mrwLeftLinks">You are not entitled to access the full text/) 552 | access = "Restricted access" 553 | elsif index_page.match(/class="offscreen">Entitled to full text<.+{4,}/) 554 | access = "Full access to 4 or more reference work articles" 555 | end 556 | elsif page.match(/You are not entitled to access the full text/) 562 | access = "Restricted access" 563 | elsif index_page.match(/class="offscreen">Entitled to full text<.+{4,}/) 564 | access = "Full access to 4 or more reference work articles" 565 | end 566 | else 567 | access = "check manually" 568 | end 569 | 570 | elsif package == "siam" 571 | sleeptime = 1 572 | if page.include?("Book not found") 573 | access = "Page not found" 574 | elsif page.include?('title="No Access"') 575 | access = "Restricted access" 576 | elsif page.include?('title="Full Access"') 577 | access = "Full access" 578 | elsif page.include?("DOI Not Found") 579 | access = "DOI error" 580 | else 581 | access = "Check access manually" 582 | end 583 | 584 | elsif package == "skno" 585 | sleeptime = 1 586 | if page.include?("Page Not Found") 587 | access = "No access - page not found" 588 | elsif page.match(/
\s*
/) 589 | access = "Restricted access" 590 | elsif page.match(/
\s*/) 603 | access = "Restricted access" 604 | elsif page.match(/
<\/div>/) 605 | access = "Restricted access" 606 | else 607 | access = "Check access manually" 608 | end 609 | 610 | elsif package == "spr" 611 | sleeptime = 1 612 | case page 613 | when /.Open Access.:.Y./i 614 | access = "Open access" 615 | when /.HasAccess.:.Y./i 616 | access = "Full access" 617 | when /.hasAccess.:.N./i 618 | access = "Restricted access" 619 | when /viewType="Denial"/ 620 | access = "Restricted access" 621 | when /viewType="Full text download"/ 622 | access = "Full access" 623 | when /viewType="Book pdf download"/ 624 | access = "Full access" 625 | when /viewType="EPub download"/ 626 | access = "Full access" 627 | when /viewType="Chapter pdf download"/ 628 | access = "Full access (probably). Some chapters can be downloaded, but it appears the entire book cannot. May want to check manually." 629 | when /viewType="Reference work entry pdf download"/ 630 | access = "Reference work with access to PDF downloads. May want to check manually, as we have discovered some reference work entry PDFs contain no full text content." 631 | when /DOI Not Found/ 632 | access = "DOI error" 633 | no_spr_content = true 634 | when /

Page not found<\/h1>/ 635 | access = "Page not found (404) error" 636 | no_spr_content = true 637 | when /Bookshop, Wageningen/ 638 | access = "wageningenacademic.com" 639 | no_spr_content = true 640 | else 641 | access = "Check access manually" 642 | end 643 | 644 | if get_ebk_pkg == "y" 645 | if no_spr_content 646 | ebk_pkg = "n/a" 647 | else 648 | match_chk = /href="\/search\?facet-content-type=%22Book%22&package=\d+&facet-start-year=\d{4}&facet-end-year=\d{4}">([^<]+)<\/a>/.match(page) 649 | if match_chk 650 | ebk_pkg = match_chk[1] 651 | end 652 | end 653 | end 654 | 655 | elsif package == "srmo" 656 | sleeptime = 1 657 | if page.include?("Page Not Found") 658 | access = "No access - page not found" 659 | elsif page.include?("'access': 'false'") 660 | access = "Restricted access" 661 | elsif page.include?("'access': 'true'") 662 | access = "Full access" 663 | else 664 | access = "Check access manually" 665 | end 666 | 667 | elsif package == "ss" 668 | sleeptime = 1 669 | if page.match('{"titles":\[\],"pages":\[\]') 670 | access = "No access indicated" 671 | elsif page.match('{"titles":\[{"title":') 672 | access = "Access indicated" 673 | elsif page.match("This script is not configured to accept this URL structure.") 674 | access = "Unknown URL structure." 675 | else 676 | access = "Check access manually" 677 | end 678 | 679 | elsif package == "uncfa" 680 | sleeptime = 1 681 | if page.match("404 Not Found") 682 | access = "Page not found" 683 | elsif page.match('Collection') 684 | access = "Finding aid accessible" 685 | else 686 | access = "Check access manually" 687 | end 688 | 689 | elsif package == "upso" 690 | sleeptime = 1 691 | if page.include?("DOI Not Found") 692 | access = "DOI error" 693 | elsif page.match(/pf:authorized" ?: ?"authorized/) 694 | access = "Full access" 695 | elsif page.match(/pf:authorized" ?: ?"not-authorized/) 696 | if page.match(/Page Not Found/) 697 | access = "Page not found" 698 | else 699 | access = "Restricted" 700 | end 701 | else 702 | access = "Check manually" 703 | end 704 | 705 | elsif package == "waf" 706 | sleeptime = 1 707 | if page.include?("title=\"View the entire text of the document. NOTE: Text might be very lengthy.\">Entire Document") 708 | access = "Full access" 709 | else 710 | access = "Check access manually" 711 | end 712 | 713 | elsif package == 'wol' 714 | sleeptime = 1 715 | if page.include?("You have full text access to this content

") 716 | access = "Full access" 717 | if page.include?("agu_logo.jpg") 718 | access += " - AGU" 719 | end 720 | elsif page.include?("You have full text access to this content") 721 | access = "Full text access to partial contents" 722 | elsif page.match(/You have free access to this content<\/span>