├── .gitignore
├── Classifiers
├── MDW_by_subject.csv
├── Naive-Bayes-Classifier.py
└── PracticePDFExtractor.py
├── LICENSE
├── README.md
├── UI
└── SimpleQuerier.html
├── cc-by-nc-sa-3.0.md
├── extractor_research
├── extractors
│ ├── __init__.py
│ ├── miner.py
│ ├── pdf2.py
│ ├── pdfbox-app-1.8.5.jar
│ ├── pdfbox.py
│ ├── textstream.py
│ ├── textstream
│ │ ├── LICENSE
│ │ ├── PDFTextStream.jar
│ │ ├── TextStream.class
│ │ └── TextStream.java
│ └── xpdf.py
├── input
│ ├── American_Opera_Rev_Syllabus.pdf
│ ├── Leonard_Intro_Musicology_syllabus(2).pdf
│ ├── Leonard_Intro_Musicology_syllabus.pdf
│ ├── Leonard_Victorian_Music_syllabus_docx.pdf
│ ├── Leonard_Women_Music_syllabus.pdf
│ ├── Music_since_1900_syllabus.pdf
│ └── pride_and_prej
│ │ ├── 1.pdf
│ │ ├── 2.pdf
│ │ ├── 3.pdf
│ │ └── plain.txt
├── main.py
├── output
│ ├── American_Opera_Rev_Syllabus.html
│ ├── American_Opera_Rev_Syllabus.xml
│ ├── Leonard_Intro_Musicology_syllabus(2).html
│ ├── Leonard_Intro_Musicology_syllabus(2).xml
│ ├── Leonard_Intro_Musicology_syllabus.html
│ ├── Leonard_Intro_Musicology_syllabus.xml
│ ├── Leonard_Victorian_Music_syllabus_docx.html
│ ├── Leonard_Victorian_Music_syllabus_docx.xml
│ ├── Leonard_Women_Music_syllabus.html
│ ├── Leonard_Women_Music_syllabus.xml
│ ├── Music_since_1900_syllabus.html
│ ├── Music_since_1900_syllabus.xml
│ └── pride_and_prej
│ │ ├── miner_with_layout
│ │ └── 1.txt
│ │ ├── miner_without_layout
│ │ └── 1.txt
│ │ ├── pdf2_default
│ │ └── 1.txt
│ │ ├── pdfbox_default
│ │ └── 1.txt
│ │ ├── textstream_default
│ │ └── 1.txt
│ │ ├── xpdf_with_layout
│ │ └── 1.txt
│ │ └── xpdf_without_layout
│ │ └── 1.txt
├── stats
│ └── pride_and_prej
│ │ └── 1_speed_log.txt
└── visualize
│ ├── __init__.py
│ └── html_parser.py
├── gpl-3.0.md
├── opensyllabus
├── __init__.py
├── config.py
├── core
│ ├── __init__.py
│ ├── extractor.py
│ ├── ingestion.py
│ ├── mongo.py
│ ├── ocr.py
│ └── utils.py
├── run_getemptydocs.py
├── run_getstats.py
└── run_ingestion.py
├── sanitize.py
└── twitter
├── .gitignore
└── twitter.py
/.gitignore:
--------------------------------------------------------------------------------
1 | .project
2 | .pydevproject
3 | *.pyc
4 | scrapes/mobileread/data/*
5 |
--------------------------------------------------------------------------------
/Classifiers/MDW_by_subject.csv:
--------------------------------------------------------------------------------
1 | SUBJECT,anthropology,art,biology,chemistry,economics,english,french,geology,german,history,italian,latin,math,music,philosophy,physical-education,physics,political-science,psychology,sociology,spanish,theatre
2 | 0,christmas,sculpture,genetic,chem,econ,paraphrase,vous,tectonics,frei,hist,katerinov,bayerle,tutoring,archetto,meditations,vigorous,erasing,ucsd,psychodynamic,soc,curso,rob
3 | 1,neandertals,stokstad,genetics,aqueous,maddox,informing,fren,sedimentary,ferien,aftermath,clotilde,hbayerl,exponential,concert,hume,initiate,frosso,gospel,psypos,mcquaide,repaso,mowen
4 | 2,archaeology,marilyn,cells,bonding,monopoly,aivey,devoir,rocks,und,criticizing,boriosi,fianl,logarithmic,marchetto,hackett,gym,reproducible,enemy,improperly,findings,tulo,onstage
5 | 3,kalahari,studio,mutation,guilty,microeconomics,ivey,sur,geologic,schoene,dleinwe,katerin,mowethfr,trigonometric,cough,nicomachean,fitness,planner,lenin,conviction,stratification,para,accommodated
6 | 4,kinship,diverging,transport,mole,aplia,adriane,cours,metamorphic,deutschkurs,youngblood,vacanze,mowefr,thoughts,melody,phil_ox,gymnasium,reproducibility,mussolini,patti,deviance,composici,husband
7 | 5,anthro,interpretative,outcrops,reactions,unemployment,persuasively,mercredi,minerals,lehrbuch,sashmor,istruzione,whiteley,predicate,governs,enquiry,worn,seitaridou,donald,infancy,ehrenreich,hispanic,instill
8 | 6,abu,sketchbook,outcrop,stoichiometry,athletes,disruption,pour,weathering,wie,strives,italiano,gould,trigonometry,choral,unjust,jewelry,momentum,democracies,psychologist,medicalization,por,script
9 | 7,biocultural,patron,respiration,chemist,oligopoly,liveliness,jours,sanctioned,arbeitsbuch,ashmore,introduzione,conspicuous,theorem,poise,phil,fax,serway,election,freud,dimed,enero,imaginatively
10 | 8,anthropologist,cubism,membrane,expects,monopolistic,pertinence,expos,mineral,wiederholung,hst,albergo,ovid,binomial,piano,eewilso,adhering,eseitar,unfamiliar,mindedness,notions,introducci,talented
11 | 9,symbols,camille,chromosome,thermochemistry,scholastic,drama,lundi,igneous,ueber,constituting,letture,punctual,emergencies,skirt,mistaken,wellness,electricity,jeffrey,unguarded,interactionism,abril,clog
12 | 10,anthropology,cottrell,inheritance,configuration,macroeconomics,adherence,semaine,glaciation,abschlusspruefungen,impermissible,esame,linguist,unaided,accompanist,kant,sweat,optics,hardship,adolescence,mcknight,febrero,proximity
13 | 11,goldschmidt,receptivity,knisely,carbon,mankiw,midsummer,tre,volcanism,kapitel,scrupulous,citta,metamorphoses,optimization,receipt,phaedo,disclose,magnetism,electoral,proclaims,homicide,presentaci,finalize
14 | 12,aping,egg,mitosis,programmable,converted,tangential,avril,henderson,ich,archival,varie,informs,antiderivatives,eraser,realizing,offensive,vuille,cutting,psychologists,orgins,leer,experiencing
15 | 13,caveman,dialogic,meiosis,chromatography,preclude,typo,bureau,deformation,zum,instant,viaggio,rapid,polynomial,ticket,decipher,pant,torque,bargaining,testable,looks,puntos,clock
16 | 14,genital,fauvism,biodiversity,hybridization,ninkovic,refusing,detectable,crustal,dien,mississippi,testi,translating,wherever,repertoire,kress,probation,rocket,partisan,deception,cockerham,trabajo,entertain
17 | 15,schick,hardback,photosynthesis,spectroscopy,taxation,persuasiveness,mardi,hydrocarbon,stadt,lowing,capitolo,save,separable,chorale,influencing,taker,emailing,madison,humanistic,lindsey,todo,unapproved
18 | 16,hijras,transportable,nitya,organic,jninkov,neglect,jsvient,earthquakes,diens,footnoted,merc,latin,nit,chamber,unfortunately,inappropriately,weighting,ssb,abnormal,mckinlay,lengua,experiential
19 | 17,hodder,globe,vascular,solvent,curved,haphazard,printemps,busch,fruehlingssemester,awake,corso,consequently,inverse,intonation,philosophers,reoccurring,farhan,pols,prenatal,outlaw,otro,mysteries
20 | 18,knauft,criticality,njacob,gases,breech,harassment,svienty,volcanoes,sommerferien,curse,eserciziario,kept,poisson,blouse,groundwork,endurance,llewellyn,bshapir,differentiating,gabe,sobre,warmups
21 | 19,mayr,tempera,transmission,processed,jasminka,proceeding,vrier,deserve,den,sponsorship,vacanza,henry,sigma,ankle,dictated,assess,segre,votes,therapies,catches,las,suzan
22 |
--------------------------------------------------------------------------------
/Classifiers/Naive-Bayes-Classifier.py:
--------------------------------------------------------------------------------
1 | import textblob
2 | import numpy
3 |
4 | class Document(object):
5 |
6 | STOPWORDS = "are you my I a and these to it with me your not but him do so"
7 |
8 | @classmethod
9 | def make_stop_words(cls, stopwords):
10 | return stopwords.lower().split()
11 |
12 | def __init__(self, text, label=None):
13 | self.text = text
14 | self.label = label
15 | self.stopwords = Document.make_stop_words(Document.STOPWORDS)
16 | self.wordVector = None
17 |
18 | def get_label(self):
19 | return self.label
20 |
21 | def split_and_remove_stop_words(self):
22 | ## split and make all the words lower case
23 | splitText = self.text.lower().split()
24 | scrubbedText = []
25 | for word in splitText:
26 | if word not in self.stopwords:
27 | scrubbedText.append(word)
28 | self.wordVector = scrubbedText
29 |
30 | def count_tokens(self):
31 | return len(self.wordVector)
32 |
33 | def get_word_frequencies(self):
34 | wordFreq = {}
35 | for word in self.wordVector:
36 | if word not in wordFreq:
37 | wordFreq[word] = 1
38 | else:
39 | wordFreq[word] += 1
40 | return wordFreq
41 |
42 | def get_vocabulary(self):
43 | wordFreq = self.get_word_frequencies()
44 | return wordFreq.keys()
45 |
46 | class DocDatabase(object):
47 |
48 | def __init__(self, documents):
49 | self.documents = documents
50 | self.classes = self.get_classes()
51 | self.vocabulary = self.construct_complete_vocabulary()
52 | self.priorProbs = self.calc_prior_probs()
53 | self.conditionalProbs = self.calc_conditional_prob_per_word()
54 |
55 | def get_classes(self):
56 | classes = []
57 | for d in self.documents:
58 | label = d.get_label()
59 | if label not in classes:
60 | classes.append(label)
61 | return classes
62 |
63 | def count_docs_per_class(self):
64 | """ Determine the number of documents per class """
65 | classCounts = { c:0 for c in self.classes }
66 | for d in self.documents:
67 | label = d.get_label()
68 | classCounts[label] += 1
69 | return classCounts
70 |
71 | def calc_prior_probs(self):
72 | """ Determine the probabilty of each class. This is also known as the
73 | prior probability. """
74 | classCounts = self.count_docs_per_class()
75 | totalNumTexts = sum(classCounts.values())
76 | classProbs = { c:( classCounts[c] / float(totalNumTexts) ) for c in classCounts.keys() }
77 | return classProbs
78 |
79 | def construct_complete_vocabulary(self):
80 | """ Generate a complete list of vocabulary words across all documents """
81 | vocab = set([])
82 | for d in self.documents:
83 | vocab = vocab.union(set(d.get_vocabulary()))
84 | return vocab
85 |
86 | def calc_word_freq_per_class(self):
87 | """ Determine the word frequencies for each class """
88 | classVocab = {}
89 | for c in self.classes:
90 | ## initialize the word frequencies to 0
91 | classVocab[c] = { word:0 for word in self.vocabulary }
92 | for d in self.documents:
93 | myClass = classVocab[d.get_label()]
94 | myFrequencies = d.get_word_frequencies()
95 | for word in myFrequencies.keys():
96 | myClass[word] += myFrequencies[word]
97 | return classVocab
98 |
99 | def count_tokens_per_class(self):
100 | countTokens = { c:0 for c in self.classes }
101 | for d in self.documents:
102 | countTokens[d.get_label()] += d.count_tokens()
103 | return countTokens
104 |
105 | def calc_conditional_prob_per_word(self):
106 | """ We will use LAPLACE ADD-1 SMOOTHING:
107 | p(word | class ) = [ # of tokens of word in class ) + 1 ] / [ ( total number of tokens in class ) + VOCAB_SIZE] """
108 | conditionalProbs = self.calc_word_freq_per_class()
109 | countTokens = self.count_tokens_per_class()
110 | for c in conditionalProbs.keys():
111 | for w in conditionalProbs[c].keys():
112 | conditionalProbs[c][w] = float( conditionalProbs[c][w] + 1) / float( countTokens[c] + len(self.vocabulary))
113 | return conditionalProbs
114 |
115 | def prior_prob(self, givenClass):
116 | return self.priorProbs[givenClass]
117 |
118 | def conditional_prob(self, givenClass, word):
119 | ## if the word is actually contained in the known vocabulary for the class,
120 | ## return the conditional probability
121 | if word in self.conditionalProbs[givenClass].keys():
122 | return self.conditionalProbs[givenClass][word]
123 | ## if the word is unknown, then use the following smoothing approximation
124 | ## Pr(word) = 1 / ( VOCAB-SIZE + 1 )
125 | else:
126 | return 1 / float(len(self.vocabulary) + 1)
127 |
128 |
129 | def classify(self, testDoc):
130 | """ Given a test document, determine the most probable classification """
131 | ## Get the word frequencies for the document
132 | doc = Document(testDoc)
133 | doc.split_and_remove_stop_words()
134 | docWordFreqs = doc.get_word_frequencies()
135 | docWords = docWordFreqs.keys()
136 | ## P(c|w) = [ P(w|c) ^ (count_w) ] * P(c)
137 | results = {}
138 | for c in self.classes:
139 | productOfConditionals = numpy.prod(map(lambda x: self.conditional_prob(c,x) ** docWordFreqs[x], docWords))
140 | probOfClass = productOfConditionals * self.prior_prob(c)
141 | results[c] = probOfClass
142 | bestLabel = max( results.items(), key=lambda x: x[1])
143 | return bestLabel[0]
144 |
145 | def classify_test_set(self, testSet):
146 | return map(lambda x: self.classify(x), testSet)
147 |
148 |
149 | def test_doc():
150 | class1 = [ "How are you my friends I brought you a sandwich",
151 | "I found a sandwich and these beers and I wanted to know you wanted to share it with me",
152 | "Listen my friend I going to get a beer tonight you want to join me" ]
153 | class2 = [ "Friends Romans countryman lend me your ears",
154 | "I come not to praise caesar but to bury him gentle romans",
155 | "mighty caesar do you lie so low" ]
156 | testSet = [ "Beers sandwich tonight", "caesar romans beers", "bury bury friends sandwiches share" ]
157 | documents = []
158 | for doc in class1:
159 | docObject = Document(doc, 'class1')
160 | docObject.split_and_remove_stop_words()
161 | documents.append(docObject)
162 | for doc in class2:
163 | docObject = Document(doc, 'class2')
164 | docObject.split_and_remove_stop_words()
165 | documents.append(docObject)
166 |
167 | myDD = DocDatabase(documents)
168 | myDD.construct_complete_vocabulary()
169 | print myDD.classify_test_set(testSet)
170 | #print myDD.documents
171 | #print myDD.get_classes()
172 | #print myDD.vocabulary
173 | #print myDD.calc_word_freq_per_class()
174 | #print myDD.count_tokens_per_class()
175 | #print myDD.calc_conditional_prob_per_word()
176 |
177 | """
178 | doc1 = Document(class1[0], 'class1')
179 | doc1.split_and_remove_stop_words()
180 | print doc1.wordVector
181 | print doc1.count_tokens()
182 | print doc1.get_word_frequencies()
183 | print doc1.get_vocabulary()
184 | """
185 |
186 | test_doc()
187 |
--------------------------------------------------------------------------------
/Classifiers/PracticePDFExtractor.py:
--------------------------------------------------------------------------------
1 | import PyPDF2
2 | import csv
3 | import pdfminer
4 | import os
5 |
6 |
7 | def extract_text_using_pypdf(fileName):
8 | pdf = PyPDF2.PdfFileReader(open(fileName, "rb"))
9 | allText = []
10 | for page in pdf.pages:
11 | allText.append(page.extractText())
12 | return allText
13 |
14 | def extract_text_using_pdf_miner(inputFile, outputFile):
15 | """ This method calls a command line argument from the pdfminer library
16 | Indicate txt or html by the file name output.txt or output.html
17 | For more commands, see http://www.unixuser.org/~euske/python/pdfminer/
18 | """
19 | commandString = "pdf2txt.py -o " + outputFile + " " + inputFile
20 | os.system(commandString)
21 |
22 | def export_to_csv(inputFileName, csvFileName):
23 | f = open(inputFileName, 'r')
24 | with open(csvFileName, 'wb') as csvfile:
25 | myWriter = csv.writer(csvfile, delimiter='\t')
26 | myWriter.writerow(f.readlines())
27 |
28 | def test():
29 | #textVector = extract_text_using_pypdf("Lunch-Money.pdf")
30 | #export_to_csv(textVector[2:], "Lunch-Money.csv")
31 | extract_text_using_pdf_miner("E3562014236085.pdf", "E3562014236085.html")
32 | export_to_csv("E3562014236085.html", "E3562014236085.csv")
33 |
34 |
35 | if __name__ == '__main__':
36 | test()
37 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Code License
2 |
3 | This code is released under the [GPL], version 2 or later:
4 |
5 | This program is free software; you can redistribute it and/or modify
6 | it under the terms of the GNU General Public License as published by
7 | the Free Software Foundation; either version 2 of the License, or
8 | (at your option) any later version.
9 |
10 | This program is distributed in the hope that it will be useful,
11 | but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 | GNU General Public License for more details.
14 |
15 | You should have received a copy of the GNU General Public License
16 | along with this program; if not, write to the Free Software
17 | Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
18 |
19 | The GNU General Public License is available in the file COPYING in
20 | the source distribution. On Debian systems, the complete text of the
21 | GPL can be found in `/usr/share/common-licenses/GPL`.
22 |
23 | [GPL]: http://www.gnu.org/copyleft/gpl.html
24 |
25 | Documentation License
26 |
27 | All documentation contained in this repository is licensed under a
28 | Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
29 | For details, please see the license file located in this directory (./cc-by-nc-sa-3.0.md) or
30 | visit [the link](http://creativecommons.org/licenses/by-nc-sa/3.0/deed.en_US).
31 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | opensyllabus
2 | ============
3 |
4 | *The Open Syllabus Project* seeks to promote institutional cooperation in the task of gathering and analyzing a significant corpus of syllabi. Our principles are as follows:
5 |
6 | *We believe in openness and transparency while respecting the intellectual property rights of all constituents.* We create and make publicly available a rich dataset of metadata, while protecting the original documents in a secure “research sandbox” environment.
7 |
8 | *We believe in data-driven innovation.* A critical mass of documents can foster new tools, drive policy change, enable best-practices, provide metrics, and aid in search, discovery, and the creation of new course materials.
9 |
10 | *We invite participating scholars and institutions to collaborate and benefit* from the project’s research, platform- and tool-development experiments. Our team includes the nation’s leading librarians and legal scholars who are committed to an ongoing dialog about knowledge sharing, preservation, and accessibility.
11 |
12 | Email:
13 | share [at] opensyllabusproject [dot] org
14 |
15 | Twitter:
16 | @opensyllabus
17 |
--------------------------------------------------------------------------------
/UI/SimpleQuerier.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
12 |
13 | Simple Page for Querying Open Syllabus Project REST API
14 |
23 |
24 |
71 |
72 |
73 |
74 |
Simple Page for querying Open Syllabus Project REST API
75 |
76 | Instructions: (1) specify parameter values, (2) click the Query button, (3) scroll down to review the results.
77 |
78 |
79 |
80 |
81 |
Parameter Name
82 |
Parameter Value
83 |
Parameter Information
84 |
85 |
Base URL
86 |
88 |
The URL for querying the Open Syllabus Project MongoDb. (Get this from the OSP.)
89 |
90 |
Username
91 |
92 |
Your username. (Get this from the OSP.)
93 |
94 |
Password
95 |
96 |
Your password. (Get this from the OSP.)
97 |
98 |
criteria
99 |
100 |
Search criteria, formatted as a JSON object.
101 |
102 |
fields
103 |
104 |
Names of fields, formatted into a JSON object, that you want in the results.
105 |
106 |
sort
107 |
108 |
Names of fields, formatted into a JSON object, by which you want the results to be sorted.
109 |
110 |
skip
111 |
112 |
number
113 |
114 |
limit
115 |
116 |
number
117 |
118 |
explain
119 |
120 |
121 |
122 |
batch_size
123 |
124 |
number to return
125 |
126 |
Constructed URL for query
127 |
128 |
You do not need to edit this because it is dynamically constructed from the preceding
129 | parameter values you provide. However, if desired, you can skip providing parameter values
130 | other than base URL, username, and password and provide your own constructed URL directly.
131 |
132 |
133 |
134 |
135 |
136 |
137 |
138 |
139 |
140 |
141 |
Query Results
142 |
143 |
144 |
145 |
146 |
147 |
148 |
149 |
150 |
--------------------------------------------------------------------------------
/cc-by-nc-sa-3.0.md:
--------------------------------------------------------------------------------
1 | # Creative Commons
2 |
3 |
4 | ## Creative Commons Legal Code
5 |
6 | ### Attribution-NonCommercial-ShareAlike 3.0 Unported
7 |
8 |
9 | CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS LICENSE DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE INFORMATION PROVIDED, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM ITS USE.
10 |
11 | *License*
12 |
13 | THE WORK (AS DEFINED BELOW) IS PROVIDED UNDER THE TERMS OF THIS CREATIVE COMMONS PUBLIC LICENSE ("CCPL" OR "LICENSE"). THE WORK IS PROTECTED BY COPYRIGHT AND/OR OTHER APPLICABLE LAW. ANY USE OF THE WORK OTHER THAN AS AUTHORIZED UNDER THIS LICENSE OR COPYRIGHT LAW IS PROHIBITED.
14 |
15 | BY EXERCISING ANY RIGHTS TO THE WORK PROVIDED HERE, YOU ACCEPT AND AGREE TO BE BOUND BY THE TERMS OF THIS LICENSE. TO THE EXTENT THIS LICENSE MAY BE CONSIDERED TO BE A CONTRACT, THE LICENSOR GRANTS YOU THE RIGHTS CONTAINED HERE IN CONSIDERATION OF YOUR ACCEPTANCE OF SUCH TERMS AND CONDITIONS.
16 |
17 | **1. Definitions**
18 |
19 | "Adaptation" means a work based upon the Work, or upon the Work and other pre-existing works, such as a translation, adaptation, derivative work, arrangement of music or other alterations of a literary or artistic work, or phonogram or performance and includes cinematographic adaptations or any other form in which the Work may be recast, transformed, or adapted including in any form recognizably derived from the original, except that a work that constitutes a Collection will not be considered an Adaptation for the purpose of this License. For the avoidance of doubt, where the Work is a musical work, performance or phonogram, the synchronization of the Work in timed-relation with a moving image ("synching") will be considered an Adaptation for the purpose of this License.
20 | "Collection" means a collection of literary or artistic works, such as encyclopedias and anthologies, or performances, phonograms or broadcasts, or other works or subject matter other than works listed in Section 1(g) below, which, by reason of the selection and arrangement of their contents, constitute intellectual creations, in which the Work is included in its entirety in unmodified form along with one or more other contributions, each constituting separate and independent works in themselves, which together are assembled into a collective whole. A work that constitutes a Collection will not be considered an Adaptation (as defined above) for the purposes of this License.
21 | "Distribute" means to make available to the public the original and copies of the Work or Adaptation, as appropriate, through sale or other transfer of ownership.
22 | "License Elements" means the following high-level license attributes as selected by Licensor and indicated in the title of this License: Attribution, Noncommercial, ShareAlike.
23 | "Licensor" means the individual, individuals, entity or entities that offer(s) the Work under the terms of this License.
24 | "Original Author" means, in the case of a literary or artistic work, the individual, individuals, entity or entities who created the Work or if no individual or entity can be identified, the publisher; and in addition (i) in the case of a performance the actors, singers, musicians, dancers, and other persons who act, sing, deliver, declaim, play in, interpret or otherwise perform literary or artistic works or expressions of folklore; (ii) in the case of a phonogram the producer being the person or legal entity who first fixes the sounds of a performance or other sounds; and, (iii) in the case of broadcasts, the organization that transmits the broadcast.
25 | "Work" means the literary and/or artistic work offered under the terms of this License including without limitation any production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression including digital form, such as a book, pamphlet and other writing; a lecture, address, sermon or other work of the same nature; a dramatic or dramatico-musical work; a choreographic work or entertainment in dumb show; a musical composition with or without words; a cinematographic work to which are assimilated works expressed by a process analogous to cinematography; a work of drawing, painting, architecture, sculpture, engraving or lithography; a photographic work to which are assimilated works expressed by a process analogous to photography; a work of applied art; an illustration, map, plan, sketch or three-dimensional work relative to geography, topography, architecture or science; a performance; a broadcast; a phonogram; a compilation of data to the extent it is protected as a copyrightable work; or a work performed by a variety or circus performer to the extent it is not otherwise considered a literary or artistic work.
26 | "You" means an individual or entity exercising rights under this License who has not previously violated the terms of this License with respect to the Work, or who has received express permission from the Licensor to exercise rights under this License despite a previous violation.
27 | "Publicly Perform" means to perform public recitations of the Work and to communicate to the public those public recitations, by any means or process, including by wire or wireless means or public digital performances; to make available to the public Works in such a way that members of the public may access these Works from a place and at a place individually chosen by them; to perform the Work to the public by any means or process and the communication to the public of the performances of the Work, including by public digital performance; to broadcast and rebroadcast the Work by any means including signs, sounds or images.
28 | "Reproduce" means to make copies of the Work by any means including without limitation by sound or visual recordings and the right of fixation and reproducing fixations of the Work, including storage of a protected performance or phonogram in digital form or other electronic medium.
29 |
30 | **2. Fair Dealing Rights.** Nothing in this License is intended to reduce, limit, or restrict any uses free from copyright or rights arising from limitations or exceptions that are provided for in connection with the copyright protection under copyright law or other applicable laws.
31 |
32 | **3. License Grant.** Subject to the terms and conditions of this License, Licensor hereby grants You a worldwide, royalty-free, non-exclusive, perpetual (for the duration of the applicable copyright) license to exercise the rights in the Work as stated below:
33 |
34 | to Reproduce the Work, to incorporate the Work into one or more Collections, and to Reproduce the Work as incorporated in the Collections;
35 | to create and Reproduce Adaptations provided that any such Adaptation, including any translation in any medium, takes reasonable steps to clearly label, demarcate or otherwise identify that changes were made to the original Work. For example, a translation could be marked "The original work was translated from English to Spanish," or a modification could indicate "The original work has been modified.";
36 | to Distribute and Publicly Perform the Work including as incorporated in Collections; and,
37 | to Distribute and Publicly Perform Adaptations.
38 | The above rights may be exercised in all media and formats whether now known or hereafter devised. The above rights include the right to make such modifications as are technically necessary to exercise the rights in other media and formats. Subject to Section 8(f), all rights not expressly granted by Licensor are hereby reserved, including but not limited to the rights described in Section 4(e).
39 |
40 | **4. Restrictions.** The license granted in Section 3 above is expressly made subject to and limited by the following restrictions:
41 |
42 | You may Distribute or Publicly Perform the Work only under the terms of this License. You must include a copy of, or the Uniform Resource Identifier (URI) for, this License with every copy of the Work You Distribute or Publicly Perform. You may not offer or impose any terms on the Work that restrict the terms of this License or the ability of the recipient of the Work to exercise the rights granted to that recipient under the terms of the License. You may not sublicense the Work. You must keep intact all notices that refer to this License and to the disclaimer of warranties with every copy of the Work You Distribute or Publicly Perform. When You Distribute or Publicly Perform the Work, You may not impose any effective technological measures on the Work that restrict the ability of a recipient of the Work from You to exercise the rights granted to that recipient under the terms of the License. This Section 4(a) applies to the Work as incorporated in a Collection, but this does not require the Collection apart from the Work itself to be made subject to the terms of this License. If You create a Collection, upon notice from any Licensor You must, to the extent practicable, remove from the Collection any credit as required by Section 4(d), as requested. If You create an Adaptation, upon notice from any Licensor You must, to the extent practicable, remove from the Adaptation any credit as required by Section 4(d), as requested.
43 | You may Distribute or Publicly Perform an Adaptation only under: (i) the terms of this License; (ii) a later version of this License with the same License Elements as this License; (iii) a Creative Commons jurisdiction license (either this or a later license version) that contains the same License Elements as this License (e.g., Attribution-NonCommercial-ShareAlike 3.0 US) ("Applicable License"). You must include a copy of, or the URI, for Applicable License with every copy of each Adaptation You Distribute or Publicly Perform. You may not offer or impose any terms on the Adaptation that restrict the terms of the Applicable License or the ability of the recipient of the Adaptation to exercise the rights granted to that recipient under the terms of the Applicable License. You must keep intact all notices that refer to the Applicable License and to the disclaimer of warranties with every copy of the Work as included in the Adaptation You Distribute or Publicly Perform. When You Distribute or Publicly Perform the Adaptation, You may not impose any effective technological measures on the Adaptation that restrict the ability of a recipient of the Adaptation from You to exercise the rights granted to that recipient under the terms of the Applicable License. This Section 4(b) applies to the Adaptation as incorporated in a Collection, but this does not require the Collection apart from the Adaptation itself to be made subject to the terms of the Applicable License.
44 | You may not exercise any of the rights granted to You in Section 3 above in any manner that is primarily intended for or directed toward commercial advantage or private monetary compensation. The exchange of the Work for other copyrighted works by means of digital file-sharing or otherwise shall not be considered to be intended for or directed toward commercial advantage or private monetary compensation, provided there is no payment of any monetary compensation in con-nection with the exchange of copyrighted works.
45 | If You Distribute, or Publicly Perform the Work or any Adaptations or Collections, You must, unless a request has been made pursuant to Section 4(a), keep intact all copyright notices for the Work and provide, reasonable to the medium or means You are utilizing: (i) the name of the Original Author (or pseudonym, if applicable) if supplied, and/or if the Original Author and/or Licensor designate another party or parties (e.g., a sponsor institute, publishing entity, journal) for attribution ("Attribution Parties") in Licensor's copyright notice, terms of service or by other reasonable means, the name of such party or parties; (ii) the title of the Work if supplied; (iii) to the extent reasonably practicable, the URI, if any, that Licensor specifies to be associated with the Work, unless such URI does not refer to the copyright notice or licensing information for the Work; and, (iv) consistent with Section 3(b), in the case of an Adaptation, a credit identifying the use of the Work in the Adaptation (e.g., "French translation of the Work by Original Author," or "Screenplay based on original Work by Original Author"). The credit required by this Section 4(d) may be implemented in any reasonable manner; provided, however, that in the case of a Adaptation or Collection, at a minimum such credit will appear, if a credit for all contributing authors of the Adaptation or Collection appears, then as part of these credits and in a manner at least as prominent as the credits for the other contributing authors. For the avoidance of doubt, You may only use the credit required by this Section for the purpose of attribution in the manner set out above and, by exercising Your rights under this License, You may not implicitly or explicitly assert or imply any connection with, sponsorship or endorsement by the Original Author, Licensor and/or Attribution Parties, as appropriate, of You or Your use of the Work, without the separate, express prior written permission of the Original Author, Licensor and/or Attribution Parties.
46 | For the avoidance of doubt:
47 |
48 | Non-waivable Compulsory License Schemes. In those jurisdictions in which the right to collect royalties through any statutory or compulsory licensing scheme cannot be waived, the Licensor reserves the exclusive right to collect such royalties for any exercise by You of the rights granted under this License;
49 | Waivable Compulsory License Schemes. In those jurisdictions in which the right to collect royalties through any statutory or compulsory licensing scheme can be waived, the Licensor reserves the exclusive right to collect such royalties for any exercise by You of the rights granted under this License if Your exercise of such rights is for a purpose or use which is otherwise than noncommercial as permitted under Section 4(c) and otherwise waives the right to collect royalties through any statutory or compulsory licensing scheme; and,
50 | Voluntary License Schemes. The Licensor reserves the right to collect royalties, whether individually or, in the event that the Licensor is a member of a collecting society that administers voluntary licensing schemes, via that society, from any exercise by You of the rights granted under this License that is for a purpose or use which is otherwise than noncommercial as permitted under Section 4(c).
51 | Except as otherwise agreed in writing by the Licensor or as may be otherwise permitted by applicable law, if You Reproduce, Distribute or Publicly Perform the Work either by itself or as part of any Adaptations or Collections, You must not distort, mutilate, modify or take other derogatory action in relation to the Work which would be prejudicial to the Original Author's honor or reputation. Licensor agrees that in those jurisdictions (e.g. Japan), in which any exercise of the right granted in Section 3(b) of this License (the right to make Adaptations) would be deemed to be a distortion, mutilation, modification or other derogatory action prejudicial to the Original Author's honor and reputation, the Licensor will waive or not assert, as appropriate, this Section, to the fullest extent permitted by the applicable national law, to enable You to reasonably exercise Your right under Section 3(b) of this License (right to make Adaptations) but not otherwise.
52 |
53 | **5. Representations, Warranties and Disclaimer**
54 |
55 | UNLESS OTHERWISE MUTUALLY AGREED TO BY THE PARTIES IN WRITING AND TO THE FULLEST EXTENT PERMITTED BY APPLICABLE LAW, LICENSOR OFFERS THE WORK AS-IS AND MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE WORK, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS, WHETHER OR NOT DISCOVERABLE. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THIS EXCLUSION MAY NOT APPLY TO YOU.
56 |
57 | **6. Limitation on Liability.** EXCEPT TO THE EXTENT REQUIRED BY APPLICABLE LAW, IN NO EVENT WILL LICENSOR BE LIABLE TO YOU ON ANY LEGAL THEORY FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR EXEMPLARY DAMAGES ARISING OUT OF THIS LICENSE OR THE USE OF THE WORK, EVEN IF LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
58 |
59 | **7. Termination**
60 |
61 | This License and the rights granted hereunder will terminate automatically upon any breach by You of the terms of this License. Individuals or entities who have received Adaptations or Collections from You under this License, however, will not have their licenses terminated provided such individuals or entities remain in full compliance with those licenses. Sections 1, 2, 5, 6, 7, and 8 will survive any termination of this License.
62 | Subject to the above terms and conditions, the license granted here is perpetual (for the duration of the applicable copyright in the Work). Notwithstanding the above, Licensor reserves the right to release the Work under different license terms or to stop distributing the Work at any time; provided, however that any such election will not serve to withdraw this License (or any other license that has been, or is required to be, granted under the terms of this License), and this License will continue in full force and effect unless terminated as stated above.
63 |
64 | **8. Miscellaneous**
65 |
66 | Each time You Distribute or Publicly Perform the Work or a Collection, the Licensor offers to the recipient a license to the Work on the same terms and conditions as the license granted to You under this License.
67 | Each time You Distribute or Publicly Perform an Adaptation, Licensor offers to the recipient a license to the original Work on the same terms and conditions as the license granted to You under this License.
68 | If any provision of this License is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this License, and without further action by the parties to this agreement, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable.
69 | No term or provision of this License shall be deemed waived and no breach consented to unless such waiver or consent shall be in writing and signed by the party to be charged with such waiver or consent.
70 | This License constitutes the entire agreement between the parties with respect to the Work licensed here. There are no understandings, agreements or representations with respect to the Work not specified here. Licensor shall not be bound by any additional provisions that may appear in any communication from You. This License may not be modified without the mutual written agreement of the Licensor and You.
71 | The rights granted under, and the subject matter referenced, in this License were drafted utilizing the terminology of the Berne Convention for the Protection of Literary and Artistic Works (as amended on September 28, 1979), the Rome Convention of 1961, the WIPO Copyright Treaty of 1996, the WIPO Performances and Phonograms Treaty of 1996 and the Universal Copyright Convention (as revised on July 24, 1971). These rights and subject matter take effect in the relevant jurisdiction in which the License terms are sought to be enforced according to the corresponding provisions of the implementation of those treaty provisions in the applicable national law. If the standard suite of rights granted under applicable copyright law includes additional rights not granted under this License, such additional rights are deemed to be included in the License; this License is not intended to restrict the license of any rights under applicable law.
72 | Creative Commons Notice
73 |
74 | Creative Commons is not a party to this License, and makes no warranty whatsoever in connection with the Work. Creative Commons will not be liable to You or any party on any legal theory for any damages whatsoever, including without limitation any general, special, incidental or consequential damages arising in connection to this license. Notwithstanding the foregoing two (2) sentences, if Creative Commons has expressly identified itself as the Licensor hereunder, it shall have all rights and obligations of Licensor.
75 |
76 | Except for the limited purpose of indicating to the public that the Work is licensed under the CCPL, Creative Commons does not authorize the use by either party of the trademark "Creative Commons" or any related trademark or logo of Creative Commons without the prior written consent of Creative Commons. Any permitted use will be in compliance with Creative Commons' then-current trademark usage guidelines, as may be published on its website or otherwise made available upon request from time to time. For the avoidance of doubt, this trademark restriction does not form part of this License.
77 |
78 | Creative Commons may be contacted at http://creativecommons.org/.
79 |
--------------------------------------------------------------------------------
/extractor_research/extractors/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/extractors/__init__.py
--------------------------------------------------------------------------------
/extractor_research/extractors/miner.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 | from pdfminer.layout import LAParams
3 | from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
4 | from pdfminer.converter import TextConverter, HTMLConverter, XMLConverter
5 | from pdfminer.pdfpage import PDFPage
6 |
7 | class Miner:
8 | def __init__(self, pdf_file, txt_file, file_format='txt', layout_analysis=True):
9 | self.pdf_file = file(pdf_file, 'rb')
10 | self.outfp = file(txt_file, 'w')
11 |
12 | if layout_analysis:
13 | laparams = LAParams()
14 | else:
15 | laparams = None
16 |
17 | self.rsrcmgr = PDFResourceManager(caching=True)
18 |
19 | if file_format == 'txt':
20 | self.device = TextConverter(self.rsrcmgr, self.outfp, codec='utf-8',
21 | laparams=laparams, imagewriter=None)
22 | elif file_format == 'html':
23 | self.device = HTMLConverter(self.rsrcmgr, self.outfp, codec='utf-8',
24 | laparams=laparams, imagewriter=None)
25 | elif file_format == 'xml':
26 | self.device = XMLConverter(self.rsrcmgr, self.outfp, codec='utf-8',
27 | laparams=laparams, imagewriter=None)
28 |
29 | def extract(self):
30 | interpreter = PDFPageInterpreter(self.rsrcmgr, self.device)
31 | pagenos = set()
32 | for page in PDFPage.get_pages(self.pdf_file, pagenos, maxpages=0,
33 | password=None, caching=True, check_extractable=True):
34 | interpreter.process_page(page)
35 | self.pdf_file.close()
36 | self.device.close()
37 | self.outfp.close()
38 |
39 | if __name__ == '__main__':
40 | import os
41 | import re
42 |
43 | #converts pdfs in the input directory into html format
44 | pdfList = [('../input/%s' % f) for f in os.listdir('../input/') if '.pdf' in f]
45 | htmlList = [re.sub(r'.pdf', r'.xml', f) for f in pdfList]
46 | htmlList = [re.sub(r'input', r'output', f) for f in htmlList]
47 |
48 | for i in range(len(pdfList)):
49 | print 'converting: %s to %s' % (pdfList[i], htmlList[i])
50 | miner = Miner(pdfList[i], htmlList[i], file_format='xml')
51 | miner.extract()
--------------------------------------------------------------------------------
/extractor_research/extractors/pdf2.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 | from PyPDF2 import PdfFileReader
3 |
4 | class PDF2:
5 | def __init__(self, pdf_file, txt_file):
6 | self.doc = PdfFileReader(open(pdf_file, 'rb'))
7 | self.output = open(txt_file, 'w')
8 |
9 | def extract(self):
10 | for page in self.doc.pages:
11 | self.output.write(page.extractText())
12 | self.output.close()
13 |
14 | if __name__ == '__main__':
15 | pdf = PDF2('../input/pride_and_prej/1.pdf', '../output/pride_and_prej/pdf2/1.txt')
16 | pdf.extract()
--------------------------------------------------------------------------------
/extractor_research/extractors/pdfbox-app-1.8.5.jar:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/extractors/pdfbox-app-1.8.5.jar
--------------------------------------------------------------------------------
/extractor_research/extractors/pdfbox.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 | import os
3 |
4 | class PDFBox:
5 | def __init__(self, pdf_file, txt_file):
6 | self.pdf_file = pdf_file
7 | self.txt_file = txt_file
8 |
9 | def extract(self):
10 | # need to hardcode path because of imports
11 | command = 'java -jar ~/workspace/OSP/opensyllabus/extractor_research/extractors/pdfbox-app-1.8.5.jar ExtractText ' + self.pdf_file + ' ' + self.txt_file
12 | os.system(command)
13 |
14 | if __name__ == '__main__':
15 | pdf = PDFBox('../input/pride_and_prej/1.pdf', '../output/pride_and_prej/pdfbox/1.txt')
16 | pdf.extract()
--------------------------------------------------------------------------------
/extractor_research/extractors/textstream.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 | import os
3 |
4 | class TextStream:
5 | def __init__(self, pdf_file, txt_file):
6 | self.pdf_file = pdf_file
7 | self.txt_file = txt_file
8 |
9 | def extract(self):
10 | # need to hardcode path because of imports
11 | command = 'java -cp "./extractors/textstream:./extractors/textstream/*" TextStream ' + self.pdf_file + ' ' + self.txt_file
12 | os.system(command)
13 |
14 | if __name__ == '__main__':
15 | pdf = TextStream('../input/pride_and_prej/1.pdf', '../output/pride_and_prej/textstream/1.txt')
16 | pdf.extract()
17 |
--------------------------------------------------------------------------------
/extractor_research/extractors/textstream/LICENSE:
--------------------------------------------------------------------------------
1 | A copy of this license is included with your distribution of PDFTextStream™ Software.
2 |
3 | SNOWTIDE™ INFORMATICS SYSTEMS, INC.
4 | PDFTEXTSTREAM™ SOFTWARE END USER LICENSE AGREEMENT ("EULA")
5 |
6 | IMPORTANT:
7 |
8 | • NEVER USE THE SOFTWARE TO VIOLATE ANYONE’S INTELLECTUAL PROPERTY OR OTHER RIGHTS.
9 |
10 | • THIS EULA IS A CONTRACT BETWEEN YOU AND SNOWTIDE. READ IT CAREFULLY BEFORE COMPLETING THE INSTALLATION PROCESS AND USING THE SOFTWARE. IT PROVIDES A LICENSE TO USE THE SOFTWARE AND CONTAINS WARRANTY INFORMATION AND LIABILITY DISCLAIMERS. BY INSTALLING AND USING THE SOFTWARE, YOU CONFIRM YOUR ACCEPTANCE OF THE SOFTWARE AND AGREE TO BE BOUND BY THIS EULA. IF YOU DO NOT AGREE TO BE BOUND, DO NOT INSTALL OR USE THE SOFTWARE.
11 |
12 | 1. Definitions
13 | (a) "Snowtide" means Snowtide Informatics Systems, Inc. and its suppliers and licensors, if any.
14 | (b) "Trial Version" means a version of the Software to be used only to review
15 | and evaluate the Software. The Trial Version may have limited functionality and
16 | may alter ts output or behaviour.
17 | (c) "Software" means the PDFTextStream™ software program and third party software program(s) supplied by Snowtide with it, which may also include associated media, printed materials, and electronic documentation. The Trial Version is also “Software” under this EULA.
18 | (d) "Production Environment" means a single computer system used to provide capabilities and features to the public or to the end users of your products or services.
19 |
20 | 2. License
21 | This EULA allows you to:
22 | (a) Absent a purchased license key, either:
23 | (i) evaluate the Software for potential future inclusion in your products or services, where such evaluation may not include using the Software in any Production Environment
24 | (ii) Install and use the Software in Production Environments in
25 | "single-threaded applications", as defined and described in the technical
26 | documentation of the Software.
27 | (b) Install and use the Software on a single Production Environment OR store the Software on a storage device, like a network server, used only to install the Software on other Production Environments over an internal network, provided you have a license key for each Production Environment on which the Software is installed and run. A license key for the Software may not be shared or used concurrently on different computers.
28 | (c) Make one copy of the Software in machine-readable form only for backup purposes. You must reproduce all copyright notices and other proprietary legends on the original copy of the Software on each copy.
29 |
30 | 3. License Restrictions
31 | (a) Except as allowed by Section 2, you may not make or distribute copies of the Software or electronically transfer the Software from one computer to another or over a network.
32 | (b) You may not decompile, reverse engineer, disassemble, or otherwise reduce the Software to a human-perceivable form.
33 | (c) You may not rent, lease, or sublicense the Software.
34 | (d) You may not redistribute the Software as part of another application.
35 | (e) You may permanently transfer all of your rights under this EULA provided you keep no copies, you transfer all of the Software (including all component parts, the media and printed materials, any upgrades, this EULA, and the serial numbers or license files), and the recipient agrees to this EULA. If the Software is an upgrade, any transfer must include all prior versions of the Software. You may not sell or transfer any Software purchased under a volume discount.
36 | (f) You may not modify the Software or create derivative works based upon the Software.
37 | (g) You may not export the Software into any country prohibited by the United States Export Administration Act and the regulations thereunder.
38 | (h) It is possible to use the Software (and many other programs) to violate the intellectual property and other rights of others. No permission for any such use is given by this EULA. You agree to hold Snowtide harmless and pay all costs including attorney fees because of your use of the Software.
39 | (i) If you fail to comply with this EULA, Snowtide may terminate the license and you must destroy all copies of the Software.
40 |
41 | 4. Upgrades
42 | If this copy of the Software is an upgrade from an earlier version of the Software, it is provided to you on a license exchange basis. Your installation and use of this copy of the Software means you have voluntarily terminated your earlier EULA and that you will not continue to use the earlier version of the Software or transfer it unless the transfer complies with Section 3.
43 |
44 | 5. Ownership
45 | This EULA gives you a limited license to use the Software. Snowtide and its suppliers retain all right, title and interest, including all intellectual property rights, in and to the Software and all copies . All rights not specifically granted in this EULA, including U.S. and International Copyrights, are reserved by Snowtide and its suppliers.
46 |
47 | 6. LIMITED WARRANTY AND DISCLAIMER
48 | (a) LIMITED WARRANTY. Snowtide warrants that for ninety (90) days from the date of delivery (as evidenced by a copy of your purchase receipt): (i) when used with a recommended hardware and software configuration, the Software will perform substantially according to the documentation supplied with the Software; and (ii) any physical media on which the Software is furnished is free from defects in materials and workmanship under normal use.
49 | (b) NO OTHER WARRANTY. EXCEPT AS SET FORTH ABOVE, SNOWTIDE DISCLAIMS ALL OTHER WARRANTIES, WHETHER EXPRESS, IMPLIED, OR OTHERWISE, INCLUDING WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THERE IS NO WARRANTY OF NONINFRINGEMENT, TITLE OR QUIET ENJOYMENT. IF APPLICABLE LAW REQUIRES ANY WARRANTIES OTHER THAN WHAT IS GRANTED HERE, ALL SUCH WARRANTIES ARE LIMITED TO NINETY (90) DAYS FROM THE DATE OF DELIVERY. NO ORAL OR WRITTEN INFORMATION OR ADVICE GIVEN BY SNOWTIDE, ITS DEALERS, DISTRIBUTORS, AGENTS OR EMPLOYEES CREATES A WARRANTY OR INCREASES THE SCOPE OF THIS WARRANTY.
50 | (c) (USA ONLY) SOME STATES DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THE ABOVE EXCLUSION MAY NOT APPLY TO YOU. THIS WARRANTY GIVES YOU SPECIFIC LEGAL RIGHTS. YOU MAY ALSO HAVE OTHER LEGAL RIGHTS THAT VARY FROM STATE TO STATE.
51 |
52 | 7. Exclusive Remedy
53 | Your exclusive remedy under Section 6 is to return the Software to the place you acquired it, with a copy of your receipt and a description of the problem. Snowtide will use reasonable commercial efforts to supply you with a replacement copy of the Software that substantially conforms to the documentation, provide a replacement for defective media, or refund your purchase price for the Software, at its option. Snowtide will not be liable under this provision if the Software has been altered in any way, if the media has been damaged by accident, abuse or misapplication, or if the failure arises out of using the Software with other than a recommended hardware and software configuration.
54 |
55 | 8. LIMITATION OF LIABILITY.
56 | (a) SNOWTIDE SHALL NOT BE LIABLE TO ANYONE FOR ANY INDIRECT, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES (INCLUDING DAMAGES FOR LOSS OF BUSINESS, LOSS OF PROFITS, BUSINESS INTERRUPTION OR THE LIKE) ARISING FROM USING OR NOT BEING ABLE TO USE, THE SOFTWARE AND BASED ON ANY THEORY OF LIABILITY INCLUDING BREACH OF CONTRACT, BREACH OF WARRANTY, TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY OR OTHERWISE, EVEN IF SNOWTIDE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES AND EVEN IF THE REMEDY ALLOWED IN THIS EULA FAILED OF ITS ESSENTIAL PURPOSE.
57 | (b) SNOWTIDE'S TOTAL LIABILITY FOR ACTUAL DAMAGES FOR ANY REASON WILL BE LIMITED TO THE GREATER OF $500 US DOLLARS OR THE AMOUNT YOU PAID FOR THE SOFTWARE THAT CAUSED THE DAMAGE.
58 | (c) (USA only) SOME STATES DO NOT ALLOW THE LIMITATION OR EXCLUSION OF LIABILITY FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES, SO THIS LIMITATION OR EXCLUSION MAY NOT APPLY TO YOU. YOU MAY ALSO HAVE OTHER LEGAL RIGHTS THAT VARY FROM STATE TO STATE.
59 |
60 | 9. Basis of Bargain
61 | The Limited Warranty, Exclusive Remedies and Limited Liability set forth above are fundamental elements of this EULA. Snowtide would not be able to provide the Software on an economical basis without such limitations.
62 |
63 | 10. U.S. GOVERNMENT RESTRICTED RIGHTS LEGEND
64 | This Software and the documentation are provided with RESTRICTED RIGHTS. Use, duplication, or disclosure by the U.S. Government is subject to restrictions as set forth in this EULA and as provided in DFARS 227.7202-1(a) and 227.7202-3(a) (1995),DFARS 252.227-7013 (c)(1)(ii)(OCT 1988), FAR 12.212(a)(1995), FAR 52.227-19, or FAR 52.227-14, as applicable. Manufacturer: Snowtide Informatics Systems, Inc, 243 King Street, Suite 248, Northampton, MA 01060.
65 |
66 | 11. (Outside of the USA) Consumer End Users Only
67 | The limitations or exclusions of warranties and liability contained in this EULA do not affect or prejudice the statutory rights of a consumer, i.e., a person acquiring goods otherwise than in the course of a business, subject to Section 12.
68 |
69 | 12. General Provisions
70 | This EULA is governed by the laws of the State of Massachusetts, without giving effect to principles of conflict of laws. If the Software is delivered outside the USA, the UN Convention for the International Sale of Goods does not apply to this EULA. This EULA is the full agreement between the parties and supersedes all other agreements or understandings, whether oral or written. All questions about this EULA must be directed to: Snowtide Informatics Systems, Inc, 243 King Street, Suite 248, Northampton, MA 01060, Attention: General Counsel.
71 |
72 | 13. Third Party Software
73 | Certain third party software is incorporated into the Software, as enumerated in Appendix A of this EULA.
74 |
75 | Snowtide and PDFTextStream are trademarks or registered trademarks of Snowtide Informatics Systems, Inc. in the United States and/or other countries. Third party trademarks, trade names, product names and logos may be the trademarks or registered trademarks of their respective owners.
76 |
77 | 14. Delivery
78 | The Software has been delivered to you by internet transmission at the address you registered with Snowtide. If the law of the jurisdiction where the Software was delivered to requires the payment of any sales, use, VAT or other tax on the purchase, use or ownership of the Software, you are responsible for reporting the purchase and paying any such taxes and you agree to indemnify Snowtide from any liability or expense in connection with your failure to do so.
79 |
80 | 15. ARBITRATION
81 | Any action arising out of this EULA, its formation, validity, breach or relating to the subject of this EULA must be filed within one year after the cause of action accrues through the American Arbitration Association under its Commercial Arbitration Rules and the optional rules for emergency measures of protection before one arbitrator who shall be an attorney experienced in trade secret law and matters relating to computer software. All hearings shall be in Northampton, Massachusetts or by telephone or videoconference at the order of the arbitrator. The arbitrator is authorized to issue equitable and legal remedies including preliminary and permanent injunctions. If an action is brought to compel arbitration or enforce the terms of any interim or final order of the arbitrator it may be brought in any court with jurisdiction over the person or property of either party and each party hereby irrevocably agrees to be subject to the jurisdiction of any such court. Any action to compel arbitration or enforce an arbitration order or award shall be governed by the Federal Arbitration Act, this EULA being in interstate commerce. The arbitrator is not empowered to grant damages in any form or amount in excess of the payments required under this EULA. This arbitration provision shall be a complete defense to any suit, action or proceeding. Nothing in this arbitration provision shall give the arbitrator any authority to alter, change, amend, modify, add to or subtract from any provision of this EULA. ALL PARTIES WAIVE ANY RIGHT TO A TRIAL BY JURY.
82 |
83 | *Appendix A - Third Party Software*
84 |
85 | The following is a list of certain third party software incorporated in the Software; any additional terms and conditions solely associated with such third party software are further indicated:
86 |
87 | (a) TIFFFaxDecompressor, Copyright (c) 2005 Sun Microsystems, Inc. All Rights Reserved.
88 |
89 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
90 |
91 | - Redistribution of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
92 |
93 | - Redistribution in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
94 |
95 | Neither the name of Sun Microsystems, Inc. or the names of contributors may be used to endorse or promote products derived from this software without specific prior written permission.
96 |
97 | This software is provided "AS IS," without a warranty of any kind. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED. SUN MIDROSYSTEMS, INC. ("SUN") AND ITS LICENSORS SHALL NOT BE LIABLE FOR ANY DAMAGES SUFFERED BY LICENSEE AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES. IN NO EVENT WILL SUN OR ITS LICENSORS BE LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS SOFTWARE, EVEN IF SUN HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
98 |
99 | You acknowledge that this software is not designed or intended for use in the design, construction, operation or maintenance of any nuclear facility.
100 |
101 | (b) Apache Commons-Logging, Licensed under the Apache License, Version 2.0, available at http://commons.apache.org/license.html
102 |
--------------------------------------------------------------------------------
/extractor_research/extractors/textstream/PDFTextStream.jar:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/extractors/textstream/PDFTextStream.jar
--------------------------------------------------------------------------------
/extractor_research/extractors/textstream/TextStream.class:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/extractors/textstream/TextStream.class
--------------------------------------------------------------------------------
/extractor_research/extractors/textstream/TextStream.java:
--------------------------------------------------------------------------------
1 | import java.io.*;
2 |
3 | import com.snowtide.pdf.PDFTextStream;
4 | import com.snowtide.pdf.OutputTarget;
5 |
6 | public class TextStream {
7 |
8 | public static void main (String[] args) throws IOException {
9 | File pdfFile = new File(args[0]);
10 | File textFile = new File(args[1]);
11 |
12 | PDFTextStream stream = new PDFTextStream(pdfFile);
13 | BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(textFile)));
14 | OutputTarget tgt = new OutputTarget(writer);
15 | stream.pipe(tgt);
16 |
17 | writer.flush();
18 | writer.close();
19 | stream.close();
20 | }
21 | }
22 |
--------------------------------------------------------------------------------
/extractor_research/extractors/xpdf.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 | import os
3 |
4 | class XPDF:
5 | def __init__(self, pdf_file, txt_file, layout=True):
6 | self.pdf_file = pdf_file
7 | self.txt_file = txt_file
8 | # -layout : maintain original physical layout
9 | self.layout = layout
10 |
11 | def extract(self):
12 | if self.layout:
13 | command = 'pdftotext -layout ' + self.pdf_file + ' ' + self.txt_file
14 | else:
15 | command = 'pdftotext ' + self.pdf_file + ' ' + self.txt_file
16 | os.system(command)
17 |
18 | if __name__ == '__main__':
19 | pdf = XPDF('../input/pride_and_prej/1.pdf', '../output/pride_and_prej/xpdf/1.txt')
20 | pdf.extract()
--------------------------------------------------------------------------------
/extractor_research/input/American_Opera_Rev_Syllabus.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/American_Opera_Rev_Syllabus.pdf
--------------------------------------------------------------------------------
/extractor_research/input/Leonard_Intro_Musicology_syllabus(2).pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/Leonard_Intro_Musicology_syllabus(2).pdf
--------------------------------------------------------------------------------
/extractor_research/input/Leonard_Intro_Musicology_syllabus.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/Leonard_Intro_Musicology_syllabus.pdf
--------------------------------------------------------------------------------
/extractor_research/input/Leonard_Victorian_Music_syllabus_docx.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/Leonard_Victorian_Music_syllabus_docx.pdf
--------------------------------------------------------------------------------
/extractor_research/input/Leonard_Women_Music_syllabus.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/Leonard_Women_Music_syllabus.pdf
--------------------------------------------------------------------------------
/extractor_research/input/Music_since_1900_syllabus.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/Music_since_1900_syllabus.pdf
--------------------------------------------------------------------------------
/extractor_research/input/pride_and_prej/1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/pride_and_prej/1.pdf
--------------------------------------------------------------------------------
/extractor_research/input/pride_and_prej/2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/pride_and_prej/2.pdf
--------------------------------------------------------------------------------
/extractor_research/input/pride_and_prej/3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xpmethod/opensyllabus/693e1304e2293515ff2817a86778ea6dde165515/extractor_research/input/pride_and_prej/3.pdf
--------------------------------------------------------------------------------
/extractor_research/main.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 |
3 | '''
4 | https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script
5 | (use with a second file option)
6 | http://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/
7 | normal sort + token sort (seems less accurate)
8 | '''
9 |
10 | from extractors import miner, pdf2, pdfbox, textstream, xpdf
11 | import os
12 | import cProfile
13 | import pstats
14 | import StringIO
15 |
16 |
17 | def miner_with_layout(pdf_file, txt_file):
18 | pdf = miner.Miner(pdf_file, txt_file)
19 | pdf.extract()
20 |
21 | def miner_without_layout(pdf_file, txt_file):
22 | pdf = miner.Miner(pdf_file, txt_file, layout_analysis=False)
23 | pdf.extract()
24 |
25 | def xpdf_with_layout(pdf_file, txt_file):
26 | pdf = xpdf.XPDF(pdf_file, txt_file)
27 | pdf.extract()
28 |
29 | def xpdf_without_layout(pdf_file, txt_file):
30 | pdf = xpdf.XPDF(pdf_file, txt_file, layout=False)
31 | pdf.extract()
32 |
33 | def textstream_default(pdf_file, txt_file):
34 | pdf = textstream.TextStream(pdf_file, txt_file)
35 | pdf.extract()
36 |
37 | def pdf2_default(pdf_file, txt_file):
38 | pdf = pdf2.PDF2(pdf_file, txt_file)
39 | pdf.extract()
40 |
41 | def pdfbox_default(pdf_file, txt_file):
42 | pdf = pdfbox.PDFBox(pdf_file, txt_file)
43 | pdf.extract()
44 |
45 | def run_all(pdf_file, txt_file):
46 | miner_with_layout(pdf_file, txt_file)
47 | miner_without_layout(pdf_file, txt_file)
48 | xpdf_with_layout(pdf_file, txt_file)
49 | xpdf_without_layout(pdf_file, txt_file)
50 | textstream_default(pdf_file, txt_file)
51 | pdf2_default(pdf_file, txt_file)
52 | pdfbox_default(pdf_file, txt_file)
53 |
54 | def time_all(pdf_file):
55 | methods = ['miner_with_layout', 'miner_without_layout', 'xpdf_with_layout',
56 | 'xpdf_without_layout', 'textstream_default', 'pdf2_default', 'pdfbox_default']
57 |
58 | base_name = os.path.basename(pdf_file)
59 | directory_name = os.path.dirname(pdf_file)
60 |
61 | # i.e. 'pride_and_prej' from './input/pride_and_prej/1.pdf'
62 | shorter_directory_name = os.path.basename(directory_name)
63 | # i.e. '1' from './input/pride_and_prej/1.pdf'
64 | file_base_name = os.path.splitext(base_name)[0]
65 |
66 | output = ''
67 |
68 | for method in methods:
69 | # build file path based on source text, input PDF, and method employed
70 | txt_file = './output/' + shorter_directory_name + '/' + method + '/' + file_base_name + '.txt'
71 |
72 | command = method + '(\'%s\', \'%s\')' % (pdf_file, txt_file)
73 | temp = 'statsfile'
74 | cProfile.run(command, temp)
75 |
76 | stream = StringIO.StringIO()
77 | stats = pstats.Stats(temp, stream=stream)
78 | stats.print_stats()
79 | stats.sort_stats('time')
80 | output = output + method + '\n-------------------------------------\n' + stream.getvalue()
81 |
82 | # clean up intermediary file
83 | os.remove('statsfile')
84 |
85 | # write results to log file
86 | with open('./stats/' + shorter_directory_name + '/' + file_base_name + '_speed_log.txt', "w") as log_file:
87 | log_file.write(output)
88 |
89 | if __name__ == '__main__':
90 | pdf_file = './input/pride_and_prej/1.pdf'
91 | time_all(pdf_file)
--------------------------------------------------------------------------------
/extractor_research/output/American_Opera_Rev_Syllabus.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
9 | Week 7: March 9, 11, 12
10 | John Adams: The Death of Klinghoffer (1996)
11 | Reading: Kraft,“The Death of Klinghoffer,” Perspectives of New Music, Vol. 30, No. 1 (Winter, 1992),
12 | 300-302
13 | Fink, “Klinghoffer in Brooklyn Heights,” Cambridge Opera Journal, 2005, 17:2, 173-213.
14 | Listening/Viewing: DVD 299
15 |
16 | Week 8: March 16, 18, 19
17 | SPRING BREAK—NO CLASS
18 |
19 | Week 9: March 23, 25, 26
20 | March 23 & 25: William Bolcom: A View from the Bridge (2001)
21 | Reading: Herwitz, “Notes from the Stage: William Bolcom,” Opera Quarterly, Vol. 22, No. 3–4, 521–
22 | 533.
23 | Listening/Viewing: CD 3255
24 | March 26: In-class conversation with Kiya Heartwood, composer of the opera Lying to the Sea Gypsy.
25 | Listening: “Safe Harbor,” http://www.lyingtotheseagypsy.com/
26 |
27 | Week 10: March 30, April 1, 2
28 | NO CLASS
29 |
30 | Week 11: April 6, 8, 9
31 | Green Day: American Idiot (2007/2009)
32 | Reading: http://www.nytimes.com/2009/09/18/theater/18greenday.html?_r=1&hpw
33 | Listening/Viewing: “American Idiot: The Musical Trailer”:
34 | http://www.youtube.com/watch?v=egGARtwaFEo
35 | “Green Day's American Idiot Musical: Michael Mayer's Introduction”:
36 | http://www.youtube.com/watch?v=EwhvAPSOrH0
37 | “‘Whatshername’ from American Idiot The Musical”:
38 | http://www.youtube.com/watch?v=IC44dUEwdiw&feature=related
39 |
40 | Week 12: April 13, 15, 16
41 | Final written paper due 4/13/10
42 | April 13: Garfein: Rosencrantz and Guildenstern are Dead (2009)
43 | Reading: Excerpts from Stoppard, Rosencrantz and Guildenstern are Dead (New York: Grove Press,
44 | 1967), 41-44.
45 | Listening/Viewing: R&G stage play, Questions: http://www.youtube.com/watch?v=y-Sx4W2cKlU
46 | R&G opera, “Questions”:
47 | http://www.youtube.com/watch?v=PxEzopBC2oA&feature=player_embedded#
48 | April 15 & 16: Operas on the Edge:
49 | Wrath of Khan-the Opera: http://trekmovie.com/2009/01/25/watch-wrath-of-khan-the-opera-via-robot-
50 | chicken/
51 | Repo-the Genetic Opera: trailer: http://www.youtube.com/watch?v=MzgpU25C6fg
52 | “Zydrate Anatomy”: http://www.youtube.com/watch?v=tevg_jT5Sco&feature=fvw
53 | “Infected”: http://www.youtube.com/watch?v=Ik9JjRqXc-8&feature=channel
54 | “Chromaggia”: http://www.youtube.com/watch?v=dxhPX0q0kpg&feature=related
55 |
Victorian Music and Culture/MH733, Spring 2011
9 |
Syllabus
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
Office hours: Tuesday, 9:30 – 10:30 a.m. in
24 | the library; and by appointment
25 | 513. 238. 8031 (C)
26 |
27 |
28 | Professor Kendra Leonard
29 | kleonard@rider.edu
30 |
31 | Class meeting: TTHF 8:00-9:30 a.m.
32 |
33 | Course Objectives
34 | 1. To provide students with an understanding of music and culture during the Victorian period in Great
35 | Britain.
36 | 2. To introduce students to the major composers, lyricists, and performers of the Victorian period.
37 | 3. To explore musical and other artistic developments taking place during the Victorian period.
38 | 4. To encourage students to think critically about reading the musical works, criticism, and practices of
39 | the past as a way of better understanding those works.
40 |
41 | Email
42 |
43 | Your Rider email account is your email address for all official email communications from the
44 | University. You are expected to check your Rider email account on a frequent and consistent basis in
45 | order to stay current with University-related communications. Any email from me about this course
46 | will only be sent to your official Rider email address. Any communication from you to me must come
47 | from your Rider email address, must contain the course name in the “Subject” line, and must use
48 | proper spelling and capitalization. I try to respond to all emails within 24 hours.
49 |
50 | Academic Code of Conduct
51 |
52 | submission of academic work. In all written work, whether in class or out of class, the student’s name
53 | on the work is considered to be a statement that the work is his or hers alone, except as otherwise
54 | indicated. Students are expected to provide proper citations for the statements and ideas of others
55 | whether submitted word for word or paraphrased. Failure to provide proper citations will be considered
56 | plagiarism and offenders will be subject to the charge of plagiarism specified in the statement of
57 | regulations.
58 |
Academic dishonesty includes any unauthorized collaboration or misrepresentation in the
59 |
Similarly, students are expected to adhere to all regulations pertaining to examination conduct.
60 | These regulations are designed to insure that the work submitted by the student on examinations is an
61 | honest representation of that student’s effort and that it does not involve unauthorized collaboration,
62 | unauthorized use of notes during the exam, or unauthorized access to prior information about the
63 | examination.
64 |
In this course, the first instance of plagiarism will result in a grade of 0 for the assignment, and
65 | a report will be sent to the dean; a second instance of plagiarism will result in an F for the course, and
66 | charges of academic dishonesty will be brought to the Academic Integrity Committee. See The Source,
67 | pages 10-16, for full information on the academic code of conduct.
68 |
69 | Required Text/Materials
70 | Solie, Ruth A. Music in Other Words: Victorian Conversations (Berkeley: University of California
71 | Press, 2004). Available used on amazon.com and abebooks.com.
72 | Vaughan Williams, Ralph. National Music and Other Essays (Oxford: Clarendon Press, 1996).
73 | Available used on amazon.com and abebooks.com.
74 | Other readings and links to listening are posted on Blackboard and are titled by author name. We will
75 |
watch some films in class.
78 |
79 | Style Manual
80 | Chicago Manual of Style, 15th ed. Chicago: University of Chicago Press, 2003.
81 |
82 | Recommended Web Sites
83 | Music, Theater, and Popular Entertainment in Victorian Britain:
84 | http://www.victorianweb.org/mt/index.html
85 | Punch archives:
86 | http://onlinebooks.library.upenn.edu/webbin/serial?id=punch
87 | Gilbert & Sullivan Archive: http://math.boisestate.edu/GaS/
88 | Victorian Resources Online: http://www2.iath.virginia.edu/bpn2f/victorian/bibliog.html
89 | The 1900 House site: http://www.pbs.org/wnet/1900house/index.html
90 |
91 | Technology Requirements
92 |
This course is on-line on Blackboard. The web address is Blackboard.rider.edu. You will need
93 |
to have regular access to the internet and a word-processing program to complete many elements of this
94 | course. Students who do not have this access at home will need to scheduled time to do so either at the
95 | library or at other campus computing locations. Always back up your work, whether on a flash drive,
96 | via email, or through an online back-up service such as Mozy or Google Documents. Assignments will
97 | not be accepted late because of computer or printer problems.
98 |
99 | In this course, we will be building a digital exhibit of Victorian music and culture at the website
100 |
101 | http://victorianmusic.omeka.net/. All students will have access to the site via a login and password I
102 | will provide on the first day of class. If you have trouble accessing the site, logging in, or other
103 | problems, it is your responsibility to contact me right away so we can get it fixed.
104 |
105 |
Attendance is expected. Attending class will help you learn the material and be better prepared
106 |
We will also be blogging throughout this course. You can sign up for a free blog at Blogger or
107 | Wordpress. You must send me the url of your blog no later than 5 p.m. January 27. The blog must be
108 | open to all other students, although you are free to use a pseudonym.
109 |
110 | Attendance
111 |
112 | for assignments. If you miss class, it is your responsibility to get notes from a classmate and be
113 | prepared for the next class. Unexcused and undocumented absences will affect your participation grade.
114 | If you miss class because of an illness, I will need a doctor’s note.
115 |
116 | Students with Disabilities
117 |
118 | learning disability, please provide me with your university documentation during the first week of class
119 | or as soon as you are documented. If you think you might have such needs, but have no documentation,
120 | please contact the E.O.P. office in the basement of Taylor.
121 |
122 | Assignment Policies
123 |
124 | turned in after 8:10 a.m. will not be accepted.
125 |
126 | Assignments and Evaluation
127 | Participation: 20%
128 |
If you have special needs that will affect performance in this class, such as a documented
129 |
Assignments are due by the beginning of class (8:00 a.m.) on the day specified. Assignments
130 |
Participation is expected in every class. You are expected to have completed the assigned
139 |
140 | reading and to be able to discuss it in class. I will keep track of your participation. Missing class will
141 | affect your participation grade.
142 | Blogging:15%
143 |
You will be keeping a blog of your thoughts and comments, such as reactions to the readings,
144 |
Each student will post two visual items (such as photos or other visual artwork) worth 5% each;
145 |
drafts and ideas for paper topics, and other relevant thoughts and links, over the course of the semester.
146 | To get full credit, you need to post at least two separate posts of 350 words each per week. The last day
147 | to post blog posts is April 7.
148 | Omeka exhibit items: 25%
149 |
150 | two audio/video items worth 5% each; and two written items, such as short essays, a review of a
151 | recording, film, book, or similar item of no fewer than 1000 words, worth 5% each to the exhibit over
152 | the course of the term. The last day to post items to the site is April 7.
153 |
154 | Final Project: 40%
155 |
156 | of Victorian music or musical culture you find particularly interesting.
157 |
158 |
Your final project will be a 15-minute in-class presentation and 10-12 page paper on any aspect
159 |
Elements of the final project
160 | A proposal for your paper is due on March 1. The proposal should be a 1-page statement and
161 |
description of what you want to research for your paper and presentation. You will need to explain what
162 | about the topic is appealing to you, and provide a general outline of the paper. The proposal is worth
163 | 5% of your final grade.
164 |
An annotated bibliography of no fewer than ten scholarly sources is due March 22. The
165 |
annotated bibliography should be in Chicago Manual of Style format. Each bibliographical entry
166 | should include a description of the source, including its intended audience and why you think it will be
167 | helpful for your paper. The annotated bibliography is worth 5% of your final grade.
168 |
The final paper is due April 19. Your paper will be 10-12 pages long, not including the
169 |
bibliography. Please format it according to CMS guidelines. The final paper is worth 20% of your final
170 | project grade.
171 |
Presentations will take place in class during the last week of class. You should prepare a 15-
172 |
minute presentation of your research, using audio/visual materials as appropriate. In-class
173 | performances are encouraged, as are creative approaches. The presentation is worth 10% of your final
174 | grade.
175 |
176 |
Your work reflects directly on you: strive for a professional appearance and clear, well-written
177 |
prose in your assignments. Spelling and grammar errors will count against you; always spell-check and
178 | proofread your work prior to posting it or turning it in. All assignments must be submitted via email as
179 | .doc or .docx attachments and should be double-spaced and single-sided. Please use Times New Roman
180 | font in 12 point type in black ink for all assignments. Margins should not be more than 1 inch. Include
181 | your full name and the course name and number in the upper left hand corner of each assignment.
182 |
183 | Classroom Etiquette
184 |
185 | asked to do so in class). Please do not eat or drink anything really odiferous (pickled herring, kimchee,
186 | rotten bananas, etc.) in class.
187 |
188 |
Please do not use cell phones during class for calls, texting, or accessing the internet (unless
189 |
January 25: Introductions
195 | January 27: Sweet, Inventing the Victorians and Wilson, Victorians
196 |
197 | Feb 1: Victorian Visitors, “Introduction” and “Wagner”
198 | Feb 3: Solie, “Music in a Victorian Mirror: MacMillan’s Magazine in the Grove Years”
199 |
200 | Feb 8: Solie, “‘Girling’ at the Parlor Piano”
201 | Feb 10: Solie, “‘Tadpole Pleasures”: Daniel Deronda as Music Historiography”
202 |
203 | Feb 15: Solie, “Fictions of the Opera Box”
204 | Feb 17: Temperley, “The Lost Chord,” Victorian Studies, Vol. 30, No. 1, Music in Victorian Society
205 | and Culture (Autumn, 1986), pp. 7-23; Bashford, “Historiography and Invisible Musics: Domestic
206 | Chamber Music in Nineteenth-Century Britain,” Journal of the American Musicological Society, Vol.
207 | 63, No. 2 (Summer 2010), pp. 291-
208 | 360.
209 |
210 | Feb 22: Elgar: Adams, “Of Worcester and London: An Introduction;” Botstein, “transcending the
211 | Enigmas of Biography: The Cultural Context of Sir Edward Elgar’s Career” in EdwardElgar and His
212 | World; and Vaughan Williams, “What Have We Learnt From Elgar?”
213 | Feb 24: Elgar: Thomson, “Elgar’s Critical Critics,” and Fuller, “Elgar and the Salons: The Significance
214 | of a Private Musical World” in EdwardElgar and His World.
215 |
216 | Mar 1: Stanford, “Some Thoughts concerning Folk-Song and Nationality,” The Musical Quarterly, Vol.
217 | 1, No. 2 (Apr., 1915), pp. 232-245; Vaughan Williams, National Music chapters 1-3 and 8, and The
218 | Making of Music chapter 6
219 | Mar 3: Saylor, “Dramatic Applications of Folksong in Vaughan Williams’s Operas Hugh the Drover
220 | and Sir John in Love,” Journal of the Royal Musical Association, Vol. 134, no. 1, 37-83.
221 |
222 | Mar 8: Vaughan Williams, “Gustav Holst: An Essay and a Note” in National Music and Other Essays
223 | Mar 10: No class—instructor at SAM meeting
224 |
225 | Mar 15: Spring Break
226 | Mar 17: Spring Break
227 |
228 | Mar 22: Kift, The Victorian Music Hall, Chapters 1-3 (annotated bibliographies due)
229 | Mar 24: Faulk, Music Hall and Modernity, Introduction and Chapter 1
230 |
231 | Mar 29: Faulk, Chapters 4 and 5
232 | Mar 31: in-class viewing: Topsy-Turvey
233 |
234 | Apr 5: Fischler, “Dialectics of Social Class in the Gilbert and Sullivan Collaboration,” SEL Studies in
235 | English Literature 1500-1900, Volume 48, Number 4, Autumn 2008, pp. 829-837; and Fischler,
236 | “Princess Ida” (review), The Opera Quarterly, Volume 19, Number 4, Autumn 2003, pp. 817-821.
237 | Apr 7: in-class viewing: Princess Ida (partial)
238 |
239 | Apr 12: Princess Ida discussion and analysis
240 | Apr 14: Princess Ida discussion and analysis
241 |
242 | Apr 19: Victorian music and culture rebooted: an introduction to steampunk (final papers due)
243 |
Office hours: Tuesday, 9:30 – 10:30 a.m. in
24 | the library; and by appointment
25 | 513. 238. 8031 (C)
26 |
27 |
28 | Professor Kendra Leonard
29 | kleonard@rider.edu
30 |
31 | Class meeting: TTHF 4:30- 6:00 p.m.
32 |
33 | Course Objectives
34 | 1. To provide students with an understanding of the role of women as composers, performers, and
35 | patrons of music.
36 | 2. To introduce students to major female composers.
37 | 3. To explore works and other contributions of female composers and performers.
38 | 4. To encourage students to think critically about reading works, criticism, and practices of women in
39 | music as a way of better understanding those works.
40 |
41 | Email
42 |
43 | Your Rider email account is your email address for all official email communications from the
44 | University. You are expected to check your Rider email account on a frequent and consistent basis in
45 | order to stay current with University-related communications. Any email from me about this course
46 | will only be sent to your official Rider email address. Any communication from you to me must come
47 | from your Rider email address, must contain the course name in the “Subject” line, and must use
48 | proper spelling and capitalization. I try to respond to all emails within 24 hours.
49 |
50 | Academic Code of Conduct
51 |
52 | submission of academic work. In all written work, whether in class or out of class, the student’s name
53 | on the work is considered to be a statement that the work is his or hers alone, except as otherwise
54 | indicated. Students are expected to provide proper citations for the statements and ideas of others
55 | whether submitted word for word or paraphrased. Failure to provide proper citations will be considered
56 | plagiarism and offenders will be subject to the charge of plagiarism specified in the statement of
57 | regulations.
58 |
Academic dishonesty includes any unauthorized collaboration or misrepresentation in the
59 |
Similarly, students are expected to adhere to all regulations pertaining to examination conduct.
60 | These regulations are designed to insure that the work submitted by the student on examinations is an
61 | honest representation of that student’s effort and that it does not involve unauthorized collaboration,
62 | unauthorized use of notes during the exam, or unauthorized access to prior information about the
63 | examination.
64 |
In this course, the first instance of plagiarism will result in a grade of 0 for the assignment, and
65 | a report will be sent to the dean; a second instance of plagiarism will result in an F for the course, and
66 | charges of academic dishonesty will be brought to the Academic Integrity Committee. See The Source,
67 | pages 10-16, for full information on the academic code of conduct.
68 |
69 | Required Text/Materials
70 | Pendle, Karin. Women & Music. Bloomington: Indiana University Press, 2001.
71 | Additional readings listed are posted on Blackboard as PDFs.
72 | We will also watch some films and clips in class.
73 |
74 | Style Manual
75 |
Chicago Manual of Style, 15th ed. Chicago: University of Chicago Press, 2003.
78 |
79 | Technology Requirements
80 |
This course is on-line on Blackboard. The web address is Blackboard.rider.edu. You will need
81 |
to have regular access to the internet and a word-processing program to complete many elements of this
82 | course. Students who do not have this access at home will need to scheduled time to do so either at the
83 | library or at other campus computing locations. Always back up your work, whether on a flash drive,
84 | via email, or through an online back-up service such as Mozy or Google Documents. Assignments will
85 | not be accepted late because of computer or printer problems.
86 |
In this course, we will be building a digital exhibit of women in music at the website
87 |
http://wccwomeninmusic.omeka.net/. All students will have access to the site via a login and password
88 | I will provide on the first day of class. If you have trouble accessing the site, logging in, or other
89 | problems, it is your responsibility to contact me right away so we can get it fixed.
90 |
We will be blogging throughout this course. You can sign up for a free blog at Blogger or
91 |
If you have special needs that will affect performance in this class, such as a documented
92 |
Attendance is expected. Attending class will help you learn the material and be better prepared
93 |
Wordpress. You must send me the url of your blog no later than 5 p.m. January 27. The blog must be
94 | open to all other students, although you are free to use a pseudonym. The last day to post blog posts is
95 | April 7.
96 |
97 | Attendance
98 |
99 | for assignments. If you miss class, it is your responsibility to get notes from a classmate and be
100 | prepared for the next class. Unexcused and undocumented absences will affect your participation grade.
101 | If you miss class because of an illness, I will need a doctor’s note.
102 |
103 | Students with Disabilities
104 |
105 | learning disability, please provide me with your university documentation during the first week of class
106 | or as soon as you are documented. If you think you might have such needs, but have no documentation,
107 | please contact the Academic Student Services office in the basement of Taylor.
108 |
109 | Assignment Policies
110 |
111 | Assignments turned in after 4:40 p.m. will not be accepted.
112 |
113 | Assignments and Evaluation
114 | Participation: 20%
115 |
116 | reading and to be able to discuss it in class. I will keep track of your participation. Missing class will
117 | affect your participation grade.
118 | Blogging:15%
119 |
Assignments are due before or at the beginning of class (4:30 p.m.) on the day specified.
120 |
Participation is expected in every class. You are expected to have completed the assigned
121 |
You will be keeping a blog of your thoughts and comments, such as reactions to the readings,
122 |
drafts and ideas for paper topics, and other relevant thoughts and links, over the course of the semester.
123 | To get full credit, you need to post at least one post of at least 250 words per week. The last day to post
124 | blog posts for credit is April 7.
125 | Omeka exhibit items: 25%
126 |
127 | two audio/video items worth 5% each; and two written items, such as short essays, a review of a
128 | recording, film, book, or similar item of no fewer than 1000 words, worth 5% each to the exhibit over
129 | the course of the term. The last day to post items to the site is April 7.
130 |
Each student will post two visual items (such as photos or other visual artwork) worth 5% each;
131 |
Final Project: 40%
135 |
136 | of women in music you find particularly interesting.
137 |
138 |
Your final project will be a 15-minute in-class presentation and 8-10 page paper on any aspect
139 |
Elements of the final project
140 | A proposal for your paper is due on March 1. The proposal should be a 1-page statement and
141 |
description of what you want to research for your paper and presentation. You will need to explain what
142 | about the topic is appealing to you, and provide a general outline of the paper. The proposal is worth
143 | 5% of your final grade.
144 |
An annotated bibliography of no fewer than eight scholarly sources is due March 22. The
145 | annotated bibliography should be in Chicago Manual of Style format. Each bibliographical entry
146 | should include a description of the source, including its intended audience and why you think it will be
147 | helpful for your paper. The annotated bibliography is worth 5% of your final grade.
148 |
The final paper is due April 19. Your paper will be 8-10 pages long, not including the
149 |
bibliography. Please format it according to CMS guidelines. The final paper is worth 20% of your final
150 | grade.
151 |
Presentations will take place in class during the last week of class. You should prepare a 15-
152 |
minute presentation of your research, using audio/visual materials as appropriate. In-class
153 | performances are encouraged, as are creative approaches. The presentation is worth 10% of your final
154 | grade.
155 |
156 |
Your work reflects directly on you: strive for a professional appearance and clear, well-written
157 |
prose in your assignments. Spelling and grammar errors will count against you; always spell-check and
158 | proofread your work prior to posting it or turning it in. All assignments must be submitted via email
159 | (not Blackboard dropbox or in hard copy) as .doc or .docx attachments and should be double-spaced
160 | and single-sided. Please use Times New Roman font in 12 point type in black ink for all assignments.
161 | Margins should not be more than 1 inch. Include your full name and the course name and number in the
162 | upper left hand corner of each assignment.
163 |
164 | Classroom Etiquette
165 |
166 | asked to do so in class). Please do not eat or drink anything really odiferous (pickled herring, kimchee,
167 | rotten bananas, etc.) in class.
168 |
169 |
Please do not use cell phones during class for calls, texting, or accessing the internet (unless
170 |
(cid:1)(cid:1)(cid:1)
171 |
172 | Course Schedule
173 | All chapters refer to Pendle, Women in Music. Other readings are listed on Blackboard by author name.
174 |
175 | January 25: Introductions
176 | January 27: Preface and Chapter 1: Feminist Aesthetics
177 |
178 | Feb 1: Chapter 2: Women and Music in Ancient Greece and Rome
179 | Feb 3: Chapter 3: Women in Music to ca. 1450; Vision (Hildegard of Bingen film, partial, if available)
180 |
181 | Feb 8: “Ful weel she soong the service dyvyne’: The Cloistered Musician in the Middle Ages” in
182 | Women Making Music (Yardley) and “Jougleresses and Trobairitz: Secular Musicians in Medieval
183 | France” (Coldwell)
184 | Feb 10: Chapter 4: Musical Women in Early Modern Europe
185 |
188 | Feb 15: Chapter 5: Musical Women of the 17th and 18th Centuries
189 | Feb 17: “Courtesans, Muses, or Musicians? Professional Women Musicians in Sixteenth-Century
190 | Italy,” in Women Making Music (Newcomb)
191 |
192 | Feb 22: Chapter 6: European Composers and Musicians, ca. 1800-1890
193 | Feb 24: Chapter 7: European Composers and Musicians, 1880-1918
194 |
195 | Mar 1: Chapter 8: Women in American Music, 1800-1918 (paper proposals due)
196 | Mar 3: Chapter 11: North America Since 1920
197 |
198 | Mar 8 Kiya Heartwood visit (and catch-up day in case of snow cancellations)
199 | Mar 10 No class—instructor at SAM meeting
200 |
201 | Mar 15 Spring Break
202 | Mar 17 Spring Break
203 |
204 | Mar 22: Chapter 9: Contemporary British Composers and “‘Shout, Shout, Up with Your Song!’ Dame
205 | Ethel Smyth and the Changing Role of the British Composer” (Bernstein) (annotated bibliographies
206 | due)
207 | Mar 24: Chapter 10: Composers of Modern Europe, Israel, Australia, and New Zealand
208 |
209 | Mar 29: Chapter 12: American Popular Music
210 | Mar 31: in-class viewing: Lilith Fair movie
211 |
212 | Apr 5: Intro to Women in the World of Music and Chapter 13: Women and Music around the
213 | Mediterranean
214 | Apr 7: Video TBD (Slingshot Hip Hop, if available, or Jericho's Echo: Punk Rock in the Holy Land)
215 |
216 | Apr 12: Chapter 14: Women in the World of Music: Latin America, Native America, and the African
217 | Diaspora
218 | Apr 14: Chapter 15: American Women in Blues and Jazz; in-class viewing: Lady Sings the Blues
219 | (partial)
220 |
221 | Apr 19: in-class viewing: Girls Rock (final papers due)
222 | Apr 21: Chapter 16: Women’s Support and Encouragement of Women and Musicians
223 |
224 | Apr 26: in-class presentations
225 | Apr 28: in-class presentations
226 |
227 |
228 |