├── Official Documents ├── NLPInitialPlan.pdf └── TuneAFish_Progress.pdf ├── README.md ├── README.txt ├── answer ├── pos.py ├── sampleData ├── cities │ ├── cities_a1.htm │ ├── cities_a10.htm │ ├── cities_a2.htm │ ├── cities_a3.htm │ ├── cities_a4.htm │ ├── cities_a5.htm │ ├── cities_a6.htm │ ├── cities_a7.htm │ ├── cities_a8.htm │ └── cities_a9.htm ├── constellations │ ├── constellations_a1.htm │ ├── constellations_a10.htm │ ├── constellations_a2.htm │ ├── constellations_a3.htm │ ├── constellations_a4.htm │ ├── constellations_a5.htm │ ├── constellations_a6.htm │ ├── constellations_a7.htm │ ├── constellations_a8.htm │ └── constellations_a9.htm ├── languages │ ├── languages_a1.htm │ ├── languages_a10.htm │ ├── languages_a2.htm │ ├── languages_a3.htm │ ├── languages_a4.htm │ ├── languages_a5.htm │ ├── languages_a6.htm │ ├── languages_a7.htm │ ├── languages_a8.htm │ └── languages_a9.htm └── music_instruments │ ├── music_instruments_a1.htm │ ├── music_instruments_a10.htm │ ├── music_instruments_a2.htm │ ├── music_instruments_a3.htm │ ├── music_instruments_a4.htm │ ├── music_instruments_a5.htm │ ├── music_instruments_a6.htm │ ├── music_instruments_a7.htm │ ├── music_instruments_a8.htm │ └── music_instruments_a9.htm ├── simpleQueryAnswering.py ├── stanford-corenlp-python ├── .gitignore ├── LICENSE ├── README.md ├── changes.zip ├── client.py ├── convertfinal.py ├── corenlp.py ├── default.properties ├── demoNew.py ├── files │ ├── extract.py │ ├── extractCanonical.py │ └── parse.py ├── jsonrpc.py ├── parseNLPNew.py ├── progressbar.py ├── simpleExtract.py ├── testCoreNLP.py ├── untitled └── v1_modules │ ├── demo1.py │ ├── extractNLP1.py │ └── parseNLP1.py ├── testCoreNLP.py └── yesno.py /Official Documents/NLPInitialPlan.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/raoariel/NLP-Question-Answer-System/38b95843c362eab660ab3d726498e562814166f7/Official Documents/NLPInitialPlan.pdf -------------------------------------------------------------------------------- /Official Documents/TuneAFish_Progress.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/raoariel/NLP-Question-Answer-System/38b95843c362eab660ab3d726498e562814166f7/Official Documents/TuneAFish_Progress.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # NLP-Question-Answer-System 2 | ####Plan for 11-411 Final Project 3 | 4 | Week 12 Plans 5 | --------------------------------- 6 | * 100 Generated Questions Review 7 | 8 | Week 11 Plans 9 | --------------------------------- 10 | * Reminder: Final Video (8 mins) 11 | * Will be over Spring Carnival 12 | 13 | Week 10 Plans 14 | --------------------------------- 15 | * Final Project due April 14, midnight 16 | 17 | Week 9 Plans 18 | --------------------------------- 19 | * Reminder: Prepare for Dry Run 20 | * Dry Run due April 7, before class 21 | 22 | Week 8 Plans 23 | --------------------------------- 24 | * Put together basic working system 25 | * Complete simple tests on raw data (html files and our own questions) 26 | 27 | Week 7 Plans 28 | --------------------------------- 29 | * Ariel: Parse raw text for question generation 30 | * Vijay: Question generation template filling 31 | * Caitlin: Yes/No question answering 32 | * Emily: Factoid question answering 33 | * Reminder: Progress Report 2 Due (video) 34 | 35 | Week 4 Plans 36 | --------------------------------- 37 | * Continue to make progress 38 | * Prepare for Progress Report 2 (video) 39 | 40 | Week 2 Plans 41 | --------------------------------- 42 | * Vijay: Question Templates 43 | * Caitlin/Emily: Pronoun Resolution 44 | * Ariel: Entity-Factoid Database (consider issues with synonyms) 45 | * Reminder: Submit Progress Report 1 46 | * March 4: Instructor Meeting 47 | 48 | Week 1 Plans 49 | --------------------------------- 50 | * Vijay: Question generation using templating 51 | * Caitlin: Research useful capabilities of NLTK/Stanford, write code to parse text in basic ways 52 | * Ariel: Entity-factoid database 53 | * Emily: Answer generation 54 | 55 | Parsing the Text 56 | ---------------------------------- 57 | We will use NLTK to apply part of speech labeling and the Stanford parser for entity relationship modeling. Using an external package to take care of the details of implementation will allow us to focus on tweaking our algorithms on a higher level. We will use these tools to parse text before both distinguishing between the tasks of asking and answering. 58 | Because pronouns are fundamentally ambiguous, we will consider using a probabilistic model for anaphora resolution. Then, we can build an offline database of entity-factoid pairs that we can query for answering. 59 | 60 | Asking 61 | ---------------------------------- 62 | Since we have already tagged the text when parsing, we can then identify the candidate subjects of each question in an article and build a collection of question templates. We will then extract meta-information (such as the “Categories” section) from the Wikipedia HTML structure to topically generate questions for sections of text. Given a new article, we will consider the attributes of each subject in the text and apply the most probable question template based on all critical words in each sentence. 63 | 64 | Answering 65 | ---------------------------------- 66 | Figuring out what type of question (yes/no, location, date, etc) is being asked will be useful for determining which relationships between words we should be considering. 67 | Parsing sentences into phrases and then deciding the functionality of the phrase will be useful for answering questions based on types. For example, a prepositional phrase that describes location will be useful for answering a location question while maintaining proper grammar. 68 | If information or relationships in the article are successfully extracted, then this information will be delivered as the answer. Otherwise we will retrieve the most likely sentence, treating the keywords of the question as a vector that we are trying to match, and extract the most salient section of this retrieved sentence as the answer. We will likely use term frequency to rank sentences within a document. 69 | Both the asking and answering modules will share the structured data extracted from parsing text and use wrappers over useful NLP algorithms from NLTK/other libraries. Otherwise the two components’ system designs will be independent. 70 | 71 | Evaluation 72 | ---------------------------------- 73 | We will be automatically evaluating each question generated for grammar and syntax, to ensure fluency. We will also evaluate answers for surface-level factual accuracy and adherence to the information need of the corresponding question, in addition to grammar and syntax. We will use the quality of candidate answers as a ranking criterion for the questions we generate. 74 | 75 | Team Coordination 76 | ---------------------------------- 77 | Our team will be using Git for version control and to share code/data. Because our group is divided among people with a background in programming vs. experience with linguistics, we will be dividing tasks accordingly. One of our team expectations by the first progress report will be to have a full “skeleton” of the functional system, and we will then assign specific coding tasks accordingly. We also have a standing weekly meeting (Thursdays) to delineate tasks and manage our progress. 78 | 79 | Team: Tune a Fish 80 | ---------------------------------- 81 | ``` 82 | Emily Bram 83 | Caitlin Lohman 84 | Ariel Rao 85 | Vijay Viswanathan 86 | ``` 87 | -------------------------------------------------------------------------------- /README.txt: -------------------------------------------------------------------------------- 1 | |===============================================| 2 | 3 | Project Directory: 4 | /afs/andrew.cmu.edu/usr3/vijayyyyyyyyyy........ 5 | 6 | |===============================================| 7 | Requires: 8 | Python 9 | 10 | Python Dependencies: 11 | Basic: os, sys, ast, 12 | re, string, random, 13 | collections 14 | Additional: json, yaml, pattern, 15 | nltk, corenlp, bs4 16 | 17 | |===============================================| 18 | 19 | Generating Questions: 20 | ./ask file.txt number_of_questions > tune_a_fish_question.txt 21 | 22 | 23 | Answering Questions: 24 | ./answer file.txt question.txt > tune_a_fish_answer.txt 25 | 26 | |===============================================| 27 | -------------------------------------------------------------------------------- /answer: -------------------------------------------------------------------------------- 1 | python simpleQueryAnswering.py $1 $2 2 | -------------------------------------------------------------------------------- /pos.py: -------------------------------------------------------------------------------- 1 | import nltk 2 | 3 | 4 | text = nltk.word_tokenize(open(file.read())) 5 | #This is the built in pos tagger 6 | tagged = nltk.pos_tag(text) 7 | #this uses bigrams to tag based on training sentences 8 | bigram_tagger = nltk.BigramTagger(train_sents) 9 | bigram_tagger.tag(text) 10 | #this line would tag Named entities 11 | nltk.ne_chunk(tagged) 12 | #converts a list of words into list of bigrams 13 | list(bigrams([list])) 14 | 15 | #basic chunking of NPs 16 | grammar = "NP: {
Leo /ˈliːoʊ/ is one of the constellations of the zodiac, lying between Cancer to the west and Virgo to the east. Its name is Latin for lion, and to the ancient Greeks represented the Nemean Lion killed by the mythical Greek hero Heracles (known to the ancient Romans as Hercules) as one of his twelve labors. Its symbol is ( ♌). One of the 48 constellations described by the 2nd century astronomer Ptolemy, Leo remains one of the 88 modern constellations today, and one of the most easily recognizable due to its many bright stars and a distinctive shape that is reminiscent of the crouching lion it depicts. The lion's mane and shoulders also form an asterism known as "the Sickle," which to modern observers may resemble a backwards "question mark."
10 | 11 |Leo contains many bright stars, many of which were individually identified by the ancients. There are four stars of first or second magnitude, with render this constellation especially prominent:
16 |The other named stars in Leo are Mu Leonis, Rasalas (an abbreviation of "Al Ras al Asad al Shamaliyy," meaning "The Lion's Head Toward the South"); and Theta Leonis, Chertan or Coxa ("hip").
24 |Leo is also home to one bright variable star, the red giant R Leonis. It is a Mira variable with a minimum magnitude of 10 and normal maximum magnitude of 6; it periodically brightens to magnitude 4.4. R Leonis, 330 light-years from Earth, has a period of 310 days and a diameter of 450 solar diameters.
25 |The star Wolf 359 (CN Leonis), one of the nearest stars to Earth at 7.8 light-years away, is in Leo. Wolf 359 is a red dwarf of magnitude 13.5; it periodically brightens by one magnitude or less because it is a flare star. Gliese 436, a faint star in Leo about 33 light years away from the Sun, is orbited by a transiting Neptune-mass extrasolar planet.
26 |The carbon star CW Leo (IRC +10216) is the brightest star in the night sky at the infrared N-band (10 μm wavelength).
27 |The star SDSS J102915+172927 (Caffau's star) is a population II star in the galactic halo seen in Leo. It is about 13 billion years old, making it one of the oldest stars in the Galaxy. It has the lowest metallicity of any known star.
28 |Modern astronomers, including Tycho Brahe in 1602, excised a group of stars that once made up the "tuft" of the lion's tail and used them to form the new constellation Coma Berenices (Berenice's hair), although there was precedent for that designation among the ancient Greeks and Romans.
29 |Leo contains many bright galaxies; Messier 65, Messier 66, Messier 95, Messier 96, Messier 105, and NGC 3628 are the most famous, the first two being part of the Leo Triplet.
32 |The Leo Ring, a cloud of hydrogen and helium gas, is found in orbit of two galaxies found within this constellation.
33 |M66 is a spiral galaxy that is part of the Leo Triplet, whose other two members are M65 and NGC 3628. It is at a distance of 37 million light-years and has a somewhat distorted shape due to gravitational interactions with the other members of the Triplet, which are pulling stars away from M66. Eventually, the outermost stars may form a dwarf galaxy orbiting M66. Both M65 and M66 are visible in large binoculars or small telescopes, but their concentrated nuclei and elongation are only visible in large amateur instruments.
34 | 35 |M95 and M96 are both spiral galaxies 20 million light-years from Earth. Though they are visible as fuzzy objects in small telescopes, their structure is only visible in larger instruments. M95 is a barred spiral galaxy. M105 is about a degree away from the M95/M96 pair; it is an elliptical galaxy of the 9th magnitude, also about 20 million light-years from Earth.
36 |NGC 2903 is a barred spiral galaxy discovered by William Herschel in 1784. It is very similar in size and shape to the Milky Way and is located 25 million light-years from Earth. In its core, NGC 2903 has many "hotspots", which have been found to be near regions of star formation. The star formation in this region is thought to be due to the presence of the dusty bar, which sends shock waves through its rotation to an area with a diameter of 2,000 light-years. The outskirts of the galaxy have many young open clusters.
37 |Leo is also home to some of the largest structures in the observable universe. Some of the structures found in the constellation are the Clowes–Campusano LQG, U1.11, U1.54, and the Huge-LQG, which are all large quasar groups; the latter being the second largest structure known (see also Hercules–Corona Borealis Great Wall).
38 |The Leonids occur in November, peaking on November 14–15, and have a radiant close to Gamma Leonis. Its parent body is Comet Tempel-Tuttle, which causes significant outbursts every 35 years. The normal peak rate is approximately 10 meteors per hour.
40 |The January Leonids are a minor shower that peaks between January 1 and 7.
41 |Leo was one of the earliest recognized constellations, with archaeological evidence that the Mesopotamians had a similar constellation as early as 4000 BCE. The Persians called Leo Ser or Shir; the Turks, Artan; the Syrians, Aryo; the Jews, Arye; the Indians, Simha, all meaning "lion".
43 | 44 |Some mythologists believe that in Sumeria, Leo represented the monster Khumbaba, who was killed by Gilgamesh.
45 |In Babylonian astronomy, the constellation was called UR.GU.LA, the "Great Lion"; the bright star Regulus was known as "the star that stands at the Lion's breast." Regulus also had distinctly regal associations, as it was known as the King Star.
46 |In Greek mythology, Leo was identified as the Nemean Lion which was killed by Heracles (Hercules to the Romans) during the first of his twelve labours. The Nemean Lion would take women as hostages to its lair in a cave, luring warriors from nearby towns to save the damsel in distress, to their misfortune. The Lion was impervious to any weaponry; thus, the warriors' clubs, swords, and spears were rendered useless against it. Realizing that he must defeat the Lion with his bare hands, Hercules slipped into the Lion's cave and engaged it at close quarters. When the Lion pounced, Hercules caught it in midair, one hand grasping the Lion's forelegs and the other its hind legs, and bent it backwards, breaking its back and freeing the trapped maidens. Zeus commemorated this labor by placing the Lion in the sky.
47 |The Roman poet Ovid called it Herculeus Leo and Violentus Leo. Bacchi Sidus (star of Bacchus) was another of its titles, the god Bacchus always being identified with this animal. However, Manilius called it Jovis et Junonis Sidus (Star of Jupiter and Juno).
48 |Early Hindu astronomers knew it as Ashlesha(nakshtra-sub constellation) and as Simha, the Tamil Simham.
49 |As of 2002, the Sun appears in the constellation Leo from August 10 to Sept 10. In tropical astrology, the Sun is considered to be in the sign Leo from July 23 to August 23, and in sidereal astrology, from August 16 to September 17. .
52 |Leo is commonly represented as if the sickle-shaped asterism of stars is the back of the Lion's head. The sickle is marked by six stars: Epsilon Leonis, Mu Leonis, Zeta Leonis, Gamma Leonis, Eta Leonis, and Alpha Leonis. The lion's tail is marked by Beta Leonis (Denebola) and the rest of his body is delineated by Delta Leonis and Theta Leonis.
55 | 56 |H.A. Rey has suggested an alternative way to connect the stars, which graphically shows a lion walking. The stars delta Leonis, gamma Leonis, eta Leonis, and theta Leonis form the body of the lion, with gamma Leonis being of the second magnitude and delta Leonis and theta Leonis being of the third magnitude. The stars gamma Leonis, zeta Leonis, mu Leonis, epsilon Leonis, and eta Leonis form the lion's neck, with epsilon Leonis being of the third magnitude. The stars mu Leonis, kappa Leonis, lambda Leonis, and epsilon Leonis form the head of the lion. Delta Leonis and beta Leonis form the lion's tail: beta Leonis, also known as Denebola, is the bright tip of the tail with a magnitude of two. The stars theta Leonis, iota Leonis, and sigma Leonis form the left hind leg of the lion, with sigma Leonis being the foot. The stars theta Leonis and rho Leonis form the right hind leg, with rho Leonis being the foot. The stars eta Leonis and Alpha Leonis mark the lion's heart, with alpha Leonis, also known as Regulus, being the bright star of magnitude one. The stars eta Leonis and omicron Leonis form the right front foot of the Lion.
57 |The zodiac sign Leo creatively redrawn fits astonishingly to the chalk lines of the galloping White Horse of Uffington.
58 | 59 |This was discovered in the late winter of 2014, Leo high in the sky, spring approaching, by the Bavarian hobby astronomer and scientist Josef Krem from Germany, Munich, exploring the zodiac stars and other constellations cycling through the year upon similarity to Celtic coinage's symbols, the horse being found very often.
60 |Lengyel describes the horse on Celtic coinage meaningful as the dynamic symbol of human existence from procreation via life to death and resurrection (horse from dock in the east to muzzle showing west), ever repeating. Rhiannon or Epona is the corresponding horse goddess of fertility or even mother goddess, associated with both, sun and moon.
61 |So the White Horse was possibly a place of seasonal celebrations more than 3000 years ago associated with the unknown Celtic zodiac sign of the horse? Due to the earth's axial precession Regulus in the horse had its midnight culmination around winter solstice about 3400 years ago. Nowadays this position nearly is taken by the Winter Hexagon with the brightest star Sirius.
62 |Terry Pratchett describes this Celtic art work in A Hat Full of Sky ingeniously by "Not what a horse looks like, but what a horse be ..." There are many more redrawings possible showing a horse in other ways of movement and various surrounding objects.
63 |USS Leonis (AK-128) was a United States Navy Crater class cargo ship named after the Latin version of this constellation name.
65 |Media related to Leo (constellation) at Wikimedia Commons
74 |Gemini is one of the constellations of the zodiac. It was one of the 48 constellations described by the 2nd century AD astronomer Ptolemy and it remains one of the 88 modern constellations today. Its name is Latin for "twins," and it is associated with the twins Castor and Pollux in Greek mythology. Its symbol is (Unicode ♊).
10 | 11 |Gemini lies between Taurus to the west and Cancer to the east, with Auriga and Lynx to the north and Monoceros and Canis Minor to the south.
15 |The Sun resides in the astrological sign of Gemini from June 20 to July 20 each year (though the zodiac dates it May 22 - June 21). By mid August, Gemini will appear along the eastern horizon in the morning sky prior to sunrise. The best time to observe Gemini at night is overhead during the months of January and February. By April and May, the constellation will be visible soon after sunset in the west.
16 |The easiest way to locate the constellation is to find its two brightest stars Castor and Pollux eastward from the familiar “V” shaped asterism of Taurus and the three stars of Orion’s belt. Another way is to mentally draw a line from the Pleiades star cluster located in Taurus and the brightest star in Leo, Regulus. In doing so, you are drawing an imaginary line that is relatively close to the ecliptic, a line which intersects Gemini roughly at the midpoint of the constellation, just below Castor and Pollux.
17 |The constellation contains 85 stars visible to observation on Earth without a telescope.
20 |The brightest star in Gemini is Pollux, and the second brightest is Castor. Castor's Bayer designation as "Alpha" is attributable to a mistake by Johann Bayer, who gave his eponymous designations in 1603.
21 |α Gem (Castor): the second brightest in the constellation after Pollux. Castor is a sextuple star system 52 light-years from Earth, which appears as a magnitude 1.6 blue-white star to the unaided eye. Two spectroscopic binaries are visible at magnitudes 1.9 and 3.0 with a period of 470 years. A wide-set red dwarf star is also a part of the system; this star is an Algol-type eclipsing binary star with a period of 19.5 hours; its minimum magnitude is 9.8 and its maximum magnitude is 9.3.
22 |β Gem (Pollux): the brightest star in Gemini, it is an orange-hued giant star of magnitude 1.2, 34 light-years from Earth. Pollux has an extrasolar planet revolving around it, as do two other stars in Gemini, HD 50554, and HD 59686.
23 |γ Gem (Alhena): a blue-white hued star of magnitude 1.9, 105 light-years from earth.
24 |δ Gem (Wasat): a long-period binary star 59 light-years from Earth. The primary is a white star of magnitude 3.5, and the secondary is an orange dwarf star of magnitude 8.2. The period is over 1000 years; it is divisible in medium amateur telescopes.
25 |ε Gem (Mebsuta): a double star, the primary is a yellow supergiant of magnitude 3.1, 900 light-years from Earth. The optical companion, of magnitude 9.2, is visible in binoculars and small telescopes.
26 |ζ Gem (Mekbuda): a double star, the primary is a Cepheid variable star with a period of 10.2 days; its minimum magnitude is 4.2 and its maximum magnitude is 3.6. It is a yellow supergiant, 1200 light-years from Earth, with a radius that is 60 times solar, making it approximately 220,000 times the size of the Sun. The companion, a magnitude 7.6 star, is visible in binoculars and small amateur telescopes.
27 |η Gem: a binary star with a variable component. 350 light-years away, it has a period of 500 years and is only divisible in large amateur telescopes. The primary is a semi-regular red giant with a period of 233 days; its minimum magnitude is 3.9 and its maximum magnitude is 3.1. The secondary is of magnitude 6.
28 |κ Gem: a binary star 143 light-years from Earth. The primary is a yellow giant of magnitude 3.6; the secondary is of magnitude 8. The two are only divisible in larger amateur instruments because of the discrepancy in brightness.
29 |ν Gem: a double star divisible in binoculars and small amateur telescopes. The primary is a blue giant of magnitude 4.1, 500 light-years from Earth, and the secondary is of magnitude 8.
30 |38 Gem: a binary star divisible in small amateur telescopes, 91 light-years from Earth. The primary is a white star of magnitude 4.8 and the secondary is a yellow star of magnitude 7.8.
31 |U Gem: a dwarf nova type cataclysmic variable discovered by J.R. Hind in 1855.
32 |To look at Gemini is to look away from the Milky Way; as a result, there are comparatively few deep-sky objects of note. The Eskimo Nebula and Medusa Nebula, Messier object M35, and Geminga are those that attract the most attention. The Eskimo and Medusa nebulae are both planetary nebulae, the one approximately 2,870 light years away and the other 1,500 light years distant. M35 is an open star cluster which was discovered in the year 1745 by Swiss astronomer Philippe Loys de Chéseaux. And Geminga is a neutron star approximately 550 light years from Earth. Other objects of note are NGC 2129, NGC 2158, NGC 2266, NGC 2331, NGC 2355, and NGC 2395.
34 |M35 (NGC 2168) is a large, elongated open cluster of magnitude 5; it has an area of approximately 0.2 square degrees, the same size as the full Moon. Its high magnitude means that M35 is visible to the unaided eye under dark skies; under brighter skies it is discernible in binoculars. The 200 stars of M35 are arranged in chains that curve throughout the cluster; it is 2800 light-years from Earth. Another open cluster in Gemini is NGC 2158. Visible in large amateur telescopes and very rich, it is more than 12,000 light-years from Earth.
35 |The Eskimo Nebula or Clown Face Nebula (NGC 2392) is a planetary nebula with an overall magnitude of 9.2, located 4000 light-years from Earth. In a small amateur telescope, its 10th magnitude central star is visible, along with its blue-green elliptical disk. It is named for its resemblance to the head of a person wearing a parka.
36 |The Geminids are a prominent, bright meteor shower that peaks on December 13–14. It has a maximum rate of approximately 100 meteors per hour, making it one of the richest meteor showers. The Epsilon Geminids peak between October 18 and October 29 and have only been recently confirmed. They overlap with the Orionids, which make the Epsilon Geminids difficult to detect visually. Epsilon Geminid meteors have a higher velocity than Orionids.
38 |In Babylonian astronomy, the stars Castor and Pollux were known as the Great Twins (MUL.MASH.TAB.BA.GAL.GAL). The Twins were regarded as minor gods and were called Meshlamtaea and Lugalirra, meaning respectively 'The One who has arisen from the Underworld' and the 'Mighty King'. Both names can be understood as titles of Nergal, the major Babylonian god of plague and pestilence, who was king of the Underworld.
43 |In Greek mythology, Gemini was associated with the myth of Castor and Pollux, the children of Leda and Argonauts both. Pollux was the son of Zeus, who seduced Leda, while Castor was the son of Tyndareus, king of Sparta and Leda's husband. Castor and Pollux were also mythologically associated with St. Elmo's fire in their role as the protectors of sailors. When Castor died, because he was mortal, Pollux begged his father Zeus to give Castor immortality, and he did, by uniting them together in the heavens.
44 |Gemini is dominated by Castor and Pollux, two bright stars that appear relatively very closely together forming an o shape, encouraging the mythological link between the constellation and twinship. The twin above and to the right (as seem from the Northern Hemisphere) is Castor, whose brightest star is α Gem; it is a second magnitude star and represents Castor's head. The twin below and to the left is Pollux, whose brightest star is β Gem (more commonly called Pollux); it is of the first magnitude and represents Pollux's head. Furthermore, the other stars can be visualized as two parallel lines descending from the two main stars, making it look like two figures.
46 |H.A. Rey has suggested an alternative to the traditional visualization that connected the stars of Gemini to show twins holding hands. Pollux's torso is represented by the star υ Gem, Pollux's right hand by ι Gem, Pollux's left hand by κ Gem; all three of these stars are of the fourth magnitude. Pollux's pelvis is represented by the star δ Gem, Pollux's right knee by ζ Gem, Pollux's right foot by γ Gem, Pollux's left knee by λ Gem, and Pollux's left foot by ξ Gem. γ Gem is of the second magnitude, while δ and ξ Gem are of the third magnitude. Castor's torso is represented by the star τ Gem, Castor's left hand by ι Gem (which he shares with Pollux), Castor's right hand by θ Gem; all three of these stars are of the fourth magnitude. Castor's pelvis is represented by the star ε Gem, Castor's left foot by ν Gem, and Castor's right foot by μ Gem and η Gem; ε, μ, and η Gem are of the third magnitude. The brightest star in this constellation is Pollux.
47 |In Meteorologica (1 343b30) Aristotle mentions that he observed Jupiter in conjunction with and then occulting a star in Gemini. This is the earliest known observation of this nature. A study published in 1990 suggests the star involved was 1 Geminorum and the event took place on 5 December 337 BC.
49 |When William Herschel discovered Uranus on 13 March 1781 it was located near η Gem. In 1930 Clyde Tombaugh exposed a series of photographic plates centred on δ Gem and discovered Pluto.
50 |In Chinese astronomy, the stars that correspond to Gemini are located in two areas: the White Tiger of the West (西方白虎, Xī Fāng Bái Hǔ) and the Vermillion Bird of the South (南方朱雀, Nán Fāng Zhū Què).
52 |As of 2008, the Sun appears in the constellation Gemini from June 20 to July 20. In tropical astrology, the Sun is considered to be in the sign Gemini from May 21 to June 22, and in sidereal astrology, from June 16 to July 15.
55 |Cancer is one of the twelve constellations of the zodiac. Its name is Latin for crab and it is commonly represented as one. Its astrological symbol is (Unicode ♋). Cancer is a medium-size constellation with an area of 506 square degrees and its stars are rather faint, its brightest star Beta Cancri having an apparent magnitude of 3.5. It contains two stars with known planets, including 55 Cancri, which has five: one super-earth and four gas giants, one of which is in the habitable zone and as such has expected temperatures similar to Earth. Located at the center of the constellation is Praesepe (Messier 44), one of the closest open clusters to Earth and a popular target for amateur astronomers.
10 | 11 |Cancer is a medium-sized constellation that is bordered by Gemini to the west, Lynx to the north, Leo Minor to the northeast, Leo to the east, Hydra to the south, and Canis Minor to the southwest. The three-letter abbreviation for the constellation, as adopted by the International Astronomical Union in 1922, is 'Cnc'. The official constellation boundaries, as set by Eugène Delporte in 1930, are defined by a polygon of 10 sides. In the equatorial coordinate system, the right ascension coordinates of these borders lie between 07h 55m 19.7973s and 09h 22m 35.0364s, while the declination coordinates are between 33.1415138° and 6.4700689°. Covering 506 square degrees or 0.921% of the sky, it ranks 31st of the 88 constellations in size. It can be seen at latitudes between +90° and -60° and is best visible at 9 p.m. during the month of March.
13 |Cancer is the dimmest of the zodiacal constellations, having only two stars above the fourth magnitude. The German cartographer Johann Bayer used the Greek letters Alpha through Omega to label the most prominent stars in the constellation, followed by the letters A, then lowercase b, c and d.
17 | 18 |Also known as Altarf, Beta Cancri is the brightest star in Cancer at apparent magnitude 3.5 and located 290 light-years from Earth. It is a binary star system, its main component an orange giant of spectral type K4III that is varies slightly from a baseline magnitude of 3.53—dipping by 0.005 magnitude over a period of 6 days. An ageing star, it has expanded to around 50 times the Sun's diameter and shines with 660 times its luminosity. It has a faint magnitude 14 red dwarf companion located 29 seconds away that takes 76,000 years to complete an orbit. Altarf represents a part of Cancer's body.
19 |Delta Cancri, also known as Asellus Australis, is an orange-hued giant star of magnitude 3.9, 136 light-years from Earth. Its common name means "southern donkey". The star also holds a record for the longest name, "Arkushanangarushashutu," derived from ancient Babylonian language, which translates to "the southeast star in the Crab." Delta Cancri also makes it easy to find X Cancri, the reddest star in the sky.
20 |Gamma Cancri, or Asellus Borealis ("northern donkey colt"), is a white-hued A-type subgiant of magnitude 4.7, 158 light-years from Earth.
21 |Iota Cancri is a wide double star. The primary is a yellow-hued G-type bright giant star of magnitude 4.0, 298 light-years from Earth. The secondary is a white A-type main sequence star of magnitude 6.6. Both can easily be seen with a small telescope.
22 |Alpha Cancri (Acubens) is a double star with a primary of magnitude 4.26, 173 light-years from Earth. The secondary is of magnitude 12.0 and is visible in small amateur telescopes. Its common name means "the claw". The star’s primary component is a white A-type main sequence dwarf, while the companion is a magnitude 11 star.
23 |Zeta Cancri or Tegmine ("the shell") is a multiple star system that contains at least four stars located 83 light-years from Earth. The two brightest components are a binary star with an orbital period of 1100 years; the brighter component is a yellow-hued star of magnitude 70000.0 and the dimmer component is a yellow-hued star of magnitude 6.2. The brighter component is itself a binary star with a period of 59.5 years; its primary is of magnitude 5.6 and its secondary is of magnitude 6.0. This pair will be at its greatest separation in 2018.
24 |Rho-1 Cancri or 55 Cancri is a binary star approximately 40.9 light-years distant from Earth. 55 Cancri consists of a yellow dwarf and a smaller red dwarf, with five planets orbiting the primary star; one low-mass planet that may be either a hot, water-rich world or a carbon planet and four gas giants. 55 Cancri A, classified as a rare "super metal-rich" star, is one of the top 100 target stars for NASA’s Terrestrial Planet Finder mission, ranked 63rd on the list. The red dwarf 55 Cancri B, a suspected binary, appears to be gravitationally bound to the primary star, as the two share common proper motion.
25 |Cancer is best known among stargazers as the home of Praesepe (Messier 44), an open cluster also called the Beehive Cluster, located right in the centre of the constellation. Located 577 light-years from Earth, it is one of the nearest open clusters to our Solar System. M 44 contains about 50 stars, the brightest of which are of the sixth magnitude. Epsilon Cancri is the brightest member at magnitude 6.3. Praesepe is also one of the larger open clusters visible; it has an area of 1.5 square degrees, or three times the size of the full Moon. It is most easily observed when Cancer is high in the sky. North of the Equator, this period stretches from February to May. Ptolemy described the Beehive Cluster as "the nebulous mass in the breast of Cancer." It was one of the first objects Galileo observed with his telescope in 1609, spotting 40 stars in the cluster. Today, there are about 1010 high-probability members, most of them (68 percent) red dwarfs. The Greeks and Romans identified the nebulous object as a manger from which two donkeys, represented by the neighbouring stars [1213] Asellus Borealis and [1210] Asellus Australis, were eating. The stars represent the donkeys that the god Dionysus and his tutor Silenus rode in the war against the Titans. The ancient Chinese interpreted the object as a ghost or demon riding in a carriage, calling it a "cloud of pollen blown from under willow catkins."
27 |The smaller, denser open cluster Messier 67 can also be found in Cancer, 2500 light-years from Earth. It has an area of approximately 0.5 square degrees, the size of the full Moon. It contains approximately 200 stars, the brightest of which are of the tenth magnitude.
28 |QSO J0842+1835 is a quasar used to measure the speed of gravity in VLBI experiment conducted by Edward Fomalont and Sergei Kopeikin in September 2002.
29 |OJ 287 is a BL Lac object located 3.5 billion light years away that has produced quasi-periodic optical outbursts going back approximately 120 years, as first apparent on photographic plates from 1891. It was first detected at radio wavelengths during the course of the Ohio Sky Survey. Its central supermassive black hole is among the largest known, with a mass of 18 billion solar masses, more than six times the value calculated for the previous largest object.
30 |Cancer is said to have been the place for the Akkadian Sun of the South, perhaps from its position at the summer solstice in very remote antiquity. But afterwards it was associated with the fourth month Duzu (June–July in the modern western calendar), and was known as the Northern Gate of Sun.
32 |Showing but few stars, and its brightest stars being of only 4th magnitude, Cancer was often considered the "Dark Sign", quaintly described as black and without eyes. Dante, alluding to this faintness and position of heavens, wrote in Paradiso:
33 |Cancer was the location of the Sun's most northerly position in the sky (the summer solstice) in ancient times, though this position now occurs in Taurus due to the precession of the equinoxes, around June 21. This is also the time that the Sun is directly overhead at 23.5°N, a parallel now known as the Tropic of Cancer.
34 |In Greek mythology, Cancer is identified with the crab that appeared while Hercules was fighting the many-headed Hydra. The crab bit Hercules on the foot, Hercules crushed it and then the goddess Hera, a sworn enemy of Hercules, placed the crab among the stars.
35 |The modern symbol for Cancer represents the pincers of a crab, but Cancer has been represented as various types of creatures, usually those living in the water, and always those with an exoskeleton.
38 |In the Egyptian records of about 2000 BC it was described as Scarabaeus (Scarab), the sacred emblem of immortality. In Babylonia the constellation was known as MUL.AL.LUL, a name which can refer to both a crab and a snapping turtle. On boundary stones, the image of a turtle or tortoise appears quite regularly and it is believed that this represents Cancer as a conventional crab has not so far been discovered on any of these monuments. There also appears to be a strong connection between the Babylonian constellation and ideas of death and a passage to the underworld, which may be the origin of these ideas in later Greek myths associated with Hercules and the Hydra. In the 12th century, an illustrated astronomical manuscript shows it as a water beetle. Albumasar writes of this sign in Flowers of Abu Ma'shar. A 1488 Latin translation depicts cancer as a large crayfish, which also is the constellation's name in most Germanic languages. Jakob Bartsch and Stanislaus Lubienitzki, in the 17th century, described it as a lobster.
39 |In Ancient Greece, Aratus called the crab Καρκινος (Karkinos), which was followed by Hipparchus and Ptolemy. The Alfonsine tables called it Carcinus, a Latinized form of the Greek word. Eratosthenes extended this as Καρκινος, Ονοι, και Φατνη (Karkinos, Onoi. kai fatne): the Crab, Asses, and Crib.
41 |The Indian language Sanskrit shares a common ancestor with Greek, and the Sanskrit name of Cancer is Karka and Karkata. In Telugu it is "Karkatakam", in Kannada "Karkataka" or "Kataka", in Tamil Karkatan, and in Sinhalese Kagthaca. The later Hindus knew it as Kulira, from the Greek Κολουρος (Koloyros), the term originated by Proclus.
42 |In Ancient Rome, Manilius and Ovid called the constellation Litoreus (shore-inhabiting). Astacus and Cammarus appear in various classic writers, while it is called Nepa in Cicero's De Finibus and the works of Columella, Plautus, and Varro; all of these words signify crab, lobster, or scorpion.
43 |Athanasius Kircher said that in Coptic Egypt it was Κλαρια (Klaria), the Bestia seu Statio Typhonis (the Power of Darkness). Jérôme Lalande identified this with Anubis, one of the Egyptian divinities commonly associated with Sirius.
44 |In most Germanic and Slavic languages, the constellation is known as "The Crayfish".
45 |The creation of the constellation is explained in Greek mythology by the short-lived association of the crab Karkinos with one of the Twelve Labors of Heracles, in which Heracles battled the multi-headed Lernaean Hydra. Hera had sent Karkinos to distract Heracles and put him at a disadvantage during the battle, but Heracles quickly dispatched the crab by kicking it with such force that it was propelled into the sky. Other accounts had Karkinos grabbing onto Heracles's toe with its claws, but Heracles simply crushed the crab under his foot. Hera, grateful for Karkinos's effort, gave it a place in the sky. Some scholars have suggested that Karkinos was a late addition to the myth of Heracles in order to make the Twelve Labors correspond to the twelve signs of the Zodiac.
48 |As of 2002, the Sun appears in the constellation Cancer from July 21 to August 9. In tropical astrology, the Sun is considered to be in the sign Cancer from June 21 to July 22, and in sidereal astrology, from July 16 to August 15.
51 |In Chinese astronomy, the stars of Cancer lie within the The Vermillion Bird of the South (南方朱雀, Nán Fāng Zhū Què).
53 |Cancer (Chinese astronomy)
55 |Ophiuchus /ɒfiˈjuːkəs/ is a large constellation located around the celestial equator. Its name is from the Greek Ὀφιοῦχος "serpent-bearer", and it is commonly represented as a man grasping the snake that is represented by the constellation Serpens. Ophiuchus was one of the 48 constellations listed by the 2nd-century astronomer Ptolemy, and it remains one of the 88 modern constellations. It was formerly referred to as Serpentarius /sɜrpənˈtɛəriəs/ and Anguitenens.
10 | 11 |Ophiuchus is located between Aquila, Serpens and Hercules, northwest of the center of the Milky Way. The southern part lies between Scorpius to the west and Sagittarius to the east. In the northern hemisphere, it is best visible in summer. It is located opposite Orion in the sky. Ophiuchus is depicted as a man grasping a serpent; the interposition of his body divides the snake constellation Serpens into two parts, Serpens Caput and Serpens Cauda, which are nonetheless counted as one constellation.
14 |Ophiuchus straddles the equator but lies predominately to its south. However, Rasalhague, a fairly conspicuous star in its north, is circumpolar north of 78° north latitude. The constellation extends southward to −30° declination. Segments of the ecliptic that lie within Ophiuchus lie south of −20° declination. A determination of exactly where these stars are visible on Earth would depend on atmospheric refraction, the Novaya Zemlya effect, mountains and clouds.
15 |In contrast to Orion, it is in the period November–January (summer in the Southern Hemisphere, winter in the Northern Hemisphere) when Ophiuchus is in the daytime sky and thus not visible at most latitudes. However for much of the Arctic Circle in the Northern Hemisphere's winter months, the Sun is below the horizon even at midday. Stars (and thus parts of Ophiuchus, especially Rasalhague) are then visible at twilight for a few hours around local noon, low in the South. In the Northern Hemisphere's spring and summer months, when Ophiuchus is normally visible in the night sky, the constellation is actually not visible, at those times and places in the Arctic when midnight sun obscures the stars. In countries close to the equator Ophiuchus appears overhead in June around midnight and in the October evening sky.
16 |The brightest stars in Ophiuchus include α Ophiuchi, called Ras Alhague ("head of the serpent charmer"), at magnitude 2.07, and η Ophiuchi, known as Sabik ("the preceding one"), at magnitude 2.43. Other bright stars in the constellation include β Ophiuchi, Cebalrai ("heart of the shepherd") and λ Ophiuchi, or Marfik ("the elbow").
22 |RS Ophiuchi is part of a class called recurrent novae, whose brightness increase at irregular intervals by hundreds of times in a period of just a few days. It is thought to be at the brink of becoming a type-1a supernova.
23 |Barnard's Star, one of the nearest stars to the Solar System (the only stars closer are the Alpha Centauri binary star system and Proxima Centauri), lies in Ophiuchus. It is located to the left of β and just north of the V-shaped group of stars in an area that was once occupied by the now-obsolete constellation of Taurus Poniatovii (Poniatowski's Bull).
24 |In 2005, astronomers using data from the Green Bank Telescope discovered a superbubble so large that it extends beyond the plane of the galaxy. It is called the Ophiuchus Superbubble.
25 |In April 2007, astronomers announced that the Swedish-built Odin satellite had made the first detection of clouds of molecular oxygen in space, following observations in the constellation Ophiuchus.
26 |The supernova of 1604 was first observed on 9 October 1604, near θ Ophiuchi. Johannes Kepler saw it first on 16 October and studied it so extensively that the supernova was subsequently called Kepler's Supernova. He published his findings in a book titled De stella nova in pede Serpentarii (On the New Star in Ophiuchus' Foot). Galileo used its brief appearance to counter the Aristotelian dogma that the heavens are changeless.
27 |In 2009 it was announced that GJ 1214, a star in Ophiuchus, undergoes repeated, cyclical dimming with a period of about 1.5 days consistent with the transit of a small orbiting planet. The planet's low density (about 40% that of Earth) suggests that the planet may have a substantial component of low-density gas—possibly hydrogen or steam. The proximity of this star to Earth (42 light years) makes it a tempting target for further observations.
28 |In April 2010, the naked-eye star ζ Ophiuchi was occulted by the asteroid 824 Anastasia.
29 |Ophiuchus contains several star clusters, such as IC 4665, NGC 6633, M9, M10, M12, M14, M19, M62, and M107, as well as the nebula IC 4603-4604.
31 |M10 is a fairly close globular cluster, only 20,000 light-years from Earth. It has a magnitude of 6.6 and is a Shapley class VII cluster. This means that it has "intermediate" concentration; it is only somewhat concentrated towards its center.
32 | 33 | 34 |The unusual galaxy merger remnant and starburst galaxy NGC 6240 is also in Ophiuchus. At a distance of 400 million light-years, this "butterfly-shaped" galaxy has two supermassive black holes 3,000 light-years apart. Confirmation of the fact that both nuclei contain black holes was obtained by spectra from the Chandra X-ray Observatory. Astronomers estimate that the black holes will merge in another billion years. NGC 6240 also has an unusually high rate of star formation, classifying it as a starburst galaxy. This is likely due to the heat generated by the orbiting black holes and the aftermath of the collision.
35 |In 2006, a new nearby star cluster was discovered associated with the 4th magnitude star Mu Ophiuchi. The Mamajek 2 cluster appears to be a poor cluster remnant analogous to the Ursa Major Moving Group, but 7 times more distant (approximately 170 parsecs away). Mamajek 2 appears to have formed in the same star-forming complex as the NGC 2516 cluster roughly 135 million years ago.
36 |Barnard 68 is a large dark nebula, located 410 light-years from Earth. Despite its diameter of 0.4 light-years, Barnard 68 only has twice the mass of the Sun, making it both very diffuse and very cold - about 16 kelvins. Though it is currently stable, Barnard 68 will eventually collapse, inciting the process of star formation. One unusual feature of Barnard 68 is its vibrations, which have a period of 250,000 years. Astronomers speculate that this phenomenon is caused by the shock wave from a supernova.
37 |There is no evidence of the constellation preceding the classical era, and in Babylonian astronomy, a "Sitting Gods" constellation seems to have been located in the general area of Ophiuchus. However, Gavin White proposes that Ophiuchus may in fact be remotely descended from this Babylonian constellation, representing Nirah, a serpent-god who was sometimes depicted with his upper half human but with serpents for legs.
41 |The earliest mention of the constellation is in Aratus, informed by the lost catalogue of Eudoxus of Cnidus (4th century BC):
42 |To the ancient Greeks, the constellation represented the god Apollo struggling with a huge snake that guarded the Oracle of Delphi. Later myths identified Ophiuchus with Laocoön, the Trojan priest of Poseidon, who warned his fellow Trojans about the Trojan Horse and was later slain by a pair of sea serpents sent by the gods to punish him.
44 |According to Roman era mythography, the figure represents the healer Asclepius, who learned the secrets of keeping death at bay after observing one serpent bringing another healing herbs. To prevent the entire human race from becoming immortal under Asclepius' care, Jupiter killed him with a bolt of lightning, but later placed his image in the heavens to honor his good works.
45 |In medieval Islamic astronomy (Azophi's Uranometry, 10th century), the constellation was known as Al-Ḥawwaʾ "the snake-charmer".
46 |Aratus describes Ophiuchus as trampling on Scorpio with his feet. This is depicted in Renaissance to Early Modern star charts, beginning with Albrecht Dürer in 1515; in some depictions (such as that of Johannes Kepler, 1604), Scorpio also seems to threaten to sting Serpentarius in the foot. This is consistent with Azophi, who already included ψ Oph and ω Oph as the snake-charmer's "left foot", and θ Oph and ο Oph as his "right foot", making Ophiuchus a zodiacal constellation at least as regards his feet. This arrangement has been taken as symbolic in later literature, and placed in relation to the words spoken by God to the serpent in the Garden of Eden (Genesis 3:15).
47 |John Milton used Ophiuchus as the vehicle for an epic simile in Book 2 of Paradise Lost, comparing Satan to a comet burning across the length of Ophiuchus (lines 706-10): 'on th' other side / Incensed with indignation Satan stood / Unterrified, and like a comet burned / That fires the length of Opiuchus huge / In th' arctic sky'.
49 |Ophiuchus is one of thirteen constellations that cross the ecliptic. It has therefore been called the '13th sign of the zodiac'. However, this confuses sign with constellation.
52 |The signs of the zodiac are a twelve-fold division of the ecliptic, so that each sign spans 30° of celestial longitude, approximately the distance the Sun travels in a month, and (in the Western tradition) are aligned with the seasons so that currently the March equinox falls on the boundary between Pisces and Aries.
53 |Constellations, on the other hand, are unequal in size and are based on the positions of the stars. The constellations of the zodiac have only a loose association with the signs of the zodiac, and do not in general coincide with them. In Western astrology the constellation of Aquarius, for example, largely corresponds to the sign of Pisces. Similarly, the constellation of Ophiuchus occupies most of the sign of Sagittarius. The differences are due to the fact that the time of year that the sun passes through a particular zodiac constellation's position has slowly changed over the centuries from when the Greeks and Babylonians originally developed the Zodiac.
54 |Media related to Ophiuchus at Wikimedia Commons
62 |Hercules is a constellation named after Hercules, the Roman mythological hero adapted from the Greek hero Heracles. Hercules was one of the 48 constellations listed by the 2nd century astronomer Ptolemy, and it remains one of the 88 modern constellations today. It is the fifth largest of the modern constellations.
10 | 11 | 12 |Hercules has no first or second magnitude stars. However, it does have several stars above magnitude 4. Alpha Herculis, traditionally called Rasalgethi, is a binary star resolvable in small amateur telescopes, 400 light-years from Earth. The primary is an irregular variable star; it is a red giant with a minimum magnitude of 4 and a maximum magnitude of 3. It has a diameter of 400 solar diameters. The secondary, which orbits every 3600 years, is a blue-green hued star of magnitude 5.4. Its common name means "the kneeler's head". Beta Herculis, also called Kornephoros, is the brightest star in Hercules. It is a yellow giant of magnitude 2.8, 148 light-years from Earth. Its traditional name means "club-bearer". deltoide 5512 is a double star divisible in small amateur telescopes. The primary is a blue-white star of magnitude 3.1,and is 78 light-years from Earth. The optical companion is of magnitude 8.2. Gamma Herculis is also a double star divisible in small amateur telescopes. The primary is a white giant of magnitude 3.8, 195 light-years from Earth. The optical companion, widely separated, is 10th magnitude. Zeta Herculis is a binary star that is becoming divisible in medium-aperture amateur telescopes, as the components widen to their peak in 2025. The system, 35 light-years from Earth, has a period of 34.5 years. The primary is a yellow-tinged star of magnitude 2.9 and the secondary is an orange star of magnitude 5.7.
15 |There are several dimmer variable stars in Hercules. 30 Herculis, also called g Herculis, is a semiregular red giant with a period of 3 months. 361 light-years from Earth, it has a minimum magnitude of 6.3 and a maximum magnitude of 4.3. 68 Herculis, also called u Herculis, is a Beta Lyrae-type eclipsing binary star. 865 light-years from Earth, it has a period of 2 days; its minimum magnitude is 5.4 and its maximum magnitude is 4.7.
16 |Hercules is also home to many double stars and binary stars. Kappa Herculis is a double star divisible in small amateur telescopes. The primary is a yellow giant of magnitude 5.0, 388 light-years from Earth; the secondary is an orange giant of magnitude 6.3, 470 light-years from Earth. Rho Herculis is a binary star 402 light-years from Earth, divisible in small amateur telescopes. Both components are blue-green giant stars; the primary is magnitude 4.5 and the secondary is magnitude 5.5. 95 Herculis is a binary star divisible in small telescopes, 470 light-years from Earth. The primary is a silvery giant star of magnitude 4.9 and the secondary is an old giant star of magnitude 5.2. 100 Herculis is a double star easily divisible in small amateur telescopes. Both components are magnitude 5.8 blue-white stars; they are 165 and 230 light-years from Earth.
17 |Mu Herculis is 27.4 light-years from Earth. The solar apex, i.e., the point on the sky which marks the direction that the Sun is moving in its orbit around the center of the Milky Way, is located within Hercules, close to Vega in neighboring Lyra.
18 |Fifteen stars in Hercules are known to be orbited by extrasolar planets.
20 |Hercules contains two bright globular clusters: M13, the brightest globular cluster in the northern hemisphere, and M92. It also contains the nearly spherical planetary nebula Abell 39. M13 lies between the stars η Her and ζ Her; it is dim, but may be detected by the unaided eye on a very clear night.
31 |M13, visible to both the naked eye and binoculars, is a globular cluster of the 6th magnitude that contains more than 300,000 stars and is 25,200 light-years from Earth. It is also very large, with an apparent diameter of over 0.25 degrees, half the size of the full moon; its physical diameter is more than 100 light-years. Individual stars in M13 are resolvable in a small amateur telescope.
32 |M92 is a globular cluster of magnitude 6.4, 26,000 light-years from earth. It is a Shapley class IV cluster, indicating that it is quite concentrated at the center; it has a very clear nucleus. M92 is visible as a fuzzy star in binoculars, like M13; it is denser and smaller than the more celebrated cluster. The oldest globular cluster known at 14 billion years, its stars are resolvable in a medium-aperture amateur telescope.
33 |NGC 6229 is a dimmer globular cluster, with a magnitude of 9.4, it is the third-brightest globular in the constellation. 100,000 light-years from Earth, it is a Shapley class IV cluster, meaning that it is fairly rich in the center and quite concentrated at the nucleus.
34 |NGC 6210 is a planetary nebula of the 9th magnitude, 4000 light-years from Earth visible as a blue-green elliptical disk in amateur telescopes larger than 75 mm in aperture.
35 |The Hercules Cluster (Abell 2151) is a cluster of galaxies in Hercules.
36 |The Hercules–Corona Borealis Great Wall, the largest structure in the universe, is in Hercules.
37 |The traditional visualization imagines α Herculis as Hercules's head; its name, Ras Algethi, literally means "head of the kneeling one". Hercules's left hand then points toward Lyra from his shoulder (δ Herculis), and β Herculis, or Kornephoros ("club-bearer") forms his other shoulder. His narrow waist is formed by ε Herculis and ζ Herculis. Finally, his left leg (with θ Herculis as the knee and ι Herculis the foot) is stepping on Draco's head, the dragon/snake who Hercules has vanquished and perpetually gloats over for eternities.
41 |A common form found in modern star charts uses the quadrangle formed by π Her, η Her, ζ Her and ε Her (known as the "Keystone" asterism) as Hercules's torso.
45 |H. A. Rey has suggested an alternative visualization in which the "Keystone" becomes Hercules's head. This quadrangle lies between two very bright stars: Vega in the constellation Lyra and α CrB (Gemma, or Alphecca) in the constellation Corona Borealis. The hero's right leg contains two bright stars of the third magnitude: α Her (Ras Algethi) and δ Her (Sarin). The latter is the right knee. The hero's left leg contains dimmer stars of the fourth magnitude which do not have Bayer designations but which do have Flamsteed numbers. The star β Her belongs to the hero's outstretched right hand, and is also called Kornephoros.
47 |According to Gavin White, the Greek constellation of Hercules is a distorted version of the Babylonian constellation known as the "Standing Gods" (MUL.DINGIR.GUB.BA.MESH). White argues that this figure was, like the similarly named "Sitting Gods", depicted as a man with a serpent's body instead of legs (the serpent element now being represented on the Greek star map by the figure of Draco that Hercules crushes beneath his feet). He further argues that the original name of Hercules - the 'Kneeler' (see below) - is a conflation of the two Babylonian constellations of the Sitting and Standing Gods.
49 |The earliest Greek references to the constellation do not refer to it as Hercules. Aratus describes it as follows:
50 |51 |53 |Right there in its [Draco's] orbit wheels a Phantom form, like to a man that strives at a task. That sign no man knows how to read clearly, nor what task he is bent, but men simply call him On His Knees [Ἐγγόνασιν "the Kneeler"].
52 |
54 |56 |Now that Phantom, that toils on his knees, seems to sit on bended knee, and from both his shoulders his hands are upraised and stretch, one this way, one that, a fathom's length. Over the middle of the head of the crooked Dragon, he has the tip of his right foot. Here too that Crown [Corona], which glorious Dionysus set to be memorial of the dead Ariadne, wheels beneath the back of the toil-spent Phantom. To the Phantom’s back the Crown is near, but by his head mark near at hand the head of Ophiuchus [...] Yonder, too, is the tiny Tortoise, which, while still beside his cradle, Hermes pierced for stings and bade it be called the Lyre [Lyra]: and he brought it into heaven and set it in front of the unknown Phantom. That Croucher on his Knees comes near the Lyre with his left knee, but the top of the Bird’s head wheels on the other side, and between the Bird’s head and the Phantom’s knee is enstarred the Lyre.
55 |
The story connecting Hercules with the constellation is recounted by Dionysius of Halicarnassus:
57 |58 |60 |On his way back to Mycenae from Iberia having obtained the Cattle of Geryon as his tenth labour Heracles came to Liguria in North-Western Italy where he engaged in battle with two giants, Albion and Bergion or Dercynus. The opponents were strong; Hercules was in a difficult position so he prayed to his father Zeus for help. With the aegis of Zeus, Heracles won the battle. It was this kneeling position of Heracles when prayed to his father Zeus that gave the name "the Kneeler". and Hyginus
59 |
Hercules is also sometimes associated with Gilgamesh, a Sumerian mythological hero.
61 |In Chinese astronomy, the stars that correspond to Hercules are located in two areas: the Purple Forbidden enclosure (紫微垣, Zǐ Wēi Yuán) and the Heavenly Market enclosure (天市垣, Tiān Shì Yuán).
63 |Taurus is one of the constellations of the zodiac, which means it is crossed by the plane of the ecliptic. Its name is a Latin word meaning "bull", and its astrological symbol is a stylized bull's head: (Unicode ♉). Taurus is a large and prominent constellation in the northern hemisphere's winter sky. It is one of the oldest constellations, dating back to at least the Early Bronze Age when it marked the location of the Sun during the spring equinox. Taurus came to symbolize the bull in the mythologies of Ancient Babylon, Egypt, and Greece.
10 |There are a number of features of interest to astronomers. Taurus hosts two of the nearest open clusters to Earth, the Pleiades and the Hyades, both of which are visible to the naked eye. At first magnitude, the red giant Aldebaran is the brightest star in the constellation. In the northwest part of Taurus is the supernova remnant Messier 1, more commonly known as the Crab Nebula. One of the closest regions of active star formation, the Taurus-Auriga complex, crosses into the northern part of the constellation. The variable star T Tauri is the prototype of a class of pre-main-sequence stars.
11 | 12 |Taurus is a big and prominent constellation in the northern hemisphere's winter sky, between Aries to the west and Gemini to the east; to the north lie Perseus and Auriga, to the southeast Orion, to the south Eridanus, and to the southwest Cetus. In September and October, Taurus is visible in the evening along the eastern horizon. The most favorable time to observe Taurus in the night sky is during the months of December and January. By March and April, the constellation will appear to the west during the evening twilight.
14 |This constellation forms part of the zodiac, and hence is intersected by the ecliptic. This circle across the celestial sphere forms the apparent path of the Sun as the Earth completes its annual orbit. As the orbital plane of the Moon and the planets lie near the ecliptic, they can usually be found in the constellation Taurus during some part of each year. The galactic plane of the Milky Way intersects the northeast corner of the constellation and the galactic anticenter is located near the border between Taurus and Auriga. Taurus is the only constellation crossed by all three of the galactic equator, celestial equator, and ecliptic. A ring-like galactic structure known as the Gould's Belt passes through the Taurus constellation.
15 |The recommended three-letter abbreviation for the constellation, as adopted by the International Astronomical Union in 1922, is "Tau". The official constellation boundaries, as set by Eugène Delporte in 1930, are defined by a polygon of 26 segments. In the equatorial coordinate system, the right ascension coordinates of these borders lie between 03h 23.4m and 05h 53.3m, while the declination coordinates are between 31.10° and −1.35°. Because a small part of the constellation lies to the south of the celestial equator, this can not be a completely circumpolar constellation at any latitude.
16 |During November, the Taurid meteor shower appears to radiate from the general direction of this constellation. The Beta Taurid meteor shower occurs during the months of June and July in the daytime, and is normally observed using radio techniques. In October, between the 18th and the 29th, both the Northern Taurids and the Southern Taurids are active; though the latter stream is stronger. However, between November 1 and 10, the two streams equalize.
19 |The brightest member of this constellation is Aldebaran, an orange-hued, spectral class K5 III giant star. Its name derives from الدبران al-dabarān, Arabic for "the follower", probably from the fact that it follows the Pleiades during the nightly motion of the celestial sphere across the sky. Forming the profile of a Bull's face is a V or A-shaped asterism of stars. This outline is created by prominent members of the Hyades, the nearest distinct open star cluster after the Ursa Major Moving Group. In this profile, Aldebaran forms the bull's bloodshot eye, which has been described as "glaring menacingly at the hunter Orion", a constellation that lies just to the southwest. The Hyades span about 5° of the sky, so that they can only be viewed in their entirety with binoculars or the unaided eye. It includes a naked eye double star, Theta Tauri, with a separation of 5.6 arcminutes.
20 |In the northeastern quadrant of the Taurus constellation lie the Pleiades (M45), one of the best known open clusters, easily visible to the naked eye. The seven most prominent stars in this cluster are at least visual magnitude six, and so the cluster is also named the "Seven Sisters". However, many more stars are visible with even a modest telescope. Astronomers estimate that the cluster has approximately 500-1,000 stars, all of which are around 100 million years old. However, they vary considerably in type. The Pleiades themselves are represented by large, bright stars; there are also many small brown dwarfs and white dwarfs. The cluster is estimated to dissipate in another 250 million years. The Pleiades cluster is classified as a Shapley class c and Trumpler class I 3 r n cluster, indicating that it is irregularly shaped and loose, though concentrated at its center and detached from the star field.
21 |In the northern part of the constellation to the northwest of the Pleiades lies the Crystal Ball Nebula, known by its catalogue designation of NGC 1514. This planetary nebula is of historical interest following its discovery by German-born English astronomer William Herschel in 1790. Prior to that time, astronomers had assumed that nebulae were simply unresolved groups of stars. However, Herschel could clearly resolve a star at the center of the nebula that was surrounded by a nebulous cloud of some type. In 1864, English astronomer William Huggins used the spectrum of this nebula to deduce that the nebula is a luminous gas, rather than stars.
22 |To the west, the two horns of the bull are formed by Beta (β) Tauri and Zeta (ζ) Tauri; two star systems that are separated by 8°. Beta is a white, spectral class B7 III giant star known as El Nath, which comes from the Arabic phrase "the butting", as in butting by the horns of the bull. At magnitude 1.65, it is the second brightest star in the constellation, and shares the border with the neighboring constellation of Auriga. As a result, it also bears the designation Gamma Aurigae. Zeta Tauri is an eclipsing binary star that completes an orbit every 133 days.
23 |A degree to the northwest of ζ Tauri is the Crab Nebula (M1), a supernova remnant. This expanding nebula was created by a Type II supernova explosion, which was seen from Earth on July 4, 1054. It was bright enough to be observed during the day, and is mentioned in Chinese historical texts. At its peak the supernova reached magnitude −4, but the nebula is currently magnitude 8.4 and requires a telescope to observe. North American peoples also observed the supernova, as evidenced from a painting on a New Mexican canyon and various pieces of pottery that depict the event. However, the remnant itself was not discovered until 1731, when John Bevis found it.
24 |The star Lambda (λ) Tauri is an eclipsing binary star. This system consists of a spectral class B3 star being orbited by a less massive class A4 star. The plane of their orbit lies almost along the line of sight to the Earth. Every 3.953 days the system temporarily decreases in brightness by 1.1 magnitudes as the brighter star is partially eclipsed by the dimmer companion. The two stars are separated by only 0.1 astronomical units, so their shapes are modified by mutual tidal interaction. This results in a variation of their net magnitude throughout each orbit.
25 |Located about 1.8° west of Epsilon (ε) Tauri is T Tauri, the prototype of a class of variable stars called T Tauri stars. This star undergoes erratic changes in luminosity, varying between magnitude 9 to 13 over a period of weeks or months. This is a newly formed stellar object that is just emerging from its envelope of gas and dust, but has not yet become a main sequence star. The surrounding reflection nebula NGC 1555 is illuminated by T Tauri, and thus is also variable in luminosity.
26 |This constellation includes part of the Taurus-Auriga complex, or Taurus dark clouds, a star forming region of sparse, filamentary clouds. This spans a diameter of 98 light-years (30 parsecs) and contains 35,000 solar masses of material, which is both larger and less massive than the Orion Nebula. At a distance of 490 light-years (150 parsecs), this is one of the nearest active star forming regions. Located in this region, about 10° to the northeast of Aldebaran, is an asterism NGC 1746 spanning a width of 45 arcminutes.
27 |The identification of the constellation of Taurus with a bull is very old, certainly dating to the Chalcolithic, and perhaps even to the Upper Paleolithic. Michael Rappenglück of the University of Munich believes that Taurus is represented in a cave painting at the Hall of the Bulls in the caves at Lascaux (dated to roughly 15,000 BC), which he believes is accompanied by a depiction of the Pleiades. The name "seven sisters" has been used for the Pleiades in the languages of many cultures, including indigenous groups of Australia, North America and Siberia. This suggests that the name may have a common ancient origin.
31 |Taurus marked the point of vernal (spring) equinox in the Chalcolithic and the Early Bronze Age, from about 4000 BC to 1700 BC, after which it moved into the neighboring constellation Aries. The Pleiades were closest to the Sun at vernal equinox around the 23rd century BC. In Babylonian astronomy, the constellation was listed in the MUL.APIN as GU4.AN.NA, "The Heavenly Bull". As this constellation marked the vernal equinox, it was also the first constellation in the Babylonian zodiac and they described it as "The Bull in Front". The Akkadian name was Alu.
32 |In the Mesopotamian Epic of Gilgamesh, one of the earliest works of literature, the goddess Ishtar sends Taurus, the Bull of Heaven, to kill Gilgamesh for spurning her advances. Gilgamesh is depicted as the neighboring constellation of Orion, and in the sky they face each other as if engaged in combat. In early Mesopotamian art, the Bull of Heaven was closely associated with Inanna, the Sumerian goddess of sexual love, fertility, and warfare. One of the oldest depictions shows the bull standing before the goddess' standard; since it has 3 stars depicted on its back (the cuneiform sign for "star-constellation"), there is good reason to regard this as the constellation later known as Taurus.
33 |The same iconic representation of the Heavenly Bull was depicted in the Dendera zodiac, an Egyptian bas-relief carving in a ceiling that depicted the celestial hemisphere using a planisphere. In these ancient cultures, the orientation of the horns was portrayed as upward or backward. This differed from the later Greek depiction where the horns pointed forward. To the Egyptians, the constellation Taurus was a sacred bull that was associated with the renewal of life in spring. When the spring equinox entered Taurus, the constellation would become covered by the Sun in the western sky as spring began. This "sacrifice" led to the renewal of the land. To the early Hebrews, Taurus was the first constellation in their zodiac and consequently it was represented by the first letter in their alphabet, Aleph.
34 |In Greek mythology, Taurus was identified with Zeus, who assumed the form of a magnificent white bull to abduct Europa, a legendary Phoenician princess. In illustrations of Greek mythology, only the front portion of this constellation are depicted; this was sometimes explained as Taurus being partly submerged as he carried Europa out to sea. A second Greek myth portrays Taurus as Io, a mistress of Zeus. To hide his lover from his wife Hera, Zeus changed Io into the form of a heifer. Greek mythographer Acusilaus marks the bull Taurus as the same that formed the myth of the Cretan Bull, one of The Twelve Labors of Heracles.
35 |Taurus became an important object of worship among the Druids. Their Tauric religious festival was held while the Sun passed through the constellation. In Buddhism, legends hold that Gautama Buddha was born when the Full Moon was in Vaisakha, or Taurus. Buddha's birthday is celebrated with the Wesak Festival, or Vesākha, which occurs on the first or second Full Moon when the Sun is in Taurus.
36 |As of 2008, the Sun appears in the constellation Taurus from May 13 to June 21. In tropical astrology, the Sun is considered to be in the sign Taurus from April 20 to May 20.
39 |The space probe Pioneer 10 is moving in the direction of this constellation, though it will not be nearing any of the stars in this constellation for many thousands of years, by which time its batteries will be long dead.
41 |Hindi (हिन्दी), or more precisely Modern Standard Hindi (मानक हिन्दी), is a standardised and Sanskritised register of the Hindustani language. Hindustani is the native language of most people living in Delhi, Uttar Pradesh, Uttarakhand, Chhattisgarh, Himachal Pradesh, Chandigarh, Bihar, Jharkhand, Madhya Pradesh, Haryana, and Rajasthan. Modern Standard Hindi is one of the official languages of India.
10 |As of 2009, the best figure Ethnologue could find for speakers of actual Hindustani Hindi was 180 million in 1991. In the 2001 Indian census, 258 million (258,000,000) people in India reported Hindi to be their native language, which also includes people who identify as native speakers of related languages who consider their speech to be a dialect of Hindi, the Hindi belt.
11 | 12 |Article 343 (1) of the Indian constitution states "The official language of the Union shall be Hindi in Devanagari script. The form of numerals to be used for the official purposes of the Union shall be the international form of Indian numerals."
15 |Article 351 of the Indian constitution states "It shall be the duty of the Union to promote the spread of the Hindi language, to develop it so that it may serve as a medium of expression for all the elements of the composite culture of India and to secure its enrichment by assimilating without interfering with its genius, the forms, style and expressions used in Hindustani and in the other languages of India specified in the Eighth Schedule, and by drawing, wherever necessary or desirable, for its vocabulary, primarily on Sanskrit and secondarily on other languages." The trend is different in Hindi cinema where more and more English, Persian, Turkish and Arabic vocabulary is preferred.
16 |It was envisioned that Hindi would become the sole working language of the Union Government by 1965 (per directives in Article 344 (2) and Article 351), with state governments being free to function in the language of their own choice. However, widespread resistance to the imposition of Hindi on non-native speakers, especially in South India (such as the those in Tamil Nadu), Maharashtra, and West Bengal, led to the passage of the Official Languages Act of 1963, which provided for the continued use of English indefinitely for all official purposes, although the constitutional directive for the Union Government to encourage the spread of Hindi was retained and has strongly influenced its policies.
17 |At the state level, Hindi is the official language of the following states: Bihar, Chhattisgarh, Haryana, Himachal Pradesh, Jharkhand, Madhya Pradesh, Rajasthan, Uttar Pradesh, and Uttarakhand. Each may also designate a "co-official language"; in Uttar Pradesh, for instance, depending on the political formation in power, this language is generally Urdu. Similarly, Hindi is accorded the status of official language in the following Union Territories: Andaman & Nicobar Islands, Chandigarh, Dadra & Nagar Haveli, Daman & Diu, National Capital Territory.
18 |National-language status for Hindi is a long-debated theme. An Indian court clarified that Hindi is not the national language of India because the constitution does not mention it as such.
19 |Outside of Asia, Hindi is also an official language in Fiji. The Constitution of Fiji declares three official languages: English, Fijian, and Hindi. The Hindi spoken there is Fiji Hindi, a form of Awadhi, not Modern Standard Hindi's Hindustani.
20 |The dialect of Hindustani on which Standard Hindi is based is Khariboli, the vernacular of Delhi and the surrounding western Uttar Pradesh and southern Uttarakhand . This dialect acquired linguistic prestige in the Mughal Empire (1600s) and became known as Urdu, "the language of the court". In the late 19th century, the movement standardising a written language from Khariboli, for the Indian masses in North India, started to standardise Hindi as a separate language from Urdu, which was learnt by the elite. In 1881 Bihar accepted Hindi as its sole official language, replacing Urdu, and thus became the first state of India to adopt Hindi.
23 |After independence, the government of India instituted the following conventions:
24 |The Constituent Assembly adopted Hindi as the Official Language of the Union on 14 September 1949. Hence, it is celebrated as Hindi Day.
27 |Linguistically, Hindi and Urdu are the same language. Hindi is written in the Devanagari script and uses more Sanskrit words, whereas Urdu is written in the Perso-Arabic script and uses more Arabic and Persian words.
30 |Hindi is written in Devanagari script (देवनागरी लिपि devanāgarī lipi) also called Nagari. Devanagari consists of 11 vowels and 33 consonants and is written from left to right.
33 |Formal Standard Hindi draws much of its academic vocabulary from Sanskrit. Standard Hindi loans words are divided into five principal categories:
36 |The Hindi standard, from which much of the Persian, Arabic and English vocabulary has been purged and replaced by neologisms compounding tatsam words, is called Shuddha Hindi (pure Hindi), and is viewed as a more prestigious dialect over other more colloquial forms of Hindi.
42 |Excessive use of tatsam words creates problems for native speakers. They may have Sanskrit consonant clusters which do not exist in native Hindi. The educated middle class of India may be able to pronounce such words, but others have difficulty. Persian and Arabic vocabulary given 'authentic' pronunciations cause similar difficulty.
43 |Hindi literature is broadly divided into four prominent forms or styles, being Bhakti (devotional – Kabir, Raskhan); Shringar (beauty – Keshav, Bihari); Virgatha (extolling brave warriors); and Adhunik (modern).
47 |Medieval Hindi literature is marked by the influence of Bhakti movement and the composition of long, epic poems. It was primarily written in other varieties of Hindi, particularly Avadhi and Braj Bhasha, but also in Khariboli. During the British Raj, Hindustani became the prestige dialect. Hindustani with heavily Sanskritised vocabulary or Sahityik Hindi (Literary Hindi) was popularised by the writings of Swami Dayananda Saraswati, Bhartendu Harishchandra and others. The rising numbers of newspapers and magazines made Hindustani popular with the educated people. Chandrakanta, written by Devaki Nandan Khatri, is considered the first authentic work of prose in modern Hindi. The person who brought realism in the Hindi prose literature was Munshi Premchand, who is considered as the most revered figure in the world of Hindi fiction and progressive movement.
48 |The Dwivedi Yug ("Age of Dwivedi") in Hindi literature lasted from 1900 to 1918. It is named after Mahavir Prasad Dwivedi, who played a major role in establishing the Modern Hindi language in poetry and broadening the acceptable subjects of Hindi poetry from the traditional ones of religion and romantic love.
49 |In the 20th century, Hindi literature saw a romantic upsurge. This is known as Chhayavaad (shadowism) and the literary figures belonging to this school are known as Chhayavaadi. Jaishankar Prasad, Suryakant Tripathi 'Nirala', Mahadevi Varma and Sumitranandan Pant, are the four major Chhayavaadi poets.
50 |Uttar Adhunik is the post-modernist period of Hindi literature, marked by a questioning of early trends that copied the West as well as the excessive ornamentation of the Chhayavaadi movement, and by a return to simple language and natural themes.
51 |Hindi has a presence on the internet, but due to lack of standard encoding, search engines cannot be used to locate text. Hindi is one of the seven languages of India that can be used to make web addresses.(URLs). Hindi has also impacted the language of technology, with words such as 'avatar' (meaning a spirit taking a new form) used in computer sciences, artificial intelligence and even robotics.
55 |The following is a sample text in High Hindi, of the Article 1 of the Universal Declaration of Human Rights (by the United Nations):
58 |The trapezoidal yangqin (simplified Chinese: 扬琴; traditional Chinese: 揚琴; pinyin: yángqín) is a Chinese hammered dulcimer, originally from Persia (modern-day Iran). It used to be written with the characters 洋琴 (lit. "foreign zither"), but over time the first character changed to 揚 (also pronounced "yáng"), which means "acclaimed". It is also spelled yang quin or yang ch'in. Hammered dulcimers of various types are now very popular not only in China, but also Eastern Europe, the Middle East, India, Iran, and Pakistan. The instruments are also sometimes known by the names "santur" and "cymbalom".
10 |The yangqin was traditionally fitted with bronze strings (though older Chinese stringed instruments used silk strings, resulting in their, and the yangqin's, categorisation as a silk, or "si" instrument), which gave the instrument a soft timbre. This form of instrument is still occasionally heard today in the "hudie qin" (蝴蝶琴, lit. "butterfly zither") played in the traditional silk and bamboo genre from the Shanghai region known as Jiangnan sizhu (江南絲竹), as well as in some Cantonese music groups. The Thai and Cambodian khim are nearly identical in their construction, having been introduced to those nations by southern Chinese musicians. Since the 1950s, however, steel alloy strings (in conjunction with copper-wound steel strings for the bass notes) have been used, in order to give the instrument a brighter, and louder tone. The modern yangqin can have as many as five courses of bridges and may be arranged chromatically. Traditional instruments, with three or more courses of bridges, are also still widely in use. The instrument's strings are struck with two lightweight bamboo beaters (also known as hammers) with rubber tips. A professional musician often carries several sets of beaters, each of which draws a slightly different tone from the instrument, much like the drum sticks of Western percussionists. The yangqin is used both as a solo instrument and in ensembles.
11 | 12 |Historians offer several theories to explain how the instrument was introduced to China:
15 |The word "yangqin" has historically been written in two different ways, using different Chinese characters for "yang". The "yang" in the earlier version was written with the character 洋, meaning "foreign." It was later changed, in 1910, to the character "yang" (揚), meaning "acclaimed" and is also the first character of the name of Yangzhou (揚州) which some Chinese linguistic scholars have stated was done because the latter term was more politically correct during a period when China was resisting foreign cultural influences.
19 |Another theory of how the yangqin came into contact with the Chinese is through the Silk Road. At a glance, the Silk Route stretches almost 5,000 miles reaching from China to the Middle East, including Iran (Persia). The Iranian santur, a dulcimer, has existed since ancient times. If any dulcimer was to influence China by land, it is likely to be this instrument.
21 |The technical structure of the santur is different in the way the tuning pegs are place, the bridges and the mallets. The yangqin's tuning pins are set in parallel instead of a 90-degree angle down at the side. The mallets of the santur also differ from those of the yangqin - they are made of wood with finger grip, designed to let the players perform by gripping the two mallets between their fore and middle fingers. Both modern and earlier yangqin mallets did not include finger grips.
22 |The bridge of the yangqin consist of long, single pieces of wood with many protruding "stubs" supporting the strings unlike the santur, which uses a number of small, individual chesspiece-like bridges.
23 |The port at Canton/Guangzhou attracts traders from all over Asia: from Japan, India, Southeast Asia, and the Middle East. The ships from this region bought back precious stones, slaves, exotic wares, fruits, spices, etc. Along with trade, businesses, ideas, philosophies and scientific knowledge were exchanged, including religion (principally Buddhism).
25 |During the 16th century, the Age of Exploration in Europe reached its climax and soon trade was established between China and Europe. Historians state that Portuguese, and later, English and Dutch ships, had brisk trade with China. Portuguese trading in Chinese waters began in the 16th century according to historians. Music historians report that the salterio, a hammered dulcimer, was played in Portugal, Spain, and Italy during this period. Historians say it is possible that the yangqin originated when the Portuguese, the English or the Dutch brought a dulcimer player to China who performed for locals.
26 |Some historians have stated that the European clavichord is another possible precursor to the Yangqin. These historians state that an Italian missionary, Matteo Ricci, had brought a clavichord from Europe to China, and that the Chinese court had many clavichords and harpsichords in the palace, given as gifts by various European nations. However, as the locals could not duplicate the striking mechanism, they reverted to using hammers to hit the strings instead, resulting in the Yangqin.
28 |This explanation, saying that the locals could not duplicate [...], seems very improbable, even insulting, because the tangents of a clavichord, q.v., are far too simple to be considered "mechanisms" in the, usual sense. They could easily be made by amateur metalworkers.
29 |Some music scholars support the theory that the Chinese dulcimer, yangqin was developed within China itself, devoid of all foreign influence. These historians state two possible explanations for the instrument's native origins, which are: the yangqin is a development from an ancient string instrument called zhu (筑). Or that the yangqin originated from Yangzhou (揚州), China itself.
31 |Some music scholars state that the yangqin developed from the ancient musical instrument zhu (筑). The zhu is shaped like the guqin - rectangular, with one side wider than the other. It had 12 to 13 strings (the earliest variant only had 5 strings), assumed to have been made of silk or gut, with resemblance to the guqin. It was performed using techniques quite similar to the guqin - one hand pressing the strings while the other plucked. However, in the case of the zhu, instead of plucking the strings, the strings were struck using a slender bamboo hammer.
33 |Another theory supported by some music scholars is that the yangqin was developed in Yangzhou, a city in Jiangsu Province. According to one thesis written by Mr Chew in 1921, "Yangqin was named Yangqin because it was invented in Yangzhou. Different variants came about after it was introduced into Guangzhou."
35 |As the yangqin is a type of hammered dulcimer, it shares many elements of construction with other instruments in the hammered dulcimer family:
37 |Modern yangqin usually have 144 strings in total, with each pitch running in courses, with up to 5 strings per course, in order to boost the volume. The strings come in various thicknesses, and are tied at one end by screws, and at the other with tuning pegs. The pegs and screws are covered during playing by a hinged panel/board. This panel is opened up during tuning to access the tuning pegs.
40 |There are usually four to five bridges on a yangqin. From right to left, they are: bass bridge, "right bridge", tenor bridge, "left bridge", and the chromatic bridge. During playing, one is supposed to strike the strings on the left side of the bridges. However, the strings on the "chromatic bridge" are struck on the right, and strings on the "left bridge" can be struck on both sides of the bridge.
42 |The hammers are made of flexible bamboo, and one end is half covered by rubber. Due to their unique construction, there are two ways to play: with the rubber side for a softer sound, and with the bamboo side for a crisper, more percussive sound. This technique, known as 反竹 (fǎnzhǔ), is best utilized in the higher ranges of the yangqin. Additionally, the ends of the sticks can be used to pluck the strings, producing a sharp, clear sound. Glissandos can also be achieved in this way by running the ends of the sticks up or down the strings.
47 |Furthermore, some songs require the use of "雙音琴竹" (shuāng yīn qín zhǔ), literally "double-note yangqin hammers". These specially-constructed hammers have 2 striking surfaces, allowing the player to play up to 4 notes simultaneously (or even 8 notes, if the strings of the "left bridge" and "tenor bridge" are struck at a point where they intersect each other), resulting in a rich, powerful tone, which is especially pronounced in the lower registers due to the strings' long echoes. 林沖夜奔 (Lin Chong Flees In The Night), composed by 項祖華 (Xiang Zu Hua), is a representative solo piece which utilizes 雙音琴竹.
48 |When using 雙音琴竹, the left hand holds a beater that plays intervals of a perfect fourth, while the right hand's beater plays thirds. These intervals are standard over most of the yangqin's range, due to the positioning of its strings.
49 |On both sides of the yangqin, aside from the tuning screws, are numerous cylindrical metal Nuts that can be moved for fine tuning the strings or to raise the strings slightly to eliminate unwanted vibrations that may occur. More modern designs also have moveable ball-shaped nuts that can be adjusted on the fly with the fingers; this provides some microtuning and additional dynamics during performances, such as portamentos and vibratos (see below: "Manner of Performance").
52 |The sticks are held, one in each hand, and hit the strings alternately. In the orchestra, the yangqin often adds to the harmony by playing chords or arpeggios. As the yangqin is softer than other Chinese instruments, it is usually positioned at the front of the orchestra, in the row just in front of the conductor. However, this is not a rule: the Singapore Chinese Orchestra positions the yangqin close to the percussion section. As the yangqin's tones sustain long after they have been played, such an arrangement minimizies the dissonance that results. If the hands are free (e.g. in periods of rest), covering the strings with the hands quickly dampens the vibrations. The yangqin has been called the "Chinese piano" as it has an indispensable role in the accompaniment of Chinese string and wind instruments.
55 |The yangqin's solo repertoire calls for more techniques than is usually required in orchestral pieces. Examples include pressing down on the strings to produce vibrato effects, similar to that of a guzheng, as well as harmonics and 顫竹 (chàn zhǔ), which involves flicking the sticks lightly over the strings, causing them to vibrate, which results in a short, quick tremolo. Numerous other techniques, such as portamento - a glide from one note to another (accomplished through 2 methods, both involving the lengthening or shortening of strings: the first is by sliding the fine-tuning devices on the sides of the instrument by hand, and the second is by wearing a metallic "ring" - known as a 滑音指套 [huá yīn zhǐ tào] - and sliding it along the length of the indicated string) - are also used.
56 |The yangqin is a chromatic instrument with a range of slightly over four octaves. Middle C is located on the tenor bridge, third course from the bottom.
59 |The pitches are arranged so that in general, moving one section away from the player's body corresponds to a transposition of a whole tone upwards. Similarly, moving one section towards the left of the performer generally corresponds to a transposition of a perfect fifth upwards. These are only rules of thumb since the arrangement has to be modified towards the extremes of the pitch range to fill out notes in the chromatic scale. Such an arrangement facilitates transposition.
60 |In the playing of traditional Chinese music, most Chinese yangqin players use a numerical notation system called jianpu, rather than Western staff notation.
61 |The yangqin has also been modified, much like an electric guitar, to be an amplified electronic instrument.
63 |British rock band Coldplay also features the Yangqin in their 2008 single Life in Technicolor II.
89 |The kaval is a chromatic end-blown flute traditionally played throughout Azerbaijan, Turkey, Hungary, Bulgaria, Macedonia, Albania, southern Serbia (кавал), Ukraine, Moldova, northern Greece (καβάλι or τζαμάρα), Romania (caval), and Armenia (Բլուլ or blul). The kaval is primarily associated with mountain shepherds throughout the Balkans and Anatolia.
10 |Unlike the transverse flute, the kaval is fully open at both ends, and is played by blowing on the sharpened edge of one end. The kaval has 8 playing holes (7 in front and 1 in the back for the thumb) and usually four more unfingered intonation holes near the bottom of the kaval. As a wooden rim-blown flute, kaval is similar to the ney of the Arab world. The name kaval may once have been referred to various Balkan duct and rim-blown flutes, accounting for the present day diversity of the term’s usage.
11 | 12 |While typically made of wood (cornel cherry, apricot, plum, boxwood, mountain ash, etc.), kavals are also made from water buffalo horn, Arundo donax Linnaeus 1753 (Persian reed), metal and plastic.
15 |A kaval made without joints is usually mounted on a wooden holder, which protects it from warping and helps keep the interior walls oiled. According to the key, the kaval can be in the high register (C, C#), middle (D, H) or low (A, B). The kaval plays two octaves and a fifth, in the chromatic scale. Its sound is warm, melancholic and pleasant.
16 |The kaval is primarily associated with mountain shepherds throughout the Balkans and Anatolia and in the book Kaval: Traditional Folk Melodies for Balkan & Anatolian Folk Flute, musician Pat MacSwyney suggest that the kaval spread with the Yoruks from the Taurus mountains of southern Anatolia into the southern Balkans of southeast Europe.
18 |While in the past it was almost entirely a shepherd's instrument, today it is widely used in folk songs and dances as part of ensembles or solo.
19 |Unlike the transverse flute, the kaval is fully open at both ends, and is played by blowing on the sharpened edge of one end. The kaval has 8 playing holes (7 in front and 1 in the back for the thumb) and usually 4 more near the bottom of the kaval. These holes are not used for playing the instrument, but determine the lowest tone's pitch and timbre and are supposed to improve tone and intonation. In Bulgaria they are known as "devil's holes", based on a folk tale in which the devil tries to out-play a shepherd in a musical duel. While the shepherd is sleeping, the devil drilled holes in the shepherd's kaval but instead of ruining the kaval, this only served to enhance the shepherd's kaval playing thus thwarting the devil. In Macedonia they are known as "glasnici" (гласници) meaning "giving voice to/of".
21 |When played, the kaval is held with both hands at an angle of approximately 45° from the body, with the four fingers of the one hand covering the lower holes; the upper three holes and the thumbhole are covered with the other hand. The mouth covers ~¾ of the end. Change of the breath air pressure also changes the pitch.
22 |The kaval that is most common in Bulgaria is the one in middle (D) register. The kaval in lower (C) register is also not uncommon for this country. What is characteristic for the Bulgarian style of kaval performance is the incredible diversity of sound shades and techniques. According to the pitch there are four different registers that can be achieved with the Bulgarian kaval. What controls which register the performer works in is mostly the air flow and to some extent the position of the mouth and the lips on the end of the kaval. A very characteristic sound of kaval is achieved in the lowest register. It could sound very mild and gentle if blown lightly while by changing the air stream a deeper (flageolet like) sound is achieved. This sound is so outstanding that some consider it another register that they call - kaba. It is also very interesting to notice that the technique of circular breathing is successfully utilized while playing the kaval. This technique lets the performer play without interrupting the air flow, while taking a breath through the nose. In the past it has been considered an extraordinary skill while nowadays it is used by more and more young performers.
25 |The Bulgarian kaval, once made of a single piece of wood, is now constructed of three separate sections (of cornel, plum or boxwood), with a total length of 60 to 90 cm. Bone rings cover the joints, to prevent the wood from cracking. Metal decoration is also found. The finger-holes are located in the central section, while the lower (shorter) section has four additional holes called dushnitsi or dyavolski dupki (‘devil’s holes’); these are not covered in performance. The kaval can be made in various tunings, D being the most common.
26 |In the south-west Rhodope mountains, two kavals in the same tuning (called chifte kavali) are played together, one performing the melody, the other a drone. This type of kaval is made from one piece of wood. A similar use of the kaval is also known in Macedonia and Kosovo, where one kaval of the pair (usually a lower one of a same key) is ‘male’, the other ‘female’.
27 |In Turkey the term ‘kaval’ is used generally to refer to all shepherd’s pipes and more particularly (though not invariably) to ductless flutes. The presence or absence of a duct is sometimes specified by the addition of a qualification: dilsiz kaval (‘kaval without a tongue’), dilli kaval (‘kaval with a tongue’). Other qualifications may be added to describe materials, size or constructional features: kamiş kavalı (‘reed kaval’), çam kavalı (‘pine kaval’), madenı kavalı (‘metal kaval’); cura kavalı (‘small kaval’), çoban kavalı (‘shepherd’s kaval’, i.e. long kaval); üç parçalı kavalı (‘kaval with three parts’). The Turkish kaval can be made of wood, cane, bone or metal (usually brass) and has five or more finger-holes, one thumb-hole and sometimes additional unfingered holes like the Bulgarian instrument.
29 |In Thrace and some of the Aegean Islands the term ‘kavali’ refers to an end-blown flute of the flogera family. It has seven finger-holes and sometimes an additional thumb-hole. In northern Greece the term kavali is also used to denote the souravli. In Epirus the end-blown kaval is known as dzhamara.
31 |There are five types of kavals in Macedonia, according to their length and register:
33 |The smallest and the "no-name" kavals are the most used in the Macedonian music tradition.
39 |The Macedonian šupelka is similar to the kaval (open on both ends), except that it is shorter (240–350 mm). It can be made of either walnut, barberry, ash wood, maple or other wood. The šupelka plays the chromatic scale (two octaves), except the first note of the lower octave. In the low register, its sound is soft and pleasant, while in the upper register it is sharp and shrill.
40 |The svirka (or tsafara, svorche, or little kaval) is a Bulgarian shepherds' flute, consisting of one wooden tube 25 to 50 cm long with six or seven holes for fingers, and a bone lip where it is endblown. It is played much like the kaval.
43 |An open end-blown flute similar to that used by the Bashkirs and the Caucasians, and are called by such terms as khobyrakh, Quray and choor or shoor.
45 |A typical khobyrakh is a 70 cm-wide, smooth, hollow pipe made of an umbel (hollow stem of a big, parasol-like umbelifer) or wood, with 3 or sometimes 6 finger-holes. Nowadays, it is also made of plastic.
46 |The Romanian caval dobrogean ("Dobrujan caval") or caval bulgăresc is a similar instrument to the Bulgarian kaval. The instrument known simply as caval, however, is instead a large duct flute. It has five finger-holes arranged in groups of two and three, counting from the distal end.
48 |The Moldovan kaval is a Hungarian-Romanian played in Csángó music; however, it is a fipple flute rather than endblown like the kaval.
50 |The piccolo (Italian for "small", but named ottavino in Italy) is a half-size flute, and a member of the woodwind family of musical instruments. The modern piccolo has most of the same fingerings as its larger sibling, the standard transverse flute, but the sound it produces is an octave higher than written. This gave rise to the name ottavino (Italian for "little octave"), the name by which the instrument is referred to in the scores of Italian composers.
10 |Piccolos are now only manufactured in the key of C; however, they were once also available in D♭. It was for this D♭ piccolo that John Philip Sousa wrote the famous solo in the final repeat of the closing section (trio) of his march "The Stars and Stripes Forever".
11 |In the orchestral setting, the piccolo player is often designated as "piccolo/flute III", or even "assistant principal". The larger orchestras have designated this position as a solo position due to the demands of the literature. Piccolos are often orchestrated to double (play together with) the violins or the flutes, adding sparkle and brilliance to the overall sound because of the aforementioned one-octave transposition upwards. In concert band settings, the piccolo is almost always used and a piccolo part is almost always available.
12 |The first known use of the word piccolo was circa 1854, though the English were using the term already some thirteen years earlier.
13 | 14 |Historically, the piccolo had no keys, and should not be confused with the fife, which has a smaller bore and is therefore more strident. The piccolo is used in conjunction with marching drums in traditional formations at the Carnival of Basel, Switzerland.
17 |It is a myth that one of the earliest pieces to use the piccolo was Beethoven's Symphony No. 5 in C Minor, premiered in December 1808. Although neither Joseph Haydn nor Mozart used it in their symphonies, some of their contemporaries did, including Hoffmeister, Süssmayr and Michael Haydn. Also, Mozart used the piccolo in his opera Idomeneo. Opera orchestras in Paris sometimes included small transverse flutes at the octave as early as 1735 as existing scores by Rameau show.
18 |Although once made of various kinds of wood, glass or ivory, piccolos today are made from a range of materials, including plastic, resin, brass, nickel silver, silver, and a variety of hardwoods, most commonly grenadilla. Finely made piccolos are often available with a variety of options similar to the flute, such as the split-E mechanism. Most piccolos have a conical body with a cylindrical head, which is like the Baroque flute and later flutes before the popularization of the Boehm bore used in modern flutes. Unlike other woodwind instruments, in most wooden piccolos the tenon joint connecting the head to the body has two interference fit points which surround both the cork and metal side of the piccolo body joint.
19 |There are a number of pieces for piccolo alone, by such composers as Samuel Adler, Michael Isaacson, David Loeb, Polly Moller, Vincent Persichetti, and Karlheinz Stockhausen.
21 |Repertoire for piccolo and piano, many of which are sonatas have been composed by Robert Baksa, Robert Beaser, Howard J. Buss, Eugene Damare, Pierre Max Dubois, Raymond Guiot, Lowell Liebermann, Peter Schickele, Michael Daugherty, and Gary Schocker.
22 |Concertos have been composed for piccolo, including those by Lowell Liebermann, Sir Peter Maxwell Davies, Todd Goodman, Martin Amlin, Will Gay Bottje, Bruce Broughton, Valentino Bucchi, Avner Dorman, Jean Doué, Michael Easton, Egil Hovland, Guus Janssen, Daniel Pinkham and Jeff Manookian.
23 |Additionally, there is a small selection of chamber music that uses the piccolo. One example is the Quintet for Piccolo and String Quartet by Graham Waterhouse. Another is Stockhausen's Zungenspitzentanz, for piccolo and two euphoniums (or one synthesizer), with optional percussionist and dancer.
24 |The Indian santoor is an ancient string musical instrument native to Jammu and Kashmir, with origins in Persia. A primitive ancestor of this type of instruments was invented in Mesopotamia (1600-911 BC).
10 |The santoor is a trapezoid-shaped hammered dulcimer often made of walnut, with seventy two strings. The special-shaped mallets (mezrab) are lightweight and are held between the index and middle fingers. A typical santoor has two sets of bridges, providing a range of three octaves.
11 |The Indian santoor is more rectangular and can have more strings than the Persian counterpart, which generally has 72 strings.
12 | 13 |In ancient Sanskrit texts, it has been referred to as shatatantri vina (100-stringed vina). In India, the santoor was used as an accompaniment instrument to the folk music of Kashmir. It is played in a style of music known as the Sufiana Mausiqi. The Sufi mystics used it as an accompaniment to their hymns.
15 |The trapezoid framework is generally made out of either walnut or maple wood. The top and bottom boards sometimes can be either plywood or veneer. On the top board, also known as sound board, wooden bridges are placed, in order to seat stretched metal strings across. The strings, grouped in units of 3 or 4, are tied on nails or pins on the left side of the instrument and are stretched over the sound board on top of the bridges to the right side. On the right side there are steel tuning pegs or tuning pins, as they are commonly known, that allows tuning each unit of strings to a desired musical note or a frequency or a pitch.
17 |The santoor is played while sitting in an asana called ardha-padmasana position and placing it on top of the lap. While playing, the broad side is closer to the waist of the musician and the shorter side is away from the musician. It is played with a pair of light wooden mallets or hammers with both hands. The santoor is a very delicate instrument and is very sensitive to light strokes and glides. The strokes are played always on the strings either closer to the bridges or a little away from bridges. Both styles result in different tones. Sometimes strokes by one hand can be muffled by the other hand by using the face of the palm just to create variety.
19 |The tabla (or tabl, tabla) (Punjabi: ਤਬਲਾ, Hindi: तबला, Bengali: তবলা, Tamil: தபலா, Urdu: طبلہ is a membranophone percussion instrument (similar to bongos) which is often used in Hindustani classical music and in traditional music of India, Pakistan, Afghanistan, Nepal, Bangladesh and Sri Lanka. The instrument consists of a pair of hand drums of contrasting sizes and timbres.
10 |Tabla playing is a mathematically calculated process. The right hand drum is called a tabla and the left hand drum is called a dagga or baya. It is claimed that the term tabla is derived from an Arabic word, tabl, which simply means "drum." The tabla is used in some other Asian musical traditions outside of Indian subcontinent, such as in the Indonesian dangdut genre. Playing technique involves extensive use of the fingers and palms in various configurations to create a wide variety of different sounds and rhythms, reflected in the mnemonic syllables (bol). The heel of the hand is used to apply pressure or in a sliding motion on the larger drum so that the pitch is changed during the sound's decay. In playing tabla there are two ways to play it: band bol and khula bol. In sense of classical music it is termed as "tali" and "khali".
11 | 12 |The roots for tabla invention are found in India. The carvings in Bhaja Caves in the state of Maharashtra in India shows a woman playing Tabla and another woman performing a dance, dating back to 200 BC. Taals has developed since the Vedic or Upanishad eras in India. as a result Pushkar was in existence long before even the Pakhawaj. It is quite likely that an instrument resembling the tabla was in existence much before. It was popular during the Yadava rule (1210 to 1247) in the south, at the time when Sangeeta Ratnakara was written by Sharngadeva. The myth is also extended that tabla is invented by the Indian Sufi poet and musician Amir Khusro in the 13th century, originating from the need to have a drum that could be played from the top in the sitting position to enable the more complex rhythm structures that were required for the new Indian Sufi vocal style of chanting and Zikr. Its invention would also have complemented the complex early Sitar melodies that Amir Khusro was composing. However none of his writings on music mention the drum, A temple known as Eklingaji in Jaipur, Rajasthan shows the carvings of Tabla being played. There is recent iconography of the tabla dating back to 1799. This theory is now obsolete with iconography carvings found in Bhaje caves providing a stable proof that Tabla was used in ancient India. There are Hindu temple carvings of double hand drums resembling tabla that date back to 500 BC. Tabla was widely spread across ancient India. A Hosaleshvara temple in Karnatak shows a carving of woman playing Tabla and dance performance.
15 |The Tabla uses a "complex finger tip and hand percussive" technique played from the top unlike the Pakhawaj and mridangam which mainly use the full palm and are sideways in motion and are more limited in terms of sound complexity.
16 |Rebecca Stewart has suggested that the tabla was most likely a product of experiments with existing drums such as pakhawaj, mridang, dholak and naqqara. The origins of tabla repertoire and technique may be found in all three and in physical structure there are also similar elements: the smaller pakhawaj head for the dayan, the naqqara kettledrum for the bayan, and the flexible use of the bass of the dholak.
17 |The smaller drum, played with the dominant hand, is sometimes called dayan (literally "right"), dāhina, siddha or chattū, but is correctly called the "tabla." It is made from a conical piece of mostly teak and rosewood hollowed out to approximately half of its total depth. The drum is tuned to a specific note, usually either the tonic, dominant or subdominant of the soloist's key and thus complements the melody. The tuning range is limited although different dāyāñs are produced in different sizes, each with a different range. Cylindrical wood blocks, known as ghatta, are inserted between the strap and the shell allowing tension to be adjusted by their vertical positioning. Fine tuning is achieved while striking vertically on the braided portion of the head using a small, heavy hammer.
21 |The larger drum, played with the other hand, is called bāyāñ (literally "left") or sometimes dagga, duggī or dhāmā. The bāyāñ has a much deeper bass tone, much like its distant cousin, the kettle drum. The bāyāñ may be made of any of a number of materials. Brass is the most common, copper is more expensive, but generally held to be the best, while aluminum and steel are often found in inexpensive models. Sometimes wood is used, especially in old bāyāñs from the Punjab. Clay is also used, although not favored for durability; these are generally found in the North-East region of Bengal.
22 |The name of the head areas are:
23 |Both drum shells are covered with a head (puri) constructed from goat or cow skin. An outer ring of skin (keenar) is overlaid on the main skin and serves to suppress some of the natural overtones. These two skins are bound together with a complex woven braid that gives the assembly enough strength to be tensioned on the shell. The head is affixed to the drum shell with a single cow or camel hide strap laced between the braid of the head assembly and another ring (made from the same strap material) placed on the bottom of the drum.
27 |The head of each drum has a central area of "tuning paste" called the syahi (lit. "ink"; a.k.a. shāī or gāb). This is constructed using multiple layers of a paste made from starch (rice or wheat) mixed with a black powder of various origins. The precise construction and shaping of this area is responsible for modification of the drum's natural overtones, resulting in the clarity of pitch (see inharmonicity) and variety of tonal possibilities unique to this instrument which has a bell-like sound. The skill required for the proper construction of this area is highly refined and is the main differentiating factor in the quality of a particular instrument.
28 |For stability while playing, each drum is positioned on a toroidal bundle called chutta or guddi, consisting of plant fiber or another malleable material wrapped in cloth.
29 |Indian music is traditionally practice-oriented and until the 20th century did not employ written notations as the primary media of instruction, understanding, or transmission. The rules of Indian music and compositions themselves are taught from a guru to a shishya, in person. Thus oral notation, such as the Tabla stroke names, is very developed and exact. However, written notation is regarded as a matter of taste and is not standardized. Thus there is no universal system of written notation for the rest of the world to study Indian music.
31 |Maula Bakhsh (born as Chole Khan in 1833) was an Indian musician, singer and poet. His grandfather was Hazrat Inayat Khan, founder of the Universal Sufism. Developed the "first system of notation for Indian music". He also founded the "first Academy of Music in India" in 1886, based in Baroda that encompassed both Eastern and Western musical cultural traditions.
32 |Hindustani classical music has two standard notation systems, one designed by V. N. Bhatkhande and the other by V. D. Paluskar. These notation systems are used for Indian instruments including the tabla.
33 |Some basic strokes with dayan on right side and bayan on left side are:
35 |Some tals, for example Dhamaar, Ek, Jhoomra and Chau tals, lend themselves better to slow and medium tempos. Others flourish at faster speeds, like Jhap or Rupak talas. Trital or Teental is one of the most popular, since it is as aesthetic at slower tempos as it is at faster speeds.
46 |There are many tals in Hindustani music, some of the more popular ones are:
47 |The term gharānā is used to specify a lineage of teaching and repertoire in Indian classical music. Most performers and scholars recognize two styles of tabla gharana: Dilli Baj and Purbi Baj. Dilli (or Delhi) baj comes from the style that developed in Delhi, and Purbi (meaning eastern) baj developed in the area east of Delhi. Delhi Baj is also known as Chati baj (Chati is a part of Tabla from where special tone can be produced).
51 |Musicians then recognize six gharānās – schools or traditions – of tabla. These traditions appeared or evolved in presumably the following order:
52 |Some traditions have sub-lineages and sub-styles that may meet the criteria to warrant a separate gharānā name, but such socio-musical identities have not taken hold in the public discourse of Hindustani art music, such as the Qasur lineage of tabla players of the Punjab region.
59 |Each gharānā is traditionally set apart from the others by unique aspects of the compositional and playing styles of its exponents. For instance, some gharānās have different tabla positioning and bol techniques. In the days of court patronage the preservation of these distinctions was important in order to maintain the prestige of the sponsoring court. Gharānā secrets were closely guarded and often only passed along family lines. Being born into or marrying into a lineage holding family was often the only way to gain access to this knowledge.
60 |Today many of these gharānā distinctions have been blurred as information has been more freely shared and newer generations of players have learned and combined aspects from multiple gharānās to form their own styles. There is much debate as to whether the concept of gharānā even still applies to modern players. Some think the era of gharānā has effectively come to an end as the unique aspects of each gharānā have been mostly lost through the mixing of styles and the socio-economic difficulties of maintaining lineage purity through rigorous training.
61 |Next to the contemporary common style of tabla, there exist older styles in which the bayan (called dhama or dhamma) is often made out of wood. Instead of having a thin dry syahi, this style of tabla uses a wet wheat dough on the bass drum's skin, applied shortly before playing. These types of Jori tabla are used by qawwali ensembles (notably Dildar Hussain), as well as in the Sikh tabla gharanas, Punjabi dhrupad, gurbani kirtan, and Afghan traditional music. A reminder that this style of tabla was used all over India not long ago is that many modern brass tuning hammers still have a dough removal spatula on the reverse end.
63 |", "
" etc. 11 | #Will have no effect if the "cleanxml" annotator is used 12 | #ssplit.htmlBoundariesToDiscard = p,text 13 | 14 | # 15 | # None of these paths are necessary anymore: we load all models from the JAR file 16 | # 17 | 18 | #pos.model = /u/nlp/data/pos-tagger/wsj3t0-18-left3words/left3words-distsim-wsj-0-18.tagger 19 | ## slightly better model but much slower: 20 | ##pos.model = /u/nlp/data/pos-tagger/wsj3t0-18-bidirectional/bidirectional-distsim-wsj-0-18.tagger 21 | 22 | #ner.model.3class = /u/nlp/data/ner/goodClassifiers/all.3class.distsim.crf.ser.gz 23 | #ner.model.7class = /u/nlp/data/ner/goodClassifiers/muc.distsim.crf.ser.gz 24 | #ner.model.MISCclass = /u/nlp/data/ner/goodClassifiers/conll.distsim.crf.ser.gz 25 | 26 | #regexner.mapping = /u/nlp/data/TAC-KBP2010/sentence_extraction/type_map_clean 27 | #regexner.ignorecase = false 28 | 29 | #nfl.gazetteer = /scr/nlp/data/machine-reading/Machine_Reading_P1_Reading_Task_V2.0/data/SportsDomain/NFLScoring_UseCase/NFLgazetteer.txt 30 | #nfl.relation.model = /scr/nlp/data/ldc/LDC2009E112/Machine_Reading_P1_NFL_Scoring_Training_Data_V1.2/models/nfl_relation_model.ser 31 | #nfl.entity.model = /scr/nlp/data/ldc/LDC2009E112/Machine_Reading_P1_NFL_Scoring_Training_Data_V1.2/models/nfl_entity_model.ser 32 | #printable.relation.beam = 20 33 | 34 | #parser.model = /u/nlp/data/lexparser/englishPCFG.ser.gz 35 | 36 | #srl.verb.args=/u/kristina/srl/verbs.core_args 37 | #srl.model.cls=/u/nlp/data/srl/trainedModels/englishPCFG/cls/train.ann 38 | #srl.model.id=/u/nlp/data/srl/trainedModels/englishPCFG/id/train.ann 39 | 40 | #coref.model=/u/nlp/rte/resources/anno/coref/corefClassifierAll.March2009.ser.gz 41 | #coref.name.dir=/u/nlp/data/coref/ 42 | #wordnet.dir=/u/nlp/data/wordnet/wordnet-3.0-prolog 43 | 44 | #dcoref.demonym = /scr/heeyoung/demonyms.txt 45 | #dcoref.animate = /scr/nlp/data/DekangLin-Animacy-Gender/Animacy/animate.unigrams.txt 46 | #dcoref.inanimate = /scr/nlp/data/DekangLin-Animacy-Gender/Animacy/inanimate.unigrams.txt 47 | #dcoref.male = /scr/nlp/data/Bergsma-Gender/male.unigrams.txt 48 | #dcoref.neutral = /scr/nlp/data/Bergsma-Gender/neutral.unigrams.txt 49 | #dcoref.female = /scr/nlp/data/Bergsma-Gender/female.unigrams.txt 50 | #dcoref.plural = /scr/nlp/data/Bergsma-Gender/plural.unigrams.txt 51 | #dcoref.singular = /scr/nlp/data/Bergsma-Gender/singular.unigrams.txt 52 | 53 | 54 | # This is the regular expression that describes which xml tags to keep 55 | # the text from. In order to on off the xml removal, add cleanxml 56 | # to the list of annotators above after "tokenize". 57 | #clean.xmltags = .* 58 | # A set of tags which will force the end of a sentence. HTML example: 59 | # you would not want to end on , but you would want to end on. 60 | # Once again, a regular expression. 61 | # (Blank means there are no sentence enders.) 62 | #clean.sentenceendingtags = 63 | # Whether or not to allow malformed xml 64 | # StanfordCoreNLP.properties 65 | #wordnet.dir=models/wordnet-3.0-prolog 66 | -------------------------------------------------------------------------------- /stanford-corenlp-python/demoNew.py: -------------------------------------------------------------------------------- 1 | from parseNLPNew import * 2 | file = Parse("languages_a9.htm") 3 | a = file.getContent() 4 | b = Extract(a) 5 | b.getPhrases() 6 | 7 | 8 | ## todo since there are NP in VP's maybe just pick an NP that is not a subset of the vp... 9 | # or filter them out.. 10 | # mehh idk -------------------------------------------------------------------------------- /stanford-corenlp-python/files/extract.py: -------------------------------------------------------------------------------- 1 | import os, sys, re, string, ast, json 2 | from nltk import * 3 | from corenlp import * 4 | from random import randint 5 | from random import choice 6 | from bs4 import BeautifulSoup 7 | from collections import Counter 8 | 9 | class Extract(object): 10 | def __init__(self,parse): 11 | """Extracts all sub phrase strings from list at the given level is the parse tree. 12 | If no phrase exists, will return None type.""" 13 | try: 14 | self.parseList = parse["parse"] 15 | self.raw = parse["raw"] 16 | except: 17 | self.parseList = [] 18 | self.raw = "" 19 | return 20 | 21 | def getAllSub(self,l): 22 | a = [] 23 | for sub in l: 24 | if type(sub) == list: 25 | a.append(sub) 26 | a += self.getAllSub(sub) 27 | return a 28 | 29 | def filterSub(self,l): 30 | goodSub = [] 31 | for sub in l: 32 | if ('NP' in sub) or ('VP' in sub): 33 | goodSub.append(sub) 34 | return goodSub 35 | 36 | def getString(self,l): 37 | s = '' 38 | try: 39 | if type(l[1]) == list: 40 | for x in l[1:]: 41 | s += self.getString(x) 42 | else: 43 | if l[0] != l[1]: 44 | s += l[1] + " " 45 | except: pass 46 | return s 47 | 48 | def getPhrases(self): 49 | if ((self.parseList == []) or (self.raw == "")): return None 50 | allSub = self.getAllSub(self.parseList) 51 | goodSub = self.filterSub(allSub) 52 | res = dict() 53 | for sub in goodSub: 54 | pos = sub[0] 55 | phrase = self.getString(sub) 56 | if pos in res: 57 | res[pos].append(phrase) 58 | else: 59 | res[pos] = [phrase] 60 | for key in res: 61 | res[key].sort(key = lambda s: len(s)) 62 | return res 63 | 64 | -------------------------------------------------------------------------------- /stanford-corenlp-python/files/extractCanonical.py: -------------------------------------------------------------------------------- 1 | import os, sys, re, string, ast, json 2 | from nltk import * 3 | from corenlp import * 4 | from random import randint 5 | from random import choice 6 | from bs4 import BeautifulSoup 7 | from collections import Counter 8 | 9 | # Return largest NP VP pair for the given sentence 10 | 11 | class ExtractCanonical(object): 12 | def __init__(self,parse,level=1): 13 | """Extracts largest phrase strings from list at the given level 14 | is the parse tree. 15 | If no phrase exists, will return None type.""" 16 | try: 17 | self.parseList = parse["parse"] 18 | self.raw = parse["raw"] 19 | except: 20 | self.parseList = [] 21 | self.raw = "" 22 | return 23 | 24 | def getString(self,subParse,acc): 25 | if (subParse == []): return "" 26 | elif all((type(x) == str) for x in subParse): 27 | try: 28 | return (subParse[1] + " ") 29 | except: 30 | pass 31 | for tok in subParse: 32 | if (type(tok) == list): 33 | acc += self.getString(tok," ") 34 | return acc 35 | 36 | 37 | def getText(self): 38 | """Returns dictionay of (POS,phrase) kv pairs.""" 39 | # Removes 'S' tag and [] end of line 40 | if ((self.parseList == []) or (self.raw == "")): return None 41 | self.results = dict() 42 | innerList = [] 43 | sub = [] 44 | for stree in self.parseList[1:]: 45 | try: 46 | if (stree[0] == 'S'): 47 | innerList += stree[1:] 48 | else: 49 | innerList.append(stree) 50 | except: 51 | pass 52 | for sub in innerList: 53 | if (sub == []): pass 54 | else: 55 | pos = sub[0] 56 | string = self.getString(sub[1:],"") 57 | string = string.replace(' ', ' ') 58 | try: 59 | if string[0] == " ": 60 | string = string[1:] 61 | except: pass 62 | try: 63 | tmp = self.results[pos] 64 | rself.esults[pos] = tmp.append(string) 65 | except: 66 | self.results[pos] = [string] 67 | for i in sub[1:]: 68 | if (type(i) == list): 69 | for j in i: 70 | if "VP" in j: 71 | sub_vp = self.getString(j,"") 72 | sub.append(sub_vp) 73 | return self.results 74 | 75 | 76 | def tag(s): 77 | import nltk 78 | text = nltk.word_tokenize(s) 79 | return nltk.pos_tag(text) 80 | 81 | def supersense(entity): 82 | import os 83 | return map(lambda y: y.split("\t"), 84 | os.popen("cd SupersenseTagger && ./run.sh <<< \"" + entity+ "\" cd ..").read().split("\n")) 85 | 86 | def get_questions(self): 87 | z = self.getText() 88 | (subj,vp) = (z['NP'][0], z['VP'][0]) 89 | from pattern.en import lexeme, lemma, tenses 90 | import nltk, re 91 | tagged = nltk.pos_tag(nltk.word_tokenize(subj + " " + vp)) 92 | verb = "" 93 | sense = supersense(subj) 94 | if(sense[0][2][-6:] == 'person' or sense[0][1] == 'PRP'): return ("Who " + vp + "?") 95 | elif(sense[0][2][-4:] == 'time' or re.match("[1|2]\d\d\d", subj)): return ("When " + vp + "?") 96 | elif(sense[0][2][-8:] == 'location' and 97 | ('PP' in z and z['PP'].split()[0].lower in ["on", "in", "at", "over", "to"])): 98 | return ("Where " + vp + "?") 99 | aux = ["Will","Shall","May","Might","Can","Could","Must","Should","Would","Do","Does","Did"] 100 | for i in reversed(tagged): 101 | if(i[1][0] == 'V'): 102 | verb = i[0] 103 | if((u'' + verb) in lexeme("is")): 104 | return (verb.capitalize() + " " + subj.lower() + vp[len(verb):] + "?") 105 | else: 106 | for x in aux: 107 | if(tenses(x)[0] == tenses(verb)[0]): 108 | return (x + " " + subj.lower() + " " + lemma(verb) + vp[len(verb):] + "?") 109 | 110 | -------------------------------------------------------------------------------- /stanford-corenlp-python/files/parse.py: -------------------------------------------------------------------------------- 1 | import os, sys, re, string, ast, json 2 | from nltk import * 3 | from corenlp import * 4 | from random import randint 5 | from random import choice 6 | from bs4 import BeautifulSoup 7 | from collections import Counter 8 | 9 | class Parse(object): 10 | def __init__(self,fileName,dataDir='../sampleData/languages/'): 11 | self.fileName = fileName 12 | self.dataDir = dataDir 13 | self.readFile() 14 | self.tokenize() 15 | self.corenlp = StanfordCoreNLP() 16 | self.getMain() 17 | self.rem = [] 18 | self.parseList = [] 19 | self.line = "" 20 | return 21 | 22 | def readFile(self): 23 | html = BeautifulSoup(open(self.dataDir+self.fileName)) 24 | self.raw = html.get_text() 25 | return 26 | 27 | def tokenize(self): 28 | try: 29 | misc = self.raw.index("See also") 30 | self.raw = self.raw[:misc] 31 | except: 32 | pass 33 | self.text = tokenize.sent_tokenize(self.raw) 34 | self.textLen = len(self.text) 35 | return 36 | 37 | def getMain(self): 38 | main = self.text[0] 39 | self.topic = Counter(main.split()).most_common(1)[0][0] 40 | return 41 | 42 | def getLine(self): 43 | found = False 44 | if self.topic == None: pass 45 | while not found: 46 | if self.text == []: 47 | self.line = self.rem[0] 48 | self.rem = self.rem[1:] 49 | return 50 | r = randint(0,max(0,len(self.text)-1)) 51 | self.line = self.text[r] 52 | if (self.topic in self.line): 53 | if (len(self.line) > 80): 54 | if ('\n' not in self.line): 55 | found = True 56 | self.text.remove(self.line) 57 | self.rem.append(self.line) 58 | return 59 | 60 | def treeToList(self): 61 | invalidChar = u'!"#%\'*+,-./:;<=>?@[\]^_`{|}~' 62 | translateTo = u'' 63 | translateTable = dict((ord(char), translateTo) for char in invalidChar) 64 | self.parse = self.parse.translate(translateTable) 65 | self.parse = self.parse.replace('(', '[') 66 | self.parse = self.parse.replace(')', ']') 67 | self.parse = self.parse.replace('] [', '], [') 68 | self.parse = re.sub(r'(\w+)', r'"\1",', self.parse) 69 | self.parse = self.parse.replace(',]', ']') 70 | self.parse = self.parse.replace(', [ ]', '') 71 | try: 72 | return ast.literal_eval(self.parse) 73 | except: 74 | return [] 75 | 76 | def getContent(self): 77 | self.getLine() 78 | try: 79 | self.coreParse = json.loads(self.corenlp.parse(self.line)) 80 | self.parse = self.coreParse['sentences'][0]['parsetree'] 81 | except: 82 | return self.getContent 83 | self.parseList = self.treeToList() 84 | if self.parseList == []: 85 | return self.getContent 86 | return {"parse":self.parseList, "raw":self.line} 87 | -------------------------------------------------------------------------------- /stanford-corenlp-python/parseNLPNew.py: -------------------------------------------------------------------------------- 1 | import os, sys, re, string, ast, json 2 | from nltk import * 3 | from corenlp import * 4 | from random import randint 5 | from random import choice 6 | from bs4 import BeautifulSoup 7 | from collections import Counter 8 | # fall back for if so many question self.text = [] 9 | # have another loop guard for while not found 10 | 11 | 12 | # maybe add condition for pronoun???? 13 | class Parse(object): 14 | def __init__(self,fileName,dataDir='../sampleData/languages/'): 15 | self.fileName = fileName 16 | self.dataDir = dataDir 17 | self.readFile() 18 | self.tokenize() 19 | self.corenlp = StanfordCoreNLP() 20 | self.getMain() 21 | self.rem = [] 22 | return 23 | 24 | def readFile(self): 25 | html = BeautifulSoup(open(self.dataDir+self.fileName)) 26 | self.raw = html.get_text() 27 | return 28 | 29 | def tokenize(self): 30 | try: 31 | misc = self.raw.index("See also") 32 | self.raw = self.raw[:misc] 33 | except: 34 | pass 35 | self.text = tokenize.sent_tokenize(self.raw) 36 | self.textLen = len(self.text) 37 | return 38 | 39 | def getMain(self): 40 | main = self.text[0] 41 | self.topic = Counter(main.split()).most_common(1)[0][0] 42 | return 43 | 44 | def getMain2(self): 45 | main = self.text[0] 46 | #self.text = self.text[1:] 47 | term = re.match(r'\n\n\n(.*)\n\n\n\n',main) 48 | try: 49 | self.topic = term.group(1) 50 | except: 51 | self.topic = None 52 | return 53 | 54 | def getLine(self): 55 | found = False 56 | if self.topic == None: pass # format assumption failed, need backup :( 57 | while not found: 58 | if self.text == []: 59 | self.line = self.rem[0] 60 | self.rem = self.rem[1:] 61 | return 62 | r = randint(0,max(0,len(self.text)-1)) 63 | self.line = self.text[r] 64 | # arbitrary semi-educated conditions :D 65 | if (self.topic in self.line): 66 | if (len(self.line) > 80): 67 | if ('\n' not in self.line): 68 | found = True 69 | self.text.remove(self.line) 70 | self.rem.append(self.line) 71 | return 72 | 73 | def treeToList(self): 74 | invalidChar = u'!"#%\'*+,-./:;<=>?@[\]^_`{|}~' 75 | translateTo = u'' 76 | translateTable = dict((ord(char), translateTo) for char in invalidChar) 77 | self.parse = self.parse.translate(translateTable) 78 | self.parse = self.parse.replace('(', '[') 79 | self.parse = self.parse.replace(')', ']') 80 | self.parse = self.parse.replace('] [', '], [') 81 | self.parse = re.sub(r'(\w+)', r'"\1",', self.parse) 82 | self.parse = self.parse.replace(',]', ']') 83 | self.parse = self.parse.replace(', [ ]', '') 84 | try: 85 | return ast.literal_eval(self.parse) 86 | except: 87 | return [] 88 | 89 | def getContent(self): 90 | self.getLine() 91 | try: 92 | self.coreParse = json.loads(self.corenlp.parse(self.line)) 93 | self.parse = self.coreParse['sentences'][0]['parsetree'] 94 | except: 95 | return self.getContent 96 | self.parseList = self.treeToList() 97 | if self.parseList == []: 98 | return self.getContent 99 | return {"parse":self.parseList, "raw":self.line} 100 | 101 | 102 | class Extract(object): 103 | def __init__(self,parse): 104 | """Extracts phrase strings from list at the given level is the parse tree. If provided level is invalid, will return phrases of highest order. 105 | If no phrase exists, will return None type.""" 106 | try: 107 | self.parseList = parse["parse"] 108 | self.raw = parse["raw"] 109 | except: 110 | pass 111 | return 112 | 113 | def getAllSub(self,l): 114 | a = [] 115 | for sub in l: 116 | if type(sub) == list: 117 | a.append(sub) 118 | a += self.getAllSub(sub) 119 | return a 120 | 121 | def filterSub(self,l): 122 | goodSub = [] 123 | for sub in l: 124 | if ('NP' in sub) or ('VP' in sub): 125 | goodSub.append(sub) 126 | return goodSub 127 | 128 | def getString(self,l): 129 | s = '' 130 | try: 131 | if type(l[1]) == list: 132 | for x in l[1:]: 133 | s += self.getString(x) 134 | else: 135 | if l[0] != l[1]: 136 | s += l[1] + " " 137 | except: pass 138 | return s 139 | 140 | def getPhrases(self): 141 | allSub = self.getAllSub(self.parseList) 142 | goodSub = self.filterSub(allSub) 143 | res = dict() 144 | for sub in goodSub: 145 | pos = sub[0] 146 | phrase = self.getString(sub) 147 | if pos in res: 148 | res[pos].append(phrase) 149 | else: 150 | res[pos] = [phrase] 151 | for key in res: 152 | res[key].sort(key = lambda s: len(s)) 153 | return res 154 | 155 | -------------------------------------------------------------------------------- /stanford-corenlp-python/progressbar.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # -*- coding: iso-8859-1 -*- 3 | # 4 | # progressbar - Text progressbar library for python. 5 | # Copyright (c) 2005 Nilton Volpato 6 | # 7 | # This library is free software; you can redistribute it and/or 8 | # modify it under the terms of the GNU Lesser General Public 9 | # License as published by the Free Software Foundation; either 10 | # version 2.1 of the License, or (at your option) any later version. 11 | # 12 | # This library is distributed in the hope that it will be useful, 13 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 15 | # Lesser General Public License for more details. 16 | # 17 | # You should have received a copy of the GNU Lesser General Public 18 | # License along with this library; if not, write to the Free Software 19 | # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA 20 | 21 | 22 | """Text progressbar library for python. 23 | 24 | This library provides a text mode progressbar. This is typically used 25 | to display the progress of a long running operation, providing a 26 | visual clue that processing is underway. 27 | 28 | The ProgressBar class manages the progress, and the format of the line 29 | is given by a number of widgets. A widget is an object that may 30 | display diferently depending on the state of the progress. There are 31 | three types of widget: 32 | - a string, which always shows itself; 33 | - a ProgressBarWidget, which may return a diferent value every time 34 | it's update method is called; and 35 | - a ProgressBarWidgetHFill, which is like ProgressBarWidget, except it 36 | expands to fill the remaining width of the line. 37 | 38 | The progressbar module is very easy to use, yet very powerful. And 39 | automatically supports features like auto-resizing when available. 40 | """ 41 | 42 | __author__ = "Nilton Volpato" 43 | __author_email__ = "first-name dot last-name @ gmail.com" 44 | __date__ = "2006-05-07" 45 | __version__ = "2.2" 46 | 47 | # Changelog 48 | # 49 | # 2006-05-07: v2.2 fixed bug in windows 50 | # 2005-12-04: v2.1 autodetect terminal width, added start method 51 | # 2005-12-04: v2.0 everything is now a widget (wow!) 52 | # 2005-12-03: v1.0 rewrite using widgets 53 | # 2005-06-02: v0.5 rewrite 54 | # 2004-??-??: v0.1 first version 55 | 56 | import sys 57 | import time 58 | from array import array 59 | try: 60 | from fcntl import ioctl 61 | import termios 62 | except ImportError: 63 | pass 64 | import signal 65 | 66 | 67 | class ProgressBarWidget(object): 68 | """This is an element of ProgressBar formatting. 69 | 70 | The ProgressBar object will call it's update value when an update 71 | is needed. It's size may change between call, but the results will 72 | not be good if the size changes drastically and repeatedly. 73 | """ 74 | def update(self, pbar): 75 | """Returns the string representing the widget. 76 | 77 | The parameter pbar is a reference to the calling ProgressBar, 78 | where one can access attributes of the class for knowing how 79 | the update must be made. 80 | 81 | At least this function must be overriden.""" 82 | pass 83 | 84 | 85 | class ProgressBarWidgetHFill(object): 86 | """This is a variable width element of ProgressBar formatting. 87 | 88 | The ProgressBar object will call it's update value, informing the 89 | width this object must the made. This is like TeX \\hfill, it will 90 | expand to fill the line. You can use more than one in the same 91 | line, and they will all have the same width, and together will 92 | fill the line. 93 | """ 94 | def update(self, pbar, width): 95 | """Returns the string representing the widget. 96 | 97 | The parameter pbar is a reference to the calling ProgressBar, 98 | where one can access attributes of the class for knowing how 99 | the update must be made. The parameter width is the total 100 | horizontal width the widget must have. 101 | 102 | At least this function must be overriden.""" 103 | pass 104 | 105 | 106 | class ETA(ProgressBarWidget): 107 | "Widget for the Estimated Time of Arrival" 108 | def format_time(self, seconds): 109 | return time.strftime('%H:%M:%S', time.gmtime(seconds)) 110 | 111 | def update(self, pbar): 112 | if pbar.currval == 0: 113 | return 'ETA: --:--:--' 114 | elif pbar.finished: 115 | return 'Time: %s' % self.format_time(pbar.seconds_elapsed) 116 | else: 117 | elapsed = pbar.seconds_elapsed 118 | eta = elapsed * pbar.maxval / pbar.currval - elapsed 119 | return 'ETA: %s' % self.format_time(eta) 120 | 121 | 122 | class FileTransferSpeed(ProgressBarWidget): 123 | "Widget for showing the transfer speed (useful for file transfers)." 124 | def __init__(self): 125 | self.fmt = '%6.2f %s' 126 | self.units = ['B', 'K', 'M', 'G', 'T', 'P'] 127 | 128 | def update(self, pbar): 129 | if pbar.seconds_elapsed < 2e-6: # == 0: 130 | bps = 0.0 131 | else: 132 | bps = float(pbar.currval) / pbar.seconds_elapsed 133 | spd = bps 134 | for u in self.units: 135 | if spd < 1000: 136 | break 137 | spd /= 1000 138 | return self.fmt % (spd, u + '/s') 139 | 140 | 141 | class RotatingMarker(ProgressBarWidget): 142 | "A rotating marker for filling the bar of progress." 143 | def __init__(self, markers='|/-\\'): 144 | self.markers = markers 145 | self.curmark = -1 146 | 147 | def update(self, pbar): 148 | if pbar.finished: 149 | return self.markers[0] 150 | self.curmark = (self.curmark + 1) % len(self.markers) 151 | return self.markers[self.curmark] 152 | 153 | 154 | class Percentage(ProgressBarWidget): 155 | "Just the percentage done." 156 | def update(self, pbar): 157 | return '%3d%%' % pbar.percentage() 158 | 159 | 160 | class Fraction(ProgressBarWidget): 161 | "Just the fraction done." 162 | def update(self, pbar): 163 | return "%d/%d" % (pbar.currval, pbar.maxval) 164 | 165 | 166 | class Bar(ProgressBarWidgetHFill): 167 | "The bar of progress. It will strech to fill the line." 168 | def __init__(self, marker='#', left='|', right='|'): 169 | self.marker = marker 170 | self.left = left 171 | self.right = right 172 | 173 | def _format_marker(self, pbar): 174 | if isinstance(self.marker, (str, unicode)): 175 | return self.marker 176 | else: 177 | return self.marker.update(pbar) 178 | 179 | def update(self, pbar, width): 180 | percent = pbar.percentage() 181 | cwidth = width - len(self.left) - len(self.right) 182 | marked_width = int(percent * cwidth / 100) 183 | m = self._format_marker(pbar) 184 | bar = (self.left + (m * marked_width).ljust(cwidth) + self.right) 185 | return bar 186 | 187 | 188 | class ReverseBar(Bar): 189 | "The reverse bar of progress, or bar of regress. :)" 190 | def update(self, pbar, width): 191 | percent = pbar.percentage() 192 | cwidth = width - len(self.left) - len(self.right) 193 | marked_width = int(percent * cwidth / 100) 194 | m = self._format_marker(pbar) 195 | bar = (self.left + (m * marked_width).rjust(cwidth) + self.right) 196 | return bar 197 | 198 | default_widgets = [Percentage(), ' ', Bar()] 199 | 200 | 201 | class ProgressBar(object): 202 | """This is the ProgressBar class, it updates and prints the bar. 203 | 204 | The term_width parameter may be an integer. Or None, in which case 205 | it will try to guess it, if it fails it will default to 80 columns. 206 | 207 | The simple use is like this: 208 | >>> pbar = ProgressBar().start() 209 | >>> for i in xrange(100): 210 | ... # do something 211 | ... pbar.update(i+1) 212 | ... 213 | >>> pbar.finish() 214 | 215 | But anything you want to do is possible (well, almost anything). 216 | You can supply different widgets of any type in any order. And you 217 | can even write your own widgets! There are many widgets already 218 | shipped and you should experiment with them. 219 | 220 | When implementing a widget update method you may access any 221 | attribute or function of the ProgressBar object calling the 222 | widget's update method. The most important attributes you would 223 | like to access are: 224 | - currval: current value of the progress, 0 <= currval <= maxval 225 | - maxval: maximum (and final) value of the progress 226 | - finished: True if the bar is have finished (reached 100%), False o/w 227 | - start_time: first time update() method of ProgressBar was called 228 | - seconds_elapsed: seconds elapsed since start_time 229 | - percentage(): percentage of the progress (this is a method) 230 | """ 231 | def __init__(self, maxval=100, widgets=default_widgets, term_width=None, 232 | fd=sys.stderr, force_update=False): 233 | assert maxval > 0 234 | self.maxval = maxval 235 | self.widgets = widgets 236 | self.fd = fd 237 | self.signal_set = False 238 | if term_width is None: 239 | try: 240 | self.handle_resize(None, None) 241 | signal.signal(signal.SIGWINCH, self.handle_resize) 242 | self.signal_set = True 243 | except: 244 | self.term_width = 79 245 | else: 246 | self.term_width = term_width 247 | 248 | self.currval = 0 249 | self.finished = False 250 | self.prev_percentage = -1 251 | self.start_time = None 252 | self.seconds_elapsed = 0 253 | self.force_update = force_update 254 | 255 | def handle_resize(self, signum, frame): 256 | h, w = array('h', ioctl(self.fd, termios.TIOCGWINSZ, '\0' * 8))[:2] 257 | self.term_width = w 258 | 259 | def percentage(self): 260 | "Returns the percentage of the progress." 261 | return self.currval * 100.0 / self.maxval 262 | 263 | def _format_widgets(self): 264 | r = [] 265 | hfill_inds = [] 266 | num_hfill = 0 267 | currwidth = 0 268 | for i, w in enumerate(self.widgets): 269 | if isinstance(w, ProgressBarWidgetHFill): 270 | r.append(w) 271 | hfill_inds.append(i) 272 | num_hfill += 1 273 | elif isinstance(w, (str, unicode)): 274 | r.append(w) 275 | currwidth += len(w) 276 | else: 277 | weval = w.update(self) 278 | currwidth += len(weval) 279 | r.append(weval) 280 | for iw in hfill_inds: 281 | r[iw] = r[iw].update(self, 282 | (self.term_width - currwidth) / num_hfill) 283 | return r 284 | 285 | def _format_line(self): 286 | return ''.join(self._format_widgets()).ljust(self.term_width) 287 | 288 | def _need_update(self): 289 | if self.force_update: 290 | return True 291 | return int(self.percentage()) != int(self.prev_percentage) 292 | 293 | def reset(self): 294 | if not self.finished and self.start_time: 295 | self.finish() 296 | self.finished = False 297 | self.currval = 0 298 | self.start_time = None 299 | self.seconds_elapsed = None 300 | self.prev_percentage = None 301 | return self 302 | 303 | def update(self, value): 304 | "Updates the progress bar to a new value." 305 | assert 0 <= value <= self.maxval 306 | self.currval = value 307 | if not self._need_update() or self.finished: 308 | return 309 | if not self.start_time: 310 | self.start_time = time.time() 311 | self.seconds_elapsed = time.time() - self.start_time 312 | self.prev_percentage = self.percentage() 313 | if value != self.maxval: 314 | self.fd.write(self._format_line() + '\r') 315 | else: 316 | self.finished = True 317 | self.fd.write(self._format_line() + '\n') 318 | 319 | def start(self): 320 | """Start measuring time, and prints the bar at 0%. 321 | 322 | It returns self so you can use it like this: 323 | >>> pbar = ProgressBar().start() 324 | >>> for i in xrange(100): 325 | ... # do something 326 | ... pbar.update(i+1) 327 | ... 328 | >>> pbar.finish() 329 | """ 330 | self.update(0) 331 | return self 332 | 333 | def finish(self): 334 | """Used to tell the progress is finished.""" 335 | self.update(self.maxval) 336 | if self.signal_set: 337 | signal.signal(signal.SIGWINCH, signal.SIG_DFL) 338 | 339 | 340 | def example1(): 341 | widgets = ['Test: ', Percentage(), ' ', Bar(marker=RotatingMarker()), 342 | ' ', ETA(), ' ', FileTransferSpeed()] 343 | pbar = ProgressBar(widgets=widgets, maxval=10000000).start() 344 | for i in range(1000000): 345 | # do something 346 | pbar.update(10 * i + 1) 347 | pbar.finish() 348 | return pbar 349 | 350 | 351 | def example2(): 352 | class CrazyFileTransferSpeed(FileTransferSpeed): 353 | "It's bigger between 45 and 80 percent" 354 | def update(self, pbar): 355 | if 45 < pbar.percentage() < 80: 356 | return 'Bigger Now ' + FileTransferSpeed.update(self, pbar) 357 | else: 358 | return FileTransferSpeed.update(self, pbar) 359 | 360 | widgets = [CrazyFileTransferSpeed(), ' <<<', 361 | Bar(), '>>> ', Percentage(), ' ', ETA()] 362 | pbar = ProgressBar(widgets=widgets, maxval=10000000) 363 | # maybe do something 364 | pbar.start() 365 | for i in range(2000000): 366 | # do something 367 | pbar.update(5 * i + 1) 368 | pbar.finish() 369 | return pbar 370 | 371 | 372 | def example3(): 373 | widgets = [Bar('>'), ' ', ETA(), ' ', ReverseBar('<')] 374 | pbar = ProgressBar(widgets=widgets, maxval=10000000).start() 375 | for i in range(1000000): 376 | # do something 377 | pbar.update(10 * i + 1) 378 | pbar.finish() 379 | return pbar 380 | 381 | 382 | def example4(): 383 | widgets = ['Test: ', Percentage(), ' ', 384 | Bar(marker='0', left='[', right=']'), 385 | ' ', ETA(), ' ', FileTransferSpeed()] 386 | pbar = ProgressBar(widgets=widgets, maxval=500) 387 | pbar.start() 388 | for i in range(100, 500 + 1, 50): 389 | time.sleep(0.2) 390 | pbar.update(i) 391 | pbar.finish() 392 | return pbar 393 | 394 | 395 | def example5(): 396 | widgets = ['Test: ', Fraction(), ' ', Bar(marker=RotatingMarker()), 397 | ' ', ETA(), ' ', FileTransferSpeed()] 398 | pbar = ProgressBar(widgets=widgets, maxval=10, force_update=True).start() 399 | for i in range(1, 11): 400 | # do something 401 | time.sleep(0.5) 402 | pbar.update(i) 403 | pbar.finish() 404 | return pbar 405 | 406 | 407 | def main(): 408 | example1() 409 | print 410 | example2() 411 | print 412 | example3() 413 | print 414 | example4() 415 | print 416 | example5() 417 | print 418 | 419 | if __name__ == '__main__': 420 | main() 421 | -------------------------------------------------------------------------------- /stanford-corenlp-python/simpleExtract.py: -------------------------------------------------------------------------------- 1 | import os, sys, re, string, ast, json, yaml 2 | import convert 3 | from nltk import * 4 | from corenlp import * 5 | from bs4 import BeautifulSoup 6 | from collections import Counter 7 | 8 | debug = False 9 | 10 | dataDir = '../NLP-Question-Answer-System/sampleData/' 11 | htmlFile = sys.argv[1] 12 | html = BeautifulSoup(open(dataDir+htmlFile)) 13 | raw = html.get_text() 14 | # print raw 15 | 16 | ## WRITE ACTUAL MODULE ie Parse.extractPhrase(n), Parse.getTree(raw string) etc. 17 | 18 | try: 19 | misc = raw.index("See also") 20 | raw = raw[:misc] 21 | except: 22 | pass 23 | 24 | text = tokenize.sent_tokenize(raw) 25 | 26 | 27 | 28 | corenlp = StanfordCoreNLP() 29 | for i in range(10): 30 | text0 = text[i] # filler for now 31 | if debug: print text0 32 | 33 | # NTS use json to work with unicode to match raw text 34 | parse = yaml.load(corenlp.parse(text0)) 35 | if debug: print parse 36 | 37 | try: 38 | coref = parse['coref'] # see testCoreNLP.py script 39 | except: 40 | if debug: print "No unresolved pronouns." 41 | coref = None 42 | pass 43 | 44 | parseTree = parse['sentences'][0]['parsetree'] ### 45 | originalText = parse['sentences'][0]['text'] 46 | dependencies = parse['sentences'][0]['dependencies'] 47 | words = parse['sentences'][0]['words'] 48 | #print (' ') 49 | #print (originalText) 50 | #print(parseTree) 51 | a = convert.treeToList(parseTree) 52 | #print a 53 | 54 | print (' ') 55 | print (' ') 56 | print (' ') 57 | print (' ') 58 | while (type(a[1]) == list) and (a[1][0] != 'NP'): 59 | a = a[1] 60 | try: 61 | print(a[1]) 62 | print (' ') 63 | print (a[2]) 64 | print (' ') 65 | except: 66 | print (' ') 67 | print (' ') 68 | print (' ') 69 | 70 | # Question Gen: 71 | # use Counter to find most freq Named Entities 72 | # randint to get sentence with it, 73 | # Parse tree to extract necessary components 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | -------------------------------------------------------------------------------- /stanford-corenlp-python/testCoreNLP.py: -------------------------------------------------------------------------------- 1 | from corenlp import * 2 | import json 3 | 4 | corenlp = StanfordCoreNLP() 5 | sentence = "Bob is a dog who loves cats. Emily is a carrot and she hates bongo drums." 6 | 7 | corefs = json.loads(corenlp.parse(sentence))["coref"] 8 | print sentence 9 | 10 | print corefs 11 | 12 | corefDict = dict() 13 | for references in corefs: 14 | coreferent = references[0][1][0] 15 | refs = [] 16 | for tuple in references: 17 | pronoun = tuple[0][0] 18 | sentenceNum = tuple[0][1] 19 | refs.append((pronoun, sentenceNum)) 20 | corefDict[coreferent] = refs 21 | 22 | print corefDict 23 | 24 | -------------------------------------------------------------------------------- /stanford-corenlp-python/untitled: -------------------------------------------------------------------------------- 1 | import os, sys, re, string, ast, json, yaml 2 | import convert 3 | from nltk import * 4 | from corenlp import * 5 | from random import randint 6 | from bs4 import BeautifulSoup 7 | from collections import Counter 8 | 9 | debug = False 10 | 11 | class Parse(object): 12 | def __init__(self,fileName,textRange=15): 13 | self.fileName = fileName 14 | self.range = textRange+1 # lines of text to search over, +1 for range offset 15 | readFile() 16 | tokenize() 17 | self.corenlp = StanfordCoreNLP() 18 | parse() 19 | return 20 | 21 | def readFile(self): 22 | dataDir = '../NLP-Question-Answer-System/sampleData/' 23 | htmlFile = self.fileName 24 | html = BeautifulSoup(open(dataDir+htmlFile)) 25 | self.raw = html.get_text() 26 | return 27 | 28 | def tokenize(self): 29 | try: 30 | misc = self.raw.index("See also") 31 | self.raw = self.raw[:misc] 32 | except: 33 | pass 34 | self.text = tokenize.sent_tokenize(self.raw) 35 | return 36 | 37 | 38 | 39 | def parse(self): 40 | self.parsedText = [] 41 | for line in self.text: 42 | self.parsedText.append(json.load(self.corenlp.parse(line))) 43 | self.textLen = len(self.parsedText) 44 | return 45 | 46 | def getNERCounts(self,line): pass 47 | sentence = line['sentences'][0]['words'] 48 | for token in sentence: 49 | word = token[0] 50 | info = token[1] 51 | if info['NamedEntityTag'] != 'O': 52 | self.NEcounts[word] += 1 53 | return 54 | 55 | def getTopicSentence(self,topicNE,startInd): 56 | self.topicInd = -1 #get sentence with topic in it 57 | ind = startInd 58 | while (self.topicInd < 0): 59 | if topicNE in self.text[ind]: 60 | self.topicInd = ind 61 | else: ind += 1 62 | return 63 | 64 | def treeToList(self,parseTree): 65 | validChar = string.ascii_letters + string.digits + "() " 66 | removeInvalidChar = all.translate(string.maketrans('',''), validChar) 67 | parseTree = parseTree.translate(all, removeInvalidChar) 68 | parseTree = parseTree.replace('(', '[') 69 | parseTree = parseTree.replace(')', ']') 70 | parseTree = parseTree.replace('] [', '], [') 71 | parseTree = parseTree.replace('[]', '') 72 | parseTree = re.sub(r'(\w+)', r'"\1",', parseTree) 73 | return ast.literal_eval(parseTree) 74 | 75 | def decomposePhrase(self): 76 | while (type(phrase) == list) and (phase[1][0] != 'NP'): 77 | phrase = phrase[1] 78 | print(phrase[1],phrase[3]) 79 | #FINISH LATER IF NEEDED 80 | return 81 | 82 | def getContent(self,line): 83 | # assumption, python PRNG goodish 84 | #assumption, something worthwhile ever 15 sentences 85 | lineFound = false 86 | startInd = randint(0,self.textLen-self.textRange) 87 | self.NEcounts = Counter() 88 | for lineInd in xrange(startInd,startInd+self.textRange): 89 | getNECounts(self.parsedText[lineInd]) 90 | uniqueNE = len(list(self.NEcounts)) 91 | topNE = uniqueNE // 4 92 | mostCommonNE = [k for (k,v) in Counter(words).most_common(topNE)] 93 | 94 | while not lineFound: 95 | topicNE = mostCommonNE[randint(0,topNE)] 96 | getTopicSentence(self,topicNE,startInd) 97 | parseTree = self.parsedText[self.topicInd]['sentences'][0]['parsetree'] 98 | rawSentence = self.parsedText[self.topicInd]['sentences'][0]['text'] 99 | self.parseTree = convert.treeToList(self,parseTree) 100 | phrase = self.parseTree[1] 101 | if len(rawSentence) > 6: 102 | lineFound = True 103 | 104 | #(phraseNP,phraseVP) = decomposePhrase(self) 105 | #return raw line with parse, NP pharse, VP phrase 106 | #return (phrase,rawSentence,phraseNP,phraseVP) 107 | return (phrase,rawSentence) 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | -------------------------------------------------------------------------------- /stanford-corenlp-python/v1_modules/demo1.py: -------------------------------------------------------------------------------- 1 | # Note to self: MUST deal with non-ascii; too limiting in content and time consuming to restart search each time 2 | # Note to self: CAN find way to clean up raw input and optimize for given htmls 3 | # currently removing index but should also remove charts, examples, etc. 4 | 5 | 6 | 7 | 8 | # Demo for parse & extract modules 9 | # Dependencies: 10 | # convert.py parseNLP.py extractNLP.py 11 | # os, sys, re, string, ast, json 12 | # nltk, corenlp, random, bs4, collections 13 | 14 | from parseNLP import Parse 15 | from extractNLP import Extract 16 | 17 | file = Parse("languages_a1.htm") 18 | 19 | parsed = file.getContent() 20 | word = "hello" 21 | parsed.wordNE[word] # returns Named Entity of 'hello' 22 | # tuple of parse tree as list and raw string 23 | # Example: 24 | # (['S', ['NP', ['DT', 'The'], ['NN', 'palace']], ['VP', ['VBD', 'was'], ['NP', ['NP', ['DT', 'an'], ['NN', 'act']], ['PP', ['IN', 'of'], ['NP', ['NN', 'charity']]]], ['PP', ['IN', 'by'], ['NP', ['NP', ['DT', 'the'], ['NNP', 'Sultan']], ['SBAR', ['WHNP', ['WP', 'who']], ['S', ['VP', ['VBD', 'wanted'], ['S', ['VP', ['TO', 'to'], ['VP', ['VB', 'help'], ['NP', ['DT', 'the'], ['JJ', 'poor']], ['PP', ['IN', 'in'], ['NP', ['NP', ['DT', 'the'], ['JJ', 'neighbouring'], ['NNS', 'areas']], ['PP', ['IN', 'of'], ['NP', ['NNP', 'Pune']]], [], ['SBAR', ['WHNP', ['WP', 'who']], ['S', ['VP', ['VBD', 'were'], ['ADVP', ['RB', 'drastically']], ['VP', ['VBN', 'hit'], ['PP', ['IN', 'by'], ['NP', ['NN', 'famine']]]]]]]]]]]]]]]]]], []], u'The palace was an act of charity by the Sultan who wanted to help the poor in the neighbouring areas of Pune, who were drastically hit by famine.') 25 | 26 | phrases = Extract(parsed) 27 | 28 | phraseDict = phrases.getText() 29 | # dictionary with (key: POS tag, value: raw string associated) 30 | # Example: 31 | # {'NP': ['The palace '], 'VP': ['was an act of charity by the Sultan who wanted to help the poor in the neighbouring areas of Pune who were drastically hit by famine ']} 32 | # NOTE: OUTPUT MAY BE NONE: INPUT IS A FRAGMENT NOT SENTENCE: NEED TO GET NEW SENTENCE 33 | # I should handle it for you but check in case... 34 | -------------------------------------------------------------------------------- /stanford-corenlp-python/v1_modules/extractNLP1.py: -------------------------------------------------------------------------------- 1 | # REQUIRES: input is Parse List 2 | 3 | class Extract(object): 4 | def __init__(self,parse,level=1): 5 | """Extracts phrase strings from list at the given level is the parse tree. If provided level is invalid, will return phrases of highest order. 6 | If no phrase exists, will return None type.""" 7 | (self.parseList,self.raw) = parse 8 | self.level = level 9 | # self.getText() 10 | # if ((self.text == None) and self.level > 1): 11 | # print("No phrases at level %d, retrieving for level 1...", level) 12 | # self.level = 1 13 | # self.getText() 14 | # if (self.text == None): 15 | # print("No phrases in sentence.") 16 | return 17 | 18 | def getString(self,subParse,acc): 19 | if (subParse == []): return "" 20 | elif all((type(x) == str) for x in subParse): 21 | try: 22 | return (subParse[1] + " ") 23 | except: 24 | print subParse 25 | print "Something Failed :(" 26 | for tok in subParse: 27 | if (type(tok) == list): 28 | acc += self.getString(tok," ") 29 | return acc 30 | 31 | 32 | def getText(self): 33 | """Returns dictionay of (POS,phrase) kv pairs.""" 34 | ## ONLY IMPLEMENTED FOR LEVEL 1 NOW... 35 | print("Retrieving phrases at level %d..." % self.level) 36 | # if ('S' not in self.parseList): 37 | # print("This is a fragment, try again.") 38 | # return None 39 | 40 | # # Removes 'S' tag and [] end of line 41 | self.results = dict() 42 | innerList = [] 43 | for stree in self.parseList[1:-1]: 44 | try: 45 | if (stree[0] == 'S'): 46 | innerList += stree[1:] 47 | else: 48 | innerList.append(stree) 49 | except: 50 | # print stree 51 | # Random [] in sentences... IDK why 52 | # also random ['POS'] ohh is punctuation like '.' 53 | print "Empty.. ignore" 54 | for sub in innerList: 55 | if (sub == []): pass 56 | else: 57 | pos = sub[0] 58 | string = self.getString(sub[1:],"") 59 | string = string.replace(' ', ' ') 60 | try: 61 | if string[0] == " ": 62 | string = string[1:] 63 | except: pass 64 | try: 65 | tmp = self.results[pos] 66 | rself.esults[pos] = tmp.append(string) 67 | except: 68 | self.results[pos] = [string] 69 | return self.results 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | -------------------------------------------------------------------------------- /stanford-corenlp-python/v1_modules/parseNLP1.py: -------------------------------------------------------------------------------- 1 | # Load in dependancies 2 | import os, sys, re, string, ast, json 3 | from nltk import * 4 | from corenlp import * 5 | from random import randint 6 | from bs4 import BeautifulSoup 7 | from collections import Counter 8 | 9 | debug = False 10 | # non deterministically find important content 11 | 12 | class Parse(object): 13 | # textRange 15 14 | def __init__(self,fileName,dataDir='../sampleData/languages/',textRange=5,ignoreUnicode=True): 15 | """Parse(fileName,dataDir='../NLP-Question-Answer-System/sampleData/languages/',textRange=5)""" 16 | self.fileName = fileName 17 | self.dataDir = dataDir 18 | self.textRange = textRange+1 # lines of text to search over, +1 for range offset 19 | self.ignoreUnicode = ignoreUnicode 20 | print("Reading file..."), 21 | self.readFile() 22 | print("OK!") 23 | print("Tokenizing file..."), 24 | self.tokenize() 25 | print("OK!") 26 | print("Starting StanfordCoreNLP..."), 27 | self.corenlp = StanfordCoreNLP() 28 | print("OK!") 29 | return 30 | 31 | def readFile(self): 32 | """Load in html file and extract raw text.""" 33 | try: 34 | html = BeautifulSoup(open(self.dataDir+self.fileName)) 35 | self.raw = html.get_text() 36 | except: 37 | raise Exception("Could not read file. Check that the file name and directory are correct. " + 38 | "The file extension should be .htm.") 39 | return 40 | 41 | def tokenize(self): 42 | """Tokenize raw text to prepare for parsing.""" 43 | try: 44 | misc = self.raw.index("See also") 45 | self.raw = self.raw[:misc] 46 | except: 47 | pass 48 | self.text = tokenize.sent_tokenize(self.raw) 49 | self.textLen = len(self.text) 50 | return 51 | 52 | def getNECounts(self,line): 53 | """Collect and store counts of named entities within the sentence.""" 54 | try: 55 | sentence = line[0]['words'] 56 | for token in sentence: 57 | word = token[0] 58 | info = token[1] 59 | if info['NamedEntityTag'] != 'O': 60 | self.NEcounts[word] += 1 61 | except: 62 | raise Exception ("Failed coreNLP parse. \n Text: ", line) 63 | return 64 | 65 | def getTopicSentence(self,topicNE): 66 | """Select a sentence with a given named entity.""" 67 | self.topicInd = -1 #get sentence with topic in it 68 | ind = 0 69 | while (self.topicInd < 0): 70 | try: 71 | tmp = self.parsedText[ind]['sentences'][0]['text'] 72 | if topicNE in tmp: 73 | self.topicInd = ind 74 | else: ind += 1 75 | except: 76 | ind += 1 77 | return 78 | 79 | def treeToList(self,parseTree): 80 | """Convert parse tree from string to nested array""" 81 | validChar = string.ascii_letters + string.digits + "() " 82 | filterChars = (lambda x: ((x not in string.punctuation) or 83 | (x not in string.punctuation) or 84 | (x in "() "))) 85 | parseTree = ''.join(filter(filterChars, parseTree)) 86 | parseTree = parseTree.replace('(', '[') 87 | parseTree = parseTree.replace(')', ']') 88 | parseTree = parseTree.replace('] [', '], [') 89 | parseTree = parseTree.replace('[]', '') 90 | parseTree = re.sub(r'(\w+)', r'"\1",', parseTree) 91 | return ast.literal_eval(parseTree) 92 | 93 | def selectLine(self): 94 | (lineFound,attempts) = (False,0) 95 | uniqueNE = len(list(self.NEcounts)) 96 | topNE = uniqueNE // 4 97 | mostCommonNE = [k for (k,v) in self.NEcounts.most_common(topNE)] 98 | topicNE = None 99 | 100 | while ((not lineFound) and (attempts < 5)): 101 | while (topicNE == None): 102 | try: 103 | topicNE = mostCommonNE[randint(0,max(topNE-1,1))] 104 | except: pass 105 | #print topicNE 106 | #print "here" 107 | self.getTopicSentence(topicNE) 108 | #print "not here" 109 | try: 110 | parseTree = self.parsedText[self.topicInd]['sentences'][0]['parsetree'] 111 | rawSentence = self.parsedText[self.topicInd]['sentences'][0]['text'] 112 | except: 113 | raise Exception ("Invalid parse. Could not decode results.") 114 | #print "at A" 115 | self.parseTree = self.treeToList(parseTree) 116 | selectedPhrase = self.parseTree[1] # ROOT extracted 117 | # try: 118 | # if selectedPhrase[0] == 'FRAG': 119 | # print("This is a fragment. Ignoring...") 120 | # self.getContent() 121 | # except: pass 122 | 123 | #print "at B" 124 | #print selectedPhrase 125 | #print rawSentence 126 | if len(rawSentence.split()) > 6: 127 | lineFound = True 128 | attempts += 1 129 | if (attempts > 5): 130 | print "Timeout finding line with good content. Trying new block..." 131 | self.getContent() 132 | #print rawSentence 133 | if ((unicode("") in rawSentence) and self.ignoreUnicode): 134 | print "Sentence had non-ascii. Ignoring..." 135 | self.getContent() 136 | self.result = (selectedPhrase,rawSentence) 137 | 138 | return 139 | 140 | def getContent(self): 141 | """Return sentence and corresponding parse tree with important content.""" 142 | # Assumption: python PRNG goodish 143 | # there is some worthwhile content every 15 sentences 144 | startInd = randint(0,self.textLen-self.textRange) 145 | self.parsedText = [] 146 | self.NEcounts = Counter() 147 | print ("Parsing lines: "), 148 | for lineInd in xrange(startInd,startInd+self.textRange): 149 | print lineInd, 150 | line = self.text[lineInd] 151 | if isinstance(line, unicode): 152 | strLine = line.encode('ascii', 'xmlcharrefreplace') 153 | try: 154 | res = json.loads(self.corenlp.parse(strLine)) 155 | self.parsedText.append(res) 156 | self.getNECounts(res['sentences']) 157 | except: pass 158 | else: raise Exception ("Invalid encoding for text file.") 159 | 160 | print "Block Complete." 161 | self.selectLine() 162 | return self.result 163 | -------------------------------------------------------------------------------- /testCoreNLP.py: -------------------------------------------------------------------------------- 1 | from corenlp import * 2 | import json 3 | 4 | corenlp = StanfordCoreNLP() 5 | sentence = "Bob is a dog who loves cats. Emily is a carrot and she hates bongo drums." 6 | 7 | corefs = json.loads(corenlp.parse(sentence))["coref"] 8 | print sentence 9 | 10 | print corefs 11 | 12 | corefDict = dict() 13 | for references in corefs: 14 | coreferent = references[0][1][0] 15 | refs = [] 16 | for tuple in references: 17 | pronoun = tuple[0][0] 18 | sentenceNum = tuple[0][1] 19 | refs.append((pronoun, sentenceNum)) 20 | corefDict[coreferent] = refs 21 | 22 | print corefDict 23 | 24 | -------------------------------------------------------------------------------- /yesno.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import nltk.data 3 | import nltk 4 | 5 | def answeryesno(article, question): 6 | prev = "no" 7 | questionstr = ' '.join(question) 8 | questionstr = questionstr.lower() 9 | question = nltk.pos_tag(question) 10 | answer = "no" 11 | keyword = "" 12 | for (word,pos) in question: 13 | if (pos == 'NN' or pos == 'NNS' or pos == 'NNP' or pos == 'NNPS'): 14 | keyword = word.lower() 15 | answer = "no" 16 | for sentence in article: 17 | # print sentence 18 | if answer == "yes": 19 | break 20 | s = nltk.word_tokenize(sentence.lower()) 21 | if keyword in s: 22 | #print sentence 23 | answer = "yes" 24 | for (word,pos) in question: 25 | if answer == 'no': 26 | break 27 | if (pos != '.') and (word.lower() not in s) and (pos != 'DT') and (word != 'does') and (word != 'do'): 28 | answer = 'no' 29 | #print word, pos 30 | if pos[0] == 'V': 31 | tempword = nltk.stem.wordnet.WordNetLemmatizer().lemmatize(word,'v') 32 | for (w,p) in nltk.pos_tag(s): 33 | if p[0] == 'V': 34 | tempword2 = nltk.stem.wordnet.WordNetLemmatizer().lemmatize(w,'v') 35 | if tempword == tempword2: 36 | answer = 'yes' 37 | elif word in article[0]: 38 | answer = "yes" 39 | if prev == "yes": 40 | if (word == "no" or word =="not"): 41 | answer = "no" 42 | if pos[0] == 'V': 43 | prev = "yes" 44 | else: 45 | prev = "no" 46 | 47 | #print questionstr,answer 48 | print answer 49 | --------------------------------------------------------------------------------