├── examples ├── animation.gif ├── wordmesh_law.png ├── wordmesh_risk.png ├── Belgium-Brazil.png ├── wordmesh_trump.png ├── Jobs-speech-scores.png ├── Jobs-speech-cooccurence.png ├── Belgium-Brazil-cooccurence.png ├── trump_hillary_debate_adj.png ├── Jobs-speech-cooccurence-demo.png ├── Belgium-Brazil.txt ├── Jobs-speech.txt └── trump_hillary_debate.txt ├── requirements.txt ├── test ├── __init__.py ├── test_wordmesh.py ├── sample_text.txt └── sample_speech.txt ├── wordmesh ├── __init__.py ├── stopwords.txt ├── force_directed_model.py ├── text_processing.py ├── utils.py └── static_wordmesh.py ├── setup.py ├── LICENSE ├── .gitignore └── README.md /examples/animation.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/animation.gif -------------------------------------------------------------------------------- /examples/wordmesh_law.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/wordmesh_law.png -------------------------------------------------------------------------------- /examples/wordmesh_risk.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/wordmesh_risk.png -------------------------------------------------------------------------------- /examples/Belgium-Brazil.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/Belgium-Brazil.png -------------------------------------------------------------------------------- /examples/wordmesh_trump.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/wordmesh_trump.png -------------------------------------------------------------------------------- /examples/Jobs-speech-scores.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/Jobs-speech-scores.png -------------------------------------------------------------------------------- /examples/Jobs-speech-cooccurence.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/Jobs-speech-cooccurence.png -------------------------------------------------------------------------------- /examples/Belgium-Brazil-cooccurence.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/Belgium-Brazil-cooccurence.png -------------------------------------------------------------------------------- /examples/trump_hillary_debate_adj.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/trump_hillary_debate_adj.png -------------------------------------------------------------------------------- /examples/Jobs-speech-cooccurence-demo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mukund109/word-mesh/HEAD/examples/Jobs-speech-cooccurence-demo.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | textacy==0.6.1 2 | plotly==2.0.11 3 | numpy==1.12.0 4 | colorlover==0.2.1 5 | pandas==0.19.2 6 | scipy==0.18.1 7 | scikit_learn==0.19.1 8 | -------------------------------------------------------------------------------- /test/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Sun May 27 20:38:27 2018 5 | 6 | @author: mukund 7 | """ 8 | 9 | -------------------------------------------------------------------------------- /wordmesh/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Sun May 27 20:38:01 2018 5 | 6 | @author: mukund 7 | """ 8 | __version__ = '0.1.0b2' 9 | 10 | from .static_wordmesh import Wordmesh, LabelledWordmesh 11 | 12 | from .text_processing import STOPWORDS 13 | 14 | from .force_directed_model import ForceDirectedModel 15 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | import re 3 | import io 4 | 5 | __version__ = re.search( 6 | r'__version__\s*=\s*[\'"]([^\'"]*)[\'"]', # It excludes inline comment too 7 | io.open('wordmesh/__init__.py', encoding='utf_8').read() 8 | ).group(1) 9 | 10 | setuptools.setup( 11 | name="wordmesh", 12 | version=__version__, 13 | author="Mukund Chaudhry", 14 | author_email="mukund.chaudhry@gmail.com", 15 | description="A wordcloud generator which allows for meaningful word clustering", 16 | license = 'MIT', 17 | url="https://github.com/mukund109/word-mesh", 18 | install_requires=['textacy>=0.6.1', 'plotly>=2.0.11', 'numpy>=1.12.0', 'colorlover==0.2.1', 'pandas>=0.19.2', 'scipy>=0.18.1', 'scikit_learn'], 19 | python_requires='>=3', 20 | packages=['wordmesh'], 21 | package_data={'wordmesh': ['stopwords.txt']} 22 | 23 | ) 24 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 mukund109 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /test/test_wordmesh.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Sun May 27 19:22:42 2018 5 | 6 | @author: mukund 7 | """ 8 | import sys 9 | import os 10 | 11 | sys.path.append(os.path.join(os.path.dirname(os.getcwd()), 'wordmesh')) 12 | 13 | from wordmesh import Wordmesh, LabelledWordmesh 14 | import unittest 15 | 16 | with open('sample_text.txt') as f: 17 | test_text = f.read() 18 | 19 | 20 | class TestWordmesh(unittest.TestCase): 21 | 22 | def test_default_constructor(self): 23 | wm = Wordmesh(test_text) 24 | self.assertEqual(['stories', 'hall', 'hours', 'dystopias', 'close', 25 | 'new mother', 'mrs', 'feel', 'sexuality', 26 | 'eliciting deep', 'indulgence despite', 'woman', 27 | 'turning', 'wild', 'goes', 'crafting compelling'], 28 | wm.keywords) 29 | 30 | def test_lemmatized_find_all(self): 31 | """ 32 | test if ALL the normalised_keywords can be found in the normalized_text 33 | """ 34 | pass 35 | 36 | if __name__ == '__main__': 37 | unittest.main() 38 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | # Created by https://www.gitignore.io/api/python 3 | 4 | ### Python ### 5 | # Byte-compiled / optimized / DLL files 6 | __pycache__/ 7 | *.py[cod] 8 | *$py.class 9 | 10 | # C extensions 11 | *.so 12 | 13 | # Distribution / packaging 14 | .Python 15 | build/ 16 | develop-eggs/ 17 | dist/ 18 | downloads/ 19 | eggs/ 20 | .eggs/ 21 | lib/ 22 | lib64/ 23 | parts/ 24 | sdist/ 25 | var/ 26 | wheels/ 27 | *.egg-info/ 28 | .installed.cfg 29 | *.egg 30 | 31 | # PyInstaller 32 | # Usually these files are written by a python script from a template 33 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 34 | *.manifest 35 | *.spec 36 | 37 | # Installer logs 38 | pip-log.txt 39 | pip-delete-this-directory.txt 40 | 41 | # Unit test / coverage reports 42 | htmlcov/ 43 | .tox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | .pytest_cache/ 48 | nosetests.xml 49 | coverage.xml 50 | *.cover 51 | .hypothesis/ 52 | 53 | # Translations 54 | *.mo 55 | *.pot 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule.* 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # Environments 83 | .env 84 | .venv 85 | env/ 86 | venv/ 87 | ENV/ 88 | env.bak/ 89 | venv.bak/ 90 | 91 | # Spyder project settings 92 | .spyderproject 93 | .spyproject 94 | 95 | # Rope project settings 96 | .ropeproject 97 | 98 | # mkdocs documentation 99 | /site 100 | 101 | # mypy 102 | .mypy_cache/ 103 | 104 | # NLTK resource 105 | wordmesh/tokenizers 106 | 107 | # TODO file 108 | TODO 109 | 110 | #for experimentation 111 | examples/Untitled.ipynb 112 | wordmesh/utils_legacy.py 113 | 114 | # Temporary plotly plots 115 | temp-plot.html 116 | wordmesh.html 117 | 118 | 119 | # End of https://www.gitignore.io/api/python 120 | -------------------------------------------------------------------------------- /wordmesh/stopwords.txt: -------------------------------------------------------------------------------- 1 | a 2 | about 3 | above 4 | after 5 | again 6 | against 7 | all 8 | also 9 | am 10 | an 11 | and 12 | any 13 | are 14 | aren't 15 | as 16 | at 17 | be 18 | because 19 | been 20 | before 21 | being 22 | below 23 | between 24 | both 25 | but 26 | by 27 | can 28 | can't 29 | cannot 30 | com 31 | could 32 | couldn't 33 | did 34 | didn't 35 | do 36 | does 37 | doesn't 38 | doing 39 | don't 40 | down 41 | during 42 | each 43 | else 44 | ever 45 | few 46 | for 47 | from 48 | further 49 | get 50 | had 51 | hadn't 52 | has 53 | hasn't 54 | have 55 | haven't 56 | having 57 | he 58 | he'd 59 | he'll 60 | he's 61 | her 62 | here 63 | here's 64 | hers 65 | herself 66 | him 67 | himself 68 | his 69 | how 70 | however 71 | how's 72 | http 73 | i 74 | i'd 75 | i'll 76 | i'm 77 | i've 78 | if 79 | in 80 | into 81 | is 82 | isn't 83 | it 84 | it's 85 | its 86 | itself 87 | just 88 | k 89 | let's 90 | like 91 | me 92 | more 93 | most 94 | mustn't 95 | my 96 | myself 97 | no 98 | nor 99 | not 100 | of 101 | off 102 | on 103 | once 104 | only 105 | or 106 | other 107 | otherwise 108 | ought 109 | our 110 | ours 111 | ourselves 112 | out 113 | over 114 | own 115 | r 116 | same 117 | shall 118 | shan't 119 | she 120 | she'd 121 | she'll 122 | she's 123 | should 124 | shouldn't 125 | since 126 | so 127 | some 128 | such 129 | than 130 | that 131 | that's 132 | the 133 | their 134 | theirs 135 | them 136 | themselves 137 | then 138 | there 139 | there's 140 | these 141 | they 142 | they'd 143 | they'll 144 | they're 145 | they've 146 | this 147 | those 148 | through 149 | to 150 | too 151 | under 152 | until 153 | up 154 | very 155 | was 156 | wasn't 157 | we 158 | we'd 159 | we'll 160 | we're 161 | we've 162 | were 163 | weren't 164 | what 165 | what's 166 | when 167 | when's 168 | where 169 | where's 170 | which 171 | while 172 | who 173 | who's 174 | whom 175 | why 176 | why's 177 | with 178 | won't 179 | would 180 | wouldn't 181 | www 182 | you 183 | you'd 184 | you'll 185 | you're 186 | you've 187 | your 188 | yours 189 | yourself 190 | yourselves 191 | -------------------------------------------------------------------------------- /test/sample_text.txt: -------------------------------------------------------------------------------- 1 | Law is a word that means different things at different times. Black’s Law Dictionary says that law is “a body of rules of action or conduct prescribed by controlling authority, and having binding legal force. That which must be obeyed and followed by citizens subject to sanctions or legal consequence is a law.”Black’s Law Dictionary, 6th ed., s.v. “law.” 2 | Functions of the Law 3 | 4 | In a nation, the law can serve to (1) keep the peace, (2) maintain the status quo, (3) preserve individual rights, (4) protect minorities against majorities, (5) promote social justice, and (6) provide for orderly social change. Some legal systems serve these purposes better than others. Although a nation ruled by an authoritarian government may keep the peace and maintain the status quo, it may also oppress minorities or political opponents (e.g., Burma, Zimbabwe, or Iraq under Saddam Hussein). Under colonialism, European nations often imposed peace in countries whose borders were somewhat arbitrarily created by those same European nations. Over several centuries prior to the twentieth century, empires were built by Spain, Portugal, Britain, Holland, France, Germany, Belgium, and Italy. With regard to the functions of the law, the empire may have kept the peace—largely with force—but it changed the status quo and seldom promoted the native peoples’ rights or social justice within the colonized nation. 5 | 6 | In nations that were former colonies of European nations, various ethnic and tribal factions have frequently made it difficult for a single, united government to rule effectively. In Rwanda, for example, power struggles between Hutus and Tutsis resulted in genocide of the Tutsi minority. (Genocide is the deliberate and systematic killing or displacement of one group of people by another group. In 1948, the international community formally condemned the crime of genocide.) In nations of the former Soviet Union, the withdrawal of a central power created power vacuums that were exploited by ethnic leaders. When Yugoslavia broke up, the different ethnic groups—Croats, Bosnians, and Serbians—fought bitterly for home turf rather than share power. In Iraq and Afghanistan, the effective blending of different groups of families, tribes, sects, and ethnic groups into a national governing body that shares power remains to be seen. 7 | Law and Politics 8 | 9 | In the United States, legislators, judges, administrative agencies, governors, and presidents make law, with substantial input from corporations, lobbyists, and a diverse group of nongovernment organizations (NGOs) such as the American Petroleum Institute, the Sierra Club, and the National Rifle Association. In the fifty states, judges are often appointed by governors or elected by the people. The process of electing state judges has become more and more politicized in the past fifteen years, with growing campaign contributions from those who would seek to seat judges with similar political leanings. 10 | 11 | In the federal system, judges are appointed by an elected official (the president) and confirmed by other elected officials (the Senate). If the president is from one party and the other party holds a majority of Senate seats, political conflicts may come up during the judges’ confirmation processes. Such a division has been fairly frequent over the past fifty years. 12 | 13 | In most nation-states (as countries are called in international law), knowing who has power to make and enforce the laws is a matter of knowing who has political power; in many places, the people or groups that have military power can also command political power to make and enforce the laws. Revolutions are difficult and contentious, but each year there are revolts against existing political-legal authority; an aspiration for democratic rule, or greater “rights” for citizens, is a recurring theme in politics and law. 14 | -------------------------------------------------------------------------------- /examples/Belgium-Brazil.txt: -------------------------------------------------------------------------------- 1 | World Cup 2018: Belgium out-tactic Brazil to reach semi-finals as Roberto Martinez channels his Everton days 2 | 3 | Deploying Romelu Lukaku on the right in a false-nine 4-3-3 formation confused the Brazilian defence that left them with too little time to adapt 4 | 5 | By Michael Cox. 6 | 7 | Matches like Belgium’s 2-1 victory over Brazil serve to underline the difficulty of assessing managers’ strategic decisions. 8 | 9 | On one hand, Belgium coach Roberto Martinez stunned Brazil by using an entirely unexpected system, deploying key players in new roles and shifting smoothly between a three-man defence and a back four. Belgium went ahead early, counter-attacked for a second goal, then dropped deep to defend their lead. It was strategic genius. 10 | 11 | On the other, Belgium started nervously, were twice fortunate not to go behind, were gifted their opener by an unforced Brazilian error, then conceded too much possession and space in wide areas, were outfoxed by Tite’s tactical changes, and relied upon debatable refereeing decisions and poor Brazilian finishing to scrape through. The ‘expected goals’ models suggest this should have been a comfortable Brazil win. 12 | 13 | What is undeniable, however, is that Martinez’s tactical decisions made for fascinating viewing. Having previously used a 3-4-3 formation throughout this tournament, here he switched to a 4-3-1-2, or arguably a 4-3-3 with a false nine, and used Romelu Lukaku as a right-sided forward, a position he’d occasionally played under Martinez at Everton with great success. Eden Hazard was therefore playing the part of Kevin Mirallas, which roughly works based upon nationality alone, and left Kevin De Bruyne as the Steven Naismith figure, a less obvious comparison. 14 | 15 | After Belgium had survived a couple of set-piece scares and Fernandinho had diverted a corner into his own net, Belgium’s system truly came alive on the counter-attack at 1-0 up. As the first half continued, it became clear Belgium’s approach had a further nuance: although it was a four-man defence without possession, when Belgium won the ball they reverted to a back three, with Nacer Chadli shuttling to the left, and Thomas Meunier pushing forward down the right. 16 | 17 | Belgium’s Martinez gamble pays off as players pull in same direction 18 | 19 | The runs of the latter became particularly important, because Brazil simply weren’t defending the left flank. Marcelo, when not caught upfield at turnovers, was dragged inside by Lukaku. Meunier was free on the overlap on two notable occasions in the first half, first when he played a low cross towards Lukaku, then when his presence meant the overloaded Marcelo couldn’t close down De Bruyne properly, allowing the midfielder to drive home Belgium’s second. 20 | 21 | Brazil appeared utterly incapable of defending counter-attacks throughout the first half, with Fernandinho proving a disastrous replacement for the suspended Casemiro. De Bruyne was continually left unmarked at turnovers, allowed to bring the ball down and dribble at the Brazilian defence, with Hazard to his left and Lukaku breaking inside from the right, Martinez’s dream scenario. Fernandinho was often covering for Fagner at right-back, and Marcelo was too high up the pitch. Belgium, in truth, didn’t take full advantage of these situations, with a couple of underhit passes from De Bruyne preventing his side from converting promising counter-attacks into goalscoring chances. 22 | 23 | Romelu Lukaku's presence on the right wing caught out Brazil and helped Belgium take a surprise lead (EPA) 24 | 25 | Brazil may have failed to defend the left flank properly, but the combination of Marcelo, Philippe Coutinho and Neymar looked dangerous going forward, often overloading Meunier, excellent offensively but overrun defensively. Marouane Fellaini lacked the mobility to support him and Lukaku remained in an attacking position, a calculated gamble. Brazil made inroads. 26 | 27 | Tite proves his importance to Brazil’s future World Cup cause 28 | 29 | Tite changed things at half-time, introducing Roberto Firmino in place of the underwhelming Willian, with Gabriel Jesus moving to the right. Tite subsequently brought on Douglas Costa in place of Jesus, ending up with the Neymar-Firmino-Costa forward trio that many observers would have preferred from the outset. Tite remained loyal to Jesus, who starred in qualification, but in Russia Brazil haven’t stretched opponents enough without the presence of Costa down the right. 30 | 31 | Somewhat surprisingly, however, it was Brazil’s third substitute, Renato Augusto, who provided the greatest goal threat having replaced Paulinho. With Jan Vertonghen now dragged wide because of Costa’s threat, and Kompany too attracted to Firmino’s movement into clever inside-left positions, space opened up between them. Augusto twice charged into that space, first to nod home Coutinho’s excellent chip, then receiving a low pass, again from Coutinho, before sidefooting wide of the far post. 32 | 33 | This was where Belgium’s hybrid defensive approach fell down: at transitions they were sometimes caught between taking the positions of a back four and a back three. Gaps appeared, and they should have been punished long before Martinez eventually switched to a 5-3-1-1 for the final 10 minutes. 34 | 35 | Equally, Belgium should have exploited their own counter-attacking chances. After an hour, they again found themselves in Martinez’s ideal situation: De Bruyne dribbling through the centre, Hazard free left and Lukaku available on the right. De Bruyne switched the ball left to Hazard, who shot across goal when he might have squared for Lukaku. 36 | 37 | Ultimately it was Brazil who will rue their misses, with Meunier increasingly incapable of stopping Neymar. Brazil’s star man squared for Firmino, who shot over on the spin, then he pulled the ball back for Coutinho, who sliced hopelessly wide. Thibaut Courtois palmed over his curled effort in stoppage time, but Belgium’s goalkeeper was forced into relatively few saves considering Belgium’s number of chances. 38 | Martinez has guided Belgium into the World Cup quarter-finals for the first time (AP) 39 | 40 | Afterwards, Martinez reminded television viewers that matches are won on the pitch, rather than the tactics board. Rarely in this tournament, however, has a tactics board been so crucial in understanding the nature of a game. 41 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # word-mesh 2 | A wordcloud/wordmesh generator that allows users to extract keywords from text, and create a simple and interpretable wordcloud. 3 | 4 | ## Why word-mesh? 5 | 6 | Most popular open-source wordcloud generators ([word_cloud](https://github.com/amueller/word_cloud), [d3-cloud](https://github.com/jasondavies/d3-cloud), [echarts-wordcloud](https://github.com/ecomfe/echarts-wordcloud)) focus more on the aesthetics of the visualization than on effectively conveying textual features. **word-mesh** strikes a balance between the two and uses the various statistical, semantic and grammatical features of the text to inform visualization parameters. 7 | 8 | #### Features: 9 | - *keyword extraction*: In addition to 'word frequency' based extraction techniques, word-mesh supports graph based methods like **textrank**, **sgrank** and **bestcoverage**. 10 | 11 | - *word clustering*: Words can be grouped together on the canvas based on their semantic similarity, co-occurence frequency, and other properties. 12 | 13 | - *keyword filtering*: Extracted keywords can be filtered based on their pos tags or whether they are named entities. 14 | 15 | - *font colors and font sizes*: These can be set based on the following criteria - word frequency, pos-tags, ranking algorithm score. 16 | 17 | 18 | ## How it works? 19 | **word-mesh** uses [spacy](https://spacy.io/)'s pretrained language models to gather textual features, graph based algorithms to extract keywords, [Multidimensional Scaling](https://en.wikipedia.org/wiki/Multidimensional_scaling) to place these keywords on the canvas and a force-directed algorithm to optimize inter-word spacing. 20 | 21 | 22 | ## Examples 23 | 24 | Here's a visualization of the force-directed algorithm. The words are extracted using *textrank* from a textbook on international law, and are grouped together on the canvas based on their co-occurrence frequency. The colours indicate the pos tags of the words. 25 | 26 | ![animation](examples/animation.gif) 27 | 28 | This wordmesh was created from Steve Job's famous commencement speech at Stanford. The keywords are extracted using *textrank* and clustered based on their *scores*. The font colors and font sizes are also a function of the scores. *[Code](examples/examples.ipynb)* 29 | 30 | ![jobs-scores](examples/Jobs-speech-scores.png) 31 | 32 | This is from the same text, but the clustering has been done based on *co-occurrence frequency* of keywords. The colors have been assigned using the same criteria used to cluster them. 33 | 34 | This is quite apparent from the positions of the words. You can see the words like 'hungry' and 'foolish' have been grouped together, since they occur close to each other in the text as part of the famous quote **"Stay hungry. Stay foolish"**. *[Code](examples/examples.ipynb)* 35 | 36 | ![jobs-co-occurence](examples/Jobs-speech-cooccurence-demo.png) 37 | 38 | This is a wordmesh of all the *adjectives* used in a 2016 US Presidential Debate between Donald Trump and Hillary Clinton. The words are clustered based on their *meaning*, with the font size indicating the usage frequency, and the color corresponding to which candidate used them. *[Code](examples/examples_labelled.ipynb)* 39 | 40 | ![debate](examples/trump_hillary_debate_adj.png) 41 | 42 | This example is taken from a news article on the Brazil vs Belgium 2018 Russia WC QF. The colors correspond to the *POS tags* of the words. The second figure is the same wordmesh clustered based on the words' *co-occurrence frequency*. *[Code](examples/examples.ipynb)* 43 | 44 | ![fifa](examples/Belgium-Brazil.png) ![fifa2](examples/Belgium-Brazil-cooccurence.png) 45 | 46 | ## Installation 47 | 48 | Install the package using pip: 49 | 50 | pip install wordmesh 51 | 52 | You would also need to download the following language model (size ~ 115MB): 53 | 54 | python -m spacy download en_core_web_md 55 | 56 | This is required for POS tagging and for accessing word vectors. For more information on the download, or for help with the installation, visit [here](https://spacy.io/usage/models). 57 | 58 | ## Tutorial 59 | 60 | All functionality is contained within the 'Wordmesh' class. 61 | 62 | ```python 63 | from wordmesh import Wordmesh 64 | 65 | #Create a Wordmesh object by passing the constructor the text that you wish to summarize 66 | with open('sample.txt', 'r') as f: 67 | text = f.read() 68 | wm = Wordmesh(text) 69 | 70 | #Save the plot 71 | wm.save_as_html(filename='my-wordmesh.html') 72 | #You can now open it in the browser, and subsequently save it in jpeg format if required 73 | 74 | #If you are using a jupyter notebook, you can plot it inline 75 | wm.plot() 76 | ``` 77 | The Wordmesh object offers 3 'set' methods which can be used to set the fontsize, fontcolor and the clustering criteria. **Check the inline documentation for details**. 78 | 79 | ```python 80 | wm.set_fontsize(by='scores') 81 | wm.set_fontcolor(by='random') 82 | wm.set_clustering_criteria(by='meaning') 83 | ``` 84 | 85 | You can access keywords, pos_tags, keyword scores and other important features of the text. These may be used to set custom visualization parameters. 86 | 87 | ```python 88 | print(wm.keywords, wm.pos_tags, wm.scores) 89 | 90 | #set NOUNs to red and all else to green 91 | f = lambda x: (200,0,0) if (x=='NOUN') else (0,200,0) 92 | colors = list(map(f, wm.pos_tags)) 93 | 94 | wm.set_fontcolor(custom_colors=colors) 95 | ``` 96 | 97 | For more examples check out [this](examples/examples.ipynb) notebook. 98 | 99 | If you are working with text which is composed of various labelled sections (e.g. a conversation transcript), the LabelledWordmesh class (which inherits from Wordmesh) can be useful if you wish to treat those sections separately. Check out [this](examples/examples_labelled.ipynb) notebook for an example. 100 | 101 | ## Notes 102 | 103 | - The code isn't optimized to work on large chunks of text. So be wary of the memory usage while processing text with >100,000 characters. 104 | - Currently, [Plotly](https://plot.ly/) is being used as the visualization backend. However, if you wish to use another tool, you can use the positions of the keywords, and the size of their bounding boxes, which are available as Wordmesh object attributes. These can be used to render the words using a tool of your choice. 105 | - As of now, POS based filtering, and multi-gram extraction cannot be done when using graph based extraction algorithms. This is due to some problems with underlying libraries which will hopefully be fixed in the future. 106 | - Even though you have the option of choosing 'TSNE' as the clustering algorithm, I would advise against it since it still needs to be tested thoroughly. 107 | -------------------------------------------------------------------------------- /wordmesh/force_directed_model.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Mon May 28 17:30:38 2018 5 | 6 | @author: mukund 7 | """ 8 | 9 | import numpy as np 10 | from scipy.spatial import Delaunay 11 | from scipy.spatial.distance import pdist 12 | import random 13 | 14 | ATTRACTION_MULTIPLIER = 1 15 | COLLISION_MULTIPLIER = 5 16 | COLLISION_CAP = 1000 17 | DELAUNAY_MULTIPLIER = 3 18 | DR_CONSTANT = 150 19 | 20 | def _fv_attraction(point, other_points, multiplier=ATTRACTION_MULTIPLIER, 21 | inverse_distance_proportionality=False, normalised=False): 22 | 23 | """ 24 | other_points can be a single point or an ndarray with shape: (num_points, 2) 25 | """ 26 | x1, y1 = point 27 | x2, y2 = other_points[:, 0], other_points[:, 1] 28 | d = ((x2-x1)**2 + (y2-y1)**2)**(1/2) 29 | 30 | x_component = (x2-x1) 31 | y_component = (y2-y1) 32 | 33 | 34 | if inverse_distance_proportionality: 35 | x_component = x_component/(d+1) 36 | y_component = y_component/(d+1) 37 | 38 | force_vectors = np.asarray([multiplier*x_component, multiplier*y_component]).swapaxes(0,1) 39 | 40 | #normalised to account for variable number of keywords 41 | if normalised: 42 | force_vectors = force_vectors/(other_points.shape[0]+1) 43 | 44 | return force_vectors 45 | 46 | def _fv_collision(point, box_size, other_points, other_box_sizes, 47 | multiplier=COLLISION_MULTIPLIER): 48 | """ 49 | box_size = (width, height) 50 | other_box_sizes shape = (num_points, 2) 51 | """ 52 | 53 | b_x1, b_y1 = point[0]-box_size[0]/2, point[1]-box_size[1]/2 54 | b_x2, b_y2 = point[0]+box_size[0]/2, point[1]+box_size[1]/2 55 | 56 | o_x1 = other_points[:, 0] - other_box_sizes[:,0]/2 57 | o_y1 = other_points[:, 1] - other_box_sizes[:,1]/2 58 | o_x2 = other_points[:, 0] + other_box_sizes[:,0]/2 59 | o_y2 = other_points[:, 1] + other_box_sizes[:,1]/2 60 | 61 | inter_x1 = np.maximum(o_x1, b_x1) 62 | inter_x2 = np.minimum(o_x2, b_x2) 63 | inter_y1 = np.maximum(o_y1, b_y1) 64 | inter_y2 = np.minimum(o_y2, b_y2) 65 | 66 | 67 | force_vector = np.ones(shape=(inter_x1.shape[0], 2), dtype=np.float32) 68 | 69 | force_vector[inter_x1>inter_x2, :] = 0 70 | force_vector[inter_y1>inter_y2, :] = 0 71 | 72 | overlapping_area = (inter_x2-inter_x1)*(inter_y2-inter_y1) 73 | overlapping_area = np.stack([overlapping_area, overlapping_area], axis=1) 74 | 75 | 76 | force_vector = force_vector*(-1*_fv_attraction(point, 77 | other_points, 78 | multiplier, 79 | True))*overlapping_area 80 | 81 | #cap on how large the collision force vector is allowed to be 82 | force_vector[force_vector>COLLISION_CAP]=COLLISION_CAP 83 | force_vector[force_vector<-COLLISION_CAP]=-COLLISION_CAP 84 | return force_vector 85 | 86 | def _delaunay_force(point_index, current_positions, simplices, 87 | initial_positions, multiplier=DELAUNAY_MULTIPLIER): 88 | 89 | #get simplices which contain said point 90 | 91 | 92 | mask = np.any(simplices==point_index, axis=1) 93 | line_segs = current_positions[simplices[mask]].reshape(-1, 2) 94 | line_segs = line_segs[~(line_segs== current_positions[point_index])].reshape(-1,4) 95 | 96 | initial_line_segs = initial_positions[simplices[mask]].reshape(-1, 2) 97 | initial_line_segs = initial_line_segs[~(initial_line_segs== initial_positions[point_index])].reshape(-1,4) 98 | 99 | #find points at which given point bisects the triangle side opposite to it 100 | 101 | x1 = line_segs[:, 0] 102 | y1 = line_segs[:, 1] 103 | x2 = line_segs[:, 2] 104 | y2 = line_segs[:, 3] 105 | 106 | xp, yp = current_positions[point_index] 107 | m = (y2-y1)/(x2-x1) 108 | mp = (yp-y1)/(xp-x1) 109 | 110 | X = (m*(yp-y1) + (m**2)*x1 + xp)/(m**2 + 1) 111 | Y = (y1 + m*(xp-x1) + (m**2)*yp)/(m**2 + 1) 112 | 113 | #initial slopes 114 | x1_i = initial_line_segs[:, 0] 115 | y1_i = initial_line_segs[:, 1] 116 | x2_i = initial_line_segs[:, 2] 117 | y2_i = initial_line_segs[:, 3] 118 | 119 | m_i = (y2_i-y1_i)/(x2_i-x1_i) 120 | 121 | xp_i, yp_i = initial_positions[point_index] 122 | mp_i = (yp_i-y1_i)/(xp_i-x1_i) 123 | 124 | #Calculate sign of force 125 | sign = np.sign(m-mp)*np.sign(m_i-mp_i) 126 | sign = np.repeat(sign[:, np.newaxis], 2, axis=1) 127 | # find the force due to these 128 | #force vector = sign * force of repulsion 129 | force_vector = sign*(-1)*_fv_attraction(current_positions[point_index], 130 | np.stack([X,Y]).swapaxes(0,1), 131 | multiplier, True) 132 | 133 | return force_vector 134 | 135 | 136 | def _update_positions(current_positions, bounding_box_dimensions, simplices, 137 | initial_positions, descent_rate, momentum=None, apply_delaunay=True): 138 | """ 139 | Performs a single iteration of force directed displacement for every word 140 | """ 141 | updated_positions = current_positions.copy() 142 | bbd = bounding_box_dimensions 143 | num_particles = current_positions.shape[0] 144 | 145 | force_memory = np.ndarray(shape=(num_particles, 2)) 146 | 147 | for i in random.sample(list(range(num_particles)), num_particles): 148 | 149 | this_particle = updated_positions[i] 150 | other_particles = updated_positions[~(np.arange(num_particles)==i)] 151 | 152 | this_bbd = bbd[i] 153 | other_bbds = bbd[~(np.arange(num_particles)==i)] 154 | 155 | #Calculates all three forces on ith particle due to all other particles 156 | aforce = _fv_attraction(this_particle, other_particles, normalised=True) 157 | cforce = _fv_collision(this_particle, this_bbd, other_particles, 158 | other_bbds) 159 | 160 | if apply_delaunay: 161 | dforce = _delaunay_force(i, updated_positions, simplices, 162 | initial_positions) 163 | total_force = np.sum(cforce+aforce, axis=0) + np.sum(dforce, axis=0) 164 | 165 | else: 166 | total_force = np.sum(cforce+aforce, axis=0) 167 | 168 | if momentum is not None: 169 | total_force = total_force + momentum[i] 170 | 171 | #updated_position = current_position + alpha*force 172 | #Not exactly Newtonian but works 173 | updated_positions[i] = updated_positions[i] + descent_rate*total_force 174 | force_memory[i] = total_force 175 | 176 | return updated_positions, force_memory 177 | 178 | class ForceDirectedModel(): 179 | 180 | def __init__(self, positions, bounding_box_dimensions, 181 | num_iters=1000, apply_delaunay=True, delaunay_multiplier=None): 182 | 183 | """ 184 | Parameters 185 | ---------- 186 | 187 | current_positions: numpy array 188 | A 2-D numpy array of shape (num_particles, 2), giving the x and y 189 | coordinates of all particles 190 | 191 | bounding_box_dimensions: numpy array 192 | A 2-D numpy array of shape (num_particles, 2), giving the height and 193 | width of each bounding box 194 | 195 | Returns 196 | ------- 197 | a ForceDirectedModel object 198 | """ 199 | if delaunay_multiplier is not None: 200 | DELAUNAY_MULTIPLIER = delaunay_multiplier 201 | self.num_particles = positions.shape[0] 202 | self.initial_positions = positions 203 | self.bounding_box_dimensions = bounding_box_dimensions 204 | self.num_iters = num_iters 205 | self.apply_delaunay = apply_delaunay 206 | 207 | if apply_delaunay: 208 | self.simplices = Delaunay(positions).simplices 209 | else: 210 | self.simplices = None 211 | 212 | self.all_positions = self._run_algorithm() 213 | self.all_centered_positions = self._centered_positions() 214 | 215 | def equilibrium_position(self, centered=True): 216 | """ 217 | The equilibrium positions are calculated by applying the force directed 218 | algorithm on the particles surrounded by the given bounding boxes 219 | 220 | centered: bool, optional 221 | If you want the bounding boxes to be centered around the origin, 222 | you can set this to true. 223 | 224 | Returns 225 | ------- 226 | 227 | numpy array: 228 | A 2-D numpy array of shape (num_particles, 2) containing the 229 | x and y coordinates of the equilibrium positions of the particles 230 | """ 231 | if centered: 232 | return self.all_centered_positions[-1] 233 | else: 234 | return self.all_positions[-1] 235 | 236 | def _run_algorithm(self): 237 | 238 | 239 | position_i = self.initial_positions.copy() 240 | simplices = self.simplices 241 | bbd = self.bounding_box_dimensions 242 | 243 | 244 | all_positions = np.ndarray(shape=(self.num_iters, self.num_particles, 2)) 245 | all_positions[0] = position_i 246 | 247 | #make it a function of max radial distance or something 248 | avg_dist = pdist(position_i).sum(0).sum()/(self.num_particles**2) 249 | initial_dr = avg_dist/(DR_CONSTANT*self.num_iters) 250 | 251 | force_memory = np.zeros((self.num_particles, 2)) 252 | 253 | for i in range(1,self.num_iters): 254 | position_i, force_memory = _update_positions(position_i, 255 | bbd, simplices, 256 | self.initial_positions, 257 | initial_dr*(1-i*i/(self.num_iters*self.num_iters)), 258 | apply_delaunay=self.apply_delaunay)#)) 259 | all_positions[i] = position_i 260 | 261 | return all_positions 262 | 263 | def _centered_positions(self): 264 | 265 | bbd = self.bounding_box_dimensions.copy() 266 | bbd = np.repeat(bbd[np.newaxis, :, :], self.num_iters, axis=0) 267 | all_pos = self.all_positions 268 | 269 | x_left = np.min((all_pos[:, :, 0]-bbd[:,:,0]/2), axis=1) 270 | x_right = np.max((all_pos[:, :, 0]+bbd[:,:,0]/2), axis=1) 271 | y_bottom = np.min((all_pos[:, :, 1]-bbd[:,:,1]/2), axis=1) 272 | y_top = np.max((all_pos[:, :, 1]+bbd[:,:,1]/2), axis=1) 273 | 274 | centered_positions = all_pos.copy() 275 | 276 | #broadcasting; com=center of mass 277 | com_x = np.repeat(((x_right+x_left)/2)[:, np.newaxis], self.num_particles, axis=1) 278 | com_y = np.repeat(((y_top+y_bottom)/2)[:, np.newaxis], self.num_particles, axis=1) 279 | 280 | centered_positions[:, :, 0] = all_pos[:, :, 0] - com_x 281 | centered_positions[:, :, 1] = all_pos[:, :, 1] - com_y 282 | 283 | return centered_positions 284 | 285 | 286 | 287 | -------------------------------------------------------------------------------- /wordmesh/text_processing.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Sun Jun 10 02:56:08 2018 5 | 6 | @author: mukund 7 | """ 8 | #Will throw an error if language model has not been downloaded 9 | try: 10 | print('Loading spaCy\'s language model...') 11 | import en_core_web_md 12 | NLP = en_core_web_md.load() 13 | print('Done') 14 | except ModuleNotFoundError as e: 15 | msg = 'word-mesh relies on spaCy\'s pretrained language models '\ 16 | 'for tokenization, POS tagging and accessing word embeddings. \n\n'\ 17 | 'Download the \'en_core_web_md\' model by following the instructions '\ 18 | 'given on: \n\nhttps://spacy.io/usage/models \n' 19 | raise Exception(msg).with_traceback(e.__traceback__) 20 | 21 | import os 22 | import textacy 23 | from textacy.extract import named_entities 24 | from textacy.keyterms import sgrank,key_terms_from_semantic_network 25 | import numpy as np 26 | 27 | 28 | FILE = os.path.dirname(__file__) 29 | STOPWORDS = set(map(str.strip, open(os.path.join(FILE,'stopwords.txt')).readlines())) 30 | NGRAM_LIMIT = 3 31 | 32 | def _text_preprocessing(text): 33 | """ 34 | Apostrophes not handled properly by spaCy 35 | https://github.com/explosion/spaCy/issues/685 36 | """ 37 | return text.replace('’s', '').replace('’m', '').replace('\'s','') 38 | 39 | def _text_postprocessing(doc, keywords, extract_ngrams=False): 40 | """ 41 | named entities are converted to uppercase 42 | """ 43 | ents = list(named_entities(doc)) 44 | ents_lemma = [entity.lemma_ for entity in ents] 45 | 46 | #if keyword is named entity it will be replaced by its uppercase form 47 | for i,word in enumerate(keywords.copy()): 48 | try: 49 | index = ents_lemma.index(word) 50 | keywords[i] = ents[index].text 51 | except ValueError: 52 | continue 53 | 54 | #if there is a redundant space character, it will be removed 55 | if not extract_ngrams: 56 | for i,word in enumerate(keywords): 57 | if word[0]==' ': 58 | keywords[i] = word.replace(' ','') 59 | 60 | return keywords 61 | 62 | def _filter(keywords, scores, filter_stopwords, num_terms): 63 | #temporary -PRON-, stopwords and ngram filter 64 | #stopwords are allowed in multigrams 65 | tempkw, temps = [],[] 66 | stopwords = STOPWORDS if filter_stopwords else set() 67 | for i,kw in enumerate(keywords): 68 | if kw.find('-PRON-')==-1 and (kw not in stopwords) \ 69 | and (kw!='') and (len(kw.split(' '))<=NGRAM_LIMIT) and (kw!='_'): 70 | tempkw.append(kw) 71 | temps.append(scores[i]) 72 | 73 | if len(tempkw)>num_terms: 74 | tempkw, temps = tempkw[:num_terms], temps[:num_terms] 75 | 76 | return tempkw, temps 77 | 78 | def extract_terms_by_score(text, algorithm, num_terms, extract_ngrams, 79 | ngrams=(1,2), lemmatize=True, filter_stopwords=True): 80 | 81 | if lemmatize: 82 | normalize = 'lemma' 83 | else : 84 | normalize = None 85 | 86 | #convert raw text into spaCy doc 87 | text = _text_preprocessing(text) 88 | doc = textacy.Doc(text, lang=NLP) 89 | 90 | #the number of extracted terms is twice num_keyterms since a lot of them 91 | #are filtered out by the filter 92 | if algorithm=='sgrank': 93 | ngrams = ngrams if extract_ngrams else (1,) 94 | keywords_scores = sgrank(doc, normalize=normalize, 95 | ngrams=ngrams, n_keyterms=num_terms*2) 96 | 97 | elif (algorithm=='pagerank') | (algorithm=='textrank'): 98 | keywords_scores = key_terms_from_semantic_network(doc, 99 | normalize=normalize, 100 | edge_weighting='cooc_freq', 101 | window_width=5, 102 | ranking_algo='pagerank', 103 | join_key_words=extract_ngrams, 104 | n_keyterms=num_terms*2) 105 | 106 | else: 107 | keywords_scores = key_terms_from_semantic_network(doc, 108 | normalize=normalize, 109 | edge_weighting='cooc_freq', 110 | window_width=5, 111 | ranking_algo=algorithm, 112 | join_key_words=extract_ngrams, 113 | n_keyterms=num_terms*2) 114 | 115 | keywords = [i[0] for i in keywords_scores] 116 | scores = [i[1] for i in keywords_scores] 117 | 118 | keywords, scores = _filter(keywords, scores, filter_stopwords, num_terms) 119 | 120 | scores = np.array(scores) 121 | #get pos tags for keywords, if keywords are ngrams, the 122 | #pos tag of the last word in the ngram is picked 123 | ending_tokens = [ngram.split(' ')[-1] for ngram in keywords] 124 | mapping = _get_pos_mapping(doc, normalize) 125 | pos_tags = [mapping[end] for end in ending_tokens] 126 | 127 | normalized_keywords = keywords.copy() 128 | keywords = _text_postprocessing(doc, keywords, extract_ngrams) 129 | return keywords, scores, pos_tags, normalized_keywords 130 | 131 | def _get_pos_mapping(doc, normalize, pos_filter=['NOUN','PROPN','ADJ', 132 | 'VERB','ADV','SYM','PUNCT']): 133 | 134 | """ 135 | Iterates through the doc and finds the pos_tag corresponding to each 136 | token. This is then mapped to the token's normalized form. 137 | Since the same word can have multiple different POS tags, only the tag 138 | present in pos_filter is kept. If multiple tags are present in pos_filter, 139 | then the update sequence of the dictionary determines which one is kept. 140 | """ 141 | mapping = dict() 142 | if normalize=='lemma': 143 | for token in doc: 144 | if (token.pos_ in pos_filter) or (token.lemma_ not in mapping.keys()): 145 | mapping.update({token.lemma_:token.pos_}) 146 | elif normalize=='lower': 147 | for token in doc: 148 | if (token.pos_ in pos_filter) or (token.lower_ not in mapping.keys()): 149 | mapping.update({token.lower_:token.pos_}) 150 | else: 151 | for token in doc: 152 | if (token.pos_ in pos_filter) or (token.text not in mapping.keys()): 153 | mapping.update({token.text:token.pos_}) 154 | 155 | return mapping 156 | 157 | def _get_frequency_mapping(doc): 158 | """ 159 | Need to take args like 'ngrams', 'normalize' 160 | and pass them on to to_bag_of_terms 161 | """ 162 | doc.to_bag_of_terms(as_strings=True) 163 | 164 | def normalize_text(text, lemmatize): 165 | if not lemmatize: 166 | return text 167 | 168 | text = _text_preprocessing(text) 169 | 170 | #convert raw text into spaCy doc 171 | text = _text_preprocessing(text) 172 | doc = textacy.Doc(text, lang=NLP) 173 | 174 | #pronouns need to be handled separately 175 | #https://github.com/explosion/spaCy/issues/962 176 | lemmatized_strings = [] 177 | for token in doc: 178 | if token.lemma_ == '-PRON-': 179 | lemmatized_strings.append(token.lower_) 180 | else: 181 | lemmatized_strings.append(token.lemma_) 182 | 183 | normalized_text = ' '.join(lemmatized_strings) 184 | return normalized_text 185 | 186 | def extract_terms_by_frequency(text, 187 | num_terms, 188 | pos_filter=['NOUN','ADJ','PROPN'], 189 | filter_nums=True, 190 | extract_ngrams = True, 191 | ngrams=(1,2), lemmatize=True, 192 | filter_stopwords=True): 193 | """ 194 | pos_filter : {'NOUN','PROPN','ADJ','VERB','ADV','SYM',PUNCT'} 195 | """ 196 | if lemmatize: 197 | normalize = 'lemma' 198 | else : 199 | normalize = None 200 | 201 | #convert raw text into spaCy doc 202 | text = _text_preprocessing(text) 203 | doc = textacy.Doc(text, lang=NLP) 204 | 205 | #get the frequencies of the filtered terms 206 | ngrams = ngrams if extract_ngrams else (1,) 207 | if 'PROPN' in pos_filter: 208 | frequencies = doc.to_bag_of_terms(ngrams, normalize=normalize, 209 | as_strings=True, 210 | include_pos=pos_filter, 211 | filter_nums=filter_nums, 212 | include_types=['PERSON','LOC','ORG']) 213 | elif 'PROPN' not in pos_filter: 214 | frequencies = doc.to_bag_of_terms(ngrams, normalize=normalize, 215 | named_entities=False, 216 | as_strings=True, 217 | include_pos=pos_filter, 218 | filter_nums=filter_nums) 219 | 220 | #sort the terms based on the frequencies and 221 | #choose the top num_terms terms 222 | #NOTE: lots of redundant code here, cleanup required 223 | frequencies = list(frequencies.items()) 224 | 225 | keywords = [tup[0] for tup in frequencies] 226 | scores = [tup[1] for tup in frequencies] 227 | 228 | #applying filter 229 | keywords,scores = _filter(keywords, scores, filter_stopwords, num_terms) 230 | 231 | 232 | frequencies = list(zip(keywords,scores)) 233 | frequencies.sort(key=lambda x: x[1], reverse=True) 234 | top_terms = frequencies[:num_terms] 235 | 236 | keywords = [tup[0] for tup in top_terms] 237 | scores = np.array([tup[1] for tup in top_terms]) 238 | 239 | #get pos tags for keywords, if keywords are multi-grams, the 240 | #pos tag of the last word in the multi-gram is picked 241 | ending_tokens = [ngram.split(' ')[-1] for ngram in keywords] 242 | mapping = _get_pos_mapping(doc, normalize, pos_filter) 243 | pos_tags = [mapping[end] for end in ending_tokens] 244 | 245 | normalized_keywords = keywords.copy() 246 | keywords = _text_postprocessing(doc, keywords, extract_ngrams) 247 | return keywords, scores, pos_tags, normalized_keywords 248 | 249 | 250 | def get_semantic_similarity_matrix(keywords): 251 | text = ' '.join(keywords) 252 | doc = textacy.Doc(text, lang=NLP) 253 | 254 | #split the doc into a list of spaCy's 'spans' 255 | spans = [] 256 | i=0 257 | for term in keywords: 258 | delta = len(term.split(' ')) 259 | spans.append(doc[i:i+delta]) 260 | i = i+delta 261 | 262 | #find the similarity between each pair of spans 263 | similarity_scores = [] 264 | for span1 in spans: 265 | t1_scores = [] 266 | for i,span2 in enumerate(spans): 267 | t1_scores.append(span1.similarity(span2)) 268 | similarity_scores.append(t1_scores) 269 | 270 | #the values of the returned matrix vary from 0 to 1, 271 | #a smaller number means that the words are similar in meaning 272 | return (1-np.array(similarity_scores)) 273 | -------------------------------------------------------------------------------- /examples/Jobs-speech.txt: -------------------------------------------------------------------------------- 1 | I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories. 2 | 3 | The first story is about connecting the dots. 4 | 5 | I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out? 6 | 7 | It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got a call in the middle of the night asking: “We have an unexpected baby boy; do you want him?” They said: “Of course.” My biological mother later found out that my mother had never graduated from college and that my father had never graduated from high school. She refused to sign the final adoption papers. She only relented a few months later when my parents promised that I would someday go to college. 8 | 9 | And 17 years later I did go to college. But I naively chose a college that was almost as expensive as Stanford, and all of my working-class parents’ savings were being spent on my college tuition. After six months, I couldn’t see the value in it. I had no idea what I wanted to do with my life and no idea how college was going to help me figure it out. And here I was spending all of the money my parents had saved their entire life. So I decided to drop out and trust that it would all work out OK. It was pretty scary at the time, but looking back it was one of the best decisions I ever made. The minute I dropped out I could stop taking the required classes that didn’t interest me, and begin dropping in on the ones that looked interesting. 10 | 11 | It wasn’t all romantic. I didn’t have a dorm room, so I slept on the floor in friends’ rooms, I returned Coke bottles for the 5¢ deposits to buy food with, and I would walk the 7 miles across town every Sunday night to get one good meal a week at the Hare Krishna temple. I loved it. And much of what I stumbled into by following my curiosity and intuition turned out to be priceless later on. Let me give you one example: 12 | 13 | Reed College at that time offered perhaps the best calligraphy instruction in the country. Throughout the campus every poster, every label on every drawer, was beautifully hand calligraphed. Because I had dropped out and didn’t have to take the normal classes, I decided to take a calligraphy class to learn how to do this. I learned about serif and sans serif typefaces, about varying the amount of space between different letter combinations, about what makes great typography great. It was beautiful, historical, artistically subtle in a way that science can’t capture, and I found it fascinating. 14 | 15 | None of this had even a hope of any practical application in my life. But 10 years later, when we were designing the first Macintosh computer, it all came back to me. And we designed it all into the Mac. It was the first computer with beautiful typography. If I had never dropped in on that single course in college, the Mac would have never had multiple typefaces or proportionally spaced fonts. And since Windows just copied the Mac, it’s likely that no personal computer would have them. If I had never dropped out, I would have never dropped in on this calligraphy class, and personal computers might not have the wonderful typography that they do. Of course it was impossible to connect the dots looking forward when I was in college. But it was very, very clear looking backward 10 years later. 16 | 17 | Again, you can’t connect the dots looking forward; you can only connect them looking backward. So you have to trust that the dots will somehow connect in your future. You have to trust in something — your gut, destiny, life, karma, whatever. This approach has never let me down, and it has made all the difference in my life. 18 | 19 | My second story is about love and loss. 20 | 21 | I was lucky — I found what I loved to do early in life. Woz and I started Apple in my parents’ garage when I was 20. We worked hard, and in 10 years Apple had grown from just the two of us in a garage into a $2 billion company with over 4,000 employees. We had just released our finest creation — the Macintosh — a year earlier, and I had just turned 30. And then I got fired. How can you get fired from a company you started? Well, as Apple grew we hired someone who I thought was very talented to run the company with me, and for the first year or so things went well. But then our visions of the future began to diverge and eventually we had a falling out. When we did, our Board of Directors sided with him. So at 30 I was out. And very publicly out. What had been the focus of my entire adult life was gone, and it was devastating. 22 | 23 | I really didn’t know what to do for a few months. I felt that I had let the previous generation of entrepreneurs down — that I had dropped the baton as it was being passed to me. I met with David Packard and Bob Noyce and tried to apologize for screwing up so badly. I was a very public failure, and I even thought about running away from the valley. But something slowly began to dawn on me — I still loved what I did. The turn of events at Apple had not changed that one bit. I had been rejected, but I was still in love. And so I decided to start over. 24 | 25 | I didn’t see it then, but it turned out that getting fired from Apple was the best thing that could have ever happened to me. The heaviness of being successful was replaced by the lightness of being a beginner again, less sure about everything. It freed me to enter one of the most creative periods of my life. 26 | 27 | During the next five years, I started a company named NeXT, another company named Pixar, and fell in love with an amazing woman who would become my wife. Pixar went on to create the world’s first computer animated feature film, Toy Story, and is now the most successful animation studio in the world. In a remarkable turn of events, Apple bought NeXT, I returned to Apple, and the technology we developed at NeXT is at the heart of Apple’s current renaissance. And Laurene and I have a wonderful family together. 28 | 29 | I’m pretty sure none of this would have happened if I hadn’t been fired from Apple. It was awful tasting medicine, but I guess the patient needed it. Sometimes life hits you in the head with a brick. Don’t lose faith. I’m convinced that the only thing that kept me going was that I loved what I did. You’ve got to find what you love. And that is as true for your work as it is for your lovers. Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do. If you haven’t found it yet, keep looking. Don’t settle. As with all matters of the heart, you’ll know when you find it. And, like any great relationship, it just gets better and better as the years roll on. So keep looking until you find it. Don’t settle. 30 | 31 | My third story is about death. 32 | 33 | When I was 17, I read a quote that went something like: “If you live each day as if it was your last, someday you’ll most certainly be right.” It made an impression on me, and since then, for the past 33 years, I have looked in the mirror every morning and asked myself: “If today were the last day of my life, would I want to do what I am about to do today?” And whenever the answer has been “No” for too many days in a row, I know I need to change something. 34 | 35 | Remembering that I’ll be dead soon is the most important tool I’ve ever encountered to help me make the big choices in life. Because almost everything — all external expectations, all pride, all fear of embarrassment or failure — these things just fall away in the face of death, leaving only what is truly important. Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose. You are already naked. There is no reason not to follow your heart. 36 | 37 | About a year ago I was diagnosed with cancer. I had a scan at 7:30 in the morning, and it clearly showed a tumor on my pancreas. I didn’t even know what a pancreas was. The doctors told me this was almost certainly a type of cancer that is incurable, and that I should expect to live no longer than three to six months. My doctor advised me to go home and get my affairs in order, which is doctor’s code for prepare to die. It means to try to tell your kids everything you thought you’d have the next 10 years to tell them in just a few months. It means to make sure everything is buttoned up so that it will be as easy as possible for your family. It means to say your goodbyes. 38 | 39 | I lived with that diagnosis all day. Later that evening I had a biopsy, where they stuck an endoscope down my throat, through my stomach and into my intestines, put a needle into my pancreas and got a few cells from the tumor. I was sedated, but my wife, who was there, told me that when they viewed the cells under a microscope the doctors started crying because it turned out to be a very rare form of pancreatic cancer that is curable with surgery. I had the surgery and I’m fine now. 40 | 41 | This was the closest I’ve been to facing death, and I hope it’s the closest I get for a few more decades. Having lived through it, I can now say this to you with a bit more certainty than when death was a useful but purely intellectual concept: 42 | 43 | No one wants to die. Even people who want to go to heaven don’t want to die to get there. And yet death is the destination we all share. No one has ever escaped it. And that is as it should be, because Death is very likely the single best invention of Life. It is Life’s change agent. It clears out the old to make way for the new. Right now the new is you, but someday not too long from now, you will gradually become the old and be cleared away. Sorry to be so dramatic, but it is quite true. 44 | 45 | Your time is limited, so don’t waste it living someone else’s life. Don’t be trapped by dogma — which is living with the results of other people’s thinking. Don’t let the noise of others’ opinions drown out your own inner voice. And most important, have the courage to follow your heart and intuition. They somehow already know what you truly want to become. Everything else is secondary. 46 | 47 | When I was young, there was an amazing publication called The Whole Earth Catalog, which was one of the bibles of my generation. It was created by a fellow named Stewart Brand not far from here in Menlo Park, and he brought it to life with his poetic touch. This was in the late 1960s, before personal computers and desktop publishing, so it was all made with typewriters, scissors and Polaroid cameras. It was sort of like Google in paperback form, 35 years before Google came along: It was idealistic, and overflowing with neat tools and great notions. 48 | 49 | Stewart and his team put out several issues of The Whole Earth Catalog, and then when it had run its course, they put out a final issue. It was the mid-1970s, and I was your age. On the back cover of their final issue was a photograph of an early morning country road, the kind you might find yourself hitchhiking on if you were so adventurous. Beneath it were the words: “Stay Hungry. Stay Foolish.” It was their farewell message as they signed off. Stay Hungry. Stay Foolish. And I have always wished that for myself. And now, as you graduate to begin anew, I wish that for you. 50 | 51 | Stay Hungry. Stay Foolish. 52 | 53 | Thank you all very much. -------------------------------------------------------------------------------- /wordmesh/utils.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Mon May 28 04:26:25 2018 5 | 6 | @author: mukund 7 | """ 8 | 9 | import numpy as np 10 | import plotly.offline as py 11 | import plotly.graph_objs as go 12 | 13 | PLOTLY_FONTSIZE_BBW = 0.6 14 | PLOTLY_FONTSIZE_BBH = 0.972+0.088 15 | 16 | 17 | class PlotlyVisualizer(): 18 | 19 | def __init__(self, words, fontsizes_norm, height, width, 20 | filename='temp-plot.html', title=None, textcolors='white', 21 | hovertext=None, axis_visible=False, bg_color='black', 22 | title_fontcolor='white', title_fontsize='auto', 23 | title_font_family='Courier New, monospace', bb_padding=0.08, 24 | boundary_padding_factor=1.1): 25 | 26 | """ 27 | Parameters 28 | ---------- 29 | """ 30 | self.words = words 31 | self.fontsizes_norm = fontsizes_norm 32 | self.height = height 33 | self.width = width 34 | self.title = title 35 | self.textcolors = textcolors 36 | self.hovertext = hovertext 37 | self.axis_visible = axis_visible 38 | self.bg_color = bg_color 39 | self.title_fontcolor = title_fontcolor 40 | self.title_fontsize = title_fontsize 41 | self.title_font_family = title_font_family 42 | self.padding = bb_padding 43 | self.boundary_padding = boundary_padding_factor 44 | self.bounding_box_dimensions, self.real_fontsizes = self.get_bb_dimensions() 45 | 46 | # fontsize*FONTSIZE_BBW = Width of the bounding box of each character in a plotly graph 47 | def _get_zoom(self, coordinates): 48 | bbd = self.bounding_box_dimensions 49 | 50 | x_left = np.min((coordinates[:, 0]-bbd[:,0]/2)) 51 | x_right = np.max((coordinates[:, 0]+bbd[:,0]/2)) 52 | y_bottom = np.min((coordinates[:, 1]-bbd[:,1]/2)) 53 | y_top = np.max((coordinates[:,1]+bbd[:,1]/2)) 54 | 55 | zoom = max((x_right-x_left)/self.width, (y_top-y_bottom)/self.height) 56 | return zoom*self.boundary_padding 57 | 58 | def get_bb_dimensions(self): 59 | 60 | num_chars = np.array([len(word) for word in self.words]) 61 | square_side_length = self.fontsizes_norm*150*(len(self.words)**(1/2)) 62 | 63 | bb_widths = (PLOTLY_FONTSIZE_BBW+self.padding)*square_side_length*num_chars 64 | bb_heights = (PLOTLY_FONTSIZE_BBH+self.padding*2)*square_side_length 65 | return np.array([bb_widths, bb_heights]).swapaxes(0, 1), square_side_length 66 | 67 | def _get_layout(self, labels=[], zoom=1): 68 | 69 | steps = [] 70 | for label in labels: 71 | step = dict(method = 'animate', 72 | args = [[label]], 73 | label = label 74 | ) 75 | steps.append(step) 76 | 77 | top_padding = 0 if (self.title is None) else self.height/8 78 | self.title_fontsize = self.height/20 if (self.title_fontsize=='auto') else self.title_fontsize 79 | 80 | layout={'height':self.height, 81 | 'width':self.width, 82 | 'titlefont':{'color':self.title_fontcolor, 83 | 'size':self.title_fontsize}, 84 | #'paper_bgcolor':self.bg_color, 85 | 'paper_bgcolor':'white', 86 | 'plot_bgcolor':self.bg_color, 87 | 'xaxis': {'range': [-self.width*zoom/2, self.width*zoom/2], 88 | 'autorange': False, 89 | 'visible':self.axis_visible, 90 | 'autotick':False, 91 | 'dtick':10}, 92 | 'yaxis': {'range': [-self.height*zoom/2, self.height*zoom/2], 93 | 'autorange': False, 94 | 'visible':self.axis_visible, 95 | 'autotick':False, 96 | 'dtick':10}, 97 | 'margin':go.Margin( 98 | l=0, 99 | r=0, 100 | b=0, 101 | t=top_padding, 102 | pad=0 103 | ), 104 | 'hovermode':'closest', 105 | 'title': self.title, 106 | 'sliders': [{'steps':steps}] 107 | } 108 | 109 | return layout 110 | 111 | def _get_trace(self, coordinates, 112 | textfonts="Courier New, monospace", marker_opacity=0, 113 | showlegend=False, legendgroup='default_legend', zoom=1): 114 | 115 | coordinates = np.array(coordinates) 116 | 117 | 118 | trace = go.Scatter( 119 | 120 | #displays hoverinfo when hovering over keyword 121 | #by default, shows all text and colors it the color of the keyword 122 | hoverinfo = 'skip' if (self.hovertext==None) else 'text', 123 | hovertext = self.hovertext, 124 | 125 | #Sets the legend group for this trace. 126 | #Traces part of the same legend group hide/show at the 127 | #same time when toggling legend items. 128 | showlegend = showlegend, 129 | legendgroup = legendgroup, 130 | name = legendgroup, 131 | 132 | #'ids' assigns id labels to each datum. These ids can be used 133 | #for object constancy of data points during animation. 134 | #However, the following line of code has the effect of 135 | #not displaying duplicate keywords which is allowed 136 | #in a LabelledWordmesh object. 137 | #ids = self.words, 138 | 139 | x = coordinates[:,0], 140 | y = coordinates[:,1], 141 | 142 | 143 | mode = 'markers+text', 144 | marker = dict(symbol='square', 145 | opacity=marker_opacity, color = 'white', 146 | size=self.real_fontsizes), 147 | 148 | text = self.words, 149 | textposition = 'centre', 150 | textfont = dict(family = "Courier New, monospace", 151 | size = self.real_fontsizes*(1/zoom), 152 | color = self.textcolors) 153 | ) 154 | 155 | return trace 156 | 157 | def generate_figure(self, traces, labels, layout): 158 | frames = [{'data':[traces[i]], 'name':labels[i]} for i in range(len(traces))] 159 | figure={'data': [traces[0]], 160 | 'layout': layout, 161 | 'frames': frames 162 | } 163 | 164 | return figure 165 | 166 | def save_wordmesh_as_html(self, coordinates, filename='temp-plot.html', 167 | animate=False, autozoom=True, notebook_mode=False): 168 | 169 | zoom = 1 170 | labels = ['default label'] 171 | traces = [] 172 | if animate: 173 | for i in range(coordinates.shape[0]): 174 | 175 | traces.append(self._get_trace(coordinates[i])) 176 | labels = list(map(str,range(coordinates.shape[0]))) 177 | 178 | else: 179 | 180 | if autozoom: 181 | zoom = self._get_zoom(coordinates) 182 | traces = [self._get_trace(coordinates, zoom=zoom)] 183 | 184 | layout = self._get_layout(labels, zoom=zoom) 185 | 186 | fig = self.generate_figure(traces, labels, layout) 187 | 188 | if notebook_mode: 189 | py.init_notebook_mode(connected=True) 190 | py.iplot(fig, filename=filename, show_link=False) 191 | else: 192 | py.plot(fig, filename=filename, auto_open=False, show_link=False) 193 | 194 | 195 | def _cooccurence_score(text, word1, word2): 196 | #text, word1, word2 = text.lower(), word1.lower(), word2.lower() 197 | l1 = _find_all(text, word1) 198 | l2 = _find_all(text, word2) 199 | 200 | distance =0 201 | for i in l1: 202 | for j in l2: 203 | distance = distance + abs(i-j) 204 | 205 | return distance/(len(l1)*len(l2)+1) 206 | 207 | def _cooccurence_score2(text, word1, word2): 208 | l1 = _find_all(text, word1) 209 | l2 = _find_all(text, word2) 210 | avg = _smallest_cooc_distances(l1, l2) + \ 211 | _smallest_cooc_distances(l2, l1) 212 | return avg 213 | 214 | def _smallest_cooc_distances(list1, list2): 215 | #The method above is equivalent to the following: 216 | 217 | smallest_distance = 10000000 218 | sum_=0 219 | for i in list1: 220 | for j in list2: 221 | smallest_distance = min(smallest_distance, abs(i-j)) 222 | sum_ += smallest_distance 223 | 224 | 225 | return sum_/len(list2) 226 | 227 | def _find_all(text, substring, offset=0): 228 | loc = text.find(substring) 229 | if loc == -1: 230 | return [] 231 | else: 232 | sub_locs = _find_all(text[loc+1:], substring) 233 | return [offset+loc] + [offset+loc+i+1 for i in sub_locs] 234 | 235 | def _find_all_labelled(labelled_text, substring, substring_label): 236 | 237 | labelled_text['offset'] = labelled_text['text'].apply(len) 238 | labelled_text['offset'] = labelled_text['offset'].shift(1).fillna(0).cumsum() 239 | 240 | locations = labelled_text['text'].str.find(substring) 241 | return labelled_text[~(locations==-1) & (labelled_text['label']==substring_label)]['offset'] 242 | 243 | #The code above is equivalent to the following 244 | """ 245 | start = 0 246 | locations = [] 247 | for label, text in labelled_text: 248 | if label==substring_label: 249 | loc = [start+i for i in _find_all(text, substring)] 250 | locations += loc 251 | start += len(text) 252 | """ 253 | return locations 254 | 255 | def _cooccurence_score_labelled(labelled_text, word1, word2, label1, label2): 256 | l1 = _find_all_labelled(labelled_text, word1, label1) 257 | l2 = _find_all_labelled(labelled_text, word2, label2) 258 | 259 | avg = _smallest_cooc_distances(l1, l2)+_smallest_cooc_distances(l2, l1) 260 | return avg 261 | 262 | def cooccurence_similarity_matrix(text, wordlist, labelled=False, labels=None): 263 | """ 264 | Finds the cooccurence score of every pair of words. Currently it 265 | uses a heuristic, and is slow, so might change to a more robust 266 | method later on. 267 | """ 268 | if not labelled: 269 | score_func = lambda x,y: _cooccurence_score2(text, wordlist[int(x)], wordlist[int(y)]) 270 | vscore_func = np.vectorize(score_func) 271 | return np.fromfunction(vscore_func, shape=[len(wordlist)]*2) 272 | else: 273 | score_func = lambda x,y: _cooccurence_score_labelled(text, 274 | wordlist[int(x)], 275 | wordlist[int(y)], 276 | labels[int(x)], 277 | labels[int(y)]) 278 | vscore_func = np.vectorize(score_func) 279 | return np.fromfunction(vscore_func, shape=[len(wordlist)]*2) 280 | 281 | def regularize(arr, factor): 282 | arr = np.array(arr) 283 | assert arr.ndim == 1 284 | 285 | #applying regularization 286 | mx = arr.max() 287 | mn = arr.min() 288 | 289 | if (mx==mn): 290 | return arr 291 | 292 | a = mx*(factor-1)/((mx-mn)*factor) 293 | b = mx*(mx-mn*factor)/((mx-mn)*factor) 294 | 295 | return a*arr + b -------------------------------------------------------------------------------- /test/sample_speech.txt: -------------------------------------------------------------------------------- 1 | Thank you. It’s great to be here in Charlotte. I just met with our many amazing employees right up the road at our property. 2 | 3 | I’d like to take a moment to talk about the heartbreak and devastation in Louisiana, a state that is very special to me. 4 | 5 | We are one nation. When one state hurts, we all hurt – and we must all work together to lift each other up. Working, building, restoring together. 6 | 7 | Our prayers are with the families who have lost loved ones, and we send them our deepest condolences. Though words cannot express the sadness one feels at times like this, I hope everyone in Louisiana knows that our country is praying for them and standing with them to help them in these difficult hours. 8 | 9 | We are one country, one people, and we will have together one great future. 10 | 11 | Tonight, I’d like to talk about the New American Future we are going to create together. 12 | 13 | Last week, I laid out my plan to bring jobs back to our country. 14 | 15 | On Monday, I laid out my plan to defeat Radical Islamic Terrorism. 16 | 17 | On Tuesday, in Wisconsin, I talked about how we are going to restore law and order to this country. 18 | 19 | Let me take this opportunity to extend our thanks and our gratitude to the police and law enforcement officers in this country who have sacrificed so greatly in these difficult times. 20 | 21 | The chaos and violence on our streets, and the assaults on law enforcement, are an attack against all peaceful citizens. If I am elected President, this chaos and violence will end – and it will end very quickly. 22 | 23 | Every single citizen in our land has a right to live in safety. 24 | 25 | To be one united nation, we must protect all of our people. But we must also provide opportunities for all of our people. 26 | 27 | We cannot make America Great Again if we leave any community behind. 28 | 29 | Nearly Four in ten African-American children are living in poverty.I will not rest until children of every color in this country are fully included in the American Dream. 30 | 31 | Jobs, safety, opportunity. Fair and equal representation. This is what I promise to African-Americans, Hispanic-Americans, and all Americans. 32 | 33 | But to achieve this New American Future we must break from the failures of the past. 34 | 35 | As you know, I am not a politician. I have worked in business, creating jobs and rebuilding neighborhoods my entire adult life. I’ve never wanted to use the language of the insiders, and I’ve never been politically correct – it takes far too much time, and can often make more difficult. 36 | 37 | Sometimes, in the heat of debate and speaking on a multitude of issues, you don’t choose the right words or you say the wrong thing. I have done that, and I regret it, particularly where it may have caused personal pain. Too much is at stake for us to be consumed with these issues. 38 | 39 | But one thing I can promise you is this: I will always tell you the truth. 40 | 41 | I speak the truth for all of you, and for everyone in this country who doesn’t have a voice. 42 | 43 | I speak the truth on behalf of the factory worker who lost his or her job. 44 | 45 | I speak the truth on behalf of the Veteran who has been denied the medical care they need – and so many are not making it. They are dying. 46 | 47 | I speak the truth on behalf of the family living near the border that deserves to be safe in their own country but is instead living with no security at all. 48 | 49 | Our campaign is about representing the great majority of Americans – Republicans, Democrats, Independents, Conservatives and Liberals – who read the newspaper, or turn on the TV, and don’t hear anyone speaking for them. All they hear are insiders fighting for insiders. 50 | 51 | These are the forgotten men and women in our society, and they are angry at so much on so many levels. The poverty, the unemployment, the failing schools, the jobs moving to other countries. 52 | 53 | I am fighting for these forgotten Americans. 54 | 55 | Fourteen months ago, I declared my campaign for the Presidency on the promise to give our government back to the people. Every day since then, I’ve worked to repay the loyalty and the faith that you have put in me. 56 | 57 | Every day I think about how much is at stake for this country. This isn’t just the fight of my life, it’s the fight of our lives – together – to save our country. 58 | 59 | I refuse to let another generation of American children be excluded from the American Dream. Our whole country loses when young people of limitless potential are denied the opportunity to contribute their talents because we failed to provide them the opportunities they deserved. Let our children be dreamers too. 60 | 61 | Our whole country loses every time a kid doesn’t graduate from high school, or fails to enter the workforce or, worse still, is lost to the dreadful world of drugs and crime. 62 | 63 | When I look at the failing schools, the terrible trade deals, and the infrastructure crumbling in our inner cities, I know all of this can be fixed - and it can be fixed very quickly. 64 | 65 | In the world I come from, if something is broken, you fix it. 66 | 67 | If something isn’t working, you replace it. 68 | 69 | If a product doesn’t deliver, you make a change. 70 | 71 | I have no patience for injustice, no tolerance for government incompetence, no sympathy for leaders who fail their citizens. 72 | 73 | That’s why I am running: to end the decades of bitter failure and to offer the American people a new future of honesty, justice and opportunity. A future where America, and its people, always – and I mean always – come first. 74 | 75 | Aren’t you tired of a system that gets rich at your expense? 76 | 77 | Aren’t you tired of the same old lies and the same old broken promises? And Hillary Clinton has proven to be one of the greatest liars of all time. 78 | 79 | Aren’t you tired of arrogant leaders who look down on you, instead of serving and protecting you? 80 | 81 | That is all about to change – and it’s about to change soon. We are going to put the American people first again. 82 | 83 | I’ve travelled all across this country laying out my bold and modern agenda for change. 84 | 85 | In this journey, I will never lie to you. I will never tell you something I do not believe. I will never put anyone’s interests ahead of yours. 86 | 87 | And, I will never, ever stop fighting for you. 88 | 89 | I have no special interest. I am spending millions of dollars on my own campaign – nobody else is. 90 | 91 | My only interest is the American people. 92 | 93 | So while sometimes I can be too honest, Hillary Clinton is the exact opposite: she never tells the truth. One lie after another, and getting worse each passing day. 94 | 95 | The American people are still waiting for Hillary Clinton to apologize for all of the many lies she’s told to them, and the many times she’s betrayed them. 96 | 97 | Tell me, has Hillary Clinton ever apologized for lying about her illegal email server and deleting 33,000 emails? 98 | 99 | Has Hillary Clinton apologized for turning the State Department into a pay-for-play operation where favors are sold to the highest bidder? 100 | 101 | Has she apologized for lying to the families who lost loved ones at Benghazi? 102 | 103 | Has she apologized for putting Iran on the path to nuclear weapons? 104 | 105 | Has she apologized for Iraq? For Libya? For Syria? Has she apologized for unleashing ISIS across the world? 106 | 107 | Has Hillary Clinton apologized for the decisions she made that have led to so much death, destruction and terrorism? 108 | 109 | Speaking of lies, we now know from the State Department announcement that President Obama lied about the $400 million dollars in cash that was flown to Iran. He denied it was for the hostages, but it was. He said we don’t pay ransom, but he did. He lied about the hostages – openly and blatantly – just like he lied about Obamacare. 110 | 111 | Now the Administration has put every American travelling overseas, including our military personnel, at greater risk of being kidnapped. Hillary Clinton owns President Obama’s Iran policy, one more reason she can never be allowed to be President. 112 | 113 | Let’s talk about the economy. Here, in this beautiful state, so many people have suffered because of NAFTA. Bill Clinton signed the deal, and Hillary Clinton supported it. North Carolina has lost nearly half of its manufacturing jobs since NAFTA went into effect. 114 | 115 | Bill Clinton also put China into the World Trade Organization – another Hillary Clinton-backed deal. Your city of Charlotte has lost 1 in 4 manufacturing jobs since China joined the WTO, and many of these jobs were lost while Hillary Clinton was Secretary of State – our chief diplomat with China. She was a disaster, totally unfit for the job. 116 | 117 | Hillary Clinton owes the State of North Carolina a very big apology, and I think you’ll get that apology around the same time you’ll get to see her 33,000 deleted emails. 118 | 119 | Another major issue in this campaign has been the border. Our open border has allowed drugs and crime and gangs to pour into our communities. So much needless suffering, so much preventable death. I’ve spent time with the families of wonderful Americans whose loved ones were killed by the open borders and Sanctuary Cities that Hillary Clinton supports. 120 | 121 | I’ve embraced the crying parents who’ve lost their children to violence spilling across our border. Parents like Laura Wilkerson and Michelle Root and Sabine Durden and Jamiel Shaw whose children were killed by illegal immigrants. 122 | 123 | My opponent supports Sanctuary Cities. 124 | 125 | But where was the Sanctuary for Kate Steinle? Where was the Sanctuary for the children of Laura, Michelle, Sabine and Jamiel? 126 | 127 | Where was the Sanctuary for every other parent who has suffered so horribly? 128 | 129 | These moms and dads don’t get a lot of consideration from our politicians. They certainly don’t get apologies. They’ll never even get the time of day from Hillary Clinton. 130 | 131 | But they will always come first to me. 132 | 133 | Listen closely: we will deliver justice for all of these American Families. We will create a system of immigration that makes us all proud. 134 | 135 | Hillary Clinton’s mistakes destroy innocent lives, sacrifice national security, and betray the working families of this country. 136 | 137 | Please remember this: I will never put personal profit before national security. I will never leave our border open to appease donors and special interests. I will never support a trade deal that kills American jobs. I will never put the special interests before the national interest. I will never put a donor before a voter, or a lobbyist before a citizen. 138 | 139 | Instead, I will be a champion for the people. 140 | 141 | The establishment media doesn’t cover what really matters in this country, or what’s really going on in people’s lives. They will take words of mine out of context and spend a week obsessing over every single syllable, and then pretend to discover some hidden meaning in what I said. 142 | 143 | Just imagine for a second if the media spent this energy holding the politicians accountable who got innocent Americans like Kate Steinle killed – she was gunned down by an illegal immigrant who had been deported five times. 144 | 145 | Just imagine if the media spent time and lots of time investigating the poverty and joblessness of the inner cities. Just think about how much different things would be if the media in this country sent their cameras to our border, to our closing factories, or to our failing schools. 146 | 147 | Or if the media focused on what dark streets must be hidden in the 33,000 emails that Hillary Clinton illegally deleted. 148 | 149 | Thank you. Instead every story is told from the perspective of the insider. It's the narrative of the people who rig the system, never the voice of the people it's been rigged against. Believe me. So many people suffering for so long in silence. No cameras. No coverage, no outrage from the media class that seems to get outrage over just about everything else. So, again, it's not about me. It's never been about me. It's been about all the people in this country who don't have a voice. I am running to be your voice. 150 | 151 | Thank you. I'm running to be the voice for every forgotten part of this country that has been waiting and hoping for a better future. 152 | 153 | I am glad that I make the powerful, and I mean very powerful a little uncomfortable now and again, including some of the powerful people, frankly, in my own party because it means that I'm fighting for real change, real change. There is a reason hedge fund managers, the financial lobbyists, the Wall Street investors are throwing their money all over Hillary Clinton because they know she will make sure the system stays rigged in their favor. 154 | 155 | It's the powerful protecting the powerful. The insiders fighting for the insiders. I am fighting for you. 156 | 157 | Here is the change I propose. On terrorism, we are going to end the era of nation-building and, instead, focus on destroying, destroying, destroying ISIS and radical Islamic terrorism. 158 | 159 | We will use military, cyber, and financial warfare and work with any partner in the world and the Middle East that shares our goal in defeating terrorism. I have a message for the terrorists trying to kill our citizens. We will find you, we will destroy you and we will absolutely win and we will win soon. 160 | 161 | On immigration, we will temporarily suspend immigration from any place where adequate screening cannot be performed, extreme vetting. Remember, extreme vetting. All applicants for immigration will be vetted for ties to radical ideology. And we will screen out anyone who doesn't share our values and love our people. 162 | 163 | Anyone who believes Sharia Law supplants American law will not be given an immigrant visa. 164 | 165 | If you want to join our society, then you must embrace our society. Our values, and our tolerant way of life. Those who believe in oppressing women, guys, Hispanics, African-Americans, and people of different faiths are not welcome to join our great country. 166 | 167 | We will promote our American values, our American way of life, and our American system of government, which are all, all the best in the world. My opponent on the other hand wants a 550 percent increase in Syrian refugees even more than already pouring into our country under President Obama. Her plan would bring in roughly 620,000 refugees from all refugee sending nations in her first term alone on top of all other immigration. Think of that. Think of that. What are we doing? 168 | 169 | Hillary Clinton is running to be America's Angela Merkel and we have seen how much crime and how many problems that's caused the German people and Germany. 170 | 171 | We have enough problems already, we do not need more. On crime we're going to add more police, more investigators, and appoint the best judges and prosecutors in the world. 172 | 173 | We will pursue strong enforcement of federal laws. The gangs and cartels. And criminal syndicates terrorizing our people will be stripped apart one by one and they will be sent out of our country quickly. Their day is over. And it's going to end very, very fast. Our trade -- thank you. On trade, we're going to renegotiate NAFTA to make it better and if they don't agree, we will withdraw. 174 | 175 | And likewise we are going to withdraw from Transpacific Partnership, another disaster. 176 | 177 | Stand up to China on our terrible trade agreements and protect every last American job. Hillary Clinton has supported all of the major trade deals that have stripped this country of its jobs and its wealth. We owe $20 trillion. On taxes, we are going to massively cut tax rates for workers and small businesses creating millions of new good paying jobs. 178 | 179 | We're going to get rid of regulations that send jobs overseas and we are going to make it easier for young Americans to get the credit they need to start a small business and pursue their dream. 180 | 181 | On education, so important, we are going to give students choice and allow charter schools to thrive. We are going to end tenure policies that reward bad teachers and hurt our great, good teachers. My opponent wants to deny student choice and opportunity, all to get a little bit more money from the education bureaucracy. She doesn't care how many young dreams are dashed or destroyed and they are destroyed. Young people are destroyed before they even start. We are going to work closely with African-American parents and children. We are going to work with the parents' students. We are going to work with everybody in the African-American community, in the inner cities, and what a big difference that is going to make. It's one of the things I most look forward to doing. 182 | 183 | This means a lot to me and it's going to be a top priority in a Trump administration. On healthcare, we are going to repeal and replace the disaster called ObamaCare. Countless Americans have been forced into part- time jobs, premiums are about to jump by double digits yet again and just this week, ETNA announced it is pulling out of the exchanges all over but also in North Carolina. We are going to replace this disaster with reforms that give you choice and freedom and control in healthcare at a much, much lower cost. You will have much better healthcare at a much lower cost and it will happen quickly. 184 | 185 | On political corruption, we are going to restore honor to our government. In my administration, I'm going to enforce all laws concerning the protection of classified information. No one will be above the law. I am going to forbid senior officials from trading favors for cash by preventing them from collecting lavish speaking fees through their spouses when they serve. 186 | 187 | I'm going to ask my senior officials to sign an agreement not to accept speaking fees from corporations with a registered lobbyist for five years after leaving office, or from any entity tied to a foreign government. 188 | 189 | Finally, we are going to bring our country together. It is so divided. We are going to bring it together. We are going to do it by emphasizing what we all have in common as Americans. We're going to reject bigotry and I will tell you the bigotry of Hillary Clinton is amazing. She sees communities of color only as votes and not as human beings. Worthy of a better future. It's only votes. It is only votes that she sees. And she does nothing about it. She has been there forever and look at where you are. If African-Americans voters give Donald Trump a chance by giving me their vote, the result for them will be amazing. 190 | 191 | Look how badly things are going under decades of Democratic leadership. Look at the schools. Look at the poverty. Look at the 58 percent of young African-Americans not working. Fifty eight percent. It is it is time for a change. What do you have to lose by trying something new? I will fix it watch, I will fix it. We have nothing to lose. Nothing to lose. It is so bad. The inner cities are so bad, you have nothing to lose. They have been playing with you for 60, 70, 80 years, many, many decades. You have nothing to lose. I will do a great job. 192 | 193 | This means so much to me. And I will work as hard as I can to bring new opportunity to places in our country which have not known it in a very, very long time. Hillary Clinton and the Democratic Party have taken African-American votes totally for granted. Because the votes have been automatically there for them, there has been no reason for Democrats to produce, and they haven't. They haven't produced in decades and decades. It's time to break with the failures of the past and to fight for every last American child in this country to have a better and a much, much brighter future. 194 | 195 | In my administration every American will be treated equally, protected equally and honored equally. We will reject bigotry and hatred and oppression in all of its forms and seek a new future built on our common culture and values as one American people. 196 | 197 | This is the change I am promising to all of you, an honest government, a great economy, and a just society for each and every American. 198 | 199 | But we can never ever fix our problems by relying on the same politicians who created these problems in the first place. Can't do it. Seventy two percent of voters say our country is on the wrong track. I am the change candidate. Hillary Clinton is for the failed status quo to protect her special interests, her donors, her lobbyists, and others. It is time to vote for a new American future. Together, we will make America strong again. We will make America proud again, we will make America safe again. Friends and fellow citizens, come November, we will make America great again. Greater than ever before. Thank you, thank you. And God bless you. Thank you. Thank you. Thank you very much. 200 | -------------------------------------------------------------------------------- /wordmesh/static_wordmesh.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Sun May 27 02:20:39 2018 5 | 6 | @author: mukund 7 | """ 8 | from .text_processing import extract_terms_by_frequency, extract_terms_by_score 9 | from .text_processing import normalize_text, get_semantic_similarity_matrix 10 | from .utils import cooccurence_similarity_matrix as csm 11 | from .utils import regularize 12 | import numpy as np 13 | from sklearn.manifold import MDS, TSNE 14 | from .utils import PlotlyVisualizer 15 | from .force_directed_model import ForceDirectedModel 16 | import colorlover as cl 17 | import pandas as pd 18 | 19 | FONTSIZE_REG_FACTOR = 3 20 | CLUSTER_REG_FACTOR = 4 21 | NUM_ITERS = 100 22 | SIMILARITY_MEAN = 400 23 | NOTEBOOK_MODE = False 24 | 25 | class Wordmesh(): 26 | """ 27 | Wordmesh object for generating and drawing wordmeshes/wordclouds. 28 | 29 | Attributes 30 | ---------- 31 | text : str 32 | The text used to extract the keywords. 33 | 34 | keywords : list of str 35 | The keywords extracted from the text. 36 | 37 | scores : numpy array 38 | The scores assigned by the keyword extraction algorithm. 39 | 40 | pos_tags : list of str 41 | The pos_tags corresponding to the keywords. 42 | 43 | embeddings : numpy array 44 | An array of shape (num_keywords, 2), giving the locations of the 45 | keywords on the canvas. 46 | 47 | bounding_box_width_height : numpy array 48 | An array of shape (num_keywords, 2) gives the width and height of 49 | each keyword's bounding box. The coordinates of the centre of 50 | the box can be accessed through the 'embeddings' attribute. 51 | 52 | similarity_matrix : numpy array 53 | The similarity matrix with shape (num_keywords, num_keywords), is 54 | proportional to the 'dissimilarity' between the ith and jth keywords. 55 | The matrix may have been regularized to prevent extreme values. 56 | 57 | fontsizes_norm : numpy array 58 | The normalized fontsizes, the actual fontsizes depend on the 59 | visualization. These may have been regularized to avoid extreme values. 60 | 61 | fontcolors : list of str 62 | The fontcolors as rgb strings. This format was chosen since it is 63 | supported by plotly. 64 | 65 | """ 66 | def __init__(self, text, dimensions=(500, 900), 67 | keyword_extractor='textrank', num_keywords=70, 68 | lemmatize=True, pos_filter=None, 69 | extract_ngrams=False, filter_numbers=True, 70 | filter_stopwords=True): 71 | 72 | """ 73 | Parameters 74 | ---------- 75 | text : string 76 | The string of text that needs to be summarized. 77 | 78 | dimensions : tuple, optional 79 | The desired dimensions (height, width) of the wordcloud in pixels. 80 | 81 | keyword_extractor : {'textrank', 'sgrank', 'bestcoverage', 'tf'}, optional 82 | The algorithm used for keyword extraction. 'tf' refers to simple 83 | term frequency based extraction. 84 | 85 | num_keywords : int, optional 86 | The number of keywords to be extracted from the text. In some cases, 87 | if the text length is too short, fewer keywords might 88 | be extracted without a warning being raised. 89 | 90 | lemmatize : bool, optional 91 | Whether the text needs to be lemmatized before keywords are 92 | extracted from it 93 | 94 | pos_filter : list of str, optional 95 | Filters out all keywords EXCEPT the ones with these pos tags. 96 | Supported pos tags-{'NOUN','PROPN','ADJ','VERB','ADV','SYM','PUNCT'}. 97 | A POS filter can be applied on the keywords ONLY when the 98 | keyword_extractor has been set to 'tf'. 99 | 100 | extract_ngrams : bool, optional 101 | Whether bi or tri-grams should be extracted. 102 | 103 | filter_numbers : bool, optional 104 | Whether numbers should be filtered out 105 | 106 | filter_stopwords: bool, optional 107 | Whether stopwords should be filtered out 108 | Returns 109 | ------- 110 | Wordmesh 111 | A word mesh object 112 | 113 | """ 114 | if (keyword_extractor=='divrank'): 115 | raise NotImplementedError('divrank is currently unstable') 116 | #The pos_filer has only been implemented for 'tf' based extraction 117 | if (keyword_extractor!='tf') and (pos_filter is not None): 118 | 119 | msg = '\'pos_filter\' is only available for \'tf\' ' +\ 120 | 'based keyword extractor. This is an issue with textacy ' +\ 121 | 'and will be fixed in the future' 122 | 123 | raise NotImplementedError(msg) 124 | elif pos_filter is None: 125 | pos_filter = ['NOUN','ADJ','PROPN'] 126 | 127 | #textacy's functions are unstable when the following condition is met 128 | #They just churn out terrible ngrams 129 | if (keyword_extractor!='tf') and extract_ngrams: 130 | msg = 'Currently, extracting ngrams using graph based methods ' +\ 131 | 'is not advisable. This is due to underlying issues ' +\ 132 | 'with textacy which will be fixed in the future. '+\ 133 | 'For now, you can set \'extract_ngrams\' to False.' 134 | raise NotImplementedError(msg) 135 | 136 | if len(text)<=10: 137 | raise ValueError("The text cannot have less that 10 characters.") 138 | 139 | self.text = text 140 | self._resolution = dimensions 141 | self._lemmatize = lemmatize 142 | self._keyword_extractor = keyword_extractor 143 | self._pos_filter = pos_filter 144 | self._extract_ngrams = extract_ngrams 145 | self._num_required_keywords = num_keywords 146 | self._filter_numbers = filter_numbers 147 | self._filter_stopwords = filter_stopwords 148 | 149 | #If textacy throws the following error while extracting keywords, 150 | #it means that the text is too short 151 | try: 152 | self._extract_keywords() 153 | except ValueError as e: 154 | if str(e).find('must contain at least 1 term')==-1: 155 | raise 156 | else: 157 | self.keywords = [] 158 | 159 | #Text too short to extract keywords 160 | if len(self.keywords)<2 and self._keyword_extractor!='tf': 161 | msg = 'Text is too short to extract any keywords using ' + \ 162 | '\'{}\'. Try switching to \'tf\' based extraction.'.format(self._keyword_extractor) 163 | raise ValueError(msg) 164 | elif len(self.keywords)<2: 165 | raise ValueError('Text is too short to extract any keywords.') 166 | 167 | #Cannot apply delaunay triangulation on less than 4 points 168 | if len(self.keywords)<4: 169 | self._apply_delaunay = False 170 | else: 171 | self._apply_delaunay = True 172 | 173 | self.set_visualization_params() 174 | self.set_fontsize() 175 | self.set_fontcolor() 176 | self.set_clustering_criteria() 177 | 178 | def _extract_keywords(self): 179 | 180 | if self._keyword_extractor == 'tf': 181 | 182 | self.keywords, self.scores, self.pos_tags, n_kw = \ 183 | extract_terms_by_frequency(self.text, self._num_required_keywords, 184 | self._pos_filter, self._filter_numbers, 185 | self._extract_ngrams, 186 | lemmatize=self._lemmatize, 187 | filter_stopwords = self._filter_stopwords) 188 | else: 189 | self.keywords, self.scores, self.pos_tags, n_kw = \ 190 | extract_terms_by_score(self.text, self._keyword_extractor, 191 | self._num_required_keywords, self._extract_ngrams, 192 | lemmatize=self._lemmatize, 193 | filter_stopwords = self._filter_stopwords) 194 | #self._normalized_keywords are all lemmatized if self._lemmatize is True, 195 | #unlike self.keywords which contain capitalized named entities 196 | self._normalized_keywords = n_kw 197 | 198 | 199 | def set_fontsize(self, by='scores', custom_sizes=None, 200 | apply_regularization=True, 201 | regularization_factor=FONTSIZE_REG_FACTOR): 202 | """ 203 | This function can be used to pick a metric which decides the font size 204 | for each extracted keyword. The font size is directly 205 | proportional to the 'scores' assigned by the keyword extractor. 206 | 207 | Fonts can be picked by: 'scores', 'constant', None 208 | 209 | You can also choose custom font sizes by passing in a dictionary 210 | of word:fontsize pairs using the argument custom_sizes 211 | 212 | Parameters 213 | ---------- 214 | 215 | by : string or None, optional 216 | The metric used to assign font sizes. Can be None if custom sizes 217 | are being used 218 | custom_sizes : list of float or numpy array or None, optional 219 | A list of font sizes. There should be a one-to-one correspondence 220 | between the numbers in the list and the extracted keywords (that 221 | can be accessed through the keywords attribute). Note that this list 222 | is only used to calculate relative sizes, the actual sizes depend 223 | on the visualization tool used. 224 | apply_regularization : bool, optional 225 | Determines whether font sizes will be regularized to prevent extreme 226 | values which might lead to a poor visualization 227 | regularization_factor : int, optional 228 | Determines the ratio max(fontsizes)/min(fontsizes). Fontsizes are 229 | scaled linearly so as to achieve this ratio. This helps prevent 230 | extreme values. 231 | 232 | Returns 233 | ------- 234 | 235 | None 236 | """ 237 | 238 | if custom_sizes is not None: 239 | assert len(custom_sizes)==len(self.keywords) 240 | self.fontsizes_norm = np.array(custom_sizes) 241 | elif by=='scores': 242 | self.fontsizes_norm = self.scores/self.scores.sum() 243 | elif by=='constant': 244 | self.fontsizes_norm = np.full(len(self.keywords), 1) 245 | else: 246 | raise ValueError() 247 | 248 | #applying regularization 249 | if apply_regularization: 250 | self.fontsizes_norm = regularize(self.fontsizes_norm, 251 | regularization_factor) 252 | 253 | #normalize 254 | self.fontsizes_norm = self.fontsizes_norm/self.fontsizes_norm.sum() 255 | 256 | #raise flag indicating that the fontsizes have been modified 257 | self._flag_fontsizes = True 258 | 259 | 260 | def set_fontcolor(self, by='scores', colorscale='YlOrRd', 261 | custom_colors=None): 262 | """ 263 | This function can be used to pick a metric which decides the font color 264 | for each extracted keyword. By default, the font color is assigned 265 | based on the score of each keyword. 266 | 267 | Fonts can be picked by: 'random', 'scores', 'pos_tag', 'clustering_criteria' 268 | 269 | You can also choose custom font colors by passing in a list of 270 | (R,G,B) tuples with values for each component falling in [0,255]. 271 | 272 | Parameters 273 | ---------- 274 | 275 | by : str or None, optional 276 | The metric used to assign font sizes. Can be None if custom colors 277 | are being used 278 | colorscale: str or None, optional 279 | One of [Greys, YlGnBu, Greens, YlOrRd, Bluered, RdBu, Reds, Blues]. 280 | When by=='scores', this will be used to determine the colorscale. 281 | custom_colors : list of 3-tuple, optional 282 | A list of RGB tuples. Each tuple corresponding to the color of 283 | a keyword. 284 | 285 | Returns 286 | ------- 287 | None 288 | """ 289 | if custom_colors is not None: 290 | assert len(custom_colors) == len(self.keywords) 291 | if isinstance(custom_colors[0], str): 292 | self.fontcolors = custom_colors 293 | else: 294 | self.fontcolors = [] 295 | for rgb in custom_colors: 296 | assert len(rgb)==3 297 | self.fontcolors.append('rgb'+str(rgb)) 298 | 299 | elif by=='random': 300 | tone = np.random.choice(list(cl.flipper()['seq']['3'].keys())) 301 | self.fontcolors = np.random.choice(list(cl.flipper()['seq']\ 302 | ['3'][tone]), 303 | len(self.keywords)) 304 | 305 | elif by=='scores': 306 | 307 | scales = {**cl.scales['8']['div'], **cl.scales['8']['seq']} 308 | #Even though, currently all colorscales in 'scales.keys()' can be 309 | #used, only the ones listed in the doc can be used for creating a 310 | #colorbar in the plotly plot 311 | 312 | assert colorscale in ['Greys','YlGnBu', 'Greens', 'YlOrRd', 313 | 'Bluered', 'RdBu', 'Reds', 'Blues'] 314 | colors = scales[colorscale].copy() 315 | colors.reverse() 316 | 317 | #The keywords are binned based on their scores 318 | mn, mx = self.scores.min(), self.scores.max() 319 | bins = np.linspace(mn,mx,8) 320 | indices = np.digitize(self.scores, bins)-1 321 | 322 | self.fontcolors = [colors[i] for i in indices] 323 | 324 | elif by=='pos_tag': 325 | c = cl.scales['5']['qual']['Set2'] + ['rgb(254,254,254)', 'rgb(254,254,254)'] 326 | tags = ['NOUN','PROPN','ADJ','VERB','ADV','SYM','ADP'] 327 | mapping = {tag:c[i] for i,tag in enumerate(tags)} 328 | self.fontcolors = list(map(mapping.get, self.pos_tags)) 329 | 330 | elif by=='clustering_criteria': 331 | mds = MDS(3, dissimilarity='precomputed').\ 332 | fit_transform(self.similarity_matrix) 333 | mds = mds-mds.min() 334 | mds = mds*205/mds.max() + 50 335 | self.fontcolors = ['rgb'+str(tuple(rgb)) for rgb in mds] 336 | 337 | else: 338 | raise ValueError() 339 | 340 | #raise flag to indicate that the fontcolors have been modified 341 | self._flag_fontcolors = True 342 | 343 | 344 | def set_clustering_criteria(self, by='scores', 345 | custom_similarity_matrix=None, 346 | apply_regularization=True, 347 | clustering_algorithm = 'MDS', delaunay_factor=None): 348 | """ 349 | This function can be used to define the criteria for clustering of 350 | different keywords in the wordcloud. By default, clustering is done 351 | based on the keywords' scores, with keywords having high scores in the 352 | centre. 353 | 354 | The following pre-defined criteria can be used: 'cooccurence', 355 | 'meaning', 'scores', 'random' 356 | 357 | You can also define a custom_similarity_matrix. 358 | 359 | Parameters 360 | ---------- 361 | 362 | by : string or None, optional 363 | The pre-defined criteria used to cluster keywords 364 | 365 | custom_similarity_matrix : numpy array or None, optional 366 | A 2-dimensional array with shape (num_keywords, num_keywords) 367 | The entry a[i,j] is proportional to the 'dissimilarity' between 368 | keyword[i] and keyword[j]. Words that are similar will be grouped 369 | together on the canvas. 370 | 371 | apply_regularization : bool, optional 372 | Whether to regularize the similarity matrix to prevent extreme 373 | values. 374 | 375 | clustering_algorithm : {'MDS', 'TSNE'}, optional 376 | The algorithm used to find the initial embeddings based on the 377 | similarity matrix. 378 | Returns 379 | ------- 380 | None 381 | """ 382 | if custom_similarity_matrix is not None: 383 | sm = custom_similarity_matrix 384 | elif by=='cooccurence': 385 | self._normalized_text = normalize_text(self.text, 386 | lemmatize=self._lemmatize) 387 | sm = csm(self._normalized_text, 388 | self._normalized_keywords) 389 | elif by=='random': 390 | num_kw = len(self.keywords) 391 | sm = np.random.normal(400, 90, (num_kw,num_kw)) 392 | sm = 0.5*(sm + sm.T) 393 | self._apply_delaunay = False 394 | 395 | elif by=='scores': 396 | mat = np.outer(self.scores, self.scores.T) 397 | #sm = 1/np.absolute(mat-(mat**(1/16)).mean()) 398 | sm = np.absolute(mat.max()-mat) + 1 399 | 400 | elif by=='meaning': 401 | sm = get_semantic_similarity_matrix(self.keywords) 402 | else: 403 | raise ValueError() 404 | 405 | #apply regularization 406 | if apply_regularization: 407 | shape = sm.shape 408 | temp = regularize(sm.flatten(), CLUSTER_REG_FACTOR) 409 | sm = temp.reshape(shape) 410 | 411 | #standardise 412 | sm = sm*SIMILARITY_MEAN/np.mean(sm) 413 | 414 | self.similarity_matrix= sm 415 | if clustering_algorithm not in ['MDS','TSNE']: 416 | raise ValueError('Only the following clustering algorithms \ 417 | are supported: {}'.format(['MDS','TSNE'])) 418 | self._clustering_algorithm = clustering_algorithm 419 | self._delaunay_factor = delaunay_factor 420 | 421 | #raise a flag indicating that the clustering criteria has been modified 422 | self._flag_clustering_criteria = True 423 | 424 | def set_visualization_params(self, bg_color='black'): 425 | """ 426 | Set other visualization parameters 427 | 428 | Parameters 429 | ---------- 430 | 431 | bg_color: 3-tuple of int, optional 432 | Sets the background color, takes in a tuple of (R,G,B) \ 433 | color components 434 | """ 435 | if isinstance(bg_color, str): 436 | self._bg_color = bg_color 437 | else: 438 | assert(len(bg_color)==3) 439 | self._bg_color = 'rgb'+str(bg_color) 440 | 441 | 442 | self._flag_vis = True 443 | 444 | def recreate_wordmesh(self): 445 | """ 446 | Can be used to change the word placement in case the current 447 | one isn't suitable. Since the steps involved in the creation of the 448 | wordmesh are random, the result will come out looking different every 449 | time. 450 | """ 451 | 452 | #raise all the clustering flag, so as to run the MDS algorithm again 453 | self._flag_clustering_criteria = True 454 | self._generate_embeddings() 455 | 456 | def _generate_embeddings(self): 457 | 458 | if self._flag_clustering_criteria: 459 | 460 | mds = MDS(2, dissimilarity='precomputed').\ 461 | fit_transform(self.similarity_matrix) 462 | self._initial_embeds = mds 463 | 464 | if self._clustering_algorithm == 'TSNE': 465 | self._initial_embeds = TSNE(metric='precomputed', 466 | perplexity=3, init=mds).\ 467 | fit_transform(self.similarity_matrix) 468 | 469 | if self._flag_fontsizes or self._flag_fontcolors or self._flag_vis: 470 | self._visualizer = PlotlyVisualizer(words = self.keywords, 471 | fontsizes_norm =self.fontsizes_norm, 472 | height = self._resolution[0], 473 | width = self._resolution[1], 474 | textcolors=self.fontcolors, 475 | bg_color = self._bg_color) 476 | self.bounding_box_width_height = self._visualizer.bounding_box_dimensions 477 | 478 | if self._flag_fontsizes or self._flag_clustering_criteria: 479 | bbd = self.bounding_box_width_height 480 | fdm = ForceDirectedModel(self._initial_embeds, bbd, num_iters=NUM_ITERS, 481 | apply_delaunay=self._apply_delaunay, 482 | delaunay_multiplier=self._delaunay_factor) 483 | self._force_directed_model = fdm 484 | self.embeddings = fdm.equilibrium_position() 485 | 486 | #turn off all flags 487 | self._flag_clustering_criteria = False 488 | self._flag_fontsizes = False 489 | self._flag_fontcolors = False 490 | 491 | def _get_all_fditerations(self, num_slides=10): 492 | all_pos = self._force_directed_model.all_centered_positions 493 | num_iters = self._force_directed_model.num_iters 494 | 495 | step_size = num_iters//num_slides 496 | slides = [] 497 | 498 | for i in range(num_iters%step_size, num_iters, step_size): 499 | slides.append(all_pos[i]) 500 | 501 | return np.stack(slides) 502 | 503 | def save_as_html(self, filename='wordmesh.html', 504 | force_directed_animation=False, notebook_mode=NOTEBOOK_MODE): 505 | """ 506 | Save the plot as an html file. 507 | 508 | Parameters 509 | ---------- 510 | 511 | filename: str, (default='wordmesh.html') 512 | The path of the html file 513 | 514 | force_directed_animation: bool, optional 515 | Setting this to True lets you visualize the force directed algorithm 516 | 517 | notebook_mode: bool, optional 518 | Set this to True to view the plot in a jupyter notebook. 519 | The file will NOT be saved when notebook_mode is True. 520 | """ 521 | #generate embeddings if any of the wordmesh parameters have been modified 522 | if self._flag_clustering_criteria or self._flag_fontsizes or self._flag_fontcolors or self._flag_vis: 523 | self._generate_embeddings() 524 | 525 | if force_directed_animation: 526 | all_positions = self._get_all_fditerations() 527 | self._visualizer.save_wordmesh_as_html(all_positions, filename, 528 | animate=True, 529 | notebook_mode=notebook_mode) 530 | else: 531 | self._visualizer.save_wordmesh_as_html(self.embeddings, filename, 532 | notebook_mode=notebook_mode) 533 | 534 | def plot(self, force_directed_animation=False): 535 | """ 536 | Can be used to plot the wordmesh inside a jupyter notebook 537 | 538 | Parameters 539 | ---------- 540 | 541 | force_directed_animation : bool, optional 542 | Setting this to True lets you visualize the force directed algorithm 543 | """ 544 | self.save_as_html(force_directed_animation=force_directed_animation, 545 | notebook_mode=True) 546 | 547 | 548 | class LabelledWordmesh(Wordmesh): 549 | """ 550 | Create a wordmesh from labelled text. This can be used when the text 551 | is composed of several sections, each having a label associated with it. 552 | It can also be used to compare two different sources of text. 553 | 554 | Attributes 555 | ---------- 556 | text : pandas DataFrame 557 | The 'text' and its corresponding 'label' 558 | 559 | keywords : list of str 560 | The keywords extracted from the text. 561 | 562 | scores : numpy array 563 | The scores assigned by the keyword extraction algorithm. 564 | 565 | pos_tags : list of str 566 | The pos_tags corresponding to the keywords. 567 | 568 | labels : list of int 569 | The labels corresponding to each keyword. 570 | 571 | embeddings : numpy array 572 | An array of shape (num_keywords, 2), giving the locations of the 573 | keywords on the canvas. 574 | 575 | bounding_box_width_height : numpy array 576 | An array of shape (num_keywords, 2) gives the width and height of 577 | each keyword's bounding box. The coordinates of the centre of 578 | the box can be accessed through the 'embeddings' attribute. 579 | 580 | similarity_matrix : numpy array 581 | The similarity matrix with shape (num_keywords, num_keywords), is 582 | proportional to the 'dissimilarity' between the ith and jth keywords. 583 | The matrix may have been regularized to prevent extreme values. 584 | 585 | fontsizes_norm : numpy array 586 | The normalized fontsizes, the actual fontsizes depend on the 587 | visualization. These may have been regularized to avoid extreme values. 588 | 589 | fontcolors : list of str 590 | The fontcolors as rgb strings. This format was chosen since it is 591 | supported by plotly. 592 | 593 | """ 594 | 595 | def __init__(self, labelled_text, dimensions=(500, 900), 596 | keyword_extractor='textrank', num_keywords=35, 597 | lemmatize=True, pos_filter=None, 598 | extract_ngrams=False, filter_numbers=True, 599 | filter_stopwords=True): 600 | """ 601 | Parameters 602 | ---------- 603 | 604 | labelled_text: list of (int, str) 605 | Here the 'int' is the label associated with the text 606 | 607 | dimensions : tuple, optional 608 | The desired dimensions (height, width) of the wordcloud in pixels. 609 | 610 | keyword_extractor : {'textrank', 'sgrank', 'bestcoverage', 'tf'}, optional 611 | The algorithm used for keyword extraction. 'tf' refers to simple 612 | term frequency based extraction. 613 | 614 | num_keywords : int, optional 615 | The number of keywords to be extracted from the text. In some cases, 616 | if the text length is too short, fewer keywords might 617 | be extracted without a warning being raised. 618 | 619 | lemmatize : bool, optional 620 | Whether the text needs to be lemmatized before keywords are 621 | extracted from it 622 | 623 | pos_filter : list of str, optional 624 | Filters out all keywords EXCEPT the ones with these pos tags. 625 | Supported pos tags-{'NOUN','PROPN','ADJ','VERB','ADV','SYM','PUNCT'}. 626 | A POS filter can be applied on the keywords ONLY when the 627 | keyword_extractor has been set to 'tf'. 628 | 629 | extract_ngrams : bool, optional 630 | Whether bi or tri-grams should be extracted. 631 | 632 | filter_numbers : bool, optional 633 | Whether numbers should be filtered out 634 | 635 | filter_stopwords: bool, optional 636 | Whether stopwords should be filtered out 637 | Returns 638 | ------- 639 | LabelledWordmesh 640 | A LabelledWordmesh object, which inherits from Wordmesh 641 | """ 642 | if not (isinstance(labelled_text, list) or isinstance(labelled_text, pd.DataFrame)): 643 | raise ValueError('labelled_text can only be a list or a pandas \ 644 | dataframe, not a {}'.format(type(labelled_text))) 645 | if isinstance(labelled_text, list): 646 | assert len(labelled_text)!=0 647 | assert len(labelled_text[0])==2 648 | labelled_text = pd.DataFrame(labelled_text, 649 | columns=['label','text']) 650 | 651 | assert labelled_text.shape[1]==2 652 | assert (labelled_text.columns==['label','text']).all() 653 | 654 | #NOTE: code is not optimised, holds unnecessary copies of the text 655 | #will need to use suitable data structures in the future 656 | 657 | labelled_text['text'] = labelled_text['text'] + ' \n ' 658 | #dicitonary of label:text 659 | self.text_dict = labelled_text.groupby('label')['text'].sum().to_dict() 660 | 661 | if len(self.text_dict)>8: 662 | raise ValueError('Only up to 8 unique labels are allowed right now') 663 | 664 | super().__init__(labelled_text, dimensions=dimensions, 665 | keyword_extractor=keyword_extractor, 666 | num_keywords=num_keywords, lemmatize=lemmatize, 667 | pos_filter=pos_filter, extract_ngrams=extract_ngrams, 668 | filter_numbers=filter_numbers, 669 | filter_stopwords=filter_stopwords) 670 | 671 | def _extract_keywords(self): 672 | 673 | self.keywords, self.pos_tags, self.labels = [],[],[] 674 | self._normalized_keywords = [] 675 | self.scores = np.array([]) 676 | 677 | for key in self.text_dict: 678 | if self._keyword_extractor == 'tf': 679 | 680 | kw, sc, pos, n_kw = \ 681 | extract_terms_by_frequency(self.text_dict[key], 682 | self._num_required_keywords, 683 | self._pos_filter, 684 | self._filter_numbers, 685 | self._extract_ngrams, 686 | lemmatize=self._lemmatize, 687 | filter_stopwords=self._filter_stopwords) 688 | else: 689 | kw, sc, pos, n_kw = \ 690 | extract_terms_by_score(self.text_dict[key], 691 | self._keyword_extractor, 692 | self._num_required_keywords, 693 | self._extract_ngrams, 694 | lemmatize=self._lemmatize, 695 | filter_stopwords = self._filter_stopwords) 696 | 697 | self.keywords = self.keywords + kw 698 | self.scores = np.concatenate((self.scores, sc)) 699 | self.pos_tags = self.pos_tags + pos 700 | self.labels = self.labels + [key]*len(kw) 701 | 702 | #self._normalized_keywords are all lemmatized if self._lemmatize is True, 703 | #unlike self.keywords which contain capitalized named entities 704 | self._normalized_keywords = self._normalized_keywords + n_kw 705 | 706 | 707 | def set_fontcolor(self, by='label', colorscale='Set3', 708 | custom_colors=None): 709 | """ 710 | This function can be used to pick a metric which decides the font color 711 | for each extracted keyword. By default, the font color is assigned 712 | based on the score of each keyword. 713 | 714 | Fonts can be picked by: 'label', 'random', 'scores', 'pos_tag', 'clustering_criteria' 715 | 716 | You can also choose custom font colors by passing in a list of 717 | (R,G,B) tuples with values for each component falling in [0,255]. 718 | 719 | Parameters 720 | ---------- 721 | 722 | by : str or None, optional 723 | The metric used to assign font sizes. Can be None if custom colors 724 | are being used 725 | colorscale: str or None, optional 726 | One of [Greys, YlGnBu, Greens, YlOrRd, Bluered, RdBu, Reds, Blues]. 727 | When by=='scores', this will be used to determine the colorscale. 728 | custom_colors : list of 3-tuple, optional 729 | A list of RGB tuples. Each tuple corresponding to the color of 730 | a keyword. 731 | 732 | Returns 733 | ------- 734 | None 735 | """ 736 | 737 | if by=='label' and (custom_colors is None): 738 | scales = cl.scales['8']['qual'] 739 | #All colorscales in 'scales.keys()' can be used 740 | 741 | assert colorscale in ['Pastel2','Paired','Pastel1', 742 | 'Set1','Set2','Set3','Dark2','Accent'] 743 | colors = scales[colorscale].copy() 744 | colors.reverse() 745 | 746 | color_mapping={key:colors[i] for i,key in enumerate(self.text_dict)} 747 | fontcolors = list(map(color_mapping.get, self.labels)) 748 | 749 | Wordmesh.set_fontcolor(self, custom_colors=fontcolors) 750 | 751 | else: 752 | #change default colorscale to a quantitative one 753 | colorscale = 'YlGnBu' if (colorscale=='Set3') else colorscale 754 | Wordmesh.set_fontcolor(self, by=by, colorscale=colorscale, 755 | custom_colors=custom_colors) 756 | 757 | def set_clustering_criteria(self, by='scores', 758 | custom_similarity_matrix=None, 759 | apply_regularization=True, 760 | clustering_algorithm='MDS'): 761 | """ 762 | This function can be used to define the criteria for clustering of 763 | different keywords in the wordcloud. By default, clustering is done 764 | based on the keywords' scores, with keywords having high scores in the 765 | centre. 766 | 767 | The following pre-defined criteria can be used: 'cooccurence', 768 | 'meaning', 'scores', 'random' 769 | 770 | You can also define a custom_similarity_matrix. 771 | 772 | Parameters 773 | ---------- 774 | by : string or None, optional 775 | The pre-defined criteria used to cluster keywords 776 | 777 | custom_similarity_matrix : numpy array or None, optional 778 | A 2-dimensional array with shape (num_keywords, num_keywords) 779 | The entry a[i,j] is proportional to the 'dissimilarity' between 780 | keyword[i] and keyword[j]. Words that are similar will be grouped 781 | together on the canvas. 782 | 783 | apply_regularization : bool, optional 784 | Whether to regularize the similarity matrix to prevent extreme 785 | values. 786 | 787 | clustering_algorithm : {'MDS', 'TSNE'}, optional 788 | The algorithm used to find the initial embeddings based on the 789 | similarity matrix. 790 | Returns 791 | ------- 792 | None 793 | """ 794 | 795 | if by=='cooccurence': 796 | 797 | normalized = self.text 798 | normalized['text'] = normalized['text']\ 799 | .apply(lambda x: normalize_text(x, lemmatize=self._lemmatize)) 800 | 801 | #self._normalized_text = list(normalized.itertuples(False, None)) 802 | 803 | sm = csm(self._normalized_text, 804 | self._normalized_keywords, 805 | labelled=True, labels=self.labels) 806 | Wordmesh.set_clustering_criteria(self, custom_similarity_matrix=sm, 807 | apply_regularization=apply_regularization, 808 | clustering_algorithm=clustering_algorithm) 809 | else: 810 | Wordmesh.set_clustering_criteria(self, by=by, 811 | custom_similarity_matrix=custom_similarity_matrix, 812 | apply_regularization=apply_regularization, 813 | clustering_algorithm=clustering_algorithm) 814 | -------------------------------------------------------------------------------- /examples/trump_hillary_debate.txt: -------------------------------------------------------------------------------- 1 | Raddatz: Nominee for President, Donald J. Trump and the Democratic nominee for President, Hillary Clinton. 2 | 3 | [applause] 4 | 5 | Cooper: Thank you very much for being here. We’re gonna begin with a question from one of the members in our town hall. Each of you will have two minutes to respond to this question. Secretary Clinton, you won the coin toss, so you’ll go first. Our first question comes from Patrice Brock. Patrice? 6 | 7 | Patrice Brock: Thank you and good evening. The last Presidential debate could have been rated as MA, mature audiences, per TV parental guidelines. Knowing that educators assign viewing the presidential debates as students’ homework, do you feel you’re modeling appropriate and positive behavior for today’s youth? 8 | 9 | Clinton: Well, thank you. Are you a teacher? Yes, I think that that’s a very good question, because I’ve heard from lots of teachers and parents about some of their concerns about some of the things that are being said and done in this campaign. And I think it is very important for us to make clear to our children that our country really is great because we’re good. And, we are going to respect one another, lift each other up, we are going to be looking for ways to celebrate our diversity and we are going to try to reach out to every boy and girl, as well as every adult, to bring them in to working on behalf of our country. I have a very positive and optimistic view about what we can do together. That’s why the slogan of my campaign is “Stronger Together,” because I think if we work together, if we overcome the divisiveness that sometimes sets Americans against one another and instead we make some big goals, and I’ve set forth some big goals, getting the economy to work for everyone, not just those at the top, making sure that we have the best education system from preschool through college, and making it affordable and so much else. 10 | 11 | Clinton: If we set those goals and we go together to try to achieve them, there’s nothing, in my opinion, that America can’t do. So that’s why I hope that we will come together in this campaign. Obviously, I’m hoping to earn your vote, I’m hoping to be elected in November, and I can promise you I will work with every American. I wanna be the President for all Americans, regardless of your political beliefs, where you come from, what you look like, your religion. I want us to heal our country and bring it together because that’s, I think, the best way for us to get the future that our children and our grandchildren deserve. 12 | 13 | Cooper: Secretary Clinton, thank you. Mr. Trump, you have two minutes. 14 | 15 | Trump: Well, I actually agree with that. I agree with everything she said. I began this campaign because I was so tired of seeing such foolish things happen to our country. This is a great country. This is a great land. I have gotten to know the people of the country over the last year-and-a-half that I’ve been doing this as a politician. I cannot believe I’m saying that about myself, but I guess I have been a politician, and my whole concept was to make America great again. When I watch the deals being made, when I watch what’s happening with some horrible things like ObamaCare, where your health insurance and healthcare is going up by numbers that are astronomical, 68%, 59%, 71%. When I look at the Iran deal and how bad a deal it is for us, it’s a one-sided transaction, where we’re giving back $150 billion to a terrorist state, really the number one terrorist state. We’ve made them a strong country from really a very weak country just three years ago. 16 | 17 | Trump: When I look at all of the things that I see and all of the potential that our country has, we have such tremendous potential, whether it’s in business and trade where we’re doing so badly. Last year, we had an almost $800 billion trade deficit. In other words, trading with other countries, we had an $800 billion deficit. It’s hard to believe, inconceivable. You say, who’s making these deals? We’re gonna make great trade deals, we’re gonna have a strong border, we’re gonna bring back law and order. Just today, policemen were shot, two killed, and this is happening on a weekly basis. We have to bring back respect to law enforcement. At the same time, we have to take care of people on all sides. We need justice. But I want to do things that haven’t been done, including fixing and making our inner cities better for the African-American citizens that are so great and for the Latinos, Hispanics, and I look forward to doing it. It’s called Make America Great Again. 18 | 19 | Cooper: Thank you, Mr. Trump. The question from Patrice was about, are you both modeling positive and appropriate behaviors for today’s youth. We received a lot of questions online, Mr. Trump, about the tape that was released on Friday, as you can imagine. You called what you said “locker room banter”. You described kissing women without consent, grabbing their genitals. That is sexual assault. You bragged that you have sexually assaulted women. Do you understand that? 20 | 21 | Trump: No, I didn’t say that at all. I don’t think you understood what was said. This was locker room talk. I’m not proud of it. I apologize to my family, I apologize to the American people. Certainly I’m not proud of it, but this is locker room talk. You know, when we have a world where you have ISIS chopping off heads, where you have, and frankly, drowning people in steel cages, where you have wars and horrible, horrible sights all over, where you have so many bad things happening. This is like Medieval times. We haven’t seen anything like this, the carnage all over the world and they look and they see. Can you imagine the people that are, frankly, doing so well against us with ISIS? And they look at our country and they see what’s going on. Yes, I’m very embarrassed by it. I hate it. But it’s locker room talk and it’s one of those things. I will knock the hell out of ISIS. We’re gonna defeat ISIS. ISIS happened a number of years ago in a vacuum that was left because of bad judgment. And I will tell you, I will take care of ISIS. 22 | 23 | Cooper: So, Mr. Trump… 24 | 25 | Trump: We should get on to much more important things and much bigger things. 26 | 27 | Cooper: Just for the record, though, are you saying that what you said on that bus 11 years ago, that you did not actually kiss women without consent or grope women without consent? 28 | 29 | Trump: I have great respect for women. Nobody has more respect for women than I do. 30 | 31 | Cooper: So, for the record, you’re saying you never did that? 32 | 33 | Trump: I said things that frankly… You hear these things I said, and I was embarrassed by it. But I have tremendous respect for women. 34 | 35 | Cooper: Have you ever done those things? 36 | 37 | Trump: And women have respect for me. And I will tell you, no, I have not. And I will tell you that I’m gonna make our country safe, we’re gonna have borders on our country which we don’t have now. People are pouring into our country. And they’re coming in from the Middle East and other places. We’re gonna make America safe again. We’re gonna make America great again, but we’re gonna make America safe again. And we’re gonna make America wealthy again, because if you don’t do that, it sounds harsh to say, but we have to build up the wealth of our nation. 38 | 39 | Cooper: Thank you, Mr. Trump. 40 | 41 | Trump: Right now, other nations are taking our jobs and they’re taking our wealth. 42 | 43 | Cooper: Thank you, Mr. Trump. 44 | 45 | Trump: And that’s what I wanna talk about. 46 | 47 | Cooper: Secretary Clinton, do you wanna respond? 48 | 49 | Clinton: Well, like everyone else, I’ve spent a lot of time thinking over the last 48 hours about what we heard and saw. With prior Republican nominees for President, I disagreed with them on politics, policies, principles, but I never questioned their fitness to serve. Donald Trump is different. I said starting back in June that he was not fit to be President and Commander-in-Chief, and many Republicans and Independents have said the same thing. What we all saw and heard on Friday was Donald talking about women. What he thinks about women, what he does to women. And he has said that the video doesn’t represent who he is, but I think it’s clear to anyone who heard it that it represents exactly who he is. Because we’ve seen this throughout the campaign. We have seen him insult women. We’ve seen him rate women on their appearance, ranking them from 1 to 10. We’ve seen him embarrass women on TV and on Twitter. We saw him after the first debate spend nearly a week denigrating a former Miss Universe in the harshest, most personal terms. So, yes, this is who Donald Trump is. 50 | 51 | Clinton: But it’s not only women and it’s not only this video that raises questions about his fitness to be our President. Because he has also targeted immigrants, African-Americans, Latinos, people with disabilities, POWs, Muslims, and so many others. So this is who Donald Trump is. And the question for us, the question our country must answer is that this is not who we are. That’s why, to go back to your question, I wanna send a message, we all should, to every boy and girl, and indeed to the entire world, that America already is great. But we are great because we are good. And we will respect one another and we will work with one another, and we will celebrate our diversity. These are very important values to me, because this is the America that I know and love. And I can pledge to you tonight that this is the America that I will serve if I’m so fortunate enough to become your President. 52 | 53 | Raddatz: And we want to get to some questions from online. 54 | 55 | Trump: Well, am I allowed to respond to that? I assume I am. 56 | 57 | Raddatz: Yes, you can respond to that. 58 | 59 | Trump: It’s just words, folks. It’s just words. Those words, I’ve been hearing them for many years. I heard them when they were running for the Senate in New York, where Hillary was gonna bring back jobs to upstate New York and she failed. I’ve heard them where Hillary’s constantly talking about the inner cities of our country, which are a disaster education-wise, job-wise, safety-wise, in every way possible. I’m gonna help the African-Americans, I’m gonna help the Latinos, Hispanics. I am going to help the inner cities. She’s done a terrible job for the African-Americans. She wants their vote and she does nothing, and then she comes back four years later. We saw that first-hand when she was a United States senator. She campaigned, where the primary part of her campaign… 60 | 61 | Raddatz: Mr. Trump, Mr. Trump, I wanna get to audience questions and online questions. 62 | 63 | Trump: So, she’s allowed to do that but I’m not allowed to respond? 64 | 65 | Raddatz: You’re going to get to respond right now. 66 | 67 | Trump: Sounds fair. 68 | 69 | Raddatz: This tape is generating intense interest. In just 48 hours, it’s become the single most talked about story of the entire 2016 election on Facebook, with millions and millions of people discussing it on the social network. As we said a moment ago, we do want to bring in questions from voters around the country via social media, and our first stays on this topic. Jeff, from Ohio, asks on Facebook, “Trump says the campaign has changed him. When did that happen?” So, Mr. Trump, let me add to that. When you walked off that bus at age 59, were you a different man, or did that behavior continue until just recently? 70 | 71 | Trump: That was locker room talk. 72 | 73 | Raddatz: You have two minutes for this. 74 | 75 | Trump: As I told you, that was locker room talk. I’m not proud of it. I am a person who has great respect for people, for my family, for the people of this country, and certainly, I’m not proud of it. But that was something that happened. If you look at Bill Clinton, far worse. Mine are words, and his was action. His was… What he’s done to women… There’s never been anybody in the history of politics in this nation that’s been so abusive to women. So, you can say any way you wanna say it, but Bill Clinton was abusive to women. Hillary Clinton attacked those same women and attacked them viciously. Four of them here tonight. One of the women, who is a wonderful woman, at 12 years old, was raped at 12. Her client she represented, got him off, and she’s seen laughing on two separate occasions, laughing at the girl who was raped. Kathy Shelton, that young woman, is here with us tonight. 76 | 77 | Trump: So, don’t tell me about words. I am absolutely… I apologize for those words, but it is things that people say. But what President Clinton did… He was impeached, he lost his license to practice law, he had to pay an $850,000 fine to one of the women, Paula Jones, who’s also here tonight. And I will tell you that when Hillary brings up a point like that, and she talks about words that I said 11 years ago, I think it’s disgraceful and I think she should be ashamed of herself, if you wanna know the truth. 78 | 79 | [applause] 80 | 81 | Raddatz: Can we please hold the applause? Secretary Clinton, you have two minutes. 82 | 83 | Clinton: Well, first, let me start by saying that so much of what he’s just said is not right, but he gets to run his campaign any way he chooses. He get’s to decide what he wants to talk about. Instead of answering people’s questions, talking about our agenda, laying out the plans that we have that we think can make a better life and a better country. That’s his choice. When I hear something like that, I am reminded of what my friend, Michelle Obama, advised us all, “When they go low, you go high.” 84 | 85 | [applause] 86 | 87 | Clinton: And… Look, if this were just about one video, maybe what he’s saying tonight would be understandable, but everyone can draw their own conclusions at this point about whether or not the man in the video or the man on the stage respects women. But he never apologizes for anything to anyone. He never apologized to Mr and Mrs Khan, the Gold Star Family, whose son, Captain Khan, died in the line of duty in Iraq. And Donald insulted and attacked them for weeks over their religion. He never apologized to the distinguished federal judge who was born in Indiana, but Donald said he couldn’t be trusted to be a judge because his parents were “Mexican”. He never apologized to the reporter that he mimicked and mocked on national television, and our children were watching. And he never apologized for the racist lie that President Obama was not born in the United States of America. He owes the President an apology, he owes our country an apology, and he needs to take responsibility for his actions and his words. 88 | 89 | Trump: Well, you owe the President an apology, because, as you know very well, your campaign… Sidney Blumenthal, he’s another real winner that you have. And he’s the one that got this started, along with your campaign manager, and they were on television just two weeks ago, she was saying exactly that, so you really owe him an apology. You’re the one that sent the pictures around your campaign; sent the pictures around with President Obama in a certain garb. That was long before I was ever involved, so you actually owe an apology. Number two, Michelle Obama. I’ve gotten to see the commercials that they did on you, and I’ve gotten to see some of the most vicious commercials I’ve ever seen of Michelle Obama talking about you, Hillary. 90 | 91 | Trump: So, you talk about friend, go back and take a look at those commercials, a race where you lost fair and square, unlike the Bernie Sanders race where you won, but not fair and square, in my opinion. And all you have to do is take a look at WikiLeaks and just see what they said about Bernie Sanders, and see what Deborah Wasserman Schultz had in mind, because Bernie Sanders, between super-delegates and Deborah Wasserman Schultz, he never had a chance. And I was so surprised to see him sign on with the devil. 92 | 93 | Trump: But when you talk about apology, I think the one that you should really be apologizing for and the thing that you should be apologizing for are the 33,000 emails that you deleted and that you acid washed, and then the two boxes of emails and other things last week that were taken from an office and are now missing. And I’ll tell you what, I didn’t think I’d say this, but I’m going to say it. And I hate to say it, but if I win, I am going to instruct my Attorney General to get a special prosecutor to look into your situation, because there has never been so many lies, so much deception, there has never been anything like it and we’re gonna have a special prosecutor. 94 | 95 | Trump: When I speak… I go out and speak, the people of this country are furious. In my opinion, the people that have been long-term workers at the FBI are furious. There has never been anything like this where emails and you get a subpoena, you get a subpoena and after getting the subpoena, you delete 33,000 emails and then you acid wash them or bleach them, as you would say, a very expensive process. So, we’re gonna get a special prosecutor and we’re gonna look into it, because you know what? People have been… Their lives have been destroyed for doing one-fifth of what you’ve done and it’s a disgrace. And honestly, you ought to be ashamed of yourself. 96 | 97 | Raddatz: Secretary Clinton, I wanna follow up on that. I’m gonna let you talk about emails. 98 | 99 | Clinton: Let me just quickly say because everything he just said is absolutely false, but I’m not surprised. 100 | 101 | Trump: Oh, really? 102 | 103 | Clinton: In the first debate… 104 | 105 | Raddatz: The audience needs to calm down here. 106 | 107 | Clinton: I told people that it would be impossible to be fact-checking Donald all the time. I’d never get to talk about anything I wanna do and how we’re going to really make lives better for people. So, once again, go to Hillaryclinton.com. We have literally Trump, you can fact-check him in real-time. Last time, at the first debate, we had millions of people fact-checking, so I expect we’ll have millions more fact-checking because it is… It’s just awfully good that someone with the temperament of Donald Trump is not in charge of the law in our country. 108 | 109 | Trump: Because you’d be in jail. 110 | 111 | Raddatz: Secretary Clinton… 112 | 113 | Cooper: We wanna remind the audience to please not talk out loud. Please do not applaud. You’re just wasting time. 114 | 115 | Raddatz: And Secretary Clinton, I do wanna follow up on emails. You’ve said your handling of your emails was a mistake. You disagreed with FBI Director James Comey calling your handling of classified information “extremely careless.” The FBI said that there were 110 classified emails that were exchanged, eight of which were top secret and that it was possible hostile actors did gain access to those emails. You don’t call that extremely careless? 116 | 117 | Clinton: Well, Martha, first let me say, and I’ve said it before, but I’ll repeat it, because I want everyone to hear it, that was a mistake and I take responsibility for using a personal email account. Obviously, if I were to do it over again, I would not. I’m not making any excuses. It was a mistake, and I am very sorry about that. But I think it’s also important to point out where there are some misleading accusations from critics and others. 118 | 119 | Clinton: After a year-long investigation, there is no evidence that anyone hacked the server I was using and there is no evidence that anyone can point to at all – anyone that says otherwise has no basis – that any classified material ended up in the wrong hands. I take classified materials very seriously and always have. When I was on the Senate Armed Services Committee, I was privy to a lot of classified material. Obviously, as Secretary of State, I had some of the most important secrets that we possess, such as going after Bin Laden, so I am very committed to taking classified information seriously and as I said, there is no evidence that any classified information ended up in the wrong hands. 120 | 121 | Raddatz: Okay, we’re gonna move on. 122 | 123 | Trump: And yet she didn’t know the word, the letter C on a document. Right? She didn’t even know what that word, what that letter meant. You know, it’s amazing. I’m watching Hillary go over facts and she’s going after fact after fact, and she’s lying again, because she said she… What she did with the emails was fine. You think it was fine to delete 33,000 emails? I don’t think so. She said the 33,000 emails had to do with her daughter’s wedding, number one, and a yoga class. Well, maybe we’ll give three or three or four or five or something. 33,000 emails deleted, and now she’s saying there wasn’t anything wrong. 124 | 125 | Trump: And more importantly, that was after getting a subpoena. That wasn’t before, that was after. She got it from the United States Congress, and I’ll be honest, I am so disappointed in congressmen, including Republicans, for allowing this to happen. Our Justice Department, where her husband goes onto the back of an aeroplane for 39 minutes, talks to the Attorney General days before a ruling’s gotta be made in her case. But for you to say that there was nothing wrong with you deleting 39,000 emails, again, you should be ashamed of yourself. What you did, and this is after getting a subpoena from the United States Congress. 126 | 127 | Cooper: We have to move one. Secretary Clinton, you can respond and then we gotta move on. 128 | 129 | Raddatz: We wanna give the audience a chance here. 130 | 131 | Trump: If you did that in the private sector, you’d be put in jail, let alone after getting a subpoena from the United States Congress. 132 | 133 | Cooper: Secretary Clinton, you can respond, but we have to move on to an audience question. 134 | 135 | Clinton: Look, it’s just not true and so please go to… 136 | 137 | Trump: Oh, you didn’t delete them? You didn’t delete them? 138 | 139 | Cooper: Allow her to respond, please. 140 | 141 | Clinton: Those were personal emails, not official. 142 | 143 | Raddatz: 33,000? Yeah, right. 144 | 145 | Clinton: Well, we turned over 35,000, so it was… 146 | 147 | Trump: What about the other 15,000? 148 | 149 | Cooper: Please allow her to respond, she didn’t talk while you talked. 150 | 151 | Clinton: Yes, that’s true, I didn’t. 152 | 153 | Trump: Because you have nothing to say. 154 | 155 | Clinton: And I didn’t in the first debate and I’m gonna try not to in this debate, because I’d like to get to the questions that the people have brought here tonight to talk to us about. 156 | 157 | Trump: And get off this question. 158 | 159 | Clinton: Okay, Donald, I know you’re into big diversion tonight, anything to avoid talking about your campaign and the way it’s exploding, and the way Republicans are leaving you. But let’s at least focus… 160 | 161 | Trump: Let’s see what happens… 162 | 163 | Raddatz: Allow her to respond. 164 | 165 | Clinton: On some of the issues that people care about tonight. Let’s get to their questions. 166 | 167 | Cooper: We have a question here from Ken Karpowicz. He has a question about healthcare. Ken? 168 | 169 | Trump: I’d like to know, Anderson, why aren’t you bringing up the emails? I’d like to know. Why aren’t you bringing… 170 | 171 | Cooper: We brought up the e-mails. 172 | 173 | Trump: No, it hasn’t. It hasn’t. And it hasn’t been finished at all. 174 | 175 | Cooper: Ken Karpowicz has a question. 176 | 177 | Trump: It’s nice to… One on three. 178 | 179 | Ken Karpowitz: Thank you. Affordable Care Act, known as ObamaCare, it is not affordable. Premiums have gone up. Deductibles have gone up. Copays have gone up. Prescriptions have gone up. And the coverage has gone down. What will you do to bring the cost down and make coverage better? 180 | 181 | Cooper: That first one goes to Secretary Clinton, because you started out the last one to the audience. 182 | 183 | Clinton: If he wants to start, he can start. 184 | 185 | Trump: Go ahead, Hillary. 186 | 187 | Clinton: No, go ahead, Donald. 188 | 189 | Trump: No, I’m a gentlemen, Hillary. Go ahead. 190 | 191 | [laughter] 192 | 193 | Cooper: Secretary Clinton? 194 | 195 | Clinton: Well, I think Donald was about to say he’s gonna solve it by repealing it and getting rid of the Affordable Care Act. And I’m gonna fix it, because I agree with you. Premiums have gotten too high. Copays, deductibles, prescription drug costs, and I’ve laid out a series of actions that we can take to try to get those costs down. But here’s what I don’t want people to forget when we’re talking about reining in the costs, which has to be the highest priority of the next president, when the Affordable Care Act passed, it wasn’t just that 20 million people got insurance who didn’t have it before. But that in and of itself was a good thing. I meet these people all the time, and they tell me what a difference having that insurance meant to them and their families. 196 | 197 | Clinton: But everybody else, the 170 million of us who get health insurance through our employers, got big benefits. Number one, insurance companies can’t deny you coverage because of a pre-existing condition. Number two, no lifetime limits, which is a big deal if you have serious health problems. Number three, women can’t be charged more than men for our health insurance, which is the way it used to be before the Affordable Care Act. Number four, if you’re under 26, and your parents have a policy, you can be on that policy until the age of 26, something that didn’t happen before. So, I want very much to save what works and is good about the Affordable Care Act. But we’ve got to get costs down. 198 | 199 | Clinton: We’ve got to provide some additional help to small businesses so that they can afford to provide health insurance. But if we repeal it, as Donald has proposed, and start over again, all of those benefits I just mentioned are lost to everybody, not just people who get their health insurance on the exchange. And then we would have to start all over again. Right now, we are at 90% health insurance coverage. That’s the highest we’ve ever been in our country. 200 | 201 | Cooper: Secretary Clinton, your time is up. 202 | 203 | Clinton: So I want us to get to 100%, but get costs down and keep quality up. 204 | 205 | Cooper: Mr. Trump, you have two minutes. 206 | 207 | Trump: It is such a great question and it’s maybe the question I get almost more than anything else, outside of defense. ObamaCare is a disaster. You know it. We all know it. It’s going up at numbers that nobody’s ever seen worldwide. Nobody’s ever seen numbers like this for healthcare. It’s only getting worse. In ’17, it implodes by itself. Their method of fixing it is to go back and ask Congress for more money, more and more money. We have right now almost $20 trillion in debt. 208 | 209 | Trump: ObamaCare will never work. It’s very bad, very bad health insurance. Far too expensive. And not only expensive for the person that has it, unbelievably expensive for our country. It’s going to be one of the biggest line items very shortly. We have to repeal it and replace it with something absolutely much less expensive and something that works, where your plan can actually be tailored. We have to get rid of the lines around the state, artificial lines, where we stop insurance companies from coming in and competing, because they wanted President Obama and whoever was working on it, they wanna leave those lines, because that gives the insurance companies essentially monopolies. We want competition. 210 | 211 | Trump: You will have the finest healthcare plan there is. She wants to go to a single-payer plan, which would be a disaster, somewhat similar to Canada. And have you ever noticed that Canadians, when they need a big operation, when something happens, they come into the United States in many cases, because their system is so slow. It’s catastrophic in certain ways. But she wants to go to single-payer, which means the government basically rules everything. Hillary Clinton has been after this for years. ObamaCare was the first step. ObamaCare is a total disaster. And not only are your rates going up by numbers that nobody’s ever believed, but your deductibles are going up, so that unless you get hit by a truck, you’re never gonna be able to use it. 212 | 213 | Cooper: Mr. Trump, your time… 214 | 215 | Trump: It is a disastrous plan, and it has to be repealed and replaced. 216 | 217 | Cooper: Secretary Clinton, let me follow up with you. Your husband called ObamaCare, “the craziest thing in the world,” saying that small business owners are getting killed as premiums double, coverage is cut in half. Was he mistaken or was his mistake simply telling the truth? 218 | 219 | Clinton: No, I mean, he clarified what he meant. And it’s very clear. Look, we are in a situation in our country where if we were to start all over again, we might come up with a different system. But we have an employer-based system. That’s where the vast majority of people get their healthcare. And the Affordable Care Act was meant to try to fill the gap between people who were too poor and couldn’t put together any resources to afford healthcare, namely people on Medicaid. 220 | 221 | Clinton: Obviously, Medicare, which is a single-payer system which takes care of our elderly and does a great job doing it, by the way, and then all the people who were employed, but people who were working but didn’t have the money to afford insurance and didn’t have anybody, an employer or anybody else to help them. That was the slot that the ObamaCare approach was to take. And like I say, 20 million people now have health insurance, so if we just rip it up and throw it away, what Donald’s not telling you is we just turn it back to the insurance companies the way it used to be, and that means the insurance companies get to do pretty much whatever they want, including saying, “Look, I’m sorry, you’ve got diabetes, you had cancer, your child has asthma… ” 222 | 223 | Cooper: Your time is up. 224 | 225 | Clinton: “You may not be able to have insurance, ’cause you can’t afford it.” So, let’s fix what’s broken about it but let’s not throw it away and give it all back to the insurance companies… 226 | 227 | Cooper: Let me follow up with you, Mr… 228 | 229 | Clinton: That’s not gonna work. 230 | 231 | Cooper: Mr. Trump, let me follow up on this… 232 | 233 | Trump: I just wanna, just one thing… First of all, Hillary, everything’s broken about it, everything. Number two, Bernie Sanders said that Hillary Clinton has very bad judgment. This is a perfect example of it… 234 | 235 | Cooper: Mr. Trump… 236 | 237 | Trump: Trying to save ObamaCare, which is… 238 | 239 | Cooper: You’ve said you wanna end ObamaCare, you also said you wanna make coverage accessible for people with pre-existing conditions. How do you force insurance companies to do that if you’re no longer mandating that every American get insurance? 240 | 241 | Trump: We’re going to be able to. You’re gonna have plans. 242 | 243 | Cooper: What does that mean? 244 | 245 | Trump: Well, I’ll tell you what it means. You’re gonna have plans that are so good, because we’re gonna have so much competition in the insurance industry once we break out the lines and allow the competition to come. President Obama might… 246 | 247 | Cooper: Are you gonna have a mandate that Americans have to have health insurance? 248 | 249 | Trump: Anderson, excuse me. President Obama, by keeping those lines, the boundary lines around each state, and it was almost gone until just very toward the end of the passage of ObamaCare, which, by the way, was a fraud. You know that, because Jonathan Gruber, the architect of ObamaCare, he said it was a great lie, it was a big lie. President Obama said you keep your doctor, you keep your plan, the whole thing was a fraud and it doesn’t work. But when we get rid of those lines, you have competition, and we will be able to keep pre-existing. We’ll also be able to help people that can’t get… Don’t have money, because we are going to have people protected. And Republicans feel this way, believe it or not, and strongly this way. We’re gonna block grant into the states. We’re gonna block grant into Medicaid into the states… 250 | 251 | Cooper: Thank you, Mr. Trump. 252 | 253 | Trump: So that we will be able to take care of people without the necessary funds to take care of themselves. 254 | 255 | Cooper: Thank you, Mr. Trump. 256 | 257 | Raddatz: We now go to Gorbah Hameed with a question for both candidates. 258 | 259 | Gorbah Hameed: Hi. There are 3.3 million Muslims in the United States and I’m one of them. You’ve mentioned working with Muslim nations, but with Islamophobia on the rise, how will you help people like me deal with the consequences of being labelled as a threat to the country after the election is over? 260 | 261 | Raddatz: Mr. Trump, you’re first. 262 | 263 | Trump: Well, you’re right about Islamophobia and that’s a shame. But one thing we have to do, is we have to make sure that… Because there is a problem, I mean, whether we like it or not, and we can be very politically correct, but whether we like it or not, there is a problem. And we have to be sure that Muslims come in and report when they see something going on, when they see hatred going on, they have to report it. As an example in San Bernardino, many people saw the bombs all over the apartment of the two people that killed 14 and wounded many, many people, horribly wounded. They’ll never be the same. Muslims have to report the problems when they see them and, you know, there’s always a reason for everything. 264 | 265 | Trump: If they don’t do that, it’s a very difficult situation for our country, because you look at Orlando, and you look at San Bernardino, and you look at the World Trade Center. Go outside, you look at Paris. Look at that horrible… These are radical Islamic terrorists, and she won’t even mention the word, and nor will President Obama. He won’t use the term “radical Islamic terrorist”. Now, to solve a problem, you have to be able to state what the problem is or at least say the name. She won’t say the name, and President Obama won’t say the name, but the name is there. It’s radical Islamic terror, and before you solve it, you have to say the name. 266 | 267 | Raddatz: Secretary Clinton. 268 | 269 | Clinton: Well, thank you for asking your question, and I’ve heard this question from a lot of Muslim Americans across our country, because, unfortunately, there’s been a lot of very divisive, dark things said about Muslims. And even someone like Captain Khan, the young man who sacrificed himself defending our country in the United States Army, has been subject to attack by Donald. I wanna say just a couple of things. First, we’ve had Muslims in America since George Washington. And we’ve had many successful Muslims. We just lost a particularly well-known one with Muhammad Ali. 270 | 271 | Clinton: My vision of America is an America where everyone has a place, if you’re willing to work hard, you do your part, you contribute to the community. That’s what America is, that’s what we want America to be for our children and our grandchildren. It’s also very short-sighted and even dangerous to be engaging in the kind of demagogic rhetoric that Donald has about Muslims. We need American Muslims to be part of our eyes and ears on our front lines. I’ve worked with a lot of different Muslim groups around America. I’ve met with a lot of them, and I’ve heard how important it is for them to feel that they are wanted and included and part of our country, part of our homeland security, and that’s what I wanna see. 272 | 273 | Clinton: It’s also important, I intend to defeat ISIS, to do so in a coalition with majority of Muslim nations. Right now, a lot of those nations are hearing what Donald says and wondering, “Why should we cooperate with the Americans?” And this is a gift to ISIS and the terrorists, violent jihadist terrorists. We are not at war with Islam. It is a mistake and it plays into the hands of the terrorists to act as though we are. I want a country where citizens like you and your family are just as welcome as anyone else. 274 | 275 | Raddatz: Thank you, Secretary Clinton. Mr. Trump, in December, you said this, “Donald J. Trump is calling for a total and complete shutdown of Muslims entering the United States until our country’s representatives can figure out what the hell is going on. We have no choice, we have no choice.” Your running mate said this week that the Muslim ban is no longer your position. Is that correct? And if it is, was it a mistake to have a religious test? 276 | 277 | Trump: First of all, Captain Khan is an American hero. If I were President at that time, he would be alive today, because unlike her, who voted for the war without knowing what she was doing, I would not have had our people in Iraq. Iraq was a disaster. So, he would have been alive today. The Muslim ban is something that, in some form, has morphed into a extreme vetting from certain areas of the world. Hillary Clinton wants to allow hundreds of thousands… Excuse me, excuse me. 278 | 279 | Raddatz: And why did it morph into that? No, answer the question. Do you still believe… 280 | 281 | Trump: Why don’t you interrupt her? You interrupt me all the time. 282 | 283 | Raddatz: I do. 284 | 285 | Trump: Why don’t you interrupt her? 286 | 287 | Raddatz: Would you please explain whether or not the Muslim ban still stands? 288 | 289 | Trump: It’s called extreme vetting. We are going to areas like Syria, where they’re coming in by the tens of thousands because of Barrack Obama, and Hillary Clinton wants to allow a 550% increase over Obama. People are coming into our country, we have no idea who they are, where they are from, what their feelings about our country is and she wants 550% more. This is gonna be the great Trojan horse of all time. We have enough problems in this country. I believe in building safe zones. I believe in having other people pay for them. As an example, the Gulf States who are not carrying their weight but they have nothing but money and take care of people. But I don’t wanna have, with all the problems this country has and all of the problems that you see going on, hundreds of thousands of people coming in from Syria when we know nothing about them. We know nothing about their values and we know nothing about their love for our country. 290 | 291 | Raddatz: Secretary Clinton, let me ask you about that, because you have asked for an increase from 10,000 to 65,000 Syrian refugees. We know you want tougher vetting. That’s not a perfect system. So, why take the risk of having those refugees come into the country? 292 | 293 | Clinton: First of all, I will not let anyone into our country that I think poses a risk to us, but there are a lot of refugees, women and children. Think of that picture we all saw, that four-year-old boy with the blood on his forehead because he’d been bombed by the Russian and Syrian air forces. There are children suffering in this catastrophic war, largely, I believe, because of Russian aggression, and we need to do our part. We, by no means, are carrying anywhere near the load that Europe and others are. 294 | 295 | Clinton: But we will have vetting that is as tough as it needs to be from our professionals, our intelligence experts and others. But it is important for us, as a policy, not to say, as Donald has said, “We’re gonna ban people based on a religion.” How do you do that? We are a country founded on religious freedom and liberty. How do we do what he has advocated without causing great distress within our own country? Are we going to have religious tests when people fly into our country? How do we expect to be able to implement those? 296 | 297 | Clinton: So, I thought that what he said was extremely unwise, and even dangerous. And indeed, you can look at the propaganda on a lot of the terrorist sites and what Donald Trump says about Muslims is used to recruit fighters, because they want to create a war between us. The final thing I would say, this is the 10th or 12th time that he has denied being for the war in Iraq. We have it on tape. The entire press corps has looked at it, it’s been debunked, but it never stops him from saying whatever he wants to say. 298 | 299 | Trump: It has not been debunked, has not been debunked. And I was against… 300 | 301 | Clinton: Please go to Hillaryclinton.com and you can see it. 302 | 303 | Trump: I was against the war in Iraq, has not been debunked. And you voted for it and you shouldn’t have. Well, I just wanna say… 304 | 305 | Clinton: There’s been lots of fact checking on that. I’d like to move on to an online question. 306 | 307 | Trump: Excuse me. She just went about 25 seconds over her time. 308 | 309 | Raddatz: She did not. 310 | 311 | Trump: Could I just respond to this, please? 312 | 313 | Raddatz: Very quickly, please. 314 | 315 | Trump: Hillary Clinton, in terms of having people come into our country, we have many criminal illegal aliens. When we wanna send them back to their country, their country says, “We don’t want them.” In some cases they’re murderers, drug lords, drug problems, and they don’t want them. Hillary Clinton, when she was Secretary of State, said, “That’s okay. We can’t force them into their country.” Let me tell you. I’m gonna force them right back into their country. They’re murderers and some very bad people. 316 | 317 | Trump: And I will tell you very strongly, when Bernie Sanders said she had bad judgement, she has really bad judgement, because we are letting people into this country that are gonna cause problems and crime like you’ve never seen. We’re also letting drugs pour through our southern border at a record clip, at a record clip, and it shouldn’t be allowed to happen. ICE just endorsed me. They’ve never endorsed a presidential candidate. The border patrol agents, 16,500, just recently endorsed me, and they endorsed me because I understand the border. She doesn’t. She wants amnesty for everybody, “Come right in, come right over.” It’s a horrible thing she’s doing. She’s got bad judgement. And honestly, so bad that she should never be President of the United States, that I can tell you. 318 | 319 | Raddatz: Thank you, Mr. Trump. I wanna move on. This next question comes from the public through the Bipartisan Open Debate Coalition’s online forum, where Americans submitted questions that generated millions of votes. This question involves WikiLeaks’ release of purported excerpts of Secretary Clinton’s paid speeches, which she has refused to release. And one line in particular, in which you, Secretary Clinton, purportedly say, “You need both a public and private position on certain issues.” So, two from Virginia asks, “Is it okay for politicians to be two-faced? Is it acceptable for a politician to have a private stance on issues?” Secretary Clinton, your two minutes. 320 | 321 | Clinton: As I recall, that was something I said about Abraham Lincoln, after having seen the wonderful Steven Spielberg movie called Lincoln. It was a masterclass watching President Lincoln get the Congress to approve the 13th Amendment. It was principled, and it was strategic. And I was making the point that it is hard sometimes to get the Congress to do what you wanna do, and you have to keep working at it. And yes, President Lincoln was trying to convince some people, he used some arguments, convincing other people, he used other arguments. That was a great, I thought, a great display of presidential leadership. 322 | 323 | Clinton: But, let’s talk about what’s really going on here, Martha, because our intelligence community just came out and said in the last few days that the Kremlin, meaning Putin and the Russian government, are directing the attacks, the hacking on American accounts, to influence our election. And WikiLeaks is part of that, as are other sites where the Russians hack information, we don’t even know if it’s accurate information, and then they put it out. 324 | 325 | Clinton: We have never in the history of our country been in a situation where an adversary, a foreign power, is working so hard to influence the outcome of the election. And believe me, they’re not doing it to get me elected. They’re doing it to try to influence the election for Donald Trump. Now, maybe because he has praised Putin, maybe because he says he agrees with a lot of what Putin wants to do, maybe because he wants to do business in Moscow. I don’t know the reasons. But, we deserve answers and we should demand that Donald release all of his tax returns so that people can see what are the entanglements and the financial relationships… 326 | 327 | Raddatz: We’re going to get to that later. Secretary Clinton, you’re out of time. 328 | 329 | Clinton: That he has with Putin and other foreign powers. 330 | 331 | Trump: I think I should respond, because, so ridiculous. Look. Now, she’s blaming… She got caught in a total lie. Her papers went out to all her friends at the banks, Goldman Sachs and everybody else, and she said things, WikiLeaks that just came out and she lied. Now, she’s blaming the lie on the late, great Abraham Lincoln. That’s one that I haven’t… 332 | 333 | [laughter] 334 | 335 | Trump: Okay, Honest Abe… Honest Abe never lied. That’s the good thing. That’s the big difference between Abraham Lincoln and you. That’s a big, big difference. We’re talking about some difference. But, as far as other elements of what she was saying, I don’t know Putin. I think it would be great if we got along with Russia, because we could fight ISIS together, as an example. But, I don’t know Putin, but I notice any time anything wrong happens, they like to say, “The Russians are… ” She doesn’t know if it’s the Russians doing the hacking. Maybe there is no hacking, but they always blame Russia. And the reason they blame Russia, because they think they’re trying to tarnish me with Russia. 336 | 337 | Trump: I know nothing about Russia. I know about Russia, but I know nothing about the inner workings of Russia. I don’t deal there, I have no businesses there, I have no loans from Russia. I have a very, very great balance sheet, so great that when I did the Old Post Office on Pennsylvania Avenue, the United States government, because of my balance sheet, which they actually know very well, chose me to do the Old Post Office, between the White House and Congress, chose me to do the Old Post Office. One of the primary things, in fact, perhaps the primary thing, was balance sheet. 338 | 339 | Trump: But, I have no loans with Russia. You could go to the United States government and they would probably tell you that, because they know my sheet very well, in order to get that development I had to have. Now, the taxes are a very simple thing. As soon as I have… First of all, I pay hundred of million of dollars in taxes. Many of her friends took bigger deductions. Warren Buffett took a massive deduction. Soros, who’s a friend of hers, took a massive deduction. Many of the people that are giving her all this money that she can do many more commercials than me, gave her, took massive deductions. I pay hundreds of millions of dollars in taxes. But, as soon as my routine audit’s finished, I’ll release my returns. I’ll be very proud to. They’re actually quite good. 340 | 341 | Raddatz: Thank you, Mr. Trump. 342 | 343 | Cooper: We want to turn, actually, to the topic of taxes. We have a question from Spencer Moss. Spencer? 344 | 345 | Spencer Moss: Good evening. My question is, what specific tax provisions will you change to ensure the wealthiest Americans pay their fair share in taxes? 346 | 347 | Cooper: Mr. Trump, you have two minutes. 348 | 349 | Trump: Well, one thing I’d do is get rid of carried interest. One of the greatest provisions for people like me, to be honest with you, I give up a lot when I run ’cause I knock out the tax code. And she could have done this years ago, by the way. She was a United States senator. She complains that Donald Trump took advantage of the tax code. Well, why didn’t she change it? Why didn’t you change it when you were senator? The reason you didn’t is that all your friends take the same advantage that I do, and I do. You have provisions in the tax code that frankly we could change, but you wouldn’t change it because all of these people give you the money so you can take negative ads on Donald Trump. And I say that about a lot of things. I’ve heard Hillary complaining about so many different things over the years. “I wish we would have done this,” but she’s been there for 30 years. She’s been doing this stuff. She never changed, and she never will change. She never will change. 350 | 351 | Trump: We’re getting rid of carried interest provisions. I’m lowering taxes, actually ’cause I think it’s so important for corporations, because we have corporations leaving, massive corporations, and little ones. Little ones can’t form. We’re getting rid of regulations which goes hand in hand with the lowering of the taxes, but we’re bringing the tax rate down from 35% to 15%. We’re cutting taxes for the middle class, and I will tell you, we are cutting them big league for the middle class. And I will tell you, Hillary Clinton is raising your taxes, folks. You can look at me. She’s raising your taxes really high, and what that’s going to do is a disaster for the country. But she is raising your taxes, and I’m lowering your taxes. 352 | 353 | Trump: That in itself is a big difference. We are going to be thriving again. We have no growth in this country, there’s no growth. If China has a GDP of 7%, it’s like a national catastrophe. We’re down at 1%, and that’s like no growth, and we’re going lower, in my opinion. And a lot of it has to do with the fact that our taxes are so high, just about the highest in the world. And I’m bringing them down to one of the lower in the world. And I think it’s so important, one of the most important things we can do. But she is raising everybody’s taxes massively. 354 | 355 | Cooper: Secretary Clinton, you have two minutes. The question was, what specific tax provisions will you change to ensure the wealthiest Americans pay their fair share of taxes? 356 | 357 | Clinton: Well, everything you’ve heard just now from Donald is not true. I’m sorry I have to keep saying this, but he lives in an alternative reality. And it is sort of amusing to hear somebody who hasn’t paid federal income taxes in maybe 20 years talking about what he’s going to do. But I’ll tell you what he’s going to do. His plan will give the wealthy and corporations the biggest tax cuts they’ve ever had, more than the Bush tax cuts, by at least a factor of two. Donald always takes care of Donald and people like Donald, and this would be a massive gift. And indeed, the way that he talks about his tax cuts would end up raising taxes on middle class families, millions of middle class families. 358 | 359 | Clinton: Now, here’s what I wanna do. I have said nobody who makes less than $250,000 a year, and that’s the vast majority of Americans, as you know, will have their taxes raised, because I think we’ve gotta go where the money is, and the money is with people who’ve taken advantage of every single break in the tax code. 360 | 361 | Clinton: And yes, when I was a senator, I did vote to close corporate loopholes. I voted to close, I think, one of the loopholes he took advantage of when he claimed a billion-dollar loss that enabled him to avoid paying taxes. I want to have a tax on people who are making a million dollars. It’s called the Buffett rule. Yes, Warren Buffett is the one who’s gone out and said somebody like him should not be paying a lower tax rate than his secretary. I wanna have a surcharge on incomes above $5 million. We have to make up for lost times, because I wanna invest in you. I want to invest in hardworking families. 362 | 363 | Clinton: And I think it’s been unfortunate, but it’s happened that since the Great Recession, the gains have all gone to the top, and we need to reverse that. People like Donald, who paid zero in taxes, zero for our vets, zero for our military, zero for health and education, that is wrong. And we’re going to make sure that nobody, no corporation, and no individual can get away without paying his fair share to support our country. 364 | 365 | Cooper: Thank you, I wanna give you… Mr. Trump, I wanna give you the chance to respond. I just want to tell our viewers what she’s referring to. In the last month, taxes were the number one issue on Facebook for the first time in the campaign. The New York Times published three pages of your 1995 tax returns. They showed you claimed a $916 million loss, which means you could have avoided paying personal federal income taxes for years. You’ve said you pay state taxes, employee taxes, real estate taxes, property taxes. You have not answered, though, a simple question: Did you use that $916 million loss to avoid paying personal federal income taxes? 366 | 367 | Trump: Of course I do. Of course I do. And so do all of her donors, or most of her donors. I know many of her donors. Her donors took massive tax write-offs. 368 | 369 | Cooper: So, have you paid personal federal income tax? 370 | 371 | Trump: Excuse me, Anderson. A lot of my write-off was depreciation, and other things that Hillary as a senator allowed. And she’ll always allow it, because the people that give her all this money, they want it. That’s why. See, I understand the tax code better than anybody that’s ever run for President. And it’s extremely complex. Hillary Clinton has friends that want all of these provisions, including they want the carried interest provision, which is very important to Wall Street people. But they really want the carried interest provision, which I believe Hillary’s leaving. It’s very interesting why she’s leaving carried interest. 372 | 373 | Trump: But I will tell you that, number one, I pay tremendous numbers of taxes. I absolutely used it, and so did Warren Buffett, and so did George Soros, and so did many of the other people that Hillary is getting money from. Now, I won’t mention their names, because they’re rich, but they’re not famous. So, we won’t make them famous. 374 | 375 | Cooper: Can you say how many years you have avoided paying personal federal income taxes? 376 | 377 | Trump: No, but I pay tax. And I pay federal tax, too. But I have a write-off. A lot of it’s depreciation, which is a wonderful charge. I love depreciation. She’s given it to us. Hey, if she had a problem, for 30 years she’s been doing this, Anderson. I say it all the time. She talks about healthcare. Why didn’t she do something about it? She talks about taxes. Why didn’t she do something about it? She doesn’t do anything about anything, other than talk. With her, it’s all talk and no action. 378 | 379 | Cooper: In the past… 380 | 381 | Trump: And again, Bernie Sanders, it’s really bad judgement. She has made bad judgment, not only on taxes. She’s made bad judgements on Libya, on Syria, on Iraq. Her and Obama, whether you like it or not, the way they got out of Iraq, the vacuum they’ve left, that’s why ISIS formed in the first place. They started from that little area, and now they’re in 32 different nations, Hillary. 382 | 383 | Cooper: Secretary… 384 | 385 | Trump: Congratulations. Great job. 386 | 387 | Cooper: I want you to be able to respond, Secretary Clinton. 388 | 389 | Clinton: Well, here we go again. I’ve been in favor of getting rid of carried interest for years, starting when I was a senator from New York, but that’s not the point here. 390 | 391 | Trump: Why didn’t you do it? Why didn’t you do it? 392 | 393 | Cooper: Allow her to respond. 394 | 395 | Clinton: Because I was a senator with a Republican president. 396 | 397 | Trump: Oh, really? 398 | 399 | Clinton: I will be the President who will get it done. 400 | 401 | Trump: You could’ve done it. If you were an effective senator, you could’ve done it. If you were an effective senator, you could’ve done it. But you were not an effective… 402 | 403 | Cooper: Please allow her to respond. She didn’t interrupt you. 404 | 405 | Clinton: Under our Constitution, presidents have something called veto power. Look, he has now said, repeatedly, “30 years this and 30 years that”. Let me talk about my 30 years in public service. I’m very glad to do so. 8 million kids every year have health insurance because when I was First Lady, I worked with Democrats and Republicans to create the Children’s Health Insurance Program. Hundreds of thousands of kids now have a chance to be adopted, because I worked to change our adoption and foster care system. After 9/11, I went to work with Republican mayor, governor, and president to rebuild New York, and to get healthcare for our first responders who were suffering because they had run toward danger and gotten sickened by it. Hundreds of thousands of National Guard and Reserve members have healthcare because of work that I did, and children have safer medicines because I was able to pass a law that required the dosing to be more carefully done. 406 | 407 | Clinton: When I was Secretary of State, I went around the world advocating for our country, but also advocating for women’s rights to make sure that women had a decent chance to have a better life, and negotiated a treaty with Russia to lower nuclear weapons. 400 pieces of legislation have my name on it as a sponsor or a cosponsor when I was a senator for eight years. I worked very hard, and was very proud to be reelected in New York by an even bigger margin than I’d been elected the first time. And as President, I will take that work, that bipartisan work, that finding common ground… 408 | 409 | Cooper: Thank you. 410 | 411 | Clinton: Because you have to be able to get along with people to get things done in Washington. 412 | 413 | Cooper: Thank you, Secretary. 414 | 415 | Clinton: And I’ve proven that I can, and for 30 years, I’ve produced results for people. 416 | 417 | Cooper: Thank you, Secretary. 418 | 419 | Raddatz: We’re gonna move on to Syria. Both of you have mentioned that… 420 | 421 | Trump: Well, she said a lot of things that were false. I think we should be allowed to maybe dispute… 422 | 423 | Raddatz: Mr. Trump, we’re gonna go on. This is about the audience. 424 | 425 | Trump: Because she has been a disaster as a senator. A disaster. 426 | 427 | Raddatz: Mr. Trump, we’re going to move on. The heartbreaking video of a five-year-old Syrian boy named Omran sitting in an ambulance after being pulled from the rubble after an air strike in Aleppo, focused the world’s attention on the horrors of the war in Syria, with 136 million views on Facebook alone. But there are much worse images coming out of Aleppo every day now where in the past few weeks alone, 400 people have been killed, at least 100 of them children. Just days ago, the State Department called for a war crimes investigation of the Syrian regime of Bashar al-Assad and its ally Russia for their bombardment of Aleppo. So, this next question comes from social media through Facebook. Diane from Pennsylvania asks, “If you were President, what would you do about Syria and the humanitarian crisis in Aleppo? Isn’t it a lot like the Holocaust, when the US waited too long before we helped?” Secretary Clinton, we will begin with your two minutes. 428 | 429 | Clinton: Well, the situation in Syria is catastrophic. And every day that goes by, we see the results of the regime by Assad, in partnership with the Iranians on the ground, the Russians in the air, bombarding places, in particular Aleppo, where there are hundreds of thousands of people, probably about 250,000 still left. And there is a determined effort by the Russian Air Force to destroy Aleppo in order to eliminate the last of the Syrian rebels who were really holding out against the Assad regime. Russia hasn’t paid any attention to ISIS. They’re interested in keeping Assad in power. So, I, when I was Secretary of State, advocated, and I advocate today, a no-fly zone and safe zones. We need some leverage with the Russians, because they’re not going to come to the negotiating table for a diplomatic resolution unless there is some leverage over them. 430 | 431 | Clinton: And we have to work more closely with our partners and allies on the ground. But I wanna emphasize that what is at stake here is the ambitions and the aggressiveness of Russia. Russia has decided that it’s all-in in Syria, and they’ve also decided who they wanna see become President of the United States, too, and it’s not me. I’ve stood up to Russia. I’ve taken on Putin and others. And I would do that as President. I think wherever we can cooperate with Russia, that’s fine, and I did as Secretary of State. That’s how we got a treaty reducing nuclear weapons. It’s how we got the sanctions on Iran that put a lid on the Iranian nuclear program without firing a single shot. I would go to the negotiating table with more leverage than we have now. But I do support the effort to investigate for crimes, war crimes, committed by the Syrians and the Russians, and try to hold them accountable. 432 | 433 | Raddatz: Thank you, Secretary Clinton. Mr. Trump. 434 | 435 | Trump: First of all, she’s there as secretary of state with the so-called line in the sand which… 436 | 437 | Clinton: No, I wasn’t. I was gone. I hate to interrupt you but… 438 | 439 | Trump: Okay. But you were in contact, excuse me… 440 | 441 | Clinton: At some point, we need to do some fact-checking here. 442 | 443 | Trump: You were in total contact with the White House and perhaps, sadly, Obama probably still listened to you. I don’t think he’d be listening to you very much anymore. Obama draws the line in the sand. It was laughed at all over the world what happened. Now, with that being said, she talks tough against Russia. But our nuclear program has fallen way behind and they’ve gone wild with their nuclear program. Not good. Our government shouldn’t have allowed that to happen. Russia is new in terms of nuclear. We are old, we’re tired, we’re exhausted in terms of nuclear. A very bad thing. Now she talks tough. She talks really tough against Putin and against Assad. She talks in favor of the rebels. She doesn’t even know who the rebels are. Every time we take rebels, whether it’s in Iraq or anywhere else, we’re arming people. And you know what happens? They end up being worse than the people. Look at what she did in Libya with Gaddafi. Gaddafi is out. It’s a mess. And by the way, ISIS has a good chunk of their oil. 444 | 445 | Patrice Brock: I’m sure you probably have heard that. It was a disaster. Because the fact is, almost everything she’s done in foreign policy has been a mistake and it’s been a disaster. But if you look at Russia, just take a look at Russia, and look at what they did this week, where I agree she wasn’t there, but possibly she’s consulted, we sign a peace treaty, everyone’s all excited. But what Russia did with Assad and, by the way, with Iran, who you made very powerful with the dumbest deal perhaps I’ve ever seen in the history of deal-making, the Iran deal with the $150 billion, with the $1.7 billion in cash which is enough cash to fill up this room. But look at that deal. Iran now and Russia are now against us. So, she wants to fight. She wants to fight for rebels. There’s only one problem. You don’t even know who the rebels are. 446 | 447 | Raddatz: Mr. Trump, Mr. Trump, your two minutes is up. 448 | 449 | Trump: So what’s the purpose? And one thing I have to say… 450 | 451 | Raddatz: Your two minutes is up. 452 | 453 | Trump: I don’t like Assad at all, but Assad is killing ISIS. Russia is killing ISIS. And Iran is killing ISIS. And those three have now lined up because of our weak foreign policy. 454 | 455 | Raddatz: Mr. Trump, let me repeat the question. If you were president, what would you do about Syria and the humanitarian crisis in Aleppo? And I want to remind you what your running mate said. He said, “Provocations by Russia need to be met with American strength and that if Russia continues to be involved in air strikes along with the Syrian government forces of Assad, the United States of America should be prepared to use military force to strike the military targets of the Assad regime.” 456 | 457 | Trump: Okay. He and I haven’t spoken, and I disagree. I disagree. 458 | 459 | Raddatz: You disagree with your running mate? 460 | 461 | Trump: I think you have to knock out ISIS. Right now, Syria is fighting ISIS. We have people that wanna fight both at the same time. But Syria is no longer Syria. Syria is Russia and it’s Iran, who she made strong, and Kerry and Obama made into a very powerful nation and a very rich nation very, very quickly, very, very quickly. I believe we have to get ISIS. We have to worry about ISIS before we can get too much more involved. She had a chance to do something with Syria. They had a chance. And that was the line. 462 | 463 | Raddatz: What do you think will happen if Aleppo falls? 464 | 465 | Trump: I think Aleppo is a disaster humanitarian-wise. 466 | 467 | Raddatz: What do you think will happen if it falls? 468 | 469 | Trump: I think that it basically has fallen. Okay? It basically has fallen. Let me tell you something. You take a look at Mosul. The biggest problem I have with the stupidity of our foreign policy, we have Mosul. They think a lot of the ISIS leaders are in Mosul. So, we have announcements coming out of Washington and coming out of Iraq, “We will be attacking Mosul in three weeks or four weeks.” All of these bad leaders from ISIS are leaving Mosul. Why can’t they do it quietly? Why can’t they do the attack, make it a sneak attack, and after the attack is made, inform the American public that we’ve knocked out the leaders, we’ve had a tremendous success? People leave. Why do they have to say, “We’re going to be attacking Mosul within the next four to six weeks,” which is what they’re saying. How stupid is our country? 470 | 471 | Raddatz: There are sometimes reasons the military does that. Psychological warfare. 472 | 473 | Trump: I can’t think of any. I can’t think of any. 474 | 475 | Raddatz: It might be to help get civilians out. 476 | 477 | Trump: And we have General Flynn and we have… Look, I have 200 generals and admirals who endorse me. I have 21 Congressional Medal of Honor recipients who endorse me. We talk about it all the time. They understand. Why can’t they do something secretively, where they go in and they knock out the leadership? Why would these people stay there? I’ve been reading now… 478 | 479 | Raddatz: Tell me what your strategy is. 480 | 481 | Trump: I’ve been reading now for weeks about Mosul, that it’s the harbor of where… Between Raqqa and Mosul, this is where they think the ISIS leaders are. Why would they be saying… They’re not staying there anymore, they’re gone, because everybody’s talking about how Iraq, which is us, with our leadership, goes in to fight Mosul. Now, with these 200 admirals and generals, they can’t believe it. All I say is this, General George Patton, General Douglas McArthur are spinning in their grave at the stupidity of what we’re doing in the Middle East. 482 | 483 | Raddatz: I’m going to go to Secretary Clinton. Secretary Clinton, you want Assad to go. You advocated arming rebels, but it looks like that may be too late for Aleppo. You talk about diplomatic efforts, those have failed, ceasefires have failed. Would you introduce the threat of US military force beyond a no-fly zone against the Assad regime to back up diplom 484 | 485 | ? 486 | 487 | Clinton: I would not use American ground forces in Syria. I think that would be a very serious mistake. I don’t think American troops should be holding territory, which is what they would have to do as an occupying force. I don’t think that is a smart strategy. I do think the use of special forces, which we’re using, the use of enablers and trainers in Iraq, which has had some positive effects, are very much in our interests, and so I do support what is happening, but… 488 | 489 | Raddatz: What would you do differently than President Obama is doing? 490 | 491 | Clinton: Martha, I hope that by the time I… 492 | 493 | Trump: Everything. 494 | 495 | Clinton: I hope by the time I am President, that we will have pushed ISIS out of Iraq. I do think that there is a good chance that we can take Mosul. And Donald says he knows more about ISIS than the generals; no, he doesn’t. There are a lot of very important planning going on, and some of it is to signal to the Sunnis in the area, as well as Kurdish Peshmerga fighters, that we all need to be in this, and that takes a lot of planning and preparation. I would go after Baghdadi. I would specifically target Baghdadi because I think our targeting of Al-Qaeda leaders, and I was involved in a lot of those operations, highly classified ones, made a difference. So, I think that could help. I would also consider arming the Kurds. The Kurds have been our best partners in Syria as well as Iraq. And I know there’s a lot of concern about that in some circles, but I think they should have the equipment they need so that Kurdish and Arab fighters on the ground are the principal way that we take Raqqa after pushing ISIS out of Iraq. 496 | 497 | Raddatz: Thank you very much. We’re gonna move on… 498 | 499 | Trump: You know it’s funny, she went over a minute over and you don’t stop her. When I go one second over, it’s like… 500 | 501 | Raddatz: You had many answers. 502 | 503 | Trump: It’s really very interesting. 504 | 505 | Cooper: We’ve got a question over here from James Carter. Mr. Carter? 506 | 507 | James Carter: My question is, do you believe you can be a devoted President to all the people in the United States? 508 | 509 | Cooper: That question begins for Mr. Trump. 510 | 511 | Trump: Absolutely. She calls our people deplorable, a large group, and irredeemable. I will be a president for all of our people, and I’ll be a president that will turn our inner cities around and will give strength to people and will give economics to people and will bring jobs back, because NAFTA, signed by her husband, is perhaps the greatest disaster trade deal in the history of the world. Not in this country. It’s stripped us of manufacturing jobs. We lost our jobs, we lost our money, we lost our plants. It is a disaster. And now, she wants to sign TPP, even though she says now she’s for it. She called it the gold standard. And by the way, at the last debate, she lied, because it turned out that she did say the gold standard, and she said she didn’t say it. They actually said that she lied. And she lied, but she’s lied about a lot of things. 512 | 513 | Trump: I would be a president for all of the people, African-Americans, the inner cities; devastating what’s happening to our inner cities. She’s been talking about it for years. As usual, she talks about it, nothing happens; she doesn’t get it done. Same with the Latino Americans, the Hispanic Americans, the same exact thing. They talk, they don’t get it done. You go into the inner cities and you see it’s 45% poverty; African-Americans now, 45% poverty in the inner cities. The education is a disaster. Jobs are essentially non-existent. And I’ve been saying at big speeches where I have 20,000 and 30,000 people, what do you have to lose? It can’t get any worse. And she’s been talking about the inner cities for 25 years. Nothing’s gonna ever happen. Let me tell you, if she’s president of the United States, nothing’s gonna happen. It’s just gonna be talk. And all of her friends, the taxes we were talking about, and I would just get it by osmosis. She’s not doing me favors. But by doing all the others favors, she’s doing me favors. 514 | 515 | Cooper: Mr. Trump, thank you. 516 | 517 | Trump: But I will tell you, she’s all talk, it doesn’t get done. All you have to do is take a look at her Senate run, take a look at upstate New York. 518 | 519 | Cooper: Your two minutes is up. Secretary Clinton? 520 | 521 | Trump: It turned out to be a disaster. 522 | 523 | Cooper: You have two minutes, Secretary Clinton. 524 | 525 | Clinton: Well, 67% of the people voted to re-elect me when I ran for my second term, and I was very proud and very humbled by that. Mr. Carter, I have tried my entire life to do what I can to support children and families. Right out of law school, I went to work for the Children’s Defense Fund and Donald talks a lot about the 30 years I’ve been in public service. I’m proud of that. I started off as a young lawyer working against discrimination against African-American children in schools and in the criminal justice system. I worked to make sure that kids with disabilities could get a public education, something that I care very much about. I have worked with Latinos. One of my first jobs in politics was down in South Texas, registering Latino citizens to be able to vote. 526 | 527 | Clinton: So I have a deep devotion, to use your absolutely correct word, to making sure that every American feels like he or she has a place in our country. And I think when you look at the letters that I get, a lot of people are worried that maybe they wouldn’t have a place in Donald Trump’s America. They write me, and one woman wrote me about her son, Felix. She adopted him from Ethiopia when he was a toddler. He’s 10 years old now, this is the only country he’s ever known. And he listens to Donald on TV and he said to his mother one day, “Will he send me back to Ethiopia if he gets elected?” 528 | 529 | Clinton: Children listen to what is being said, to go back to the very, very first question. And there’s a lot of fear that, in fact, teachers and parents are calling it ‘The Trump Effect’. Bullying is up, a lot of people are feeling uneasy, a lot of kids are expressing their concerns. So first and foremost, I will do everything I can to reach out to everybody. 530 | 531 | Cooper: Your time, Secretary Clinton. 532 | 533 | Clinton: Democrats, Republicans, Independents, people across our country. If you don’t vote for me, I still wanna be your president. 534 | 535 | Cooper: Your two minutes is up. 536 | 537 | Clinton: I wanna be the best president I can be for every American. 538 | 539 | Cooper: Secretary Clinton, your two minutes is up. I wanna follow up on something that Donald Trump actually said to you, a comment you made last month. You said that half of Donald Trump supporters are “deplorables”, racists, sexists, homophobic, xenophobic, Islamophobic. Later, you said you regretted saying half. You didn’t express regret for using the term “deplorables”. To Mr. Carter’s question, how can you unite a country if you’ve written off tens of millions of Americans? 540 | 541 | Clinton: Well, within hours, I said that I was sorry about the way I talked about that, because my argument is not with his supporters, it’s with him, and with the hateful and divisive campaign that he has run, and the inciting of violence at his rallies, and the very brutal kinds of comments about not just women, but all Americans, all kinds of Americans. And what he has said about African-Americans and Latinos, about Muslims, about POWs, about immigrants, about people with disabilities, he’s never apologized for. And so, I do think that a lot of the tone and tenor that he has said, I’m proud of the campaign that Bernie Sanders and I ran. We ran a campaign based on issues, not insults, and he is supporting me 100%. 542 | 543 | Cooper: Thank you. 544 | 545 | Clinton: Because we talked about what we wanted to do. We might’ve had some differences and we had a lot of debates. 546 | 547 | Cooper: Thank you, Secretary. 548 | 549 | Clinton: But we believed that we could make the country better, and I was proud of that. 550 | 551 | Cooper: I’m gonna give you a minute to… 552 | 553 | Trump: We have a divided nation. We have a very divided nation. You look at Charlotte, you look at Baltimore, you look at the violence that’s taking place in the inner cities, Chicago. You take a look at Washington DC. We have a increase in murder within our cities, the biggest in 45 years. We have a divided nation because people like her, and believe me, she has tremendous hate in her heart. And when she said “deplorables”, she meant it. And when she said “irredeemable”, “They’re irredeemable.” You didn’t mention that, but when she said “they’re irredeemable”, to me, that might’ve been even worse. 554 | 555 | Cooper: She said some of them are… 556 | 557 | Trump: She’s got tremendous hatred, and this country cannot take another four years of Barrack Obama, and that’s what you’re getting with her. 558 | 559 | Cooper: Mr. Trump, let me follow up with you. In 2008, you wrote in one of your books that the most important characteristic of a good leader is discipline. You said, if a leader doesn’t have it, “He or she won’t be one for very long.” In the days after the first debate, you sent out a series of tweets from 3:00 to 5:00 AM, including one that told people to check out a sex tape. Is that the discipline of a good leader? 560 | 561 | Trump: No, there wasn’t, “Check out a sex tape.” It was just, take a look at the person that she built up to be this wonderful girl scout, who is no girl scout. 562 | 563 | Cooper: You mentioned a sex tape. 564 | 565 | Trump: By the way, just so you understand, when she said 3:00 in the morning, take a look at Benghazi. She said, “Who’s gonna answer the call at 3:00 in the morning?” Guess what, she didn’t answer because when Ambassador Stevens… 566 | 567 | Cooper: The question is, “Is that the discipline of a good leader?” 568 | 569 | Trump: Wait a minute, Anderson. 600 times, where she said she was awake at 3:00 in the morning. And she also sent a tweet out at 3:00 in the morning, but I won’t even mention that. But she said she’ll be awake. The famous thing, “We’re gonna answer our call at 3:00 in the morning.” Guess what happened? Ambassador Stevens sent 600 requests for help and the only one she talked to was Sidney Blumenthal, who’s her friend, and not a good guy, by the way. So, she shouldn’t be talking about that. Now, tweeting happens to be a modern day form of communication. You can like it or not like it. Between Facebook and Twitter, I have almost 25 million people. It’s a very effective way of communication, so you can put it down, but it is a very effective form of communication. I’m not un-proud of it, to be honest with you. 570 | 571 | Cooper: Secretary Clinton, does Mr. Trump have the discipline to be a good leader? 572 | 573 | Clinton: No. 574 | 575 | Trump: I’m shocked to hear that. 576 | 577 | [laughter] 578 | 579 | Clinton: Well, it’s not only my opinion. It’s the opinion of many others, national security experts, Republicans, former Republican members of Congress, but it’s in part because those of us who have had the great privilege of seeing this job up close, and know how difficult it is, and it’s not just because I watched my husband take a $300 billion deficit and turn it into a $200 billion surplus, and 23 million new jobs were created and incomes went up for everybody, everybody. African-American incomes went up 33%. And it’s not just because I worked with George W. Bush after 9/11. And I was very proud that when I told him what the city needed, what we needed to recover, he said, “You’ve got it,” and he never wavered. He stuck with me. And I have worked and I admire President Obama. He inherited the worst financial crisis since the Great Depression. That was a terrible time for our country. 580 | 581 | Cooper: We have to move along. 582 | 583 | Clinton: Nine million people lost their jobs… 584 | 585 | Raddatz: Secretary Clinton, we have to… 586 | 587 | Clinton: Five million homes were lost and $13 trillion in family wealth was wiped out. We are back on the right track. He would send us back into recession with his tax plans that will benefit the wealthiest of Americans. 588 | 589 | Raddatz: Secretary Clinton, we are moving to an audience question. We’re almost out of time. 590 | 591 | Trump: We have the slowest growth since 1929. 592 | 593 | Raddatz: Mr. Trump, we’re moving to an audience question. Mr. Trump, Secretary Clinton, we wanna get to the audience. Thank you very much, both of you. 594 | 595 | [laughter] 596 | 597 | Raddatz: We have another audience question. Beth Miller has a question for both candidates. 598 | 599 | Beth Miller: Good evening. Perhaps the most important aspect of this election is the Supreme Court Justice. What would you prioritize as the most important aspect of selecting a Supreme Court Justice? 600 | 601 | Raddatz: We begin with your two minutes, Secretary Clinton. 602 | 603 | Clinton: Thank you. Well, you’re right. This is one of the most important issues in this election. I want to appoint Supreme Court Justices who understand the way the world really works, who have real life experience, who have not just been in a big law firm and maybe, clerked for a judge and then gotten on the bench. But maybe, they’d tried some more cases. They actually understand what people are up against, because I think the current Court has gone in the wrong direction. 604 | 605 | Clinton: And so, I would want to see the Supreme Court reverse Citizens United and get dark, unaccountable money out of our politics. Donald doesn’t agree with that. I would like the Supreme Court to understand that voting rights are still a big problem in many parts of our country, that we don’t always do everything we can to make it possible for people of color and older people and young people to be able to exercise their franchise. I want a Supreme Court that will stick with Roe v. Wade and a woman’s right to choose and I want a Supreme Court that will stick with marriage equality. Now, Donald has put forth the names of some people that he would consider, and among the ones that he has suggested are people who would reverse Roe v. Wade and reverse marriage equality. I think that would be a terrible mistake and would take us backwards. 606 | 607 | Clinton: I want a Supreme Court that doesn’t always side with corporate interests. I want a Supreme Court that understands because you’re wealthy and you can give more money to something doesn’t mean you have any more rights or should have any more rights than anybody else. So, I have very clear views about what I wanna see to change the balance on the Supreme Court. And I regret deeply that the Senate has not done its job and they have not permitted a vote on the person that President Obama, a highly qualified person, they’ve not given him a vote to be able to have the full complement of nine Supreme Court Justices. I think that was a dereliction of duty. I hope that they will see their way to doing it, but if I am so fortunate enough as to be President, I will immediately move to make sure that we fill that, we have nine justices that will get to work on behalf of our people. 608 | 609 | Raddatz: Thank you. You’re out of time. Mr. Trump. 610 | 611 | Trump: Justice Scalia, a great judge, died recently and we have a vacancy. I am looking to appoint judges very much in the mold of Justice Scalia. I’m looking for judges, and I’ve actually picked 20 of them, so that people would say, highly respected, highly thought of and actually, very beautifully reviewed by just about everybody, but people that will respect the Constitution of the United States. And I think that this is so important, also the Second Amendment, which is totally under siege by people like Hillary Clinton. They’ll respect the Second Amendment and what it stands for, what it represents, so important to me. 612 | 613 | Trump: Hillary mentioned something about contributions, just so you understand. I will have, in my race, more than $100 million put in, of my money, meaning I’m not taking all of this big money from all of these different corporations like she’s done. What I ask is this, I’m putting in more than… By the time it’s finished, I’ll have more than $100 million invested. Pretty much self-funding mine. We’re raising money for the Republican Party and we’re doing tremendously on the small donations, $61 average or so. 614 | 615 | Trump: I ask Hillary, why doesn’t she make $250 million by being in office? She used the power of her office to make a lot of money. Why isn’t she funding, not for $100 million, but why don’t you put $10 million or $20 million or $25 million or $30 million into your own campaign? It’s $30 million less for special interests that will tell you exactly what to do and it would really, I think, be a nice sign to the American public. Why aren’t you putting some money in? You have a lot of it. You’ve made a lot of it because of the fact that you’ve been in office. You made a lot of it while you were Secretary of State, actually. So, why aren’t you putting money into your own campaign? I’m just curious. 616 | 617 | Raddatz: Thank you very much. We’re gonna get to one more question. 618 | 619 | Clinton: But the question was about the Supreme Court, and I just wanna quickly say… 620 | 621 | Raddatz: Very quickly. 622 | 623 | Clinton: I respect the Second Amendment, but I believe there should be comprehensive background checks, and we should close the gun show loophole, and close the online loophole. 624 | 625 | Cooper: Thank you. 626 | 627 | Raddatz: We have one more question. Mrs. Clinton. 628 | 629 | Clinton: And we should try to save as many lives as we possibly can. 630 | 631 | Cooper: We have one more question from Ken Bone about energy policy. Ken? 632 | 633 | Ken Bone: What steps will your energy policy take to meet our energy needs, while at the same remaining environmentally friendly and minimizing job loss for fossil power plant workers? 634 | 635 | Cooper: Mr. Trump, two minutes. 636 | 637 | Trump: Absolutely. I think it’s such a great question, because energy is under siege by the Obama administration, under absolute siege. The EPA, Environmental Protection Agency, is killing these energy companies. And foreign companies are now coming in, buying so many of our different plants and then re-jiggering the plant, so that they can take care of their oil. We are killing, absolutely killing our energy business in this country. Now, I’m all for alternative forms of energy, including wind, including solar, etcetera. But we need much more than wind and solar. And you look at our miners, Hillary Clinton wants to put all the miners out of business. 638 | 639 | Trump: There is a thing called clean coal. Coal will last for 1,000 years in this country. Now, we have natural gas, and so many other things, because of technology. We have unbelievable… Over the last seven years, we have found tremendous wealth right under our feet. So good, especially when you have $20 trillion in debt. I will bring our energy companies back. They’ll be able to compete. They’ll make money. They’ll pay off our national debt. They’ll pay off our tremendous budget deficits, which are tremendous. But we are putting our energy companies out of business. 640 | 641 | Trump: We have to bring back our workers. You take a look at what’s happening to steel, and the cost of steel, and China dumping vast amounts of steel all over the United States, which essentially is killing our steel workers and our steel companies. We have to guard our energy companies. We have to make it possible. The EPA is so restrictive, that they are putting our energy companies out of business. And all you have to do is go to a great place like West Virginia, or places like Ohio, which is phenomenal, or places like Pennsylvania, and you see what they’re doing to the people, miners and others in the energy business. It’s a disgrace. 642 | 643 | Cooper: Your time is up. Thank you. 644 | 645 | Trump: It’s an absolute disgrace. 646 | 647 | Cooper: Secretary Clinton, two minutes. 648 | 649 | Clinton: Well, that was very interesting. First of all, China is illegally dumping steel in the United States and Donald Trump is buying it to build his buildings, putting steel workers and American steel plants out of business. That’s something that I fought against as a senator, and that I would have a trade prosecutor to make sure that we don’t get taken advantage of by China, on steel or anything else. You know, because it sounds like you’re in the business, or you’re aware of people in the business. You know that we are now, for the first time ever, energy independent. We are not dependent upon the Middle East, but the Middle East still controls a lot of the prices. So the price of oil has been way down, and that has had a damaging effect on a lot of the oil companies. We are, however, producing a lot of natural gas, which serves as a bridge to more renewable fuels, and I think that’s an important transition. 650 | 651 | Clinton: We’ve got to remain energy independent. It gives us much more power and freedom than to be worried about what goes on in the Middle East. We have enough worries over there without having to worry about that. I have a comprehensive energy policy, but it really does include fighting climate change, because I think that is a serious problem. And I support moving toward more clean, renewable energy as quickly as we can, because I think we can be the 21st century clean energy superpower, and create millions of new jobs and businesses. But I also wanna be sure that we don’t leave people behind. That’s why I’m the only candidate, from the very beginning of this campaign, who had a plan to help us revitalize coal country. Because those coal miners, and their fathers, and their grandfathers, they dug that coal out, a lot of them lost their lives. They were injured. But they turned the lights on and they powered our factories. I don’t wanna walk away from them. So we’ve gotta do something for them. 652 | 653 | Cooper: Secretary Clinton. 654 | 655 | Clinton: But the price of coal is down worldwide. So we have to look at this comprehensively. 656 | 657 | Cooper: Your time is up. 658 | 659 | Clinton: That’s exactly what I have proposed. I hope you’ll go to Hillaryclinton.com and look at my entire policy. 660 | 661 | Cooper: One more audience question. 662 | 663 | Raddatz: We’ve sneaked in one more question and it comes from Karl Becker. 664 | 665 | Karl Becker: Good evening. My question to both of you is, regardless of the current rhetoric, would either of you name one positive thing that you respect in one another. 666 | 667 | [applause] 668 | 669 | Raddatz: Mr. Trump, would you like to go first? 670 | 671 | Clinton: Well, I certainly will. Because I think that’s a very fair and important question. Look, I respect his children. His children are incredibly able and devoted, and I think that says a lot about Donald. I don’t agree with nearly anything else he says or does, but I do respect that. And I think that is something that as a mother and a grandmother is very important to me. 672 | 673 | Clinton: So, I believe that this election has become in part, so conflict-oriented, so intense, because there’s a lot at stake. This is not an ordinary time, and this is not an ordinary election. We are going to be choosing a president who will set policy, not just for eight years, but because of some of the important decisions we have to make here at home, and around the world, from the Supreme Court to energy, and so much else, and so there is a lot at stake. It’s one of the most consequential elections that we’ve had. And that’s why I’ve tried to put forth specific policies and plans. Trying to get it off of the personal, and put it onto what it is I wanna do as president. And that’s why I hope people will check on that for themselves, so that they can see that, yes, I’ve spent 30 years, actually maybe a little more, working to help kids and families, and I wanna take all that experience to the White House and do that every single day. 674 | 675 | Raddatz: Mr. Trump. 676 | 677 | Trump: Well, I consider her statement about my children to be a very nice compliment. I don’t know if it was meant to be a compliment, but it is a great… I’m very proud of my children, and they’ve done a wonderful job, and they’ve been wonderful, wonderful kids, so I consider that a compliment. I will say this about Hillary. She doesn’t quit, she doesn’t give up. I respect that. I tell it like it is. She’s a fighter. I disagree with much of what she’s fighting for, I do disagree with her judgement in many cases. But she does fight hard, and she doesn’t quit, and she doesn’t give up, and I consider that to be a very good trait. 678 | 679 | Raddatz: Thanks to both of you. 680 | 681 | [applause] 682 | 683 | Cooper: I wanna thank both the candidates, wanna thank the university here. This concludes the town hall meeting. Our thanks to the candidates, the commission, Washington University, and to everybody who watched. 684 | 685 | Raddatz: Please tune in on October 19th for the final Presidential Debate that will take place at the University of Nevada, Las Vegas. Goodnight, everyone. 686 | --------------------------------------------------------------------------------