├── wc.png ├── cloud.png ├── README.md └── mywc.py /wc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilkumarsingh/wordcloud-example/HEAD/wc.png -------------------------------------------------------------------------------- /cloud.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilkumarsingh/wordcloud-example/HEAD/cloud.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # wordcloud-example 2 | Exemplar program for creating wordcloud.To understand better, refer to this [video](https://www.youtube.com/watch?v=95p3cVkqYHQ). 3 | 4 | ## What is a wordcloud? 5 | 6 | An image composed of words used in a particular text or subject, in which the size of each word indicates its frequency or importance. 7 | 8 | ## Installation 9 | 10 | Install [wordcloud](https://github.com/amueller/word_cloud) using a simple pip command. 11 | 12 | ``` 13 | pip install wordcloud 14 | ``` 15 | 16 | **wikipedia** library is used for extracting wikipedia articles on any given topic. Install it using this pip command: 17 | ``` 18 | pip install wikipedia 19 | ``` 20 | ## Usage 21 | 22 | Run python script as: 23 | 24 | ``` 25 | python mywc.py 26 | ``` 27 | 28 | For example: 29 | 30 | ``` 31 | python mywc.py india 32 | ``` 33 | 34 | will create wordcloud for the topic 'india' which looks like this: 35 | 36 | ![](https://raw.githubusercontent.com/nikhilkumarsingh/wordcloud-example/master/wc.png) 37 | -------------------------------------------------------------------------------- /mywc.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from os import path 3 | import numpy as np 4 | from PIL import Image 5 | import wikipedia 6 | from wordcloud import WordCloud, STOPWORDS 7 | 8 | # get path to script's directory 9 | currdir = path.dirname(__file__) 10 | 11 | def get_wiki(query): 12 | # get best matching title for given query 13 | title = wikipedia.search(query)[0] 14 | 15 | # get wikipedia page for selected title 16 | page = wikipedia.page(title) 17 | return page.content 18 | 19 | 20 | def create_wordcloud(text): 21 | # create numpy araay for wordcloud mask image 22 | mask = np.array(Image.open(path.join(currdir, "cloud.png"))) 23 | 24 | # create set of stopwords 25 | stopwords = set(STOPWORDS) 26 | 27 | # create wordcloud object 28 | wc = WordCloud(background_color="white", 29 | max_words=200, 30 | mask=mask, 31 | stopwords=stopwords) 32 | 33 | # generate wordcloud 34 | wc.generate(text) 35 | 36 | # save wordcloud 37 | wc.to_file(path.join(currdir, "wc.png")) 38 | 39 | 40 | if __name__ == "__main__": 41 | # get query 42 | query = sys.argv[1] 43 | 44 | # get text for given query 45 | text = get_wiki(query) 46 | 47 | # generate wordcloud 48 | create_wordcloud(text) 49 | --------------------------------------------------------------------------------