├── A_few_words_linux.txt ├── LICENSE ├── README.md ├── dot.1.pdf ├── info.txt ├── test.py ├── top5000.pdf ├── top5000chaos.pdf ├── top5000radial.pdf ├── top5000rank.pdf ├── top5000scale.pdf ├── trie_cmd.py ├── trie_dot.gv.pdf └── trie_oop.py /A_few_words_linux.txt: -------------------------------------------------------------------------------- 1 | Acacia 2 | acacia 3 | Acacian 4 | acacias 5 | acaciin 6 | acacin 7 | acacine 8 | acad 9 | academe 10 | academes 11 | academia 12 | academial 13 | academian 14 | academias 15 | Academic 16 | academic 17 | academical 18 | academically 19 | academicals 20 | academician 21 | academicians 22 | academicianship 23 | academicism 24 | academics 25 | academie 26 | academies 27 | academise 28 | academised 29 | academising 30 | academism 31 | academist 32 | academite 33 | academization 34 | academize 35 | academized 36 | academizing 37 | Academus 38 | Academy 39 | academy 40 | Acadia 41 | acadia 42 | acadialite 43 | Acadian 44 | acadian 45 | Acadie 46 | Acaena 47 | acaena 48 | acajou 49 | acajous 50 | -acal 51 | acalculia 52 | acale 53 | acaleph 54 | Acalepha 55 | acalepha 56 | Acalephae 57 | acalephae 58 | acalephan 59 | acalephe 60 | acalephes 61 | acalephoid 62 | acalephs 63 | Acalia 64 | acalycal 65 | acalycine 66 | acalycinous 67 | acalyculate 68 | Acalypha 69 | Acalypterae 70 | Acalyptrata 71 | Acalyptratae 72 | acalyptrate 73 | Acamar 74 | Acamas 75 | Acampo 76 | acampsia 77 | acana 78 | acanaceous 79 | acanonical 80 | acanth 81 | acanth- 82 | acantha 83 | Acanthaceae 84 | acanthaceous 85 | acanthad 86 | Acantharia 87 | acanthi 88 | Acanthia 89 | acanthial 90 | acanthin 91 | acanthine 92 | acanthion 93 | acanthite 94 | acantho- 95 | acanthocarpous 96 | Acanthocephala 97 | acanthocephalan 98 | Acanthocephali 99 | acanthocephalous 100 | Acanthocereus -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 Sergiu Mo 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # trie-python-graphviz 2 | This is an Object Oriented implementation of a Trie in python. The class contains setter and getter methods, and implements several useful functionalities. In addition, the class can generate graphviz dot code, which can be used to visualize the Trie. 3 | 4 | Instructions: 5 | - Use Sublime Text (or your favorite python IDE) to edit / execute the code 6 | - Add graphviz (https://pypi.python.org/pypi/graphviz) to python 7 | - Install Graphviz (graphviz.org) and add it to PATH 8 | - Use the light and capable Sumatra PDF reader to view your graphs (http://www.sumatrapdfreader.org/free-pdf-reader.html) 9 | 10 | Credits: 11 | - I thank www.wordfrequency.info for providing the list with the top 5000 words of the English language -------------------------------------------------------------------------------- /dot.1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smosanu/trie-python-graphviz/025363dd661c7123d745996cd26423207f9a57e0/dot.1.pdf -------------------------------------------------------------------------------- /info.txt: -------------------------------------------------------------------------------- 1 | Files: 2 | trie_oop.py - description and implementation of class Trie and Node 3 | trie_cmd.py - execution example 4 | test.py - a short test 5 | words_linux.txt - example input file with set of words 6 | 7 | In order to have the graphviz visualization part working, you need to follow the instructions as described here: 8 | https://pypi.python.org/pypi/graphviz 9 | Compiling the dot code from python requires the dot executable to be on systems' path. 10 | 11 | You can compile the dot code also from the cmd/console, for example as: 12 | dot -Tpdf -o output.pdf top5000.gv 13 | 14 | Several customisations will result in a more advanced visualization. Please see the graphviz documentation (file dot.1.pdf or online) for instructions. Here are some configurations I used: 15 | 16 | //layout=twopi 17 | //ranksep="10 20 15 10 5 5 4 4 3 3 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1" 18 | //nodesep=10 19 | //splines=ortho 20 | //root=0 21 | //overlap=false -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | from trie_oop import * 2 | 3 | trie = Trie() 4 | 5 | file_in = 'dict.txt' 6 | f = open(file_in) 7 | for line in f: 8 | trie.addWord(line.strip()) 9 | f.close() 10 | 11 | trie.addWord('doughnut') 12 | trie.addWord('donut') 13 | trie.addWord('donald') 14 | trie.addWord('domino') 15 | trie.addWord('dominion') 16 | 17 | 18 | searched_word = 'second' 19 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 20 | searched_word = 'time' 21 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 22 | searched_word = 'domino' 23 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 24 | searched_word = 'dominique' 25 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 26 | 27 | print() 28 | print('#words:', trie.countWords()) 29 | print('size:', trie.getSize()) 30 | print('#letters:', trie.countLetters()) 31 | print('compression:', trie.compression()) 32 | 33 | 34 | print() 35 | trie.display() 36 | trie.diagram('test', True) 37 | -------------------------------------------------------------------------------- /top5000.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smosanu/trie-python-graphviz/025363dd661c7123d745996cd26423207f9a57e0/top5000.pdf -------------------------------------------------------------------------------- /top5000chaos.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smosanu/trie-python-graphviz/025363dd661c7123d745996cd26423207f9a57e0/top5000chaos.pdf -------------------------------------------------------------------------------- /top5000radial.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smosanu/trie-python-graphviz/025363dd661c7123d745996cd26423207f9a57e0/top5000radial.pdf -------------------------------------------------------------------------------- /top5000rank.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smosanu/trie-python-graphviz/025363dd661c7123d745996cd26423207f9a57e0/top5000rank.pdf -------------------------------------------------------------------------------- /top5000scale.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smosanu/trie-python-graphviz/025363dd661c7123d745996cd26423207f9a57e0/top5000scale.pdf -------------------------------------------------------------------------------- /trie_cmd.py: -------------------------------------------------------------------------------- 1 | from trie_oop import * 2 | 3 | trie = Trie() 4 | 5 | file_in = 'top5000.txt' 6 | f = open(file_in) 7 | for line in f: 8 | trie.addWord(line.strip()) 9 | f.close() 10 | 11 | searched_word = 'nationwide' 12 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 13 | searched_word = 'statistical' 14 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 15 | searched_word = 'domino' 16 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 17 | searched_word = 'dominique' 18 | print('foud "', searched_word, '":', trie.hasWord(searched_word)) 19 | 20 | print() 21 | print('#words:', trie.countWords()) 22 | print('size:', trie.getSize()) 23 | print('#letters:', trie.countLetters()) 24 | print('compression:', trie.compression()) 25 | 26 | 27 | print() 28 | trie.display() 29 | trie.diagram('top5000', True) 30 | -------------------------------------------------------------------------------- /trie_dot.gv.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/smosanu/trie-python-graphviz/025363dd661c7123d745996cd26423207f9a57e0/trie_dot.gv.pdf -------------------------------------------------------------------------------- /trie_oop.py: -------------------------------------------------------------------------------- 1 | class Trie(object): 2 | """ root node """ 3 | 4 | # empty constructor 5 | def __init__(self): 6 | self.root = Node('*') 7 | self.nrWords = 0 8 | self.nrLetters = 0 9 | self.size = 0 10 | 11 | # 'inherit' the method addWord from Node 12 | def addWord(self, word): 13 | self.nrWords = self.nrWords + 1 14 | self.nrLetters = self.nrLetters + len(word) 15 | self.size = self.size + self.root.addWord(word) 16 | return True 17 | 18 | # 'inherit' the method hasWord from Node 19 | def hasWord(self, word): 20 | return self.root.hasWord(word) 21 | 22 | # print all information method 23 | def display(self): 24 | # print(self.size) 25 | self.root.display() 26 | 27 | # getter words for nrWords 28 | def countWords(self): 29 | return self.nrWords 30 | 31 | # getter method for nrLetters 32 | def countLetters(self): 33 | return self.nrLetters 34 | 35 | # getter method for size 36 | def getSize(self): 37 | return self.size 38 | 39 | # compute compression 40 | def compression(self): 41 | return self.nrLetters / self.size 42 | 43 | # create graphviz Digraph 44 | def diagram(self, filename, render): 45 | from graphviz import Digraph 46 | from queue import Queue 47 | diagram = Digraph(comment='The Trie') 48 | 49 | i = 0 50 | 51 | diagram.attr('node', fontsize='4') 52 | diagram.attr('node', height='0.1') 53 | diagram.attr('node', width='0.1') 54 | diagram.attr('node', fixedsize='true') 55 | diagram.attr('node', shape='circle') 56 | diagram.attr('edge', arrowsize='0.3') 57 | diagram.node(str(i), self.root.getValue()) 58 | 59 | q = Queue() 60 | q.put((self.root, i)) 61 | 62 | while not q.empty(): 63 | 64 | node, parent_index = q.get() 65 | 66 | for child in node.getChildren(): 67 | #print('current parent: ', node.getValue(), parent_index) 68 | i += 1 69 | #print('current child: ', child.getValue(), i) 70 | if child.getEnding(): 71 | diagram.attr('node', shape='diamond') 72 | diagram.node(str(i), child.getValue()) 73 | diagram.attr('node', shape='circle') 74 | else: 75 | diagram.node(str(i), child.getValue()) 76 | diagram.edge(str(parent_index), str(i)) 77 | q.put((child, i)) 78 | 79 | o = open(filename + '.gv', 'w') 80 | o.write(diagram.source) 81 | # print(diagram.source) 82 | if render: 83 | diagram.render(filename + '.gv', view=False) 84 | o.close() 85 | 86 | 87 | class Node(object): 88 | """ any node """ 89 | 90 | # empty constructor 91 | def __init__(self, val): 92 | self.value = val 93 | self.children = [] 94 | self.ending = False 95 | 96 | def __str__(self): 97 | return str(self.value) 98 | 99 | # setter method for value 100 | def setValue(self, val): 101 | self.value = val 102 | 103 | # adds a child 104 | def add_child(self, child): 105 | self.children.append(child) 106 | 107 | # remove a child 108 | def rem_child(self, child): 109 | if self.contains_child(child): 110 | self.children.remove(child) 111 | 112 | # setter method for ending 113 | def setEnding(self, T_or_F): 114 | self.ending = T_or_F 115 | 116 | # check if a letter of interest is already a child 117 | def contains_child(self, letter): 118 | if letter in self.children: 119 | return True 120 | return False 121 | 122 | # getter method for value 123 | def getValue(self): 124 | return self.value 125 | 126 | # getter method for children 127 | def getChildren(self): 128 | return self.children 129 | 130 | # getter method for ending 131 | def getEnding(self): 132 | return self.ending 133 | 134 | def addWord(self, word): 135 | size_increment = 0 136 | if word == '': 137 | self.setEnding(True) 138 | return size_increment # stops the recursion 139 | result = None 140 | for child in self.children: 141 | if child.value == word[0]: 142 | result = child 143 | return size_increment + result.addWord(word[1:]) 144 | if result == None: 145 | new_child = Node(word[0]) 146 | self.add_child(new_child) 147 | size_increment = size_increment + 1 148 | return size_increment + new_child.addWord(word[1:]) 149 | return size_increment 150 | 151 | # checks if word is in the Trie 152 | def hasWord(self, word): 153 | if word == '': 154 | if self.getEnding(): 155 | return True 156 | else: 157 | return False 158 | for child in self.children: 159 | # print(child.value) 160 | if child.value == word[0]: 161 | return True and child.hasWord(word[1:]) 162 | return False 163 | 164 | # prints object 165 | def display(self): 166 | if self.getEnding(): 167 | ending = '_' 168 | else: 169 | ending = ' ' 170 | print(ending, self.value, ending) 171 | 172 | child_nodes = self.getChildren() 173 | for node in child_nodes: 174 | node.display() 175 | 176 | return True 177 | --------------------------------------------------------------------------------