├── .gitignore ├── README.md ├── Trie.ipynb ├── __pycache__ └── trie.cpython-36.pyc ├── data ├── for later │ ├── DFD.txt │ ├── HC.txt │ ├── KHOV.txt │ ├── KHSV.txt │ └── TD.txt ├── names.txt ├── places.txt ├── sea.txt ├── seafreq.tsv ├── villages.tsv └── villages.txt ├── result ├── segment_not_word.txt └── segment_word.txt └── sample_code ├── README.md ├── freqdict.py ├── parse_tsv_word.py ├── read_ch_in_word.py ├── test_model.py ├── train_model.py ├── trie.py ├── trie.pyc ├── word_segmentation.py └── word_segmentation_v2.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pkl 2 | *.json 3 | .ipynb_checkpoints/ 4 | .DS_Store 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/README.md -------------------------------------------------------------------------------- /Trie.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/Trie.ipynb -------------------------------------------------------------------------------- /__pycache__/trie.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/__pycache__/trie.cpython-36.pyc -------------------------------------------------------------------------------- /data/for later/DFD.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/for later/DFD.txt -------------------------------------------------------------------------------- /data/for later/HC.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/for later/HC.txt -------------------------------------------------------------------------------- /data/for later/KHOV.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/for later/KHOV.txt -------------------------------------------------------------------------------- /data/for later/KHSV.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/for later/KHSV.txt -------------------------------------------------------------------------------- /data/for later/TD.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/for later/TD.txt -------------------------------------------------------------------------------- /data/names.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/names.txt -------------------------------------------------------------------------------- /data/places.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/places.txt -------------------------------------------------------------------------------- /data/sea.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/sea.txt -------------------------------------------------------------------------------- /data/seafreq.tsv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/seafreq.tsv -------------------------------------------------------------------------------- /data/villages.tsv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/villages.tsv -------------------------------------------------------------------------------- /data/villages.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/data/villages.txt -------------------------------------------------------------------------------- /result/segment_not_word.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/result/segment_not_word.txt -------------------------------------------------------------------------------- /result/segment_word.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/result/segment_word.txt -------------------------------------------------------------------------------- /sample_code/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/README.md -------------------------------------------------------------------------------- /sample_code/freqdict.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/freqdict.py -------------------------------------------------------------------------------- /sample_code/parse_tsv_word.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/parse_tsv_word.py -------------------------------------------------------------------------------- /sample_code/read_ch_in_word.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/read_ch_in_word.py -------------------------------------------------------------------------------- /sample_code/test_model.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/test_model.py -------------------------------------------------------------------------------- /sample_code/train_model.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/train_model.py -------------------------------------------------------------------------------- /sample_code/trie.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/trie.py -------------------------------------------------------------------------------- /sample_code/trie.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/trie.pyc -------------------------------------------------------------------------------- /sample_code/word_segmentation.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/word_segmentation.py -------------------------------------------------------------------------------- /sample_code/word_segmentation_v2.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RathanakSreang/KhmerWordSegmentation/HEAD/sample_code/word_segmentation_v2.py --------------------------------------------------------------------------------