├── .gitignore ├── CODE_OF_CONDUCT.md ├── LICENSE ├── README.md ├── SECURITY.md ├── Train_Custom_LID.md ├── classifiers └── HiEn.classifier ├── config.ini ├── dictionaries ├── dict1bigr.txt ├── dict1coca.txt ├── dict1goog10k.txt ├── dict1hi.txt ├── dict1hinmov.txt └── dict1text.txt ├── getLanguage.py ├── images ├── dictionary_structure.PNG ├── info_flow_new_lid.PNG ├── langIdentify_input.PNG └── langIdentify_output.PNG ├── sampleinp.txt ├── sampleinp.txt_tagged ├── sampleoutp.txt ├── tests ├── Adversarial_FIRE_2015_Sentiment_Analysis_25.txt ├── FIRE_2015_Sentiment_Analysis_25.txt └── test_sample_20.txt ├── tmp ├── temp_testFile.txt ├── temp_testFile.txt.features └── temp_testFile.txt.out └── utils ├── __init__.py ├── extractFeatures.py └── generateLanguageTags.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *Zone.Identifier 3 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/CODE_OF_CONDUCT.md -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/README.md -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/SECURITY.md -------------------------------------------------------------------------------- /Train_Custom_LID.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/Train_Custom_LID.md -------------------------------------------------------------------------------- /classifiers/HiEn.classifier: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/classifiers/HiEn.classifier -------------------------------------------------------------------------------- /config.ini: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/config.ini -------------------------------------------------------------------------------- /dictionaries/dict1bigr.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/dictionaries/dict1bigr.txt -------------------------------------------------------------------------------- /dictionaries/dict1coca.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/dictionaries/dict1coca.txt -------------------------------------------------------------------------------- /dictionaries/dict1goog10k.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/dictionaries/dict1goog10k.txt -------------------------------------------------------------------------------- /dictionaries/dict1hi.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/dictionaries/dict1hi.txt -------------------------------------------------------------------------------- /dictionaries/dict1hinmov.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/dictionaries/dict1hinmov.txt -------------------------------------------------------------------------------- /dictionaries/dict1text.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/dictionaries/dict1text.txt -------------------------------------------------------------------------------- /getLanguage.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/getLanguage.py -------------------------------------------------------------------------------- /images/dictionary_structure.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/images/dictionary_structure.PNG -------------------------------------------------------------------------------- /images/info_flow_new_lid.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/images/info_flow_new_lid.PNG -------------------------------------------------------------------------------- /images/langIdentify_input.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/images/langIdentify_input.PNG -------------------------------------------------------------------------------- /images/langIdentify_output.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/images/langIdentify_output.PNG -------------------------------------------------------------------------------- /sampleinp.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/sampleinp.txt -------------------------------------------------------------------------------- /sampleinp.txt_tagged: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/sampleinp.txt_tagged -------------------------------------------------------------------------------- /sampleoutp.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/sampleoutp.txt -------------------------------------------------------------------------------- /tests/Adversarial_FIRE_2015_Sentiment_Analysis_25.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/tests/Adversarial_FIRE_2015_Sentiment_Analysis_25.txt -------------------------------------------------------------------------------- /tests/FIRE_2015_Sentiment_Analysis_25.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/tests/FIRE_2015_Sentiment_Analysis_25.txt -------------------------------------------------------------------------------- /tests/test_sample_20.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/tests/test_sample_20.txt -------------------------------------------------------------------------------- /tmp/temp_testFile.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/tmp/temp_testFile.txt -------------------------------------------------------------------------------- /tmp/temp_testFile.txt.features: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/tmp/temp_testFile.txt.features -------------------------------------------------------------------------------- /tmp/temp_testFile.txt.out: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/tmp/temp_testFile.txt.out -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /utils/extractFeatures.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/utils/extractFeatures.py -------------------------------------------------------------------------------- /utils/generateLanguageTags.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/LID-tool/HEAD/utils/generateLanguageTags.py --------------------------------------------------------------------------------