├── .github ├── ISSUE_TEMPLATE │ ├── bug_report.md │ └── feature_request.md └── workflows │ └── pythonpackage.yml ├── .gitignore ├── .travis.yml ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── ISSUE_TEMPLATE.md ├── LICENSE ├── MANIFEST.in ├── PULL_REQUEST_TEMPLATE.md ├── README.md ├── __init__.py ├── emot ├── __init__.py ├── core.py ├── emo_unicode.py ├── pattern_generator.py └── test.py ├── index.rst ├── output.html ├── pyproject.toml ├── requirements.txt ├── setup.cfg ├── setup.py └── test ├── __init__.py └── test.py /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | 5 | --- 6 | 7 | **Describe the bug** 8 | A clear and concise description of what the bug is. 9 | 10 | **To Reproduce** 11 | Steps to reproduce the behavior: 12 | 1. Go to '...' 13 | 2. Click on '....' 14 | 3. Scroll down to '....' 15 | 4. See error 16 | 17 | **Expected behavior** 18 | A clear and concise description of what you expected to happen. 19 | 20 | **Screenshots** 21 | If applicable, add screenshots to help explain your problem. 22 | 23 | **Desktop (please complete the following information):** 24 | - OS: [e.g. iOS] 25 | - Browser [e.g. chrome, safari] 26 | - Version [e.g. 22] 27 | 28 | **Smartphone (please complete the following information):** 29 | - Device: [e.g. iPhone6] 30 | - OS: [e.g. iOS8.1] 31 | - Browser [e.g. stock browser, safari] 32 | - Version [e.g. 22] 33 | 34 | **Additional context** 35 | Add any other context about the problem here. 36 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | 5 | --- 6 | 7 | **Is your feature request related to a problem? Please describe.** 8 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 9 | 10 | **Describe the solution you'd like** 11 | A clear and concise description of what you want to happen. 12 | 13 | **Describe alternatives you've considered** 14 | A clear and concise description of any alternative solutions or features you've considered. 15 | 16 | **Additional context** 17 | Add any other context or screenshots about the feature request here. 18 | -------------------------------------------------------------------------------- /.github/workflows/pythonpackage.yml: -------------------------------------------------------------------------------- 1 | name: Python package 2 | 3 | on: [push] 4 | 5 | jobs: 6 | build: 7 | 8 | runs-on: ubuntu-latest 9 | strategy: 10 | max-parallel: 4 11 | matrix: 12 | python-version: [3.5, 3.6, 3.7] 13 | 14 | steps: 15 | - uses: actions/checkout@v1 16 | - name: Set up Python ${{ matrix.python-version }} 17 | uses: actions/setup-python@v1 18 | with: 19 | python-version: ${{ matrix.python-version }} 20 | - name: Install dependencies 21 | run: | 22 | python -m pip install --upgrade pip 23 | pip install -r requirements.txt 24 | - name: Lint with flake8 25 | run: | 26 | pip install flake8 27 | # stop the build if there are Python syntax errors or undefined names 28 | flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics 29 | # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide 30 | flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics 31 | - name: Test with pytest 32 | run: | 33 | pip install pytest 34 | pytest 35 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Translations 50 | *.mo 51 | *.pot 52 | 53 | # Django stuff: 54 | *.log 55 | local_settings.py 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # dotenv 83 | .env 84 | 85 | # virtualenv 86 | .venv 87 | venv/ 88 | ENV/ 89 | 90 | # Spyder project settings 91 | .spyderproject 92 | .spyproject 93 | 94 | # Rope project settings 95 | .ropeproject 96 | 97 | # mkdocs documentation 98 | /site 99 | 100 | # mypy 101 | .mypy_cache/ 102 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation. 6 | 7 | ## Our Standards 8 | 9 | Examples of behavior that contributes to creating a positive environment include: 10 | 11 | * Using welcoming and inclusive language 12 | * Being respectful of differing viewpoints and experiences 13 | * Gracefully accepting constructive criticism 14 | * Focusing on what is best for the community 15 | * Showing empathy towards other community members 16 | 17 | Examples of unacceptable behavior by participants include: 18 | 19 | * The use of sexualized language or imagery and unwelcome sexual attention or advances 20 | * Trolling, insulting/derogatory comments, and personal or political attacks 21 | * Public or private harassment 22 | * Publishing others' private information, such as a physical or electronic address, without explicit permission 23 | * Other conduct which could reasonably be considered inappropriate in a professional setting 24 | 25 | ## Our Responsibilities 26 | 27 | Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. 28 | 29 | Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. 30 | 31 | ## Scope 32 | 33 | This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. 34 | 35 | ## Enforcement 36 | 37 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at neelknightme@gmail.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. 38 | 39 | Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. 40 | 41 | ## Attribution 42 | 43 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version] 44 | 45 | [homepage]: http://contributor-covenant.org 46 | [version]: http://contributor-covenant.org/version/1/4/ 47 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | 1) Start the issue. 2 | 2) Issue will open for community support and possible solution. 3 | 3) As problem solved, issue will be closed. 4 | -------------------------------------------------------------------------------- /ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | Please read how to contribute carefully. 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright © 2022 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 6 | 7 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 8 | 9 | THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 10 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include LICENSE.txt 2 | include MANIFEST.in 3 | include README.md 4 | include setup.py 5 | include setup.cfg 6 | -------------------------------------------------------------------------------- /PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | # If you want to add emoji or emoticons: 2 | 3 | ## Title: Adding/Modifiying - emoji/emoticons 4 | - Add new emoji with unicode align with same standard as currently library has. 5 | - Add new emoticons same format as we have. 6 | - Run the test locally and show it works as expected. 7 | - Provide python version and os version. 8 | - Open pull request. 9 | - assign "enhancement" tag. 10 | 11 | # If you want to submit issue: 12 | 13 | ## Title: Error name 14 | 15 | ### Description: 16 | 17 | - Explain error: Deatils about error 18 | - Code: If code is able to regenrate error it will be easy for us to track. 19 | - Error output: What is the error 20 | - OS: Which operating system is used. 21 | - Python version: Python version 22 | - Suggetion: If you have any suggetion about code or error 23 | - Extra info: If you want to add that feature or give extra information. 24 | 25 | 26 | 27 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![Downloads](http://pepy.tech/badge/emot)](http://pepy.tech/project/emot) [![GitHub issues](https://img.shields.io/github/issues/NeelShah18/emot)](https://github.com/NeelShah18/emot/issues) [![GitHub forks](https://img.shields.io/github/forks/NeelShah18/emot)](https://github.com/NeelShah18/emot/network) [![GitHub stars](https://img.shields.io/github/stars/NeelShah18/emot)](https://github.com/NeelShah18/emot/stargazers) [![GitHub license](https://img.shields.io/github/license/NeelShah18/emot)](https://github.com/NeelShah18/emot/blob/master/LICENSE) 2 | 3 | Description of the emot:3.1 library 4 | =============================== 5 | 6 | Emot is a python library to extract the emojis and emoticons from a 7 | text(string). All the emojis and emoticons are taken from a reliable 8 | source details are listed in source section. 9 | 10 | Emot 3.1 release moto is: high-performance detection library for data-science specially for large scale datasets of text. 11 | 12 | Emot use advance dynamic pattern generation. It means everytime when you create object it generate pattern 13 | based on the database(emo_unicode.py). You can add/delete/modify that file under library to create other dynamic pattern. 14 | 15 | 3.0 version provide more option such as bulk processing. It is useful when you have long list of "sentence or word" 16 | and want to use multiprocessing power to speedup the process. 17 | 18 | It means you can dynamically create pattern for the emoji or emoticons and run it in multiprocessing to get maximum 19 | performance from multiple cores. 20 | 21 | Again, I am thankful for all support and help from the community around the world. I hope this will help and make your 22 | life easier. 23 | 24 | Compatibility 25 | ------------- 26 | 27 | version 3.0 only support python 3.X. 28 | 29 | Python 2.X is no longer supported. 30 | 31 | Working 32 | ------- 33 | 34 | The Emot library takes a string/list of string as an input and returns a dictonary. 35 | 36 | There are one class name emot containing four different function. 37 | 38 | emot.emoji: 39 | 40 | - Input: It has one input: string 41 | - Output: It will return dictionary with 4 different value: dict 42 | - value = list of emojis 43 | - location = list of location list of emojis 44 | - mean = list of meaning 45 | - flag = True/False. False means library didn't find anything and True means we find something. 46 | 47 | emot.emoticons 48 | 49 | - Input: It has one input: string 50 | - Output: It will return dictionary with 4 different value: dict 51 | - value = list of emoticons 52 | - location = list of location list of emoticons 53 | - mean = list of meaning 54 | - flag = True/False. False means library didn't find anything and True means we find something. 55 | 56 | emot.bulk_emoji 57 | 58 | - Input: Two input: List of string and CPU cores pool: list[], int 59 | - By default CPU cores pool value is half of total available cores: multiprocessing.cpu_count()/2 60 | - Output: It will return **list of dictionary** with 4 different value: list of dict 61 | - value = list of emojis 62 | - location = list of location list of emojis 63 | - mean = list of meaning 64 | - flag = True/False. False means library didn't find anything and True means we find something. 65 | 66 | emot.bulk_emoticons 67 | 68 | - Input: Two input: List of string and CPU cores pool: list[], int 69 | - By default CPU cores pool value is half of total available cores: multiprocessing.cpu_count()/2 70 | - Output: It will return **list of dictionary** with 4 different value: list of dict 71 | - value = list of emoticons 72 | - location = list of location list of emoticons 73 | - mean = list of meaning 74 | - flag = True/False. False means library didn't find anything and True means we find something. 75 | 76 | 77 | Example 78 | ------- 79 | 80 | >>> import emot 81 | >>> emot_obj = emot.core.emot() 82 | >>> text = "I love python ☮ 🙂 ❤ :-) :-( :-)))" 83 | >>> emot_obj.emoji(text) 84 | >>> {'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 'mean': [':peace_symbol:', 85 | ':slightly_smiling_face:', ':red_heart:'], 'flag': True} 86 | >>> emot_obj.emoticons(test) >>> {'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 87 | 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True} 88 | 89 | Running bulk string emoji and emoticons detection. When user has access multiple processing cores. 90 | 91 | >>> import emot 92 | >>> emot_obj = emot.core.emot() 93 | >>> bulk_test = ["I love python ☮ 🙂 ❤ :-) :-( :-)))", "I love python 94 | 🙂 ❤ :-) :-( :-)))", "I love python ☮ ❤ :-) :-( :-)))", "I love python ☮ 🙂 :-( :-)))"] 95 | >>> 96 | >>> emot_obj.bulk_emoji(bulk_test) 97 | >>> [{'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 98 | 'mean': [':peace_symbol:', ':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': ['🙂', '❤'], 99 | 'location': [[14, 15], [16, 17]], 'mean': [':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': [ 100 | '☮', '❤'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':red_heart:'], 'flag': True}, 101 | {'value': ['☮', '🙂'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':slightly_smiling_face:'], 102 | 'flag': True}] 103 | >>> 104 | >>> emot_obj.bulk_emoticons(bulk_test) 105 | >>> [{'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 'mean': ['Happy face smiley', 106 | 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 107 | 'location': [[18, 21], [22, 25], [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very 108 | very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 'location': [[18, 21], [22, 25], 109 | [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 110 | 'flag': True}, {'value': [':-(', ':-)))'], 'location': [[18, 21], [22, 27]], 'mean': ['Frown, sad, angry or 111 | pouting', 'Very very Happy face or smiley'], 'flag': True}] 112 | 113 | Installation 114 | ------------ 115 | 116 | Via pip: 117 | 118 | $ pip install emot --upgrade 119 | 120 | From master branch: 121 | 122 | $ git clone https://github.com/NeelShah18/emot.git 123 | $ cd emot 124 | $ python setup.py install 125 | 126 | Developing 127 | ---------- 128 | 129 | $ git clone https://github.com/NeelShah18/emot.git 130 | $ cd emot 131 | 132 | Sources 133 | ----- 134 | 135 | [Emoji Cheat Sheet] 136 | 137 | [Official unicode list] 138 | 139 | [official emoticons list] 140 | 141 | Authors 142 | ------- 143 | 144 | Neel Shah / [@NeelShah18] 145 | 146 | Shubham Rohilla / [@kakashubham] 147 | 148 | [Emoji Cheat Sheet]: http://www.emoji-cheat-sheet.com/ 149 | [Official unicode list]: http://www.unicode.org/Public/emoji/1.0/full-emoji-list.html 150 | [official emoticons list]: https://en.wikipedia.org/wiki/List_of_emoticons 151 | [@NeelShah18]: https://github.com/NeelShah18 152 | [@kakashubham]: https://github.com/kakashubham 153 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- 1 | import emot 2 | -------------------------------------------------------------------------------- /emot/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: UTF-8 -*- 2 | 3 | 4 | """ 5 | emot package for Python 6 | ~~~~~~~~~~~~~~~~ 7 | 8 | emot terminal output for Python. 9 | 10 | >>> import emot 11 | 12 | """ 13 | 14 | from emot.core import emot 15 | from emot.emo_unicode import EMOJI_UNICODE 16 | from emot.emo_unicode import UNICODE_EMOJI 17 | from emot.emo_unicode import EMOTICONS_EMO 18 | from emot.emo_unicode import EMOJI_ALIAS_UNICODE 19 | from emot.emo_unicode import UNICODE_EMOJI_ALIAS 20 | 21 | __version__ = '3.0.0' 22 | __author__ = 'Neel Shah' 23 | __email__ = 'neelknightme@gmail.com' 24 | __source__ = 'https://github.com/NeelShah18/emot' 25 | __license__ = ''' 26 | 27 | GNU GENERAL PUBLIC LICENSE 28 | Version 3, 29 June 2007 29 | 30 | Copyright (C) 2007 Free Software Foundation, Inc. 31 | Everyone is permitted to copy and distribute verbatim copies 32 | of this license document, but changing it is not allowed. 33 | 34 | Preamble 35 | 36 | The GNU General Public License is a free, copyleft license for 37 | software and other kinds of works. 38 | 39 | The licenses for most software and other practical works are designed 40 | to take away your freedom to share and change the works. By contrast, 41 | the GNU General Public License is intended to guarantee your freedom to 42 | share and change all versions of a program--to make sure it remains free 43 | software for all its users. We, the Free Software Foundation, use the 44 | GNU General Public License for most of our software; it applies also to 45 | any other work released this way by its authors. You can apply it to 46 | your programs, too. 47 | ''' 48 | -------------------------------------------------------------------------------- /emot/core.py: -------------------------------------------------------------------------------- 1 | import multiprocessing as mp 2 | 3 | try: 4 | from . import emo_unicode, pattern_generator 5 | import re 6 | except Exception as E: 7 | print("Issue with loading of the library: " + str(E)) 8 | 9 | '''emot library to detect emoji and emoticons. 10 | 11 | >>> import emot 12 | >>> emot_obj = emot.emot() 13 | >>> text = "I love python ☮ 🙂 ❤ :-) :-( :-)))" 14 | >>> emot_obj.emoji(test) 15 | >>> {'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 'mean': [':peace_symbol:', 16 | ':slightly_smiling_face:', ':red_heart:'], 'flag': True} 17 | >>> emot_obj.emoticons(test) >>> {'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 18 | 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True} 19 | 20 | Running bulk string emoji and emoticons detection. When user has access multiple processing cores. 21 | 22 | >>> import emot 23 | >>> emot_obj = emot.emot() 24 | >>> bulk_test = ["I love python ☮ 🙂 ❤ :-) :-( :-)))", "I love python 25 | 🙂 ❤ :-) :-( :-)))", "I love python ☮ ❤ :-) :-( :-)))", "I love python ☮ 🙂 :-( :-)))"] 26 | >>> 27 | >>> emot_obj.bulk_emoji(bulk_test) 28 | >>> [{'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 29 | 'mean': [':peace_symbol:', ':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': ['🙂', '❤'], 30 | 'location': [[14, 15], [16, 17]], 'mean': [':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': [ 31 | '☮', '❤'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':red_heart:'], 'flag': True}, 32 | {'value': ['☮', '🙂'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':slightly_smiling_face:'], 33 | 'flag': True}] 34 | >>> 35 | >>> emot_obj.bulk_emoticons(bulk_test) 36 | >>> [{'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 'mean': ['Happy face smiley', 37 | 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 38 | 'location': [[18, 21], [22, 25], [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very 39 | very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 'location': [[18, 21], [22, 25], 40 | [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 41 | 'flag': True}, {'value': [':-(', ':-)))'], 'location': [[18, 21], [22, 27]], 'mean': ['Frown, sad, angry or 42 | pouting', 'Very very Happy face or smiley'], 'flag': True}] 43 | 44 | ''' 45 | 46 | 47 | class emot: 48 | def __init__(self): 49 | global pattern_generator_emoticons_obj, pattern_generator_emoji_obj 50 | try: 51 | pattern_generator_emoji_obj = pattern_generator.pattern_generator() 52 | except Exception as E: 53 | print("Issue with creating pattern generator object emoji: " + str(E)) 54 | 55 | try: 56 | for keys, values in emo_unicode.EMOJI_UNICODE.items(): 57 | pattern_generator_emoji_obj.add(values) 58 | 59 | for keys, values in emo_unicode.EMOJI_ALIAS_UNICODE.items(): 60 | pattern_generator_emoji_obj.add(values) 61 | 62 | self.compiled_pattern_emoji = re.compile(pattern_generator_emoji_obj.pattern()) 63 | except Exception as E: 64 | print("Issue with Generating pattern: " + str(E)) 65 | 66 | try: 67 | pattern_generator_emoticons_obj = pattern_generator.pattern_generator() 68 | except Exception as E: 69 | print("Issue with creating pattern generator object emoticons: " + str(E)) 70 | 71 | try: 72 | for keys, values in emo_unicode.EMOTICONS_EMO.items(): 73 | pattern_generator_emoticons_obj.add(keys) 74 | 75 | self.compiled_pattern_emoticons = re.compile(pattern_generator_emoticons_obj.pattern()) 76 | except Exception as E: 77 | print("Issue with Generating pattern: " + str(E)) 78 | 79 | def emoji(self, string): 80 | """emot.emoji is use to detect emoji from text 81 | 82 | >>> import emot 83 | >>> emot_obj = emot.emot() 84 | >>> text = "I love python ☮ 🙂 ❤ :-) :-( :-)))" 85 | >>> emot_obj.emoji(text) 86 | >>> {'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 'mean': [':peace_symbol:', 87 | ':slightly_smiling_face:', ':red_heart:'], 'flag': True} 88 | """ 89 | __entities = {} 90 | __value = [] 91 | __location = [] 92 | __mean = [] 93 | __flag = False 94 | try: 95 | processed_string = str(string) 96 | matches = self.compiled_pattern_emoji.finditer(str(processed_string)) 97 | for et in matches: 98 | __value.append(et.group().strip()) 99 | __location.append([et.start(), et.end()]) 100 | __mean.append(emo_unicode.UNICODE_EMOJI[et.group().strip()]) 101 | except Exception as E: 102 | print("Issue with internal pattern finding emoji: " + str(E)) 103 | 104 | if len(__value) > 0: 105 | __flag = True 106 | 107 | __entities = { 108 | 'value': __value, 109 | 'location': __location, 110 | 'mean': __mean, 111 | 'flag': __flag 112 | } 113 | 114 | return __entities 115 | 116 | def emoticons(self, string): 117 | """emot.emoticons is use to detect emoticon from text 118 | 119 | >>> import emot 120 | >>> emot_obj = emot.emot() 121 | >>> text = "I love python ☮ 🙂 ❤ :-) :-( :-)))" 122 | >>> emot_obj.emoji(text) 123 | >>> emot_obj.emoticons(text) >>> {'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 124 | 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True} 125 | """ 126 | __entities = {} 127 | __value = [] 128 | __location = [] 129 | __mean = [] 130 | __flag = False 131 | try: 132 | processed_string = str(string) 133 | matches = self.compiled_pattern_emoticons.finditer(str(processed_string)) 134 | for et in matches: 135 | __value.append(et.group().strip()) 136 | __location.append([et.start(), et.end()]) 137 | __mean.append(emo_unicode.EMOTICONS_EMO[et.group().strip()]) 138 | except Exception as E: 139 | print("Issue with internal pattern finding emoticons: " + str(E)) 140 | 141 | if len(__value) > 0: 142 | __flag = True 143 | 144 | __entities = { 145 | 'value': __value, 146 | 'location': __location, 147 | 'mean': __mean, 148 | 'flag': __flag 149 | } 150 | 151 | return __entities 152 | 153 | def bulk_emoji(self, string_list, multiprocessing_pool_capacity=int(mp.cpu_count()/2)): 154 | """emot.bult_emoji is use to detect emoticon from list of string 155 | 156 | Here, we have two input for bulk_emoticons: list of string and multiprocessing_pool_capacity. 157 | list of string = [string1, string2, ...] 158 | multiprocessing_pool_capacity = multi processing pool user want to provide for the computation. 159 | By defaul the pool capacity is half of the total cores in the system. 160 | 161 | >>> import emot 162 | >>> emot_obj = emot.emot() 163 | >>> bulk_test = ["I love python ☮ 🙂 ❤ :-) :-( :-)))", "I love python 🙂 ❤ :-) :-( :-)))", "I love 164 | python ☮ ❤ :-) :-( :-)))", "I love python ☮ 🙂 :-( :-)))"] 165 | >>> emot_obj.bulk_emoji(bulk_test, multiprocessing_pool_capacity=2) 166 | >>> [{'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 167 | 'mean': [':peace_symbol:', ':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': ['🙂', '❤'], 168 | 'location': [[14, 15], [16, 17]], 'mean': [':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': [ 169 | '☮', '❤'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':red_heart:'], 'flag': True}, 170 | {'value': ['☮', '🙂'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':slightly_smiling_face:'], 171 | 'flag': True}] 172 | """ 173 | processor_pool = mp.Pool(multiprocessing_pool_capacity) 174 | __entities = processor_pool.map(self.emoji, string_list) 175 | return __entities 176 | 177 | def bulk_emoticons(self, string_list, multiprocessing_pool_capacity=int(mp.cpu_count()/2)): 178 | """emot.bult_emoticons is use to detect emoticon from list of string 179 | 180 | Here, we have two input for bulk_emoticons: list of string and multiprocessing_pool_capacity. 181 | list of string = [string1, string2, ...] 182 | multiprocessing_pool_capacity = multi processing pool user want to provide for the computation. 183 | By defaul the pool capacity is half of the total cores in the system. 184 | 185 | >>> import emot 186 | >>> emot_obj = emot.emot() 187 | >>> bulk_test = ["I love python ☮ 🙂 ❤ :-) :-( :-)))", "I love python 🙂 ❤ :-) :-( :-)))", "I love python 188 | ☮ ❤ :-) :-( :-)))", "I love python ☮ 🙂 :-( :-)))"] 189 | >>> emot_obj.bulk_emoticons(bulk_test, multiprocessing_pool_capacity=2) 190 | >>> [{'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 'mean': ['Happy face smiley', 191 | 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 192 | 'location': [[18, 21], [22, 25], [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very 193 | very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 'location': [[18, 21], [22, 25], 194 | [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 195 | 'flag': True}, {'value': [':-(', ':-)))'], 'location': [[18, 21], [22, 27]], 'mean': ['Frown, sad, angry or 196 | pouting', 'Very very Happy face or smiley'], 'flag': True}] 197 | """ 198 | processor_pool = mp.Pool(multiprocessing_pool_capacity) 199 | __entities = processor_pool.map(self.emoticons, string_list) 200 | return __entities 201 | 202 | 203 | def test_emo(): 204 | emot_obj = emot() 205 | test = "I love python ☮ 🙂 ❤ :-) :-( :-)))" 206 | bulk_test = ["I love python ☮ 🙂 ❤ :-) :-( :-)))", "I love python 🙂 ❤ :-) :-( :-)))", "I love python ☮ ❤ :-) :-( :-)))", "I love python ☮ 🙂 :-( :-)))"] 207 | print(emot_obj.emoticons(test)) 208 | print(emot_obj.emoji(test)) 209 | print(emot_obj.bulk_emoji(bulk_test)) 210 | print(emot_obj.bulk_emoticons(bulk_test)) 211 | return None 212 | 213 | 214 | def about(): 215 | text = "emot library: emoji and emoticons library for python. It return emoji or emoticons from string with " \ 216 | "location of it. \nAuthors: \n Neel Shah: neelknightme@gmail.com or https://github.com/NeelShah18 " 217 | print(text) 218 | return None 219 | 220 | 221 | if __name__ == '__main__': 222 | test_emo() 223 | about() 224 | -------------------------------------------------------------------------------- /emot/pattern_generator.py: -------------------------------------------------------------------------------- 1 | import re 2 | 3 | class pattern_generator(): 4 | """Regexp::Pattern Generator in python. Creates a Trie out of a list of words. The trie can be exported to a Regexp pattern. 5 | The corresponding Regexp should match much faster than a simple Regexp union.""" 6 | 7 | def __init__(self): 8 | self.data = {} 9 | 10 | 11 | def add(self, word): 12 | ref = self.data 13 | for char in word: 14 | ref[char] = char in ref and ref[char] or {} 15 | ref = ref[char] 16 | ref[''] = 1 17 | 18 | def dump(self): 19 | return self.data 20 | 21 | def quote(self, char): 22 | return re.escape(char) 23 | 24 | def _pattern(self, pData): 25 | data = pData 26 | if "" in data and len(data.keys()) == 1: 27 | return None 28 | 29 | alt = [] 30 | cc = [] 31 | q = 0 32 | for char in sorted(data.keys()): 33 | if isinstance(data[char], dict): 34 | try: 35 | recurse = self._pattern(data[char]) 36 | alt.append(self.quote(char) + recurse) 37 | except: 38 | cc.append(self.quote(char)) 39 | else: 40 | q = 1 41 | cconly = not len(alt) > 0 42 | 43 | if len(cc) > 0: 44 | if len(cc) == 1: 45 | alt.append(cc[0]) 46 | else: 47 | alt.append('[' + ''.join(cc) + ']') 48 | 49 | if len(alt) == 1: 50 | result = alt[0] 51 | else: 52 | result = "(?:" + "|".join(alt) + ")" 53 | 54 | if q: 55 | if cconly: 56 | result += "?" 57 | else: 58 | result = "(?:%s)?" % result 59 | return result 60 | 61 | def pattern(self): 62 | return self._pattern(self.dump()) 63 | -------------------------------------------------------------------------------- /emot/test.py: -------------------------------------------------------------------------------- 1 | from core import emot 2 | import cProfile 3 | 4 | 5 | def test_emo(): 6 | emot_obj = emot() 7 | test = "I love it, 👨 :-) 🏁:-) :-)🏁 :-) 🏁 <3 oo oO " 8 | print(emot_obj.emoji(test)) 9 | print(emot_obj.emoticons(test)) 10 | print(test[27:30]) 11 | print(test[17]) 12 | return None 13 | 14 | 15 | def main(): 16 | cProfile.run('test_emo()', sort='time') 17 | return None 18 | 19 | 20 | if __name__ == "__main__": 21 | main() 22 | -------------------------------------------------------------------------------- /index.rst: -------------------------------------------------------------------------------- 1 | |Downloads| 2 | 3 | |Documentation Status| 4 | 5 | Description of the emot:3.0 library 6 | =================================== 7 | 8 | Emot is a python library to extract the emojis and emoticons from a 9 | text(string). All the emojis and emoticons are taken from a reliable 10 | source details are listed in source section. 11 | 12 | Emot 3.0 release moto is: high-performance detection library for 13 | data-science specially for large scale datasets of text. 14 | 15 | Emot use advance dynamic pattern generation. It means everytime when you 16 | create object it generate pattern based on the 17 | database(emo\_unicode.py). You can add/delete/modify that file under 18 | library to create other dynamic pattern. 19 | 20 | 3.0 version provide more option such as bulk processing. It is useful 21 | when you have long list of "sentence or word" and want to use 22 | multiprocessing power to speedup the process. 23 | 24 | It means you can dynamically create pattern for the emoji or emoticons 25 | and run it in multiprocessing to get maximum performance from multiple 26 | cores. 27 | 28 | Again, I am thankful for all support and help from the community around 29 | the world. I hope this will help and make your life easier. 30 | 31 | Compatibility 32 | ------------- 33 | 34 | version 3.0 only support python 3.X. 35 | 36 | Python 2.X is no longer supported. 37 | 38 | Working 39 | ------- 40 | 41 | The Emot library takes a string/list of string as an input and returns a 42 | dictonary. 43 | 44 | There are one class name emot containing four different function. 45 | 46 | emot.emoji: 47 | 48 | - Input: It has one input: string 49 | - Output: It will return dictionary with 4 different value: dict 50 | - value = list of emojis 51 | - location = list of location list of emojis 52 | - mean = list of meaning 53 | - flag = True/False. False means library didn't find anything and True 54 | means we find something. 55 | 56 | emot.emoticons 57 | 58 | - Input: It has one input: string 59 | - Output: It will return dictionary with 4 different value: dict 60 | - value = list of emoticons 61 | - location = list of location list of emoticons 62 | - mean = list of meaning 63 | - flag = True/False. False means library didn't find anything and True 64 | means we find something. 65 | 66 | emot.bulk\_emoji 67 | 68 | - Input: Two input: List of string and CPU cores pool: list[], int 69 | - By default CPU cores pool value is half of total available cores: 70 | multiprocessing.cpu\_count()/2 71 | - Output: It will return **list of dictionary** with 4 different value: 72 | list of dict 73 | - value = list of emojis 74 | - location = list of location list of emojis 75 | - mean = list of meaning 76 | - flag = True/False. False means library didn't find anything and True 77 | means we find something. 78 | 79 | emot.bulk\_emoticons 80 | 81 | - Input: Two input: List of string and CPU cores pool: list[], int 82 | - By default CPU cores pool value is half of total available cores: 83 | multiprocessing.cpu\_count()/2 84 | - Output: It will return **list of dictionary** with 4 different value: 85 | list of dict 86 | - value = list of emoticons 87 | - location = list of location list of emoticons 88 | - mean = list of meaning 89 | - flag = True/False. False means library didn't find anything and True 90 | means we find something. 91 | 92 | Example 93 | ------- 94 | 95 | :: 96 | 97 | >>> import emot 98 | >>> emot_obj = emot.emot() 99 | >>> text = "I love python ☮ 🙂 ❤ :-) :-( :-)))" 100 | >>> emot_obj.emoji(test) 101 | >>> {'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 'mean': [':peace_symbol:', 102 | ':slightly_smiling_face:', ':red_heart:'], 'flag': True} 103 | >>> emot_obj.emoticons(test) >>> {'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 104 | 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True} 105 | 106 | Running bulk string emoji and emoticons detection. When user has access multiple processing cores. 107 | 108 | >>> import emot 109 | >>> emot_obj = emot.emot() 110 | >>> bulk_test = ["I love python ☮ 🙂 ❤ :-) :-( :-)))", "I love python 111 | 🙂 ❤ :-) :-( :-)))", "I love python ☮ ❤ :-) :-( :-)))", "I love python ☮ 🙂 :-( :-)))"] 112 | >>> 113 | >>> emot_obj.bulk_emoji(bulk_test) 114 | >>> [{'value': ['☮', '🙂', '❤'], 'location': [[14, 15], [16, 17], [18, 19]], 115 | 'mean': [':peace_symbol:', ':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': ['🙂', '❤'], 116 | 'location': [[14, 15], [16, 17]], 'mean': [':slightly_smiling_face:', ':red_heart:'], 'flag': True}, {'value': [ 117 | '☮', '❤'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':red_heart:'], 'flag': True}, 118 | {'value': ['☮', '🙂'], 'location': [[14, 15], [16, 17]], 'mean': [':peace_symbol:', ':slightly_smiling_face:'], 119 | 'flag': True}] 120 | >>> 121 | >>> emot_obj.bulk_emoticons(bulk_test) 122 | >>> [{'value': [':-)', ':-(', ':-)))'], 'location': [[20, 23], [24, 27], [28, 33]], 'mean': ['Happy face smiley', 123 | 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 124 | 'location': [[18, 21], [22, 25], [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very 125 | very Happy face or smiley'], 'flag': True}, {'value': [':-)', ':-(', ':-)))'], 'location': [[18, 21], [22, 25], 126 | [26, 31]], 'mean': ['Happy face smiley', 'Frown, sad, angry or pouting', 'Very very Happy face or smiley'], 127 | 'flag': True}, {'value': [':-(', ':-)))'], 'location': [[18, 21], [22, 27]], 'mean': ['Frown, sad, angry or 128 | pouting', 'Very very Happy face or smiley'], 'flag': True}] 129 | 130 | Installation 131 | ------------ 132 | 133 | Via pip: 134 | 135 | :: 136 | 137 | $ pip install emot --upgrade 138 | 139 | From master branch: 140 | 141 | :: 142 | 143 | $ git clone https://github.com/NeelShah18/emot.git 144 | $ cd emot 145 | $ python setup.py install 146 | 147 | Developing 148 | ---------- 149 | 150 | :: 151 | 152 | $ git clone https://github.com/NeelShah18/emot.git 153 | $ cd emot 154 | 155 | Sources 156 | ------- 157 | 158 | `Emoji Cheat Sheet `__ 159 | 160 | `Official unicode 161 | list `__ 162 | 163 | `official emoticons 164 | list `__ 165 | 166 | Authors 167 | ------- 168 | 169 | Neel Shah / [@NeelShah18] 170 | 171 | Shubham Rohilla / [@kakashubham] 172 | 173 | .. |Downloads| image:: http://pepy.tech/badge/emot 174 | :target: http://pepy.tech/project/emot 175 | .. |Documentation Status| image:: http://readthedocs.org/projects/emot/badge/?version=latest 176 | :target: https://emot.readthedocs.io/en/latest/?badge=latest 177 | -------------------------------------------------------------------------------- /output.html: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeelShah18/emot/431067e996de1e429404a9506dbaa4dfeaf7da60/output.html -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = [ 3 | "setuptools>=57.0.0", 4 | "wheel" 5 | ] 6 | build-backend = "setuptools.build_meta" -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | setuptools>=57.0.0 -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | description-file = README.md 3 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | """ 4 | Setup script for emot 5 | """ 6 | from setuptools import setup 7 | 8 | setup( 9 | name='emot', 10 | version="3.1", 11 | author='Neel Shah', 12 | author_email='neelknightme@gmail.com', 13 | description="Emoji and Emoticons detection package for Python", 14 | keywords=['emoji, emoticons'], 15 | include_package_data=True, 16 | license="GNU GENERAL PUBLIC LICENSE", 17 | packages=['emot'], 18 | url="https://github.com/NeelShah18/emo", 19 | zip_safe=True, 20 | download_url="https://github.com/NeelShah18/emot/archive/refs/tags/v3.0.tar.gz", 21 | ) 22 | -------------------------------------------------------------------------------- /test/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Unittests for emo 3 | ~~~~~~~~~~~~~~~~ 4 | By Neel Shah 5 | """ 6 | 7 | from . import * 8 | -------------------------------------------------------------------------------- /test/test.py: -------------------------------------------------------------------------------- 1 | from emot.core import emot 2 | import cProfile 3 | 4 | 5 | def test_emo(): 6 | emot_obj = emot() 7 | test = "I love it, 👨 :-) 🏁:-) :-)🏁 :-) 🏁 <3 oo oO " 8 | print(emot_obj.emoji(test)) 9 | print(emot_obj.emoticons(test)) 10 | print(test[27:30]) 11 | print(test[17]) 12 | return None 13 | 14 | 15 | def main(): 16 | cProfile.run('test_emo()', sort='time') 17 | return None 18 | 19 | 20 | if __name__ == "__main__": 21 | main() 22 | --------------------------------------------------------------------------------