├── .github ├── FUNDING.yml ├── ISSUE_TEMPLATE │ └── bug_report.md └── PULL_REQUEST_TEMPLATE.md ├── .whitesource ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md ├── TODO.md ├── _config.yml ├── optracker ├── __init__.py ├── data │ ├── __init__.py │ └── face_models │ │ ├── dlib_face_recognition_resnet_model_v1.dat │ │ ├── mmod_human_face_detector.dat │ │ └── shape_predictor_5_face_landmarks.dat ├── facerec │ ├── __init__.py │ ├── api.py │ └── facerec.py ├── functions │ ├── __init__.py │ ├── core_func.py │ ├── db_func.py │ ├── instagram_func.py │ └── side_func.py ├── igramscraper │ ├── __init__.py │ ├── endpoints.py │ ├── exception │ │ ├── __init__.py │ │ ├── instagram_auth_exception.py │ │ ├── instagram_exception.py │ │ └── instagram_not_found_exception.py │ ├── helper.py │ ├── instagram.py │ ├── model │ │ ├── __init__.py │ │ ├── account.py │ │ ├── carousel_media.py │ │ ├── comment.py │ │ ├── initializer_model.py │ │ ├── location.py │ │ ├── media.py │ │ ├── story.py │ │ ├── tag.py │ │ └── user_stories.py │ ├── session_manager.py │ └── two_step_verification │ │ ├── __init__.py │ │ ├── console_verification.py │ │ └── two_step_verification_abstract_class.py ├── optracker.py └── zerodata.py ├── requirements.txt ├── run_tracker.py └── setup.py /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | open_collective: # Replace with a single Open Collective username 3 | ko_fi: mknoph 4 | tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel 5 | issuehunt: # Replace with a single IssueHunt username 6 | otechie: # Replace with a single Otechie username 7 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Screenshots** 24 | If applicable, add screenshots to help explain your problem. 25 | 26 | **Desktop (please complete the following information):** 27 | - OS: [e.g. iOS] 28 | - Browser [e.g. chrome, safari] 29 | - Version [e.g. 22] 30 | 31 | **Additional context** 32 | Add any other context about the problem here. 33 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | # Description 2 | 3 | Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. 4 | 5 | Fixes # (issue) 6 | 7 | ## Type of change 8 | 9 | Please delete options that are not relevant. 10 | 11 | - [ ] Bug fix (non-breaking change which fixes an issue) 12 | - [ ] New feature (non-breaking change which adds functionality) 13 | - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) 14 | - [ ] This change requires a documentation update 15 | 16 | # How Has This Been Tested? 17 | 18 | Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration 19 | 20 | 21 | # Checklist: 22 | 23 | - [ ] My code follows the style guidelines of this project 24 | - [ ] I have performed a self-review of my own code 25 | - [ ] I have commented my code, particularly in hard-to-understand areas 26 | - [ ] I have made corresponding changes to the documentation 27 | - [ ] My changes generate no new warnings 28 | - [ ] I have added tests that prove my fix is effective or that my feature works 29 | - [ ] New and existing unit tests pass locally with my changes 30 | - [ ] Any dependent changes have been merged and published in downstream modules -------------------------------------------------------------------------------- /.whitesource: -------------------------------------------------------------------------------- 1 | { 2 | "checkRunSettings": { 3 | "vulnerableCheckRunConclusionLevel": "failure" 4 | }, 5 | "issueSettings": { 6 | "minSeverityLevel": "LOW" 7 | } 8 | } -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 |

drawing

2 | 3 | # Contributor Covenant Code of Conduct 4 | 5 | ## Our Pledge 6 | 7 | In the interest of fostering an open and welcoming environment, we as 8 | contributors and maintainers pledge to making participation in our project and 9 | our community a harassment-free experience for everyone, regardless of age, body 10 | size, disability, ethnicity, sex characteristics, gender identity and expression, 11 | level of experience, education, socio-economic status, nationality, personal 12 | appearance, race, religion, or sexual identity and orientation. 13 | 14 | ## Our Standards 15 | 16 | Examples of behavior that contributes to creating a positive environment 17 | include: 18 | 19 | * Using welcoming and inclusive language 20 | * Being respectful of differing viewpoints and experiences 21 | * Gracefully accepting constructive criticism 22 | * Focusing on what is best for the community 23 | * Showing empathy towards other community members 24 | 25 | Examples of unacceptable behavior by participants include: 26 | 27 | * The use of sexualized language or imagery and unwelcome sexual attention or 28 | advances 29 | * Trolling, insulting/derogatory comments, and personal or political attacks 30 | * Public or private harassment 31 | * Publishing others' private information, such as a physical or electronic 32 | address, without explicit permission 33 | * Other conduct which could reasonably be considered inappropriate in a 34 | professional setting 35 | 36 | ## Our Responsibilities 37 | 38 | Project maintainers are responsible for clarifying the standards of acceptable 39 | behavior and are expected to take appropriate and fair corrective action in 40 | response to any instances of unacceptable behavior. 41 | 42 | Project maintainers have the right and responsibility to remove, edit, or 43 | reject comments, commits, code, wiki edits, issues, and other contributions 44 | that are not aligned to this Code of Conduct, or to ban temporarily or 45 | permanently any contributor for other behaviors that they deem inappropriate, 46 | threatening, offensive, or harmful. 47 | 48 | ## Scope 49 | 50 | This Code of Conduct applies both within project spaces and in public spaces 51 | when an individual is representing the project or its community. Examples of 52 | representing a project or community include using an official project e-mail 53 | address, posting via an official social media account, or acting as an appointed 54 | representative at an online or offline event. Representation of a project may be 55 | further defined and clarified by project maintainers. 56 | 57 | ## Enforcement 58 | 59 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 60 | reported by contacting the project team at marcuscrazy@gmail.com. All 61 | complaints will be reviewed and investigated and will result in a response that 62 | is deemed necessary and appropriate to the circumstances. The project team is 63 | obligated to maintain confidentiality with regard to the reporter of an incident. 64 | Further details of specific enforcement policies may be posted separately. 65 | 66 | Project maintainers who do not follow or enforce the Code of Conduct in good 67 | faith may face temporary or permanent repercussions as determined by other 68 | members of the project's leadership. 69 | 70 | ## Attribution 71 | 72 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 73 | available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html 74 | 75 | [homepage]: https://www.contributor-covenant.org 76 | 77 | For answers to common questions about this code of conduct, see 78 | https://www.contributor-covenant.org/faq 79 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 |

drawing

2 | 3 | # Contributing 4 | 5 | When contributing to this repository, please first discuss the change you wish to make via issue, 6 | email, or any other method with the owners of this repository before making a change. 7 | 8 | Please note we have a code of conduct, please follow it in all your interactions with the project. 9 | 10 | ## Pull Request Process 11 | 12 | 1. Ensure any install or build dependencies are removed before the end of the layer when doing a 13 | build. 14 | 2. Update the README.md with details of changes to the interface, this includes new environment 15 | variables, exposed ports, useful file locations and container parameters. 16 | 3. Increase the version numbers in any examples files and the README.md to the new version that this 17 | Pull Request would represent. The versioning scheme we use is [SemVer](http://semver.org/). 18 | 4. You may merge the Pull Request in once you have the sign-off of two other developers, or if you 19 | do not have permission to do that, you may request the second reviewer to merge it for you. 20 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Marcus Knoph 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | ![Imgur](https://i.imgur.com/rGPKaNb.png) 3 | ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/optracker) 4 | ![PyPI](https://img.shields.io/pypi/v/optracker) 5 | ![PyPI - Status](https://img.shields.io/pypi/status/optracker) 6 | ![PyPI - License](https://img.shields.io/pypi/l/optracker) 7 | ![PyPI - Downloads](https://img.shields.io/pypi/dm/optracker) 8 | ![Discord](https://img.shields.io/discord/633751704868749322) 9 | 10 | 11 | # openSource Tracker 12 | Easy to use program for scraping openSources, saves data and enable you to analyze it in your favorite graphic display. I created the projected based on instagram scraper, witch allows you to get data from Instagram without API. The goal of this project is to make it easy for everyone to gather openSource content and analyze it. 13 | 14 | 15 | 16 | ## How to install 17 | ***Simply run:*** 18 | ```cmd 19 | pip install optracker 20 | ``` 21 | 22 | ***Or download the project via git clone and run the following:*** 23 | ```cmd 24 | pip install -r requirements.txt 25 | python .\run_tracker.py 26 | ``` 27 | 28 | ## Getting Started 29 | The projects found here are for my own study for confirming and testing out theory according to social network analyzing. They can be used and altered as you see fit. To use it you need to install some required library for python see install. 30 | 31 | ### 1. Running the program 32 | To run simply type: **optracker** in console if you installed it from PIP. If you downloaded it from github: **python .\run_tracker.py** from the optracker directory.
33 |
34 | ***NB! You will need to run the script as administrator if you are using windows*** 35 | 36 | ### 2. Userlist 37 | The program need functional accounts to work. They can either be added manually when you run it for the first time. Or create a local file with usernames and password. They will then be added to the database automatically on startup. In experience you need more then one user account to scan large list of users so your user don't get blocked because of to many requests. 38 | 39 | >**The following user list can be created:** 40 | >- inst_user.txt 41 | >- face_user.txt 42 | >- user_list.txt 43 | 44 | You don't need to have a separate list for facebook or instagram but some people prefer it. You can add all the userdata in the same file by using user_list.txt. They all need to be setup the same way anyway. 45 | 46 | ```python 47 | #Setup for user_list file 48 | {USERNAME}, {PASSWORD}, {EMAIL}, {FULLNAME}, {ACCOUNT} 49 | ``` 50 | Account type can be: **facebook** or **instagram**. It is so the program know witch account to use where. 51 | 52 | ```python 53 | #Example of insta_user.txt 54 | my_username, my_password, my_email, my_fullname, instagram 55 | my_username2, my_password2, my_email3, my_fullname2, facebook 56 | my_username3, my_password3, my_email3, my_fullname3, instagram 57 | ``` 58 | 59 | **User list** will update each time you start the program, so new users can be added directly into the .txt document or you can add them manually into the program at start up. 60 | 61 | **Place for userlist** are in root directory. Usually is it ***c:\optracker*** or ***\optracker*** for Linux 62 | ```cmd 63 | optracker/ 64 | userlist.txt 65 | db/ 66 | openSource-tracker.db 67 | export/ 68 | node.csv 69 | egdes.csv 70 | ``` 71 | 72 | 73 | ### 3. How to use 74 | When you run the program it will first try to connect to Instagram, if youdon'tt have a user file you will be asked to enter a username and password. After that you will get the option to choose from a menu. Start by running a single scan of one account. After that you can run more single scan to grow your node database or use follow by scan options. You also have a help menu that will give you all the information you need.
75 | 76 | > ### Root Folder 77 | > Root folder for the program are the lowest dir. Usally is it ***c:\optracker*** or ***\optracker*** for linux 78 | 79 | ### 4. First time scraping 80 | The first time you scrape all the users will be saved as nodes. This will take some time, since we also want to save all the info we can get for each node. During this a lot of request will be send to the target server for the scrape, and as a result some of your user account may be blocked because of to many request in a short time. Laster when you scrape instagram as an example it will check if the node all ready exist in your database, if so it only add the connections it finds and your request to the server fall. Conclusion is that the bigger node base you have the faster you can scrape, and less request will be made. 81 | 82 | ### 5. Scan all follower 83 | You will be presented with a list of users that you have finnished adding to your database. The program will then scan all the connections it has that are not private, add the nodes to DB and connections in edges. 84 | 85 | ### 6. Max Follows and Max Followed by 86 | During **Scan all follower**, where you scan the profile for one user that have completet the singel search you can set a limit to how many followers a user can have or how many it are following. This is to prevent to scan uninterested profils like public organizations and so on as they can have up to 10K. Default is 2000 and is considerated a normal amount of followes/followed by. 87 | 88 | ### 7. Deepscan and Surfacescan 89 | By turning on surfacescan you only extract username and instagram id when scraping. This is to save you for request to the server so you can use one user for a longer periode of time, and make the scan go quicker if you are scraping a big nettwork. You can later add specific users found in the graphic to a text file and scan only the ones that are interesting and get all the data. 90 | 91 | ### 8. Deepscan from list 92 | Gives you the possibility to run a deep scan on a selected list of users. It will scrape all the data from instagram for the selected ones, and update DB Node. You need to create a file in **ROOT FOLDER** called **user_scan_insta.txt** 93 | ```cmd 94 | optracker/ 95 | userlist.txt 96 | user_scan_insta.txt 97 | db/ 98 | openSource-tracker.db 99 | export/ 100 | node.csv 101 | egdes.csv 102 | ``` 103 | Content of the list need to be one username per line: 104 | ```python 105 | {USER 1} 106 | {USER 2} 107 | {USER 3} 108 | {USER 4} 109 | ``` 110 | ### 9. Detail Print 111 | On Default is it turned **OFF** you will only get the minimum of info to see if it is working properly. If you turn it **ON** will you be presented with all the output the scraper have. 112 | 113 | ### 10. Download Profile Image 114 | The program will download every Instagram profile image it scans for face recognition. It saves it to **profile_pic_insta**. You can turn it of from default value menu. 115 | 116 | ```cmd 117 | optracker/ 118 | userlist.txt 119 | user_scan_insta.txt 120 | db/ 121 | openSource-tracker.db 122 | export/ 123 | node.csv 124 | egdes.csv 125 | instadata/ 126 | profile_pic_insta/ 127 | /**FIRST TWO IN ID** 128 | /**SECOND TWO IN ID** 129 | /**INSTA USER**-**INT INC**.jpg 130 | post/ 131 | ``` 132 | 133 | ### 11. Update Profile Image 134 | Running this will check the DB agenst profile image folder, and download all the images that are missing. 135 | 136 | ### 12. Change default value 137 | From the menu can you change default values like surfacescan, max follow and mysql or sqlite with more. To change select yes, fill in new value, if you dont want to change one value leave it blank. 138 | 139 | ### 13. Face reco- 140 | 141 | ## Database Information 142 | By default the scraper use **SQLite**, all the data are stored in **optracker/db/openSource-tracker.db**. 143 | 144 | > **MySQL** are also available to use. Current version tested and found OK is **MySQL 8.0.18**. You can change the database settings in the menu. But you need to download and install the latest version of Mysql and create a database called **openSource-tracker**, if you dont have an online version you want to use instead of local. Also remember to use **utf8mb4**. The following are default: 145 | > * DB_MYSQL = "localhost" 146 | > * DB_MYSQL_USER = "optracker" 147 | > * DB_MYSQL_PASSWORD = "localpassword" 148 | > * DB_MYSQL_DATABASE = "openSource_tracker" 149 | > * DB_MYSQL_PORT = "3306" 150 | > * DB_MYSQL_ON = 0 151 | > * DB_MYSQL_COLLATION = "utf8mb4_general_ci" 152 | > * DB_MYSQL_CHARSET = "utf8mb4" 153 | > 154 | >***Scraping big amount of data can be really slow if you use SQLite, therefore are MySQL an option if you plan on collectingg huge amounts.*** 155 | 156 | **The database consist of the following tabels:** 157 | - accounts 158 | - edges_insta 159 | - nodes 160 | - options 161 | - new_insta 162 | 163 | > **Note!** All SQL data are saved in **optracker.config** located in root folder. The format are in JSON and you can change it as you would like to match your current DB. But I recomend to keep the standar settings. 164 | 165 | 166 | ### 1. Accounts 167 | Stores all your usernames and password for the different openSource sites. 168 | 169 | ### 2. Edges_insta 170 | Have list of all the connections. Rows are target, source, weight and type. This is all made to be used with gephi for visualizing the data in graph form. The numbers are connected to ID in nodes. Show how is following or connected to who. 171 | 172 | ### 3. nodes 173 | List of all the nodes created. They all have their own ID. It also contain all information scraped on a single user like username, email, bio and so on found in the different scraping sites. 174 | 175 | ### 4. Options 176 | Temporary table to store information like follow list, last search and so on for the program to use. 177 | 178 | ### 5. New_insta 179 | This table have a list of all instagram accounts that have been found during scraping. The program will used this to see witch account have not yet been fully scraped. When it is finnish are the account set to DONE. If you dont want the account to be scraped set the WAIT value to True. 0 = False, 1 = True. ***This can also be used in the case of a user have to many follower, or non at all so you dont want to scan it. When the user pop up, the scanner jumps over it.*** 180 | 181 | ### 6. Export 182 | To export the data you can connect to the DB file under the db/folder. Or you can export it from the program. From main menu choose export. It will the generate two files **nodes.csv** and **egdes.csv**. You can then import this into your favorite graphic display 183 | 184 | ## Common Error 185 | ### 1. F String 186 | ```python 187 | Traceback (most recent call last): 188 | File "/usr/local/bin/optracker", line 6, in 189 | from optracker.optracker import run 190 | File "/usr/local/lib/python3.5/dist-packages/optracker/optracker.py", line 14, in 191 | from igramscraper.instagram import Instagram 192 | File "/usr/local/lib/python3.5/dist-packages/igramscraper/instagram.py", line 153 193 | cookies += f"{key}={session[key]}; " 194 | ^ 195 | SyntaxError: invalid syntax 196 | ``` 197 | To fix update python to latest, you are using an old version that dosent support **f""** you need to use **python3.6** 198 | 199 | ### 2. Instagram useragent 200 | ``` 201 | ERROR: {"message": "useragent mismatch", "status": "fail"} 202 | ``` 203 | Igramscraper are using a useragent that are not up to date. You need to update **self.user_agent** in **igramscraper/instagram.py**. Locate this file and look for somethong that looks like this: 204 | ```python 205 | self.user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) ' \ 206 | 'AppleWebKit/537.36 (KHTML, like Gecko) ' \ 207 | 'Chrome/66.0.3359.139 Safari/537.36' 208 | ``` 209 | After this change it to a new useragent that are allowed by instagram, this is one example that worked in october 2019. 210 | ```python 211 | self.user_agent = 'Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X)' \ 212 | 'AppleWebKit/605.1.15 (KHTML, like Gecko)' \ 213 | 'Mobile/15E148 Instagram 105.0.0.11.118 (iPhone11,8; iOS 12_3_1; en_US; en-US; scale=2.00; 828x1792; 165586599)' 214 | ``` 215 | 216 | ### 3. Private Instagram 217 | ```python 218 | File "\optracker\functions\instagram_func.py", line 20, in get_insta_following 219 | following = self.instagram.get_following(insta_id, totalFollow, self.page_size_check(totalFollow), delayed=True) 220 | File "\optracker\igramscraper\instagram.py", line 963, in get_following 221 | Instagram.HTTP_FORBIDDEN) 222 | optracker.igramscraper.exception.instagram_exception.InstagramException: Failed to get follows of account id ******. The account is private., Code:403 223 | ``` 224 | When searhing profiles sometimes the user have set it to private after first scraping. When extracting data after this the program will stop and give an error that the profile is private. Just run it once more, the program have updated the profile automatic to private so it wont happen on the next scan. 225 | 226 | ### 4. Two step verification. Please report issue., Code:20 227 | ``` 228 | Traceback (most recent call last): 229 | File "python37-32\lib\runpy.py", line 193, in _run_module_as_main 230 | "__main__", mod_spec) 231 | File "python37-32\lib\runpy.py", line 85, in _run_code 232 | exec(code, run_globals) 233 | File "Python37-32\Scripts\optracker.exe\__main__.py", line 7, in 234 | File "python37-32\lib\site-packages\optracker\optracker.py", line 174, in run 235 | myOptracker = Optracker() 236 | File "python37-32\lib\site-packages\optracker\optracker.py", line 56, in __init__ 237 | self.autoSelectAndLogin() 238 | File "python37-32\lib\site-packages\optracker\optracker.py", line 97, in autoSelectAndLogin 239 | self.loginInstagram(self.instagram) 240 | File "python37-32\lib\site-packages\optracker\optracker.py", line 138, in loginInstagram 241 | self.instagram.login(force=False,two_step_verificator=True) 242 | File "python37-32\lib\site-packages\optracker\igramscraper\instagram.py", line 1324, in login 243 | two_step_verificator) 244 | File "python37-32\lib\site-packages\optracker\igramscraper\instagram.py", line 1414, in __verify_two_step 245 | response.status_code) 246 | optracker.igramscraper.exception.instagram_auth_exception.InstagramAuthException: Something went wrong when try two step verification. Please report issue., Code:20 247 | ``` 248 | Something went wrong with instagram login. The username and password could not be used to loggin. Change the user value or add a new user, try once more and it schould work. 249 | 250 | ## What to do with the data? 251 | When you have gathered enough data its time to put them to some good. You have plenty of options first thing first, you can export the standar values from the program its self. It will generate to files: nodes.csv and egdes.csv
252 |
253 | This files are made to be used with [gephi](https://gephi.org). Import it to gephi and start the analyzeing. There are plenty of good tutorials out there for how to process the data. Some tips along the way is: 254 | - Import nodes first then egdes 255 | - Filter out extra nodes: **Filter -> Topology -> Degree Range** set to 2 is a good start. 256 | - Run statistics: **Network Diameter, Avereage Degree, Modularity** 257 | - Set size on nodes attribute: **Betweenness Centrallity** 258 | - Set color on nodes: **Modularity Class** 259 | 260 | This is an exampel of how it can look when finnish to easy see the pattern. You can also turn on label to see the names of the nodes.¨ 261 | 262 | ![Imgur](https://i.imgur.com/cSYySMu.png) 263 | 264 | ## Common Information 265 | - Look at TODO if you want to help: [TODO](https://github.com/suxSx/opensource-tracker/blob/master/TODO.md)
266 | - Read the CODE of Conduct before you edit: [Code of Conduct](https://github.com/suxSx/opensource-tracker/blob/master/CODE_OF_CONDUCT.md)
267 | - We use MIT License: [MIT](https://github.com/suxSx/opensource-tracker/blob/master/LICENSE.md) 268 | 269 | ### Worth mentioning 270 | - instagram-php-scraper [here](https://github.com/postaddictme/instagram-php-scraper/)
271 | - instagram-scraper [here](https://github.com/realsirjoe/instagram-scraper)
272 | - logo-design [here](http://freepik.com) 273 | - face-recognition [here](https://github.com/ageitgey/face_recognition) 274 | -------------------------------------------------------------------------------- /TODO.md: -------------------------------------------------------------------------------- 1 |

drawing

2 | 3 | # Todo: 4 | - [x] 27-10-2019 (U) Add update Node data when you run a check of node DB. 5 | - [ ] Make the code smaller. Repeating steps can be shorten 6 | - [x] 27-10-2019 (U) Make a stop function for if profile is private 7 | - [ ] Add try and catch in get user info. To enable error handling. 8 | - [ ] Make database for followers, and follower for easy rollback on error (delete when current user are done, and keypoint for insta user.) 9 | - [ ] Add functions scan keywords. (Look for specific keywords in user profiles (node) and then use a full single scan) 10 | - [ ] Add other platforms for data gathering 11 | - [x] 18-10-2019 (U) Added surface/deep scan. 12 | - [x] 01-10-2019 (U) Check up on Finnish status message in DB_TABLE_NEW_INSTA 13 | - [x] 07-10-2019 (U) Add max follower criteria in search options. 14 | - [x] 11-10-2019 (U) Root directory, PIP install, class updates. 15 | - [ ] Add user creation options 16 | - [x] 18-10-2019 (U) User DB NODE are updated in setCurrentUser(). And in userselect when scanFollowBy(). 17 | - [x] 27-10-2019 (U) Scan user from DB og text that have Deep = 0 18 | - [x] 18-10-2019 (U) Added scan options for users in txt document. 19 | - [x] 19-10-2019 (P) updateNodesUser() ERROR fix. 20 | - [x] 28-10-2019 (U) Detail print added show minimum text or all. 21 | - [x] 11-11-2019 (N) Facerecognition added. 22 | - [x] 11-11-2019 (U) Scan profile image for a face and add it to collection for later use. 23 | - [ ] Add node-type to node, is it person, page with more. 24 | - [ ] Add download post, and scan faces to create relationship status between users for a more detail map scan. 25 | 26 | ## Rules 27 | When something are done, mark it as finnish and add date of completions and what kind of edit was made. 28 | - (U) = UPDATE 29 | - (P) = PATCH 30 | - (N) = NEW 31 | 32 | Exampel: 33 | - `07-10-2019 (P) Fix bug on line 127 in core.py` 34 | - `07-10-2019 (U) Added better search options` 35 | - `07-10-2019 (N) New functions added able to export into xml` 36 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-slate -------------------------------------------------------------------------------- /optracker/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/__init__.py -------------------------------------------------------------------------------- /optracker/data/__init__.py: -------------------------------------------------------------------------------- 1 | from pkg_resources import resource_filename 2 | 3 | def pose_predictor_five_point_model_location(): 4 | return resource_filename(__name__, "face_models/shape_predictor_5_face_landmarks.dat") 5 | 6 | def face_recognition_model_location(): 7 | return resource_filename(__name__, "face_models/dlib_face_recognition_resnet_model_v1.dat") 8 | 9 | def cnn_face_detector_model_location(): 10 | return resource_filename(__name__, "face_models/mmod_human_face_detector.dat") -------------------------------------------------------------------------------- /optracker/data/face_models/dlib_face_recognition_resnet_model_v1.dat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/data/face_models/dlib_face_recognition_resnet_model_v1.dat -------------------------------------------------------------------------------- /optracker/data/face_models/mmod_human_face_detector.dat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/data/face_models/mmod_human_face_detector.dat -------------------------------------------------------------------------------- /optracker/data/face_models/shape_predictor_5_face_landmarks.dat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/data/face_models/shape_predictor_5_face_landmarks.dat -------------------------------------------------------------------------------- /optracker/facerec/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/facerec/__init__.py -------------------------------------------------------------------------------- /optracker/facerec/api.py: -------------------------------------------------------------------------------- 1 | import PIL.Image 2 | import dlib 3 | import numpy as np 4 | from PIL import ImageFile 5 | from ..data import * 6 | 7 | class face_recognition(): 8 | def __init__(self): 9 | ImageFile.LOAD_TRUNCATED_IMAGES = True 10 | self.face_detector = dlib.get_frontal_face_detector() 11 | 12 | self.predictor_5_point_model = pose_predictor_five_point_model_location() 13 | self.pose_predictor_5_point = dlib.shape_predictor(self.predictor_5_point_model) 14 | 15 | self.cnn_face_detection_model = cnn_face_detector_model_location() 16 | self.cnn_face_detector = dlib.cnn_face_detection_model_v1(self.cnn_face_detection_model) 17 | 18 | self.face_recognition_model = face_recognition_model_location() 19 | self.face_encoder = dlib.face_recognition_model_v1(self.face_recognition_model) 20 | 21 | 22 | def _rect_to_css(self, rect): 23 | """ 24 | Convert a dlib 'rect' object to a plain tuple in (top, right, bottom, left) order 25 | 26 | :param rect: a dlib 'rect' object 27 | :return: a plain tuple representation of the rect in (top, right, bottom, left) order 28 | """ 29 | return rect.top(), rect.right(), rect.bottom(), rect.left() 30 | 31 | 32 | def _css_to_rect(self, css): 33 | """ 34 | Convert a tuple in (top, right, bottom, left) order to a dlib `rect` object 35 | 36 | :param css: plain tuple representation of the rect in (top, right, bottom, left) order 37 | :return: a dlib `rect` object 38 | """ 39 | return dlib.rectangle(css[3], css[0], css[1], css[2]) 40 | 41 | 42 | def _trim_css_to_bounds(self, css, image_shape): 43 | """ 44 | Make sure a tuple in (top, right, bottom, left) order is within the bounds of the image. 45 | 46 | :param css: plain tuple representation of the rect in (top, right, bottom, left) order 47 | :param image_shape: numpy shape of the image array 48 | :return: a trimmed plain tuple representation of the rect in (top, right, bottom, left) order 49 | """ 50 | return max(css[0], 0), min(css[1], image_shape[1]), min(css[2], image_shape[0]), max(css[3], 0) 51 | 52 | 53 | def face_distance(self, face_encodings, face_to_compare): 54 | """ 55 | Given a list of face encodings, compare them to a known face encoding and get a euclidean distance 56 | for each comparison face. The distance tells you how similar the faces are. 57 | 58 | :param faces: List of face encodings to compare 59 | :param face_to_compare: A face encoding to compare against 60 | :return: A numpy ndarray with the distance for each face in the same order as the 'faces' array 61 | """ 62 | if len(face_encodings) == 0: 63 | return np.empty((0)) 64 | 65 | return np.linalg.norm(face_encodings - face_to_compare, axis=1) 66 | 67 | 68 | def load_image_file(self, file, mode='RGB'): 69 | """ 70 | Loads an image file (.jpg, .png, etc) into a numpy array 71 | 72 | :param file: image file name or file object to load 73 | :param mode: format to convert the image to. Only 'RGB' (8-bit RGB, 3 channels) and 'L' (black and white) are supported. 74 | :return: image contents as numpy array 75 | """ 76 | im = PIL.Image.open(file) 77 | if mode: 78 | im = im.convert(mode) 79 | return np.array(im) 80 | 81 | 82 | def _raw_face_locations(self, img, number_of_times_to_upsample=1, model="hog"): 83 | """ 84 | Returns an array of bounding boxes of human faces in a image 85 | 86 | :param img: An image (as a numpy array) 87 | :param number_of_times_to_upsample: How many times to upsample the image looking for faces. Higher numbers find smaller faces. 88 | :param model: Which face detection model to use. "hog" is less accurate but faster on CPUs. "cnn" is a more accurate 89 | deep-learning model which is GPU/CUDA accelerated (if available). The default is "hog". 90 | :return: A list of dlib 'rect' objects of found face locations 91 | """ 92 | if model == "cnn": 93 | return self.cnn_face_detector(img, number_of_times_to_upsample) 94 | else: 95 | return self.face_detector(img, number_of_times_to_upsample) 96 | 97 | 98 | def face_locations(self, img, number_of_times_to_upsample=1, model="hog"): 99 | """ 100 | Returns an array of bounding boxes of human faces in a image 101 | 102 | :param img: An image (as a numpy array) 103 | :param number_of_times_to_upsample: How many times to upsample the image looking for faces. Higher numbers find smaller faces. 104 | :param model: Which face detection model to use. "hog" is less accurate but faster on CPUs. "cnn" is a more accurate 105 | deep-learning model which is GPU/CUDA accelerated (if available). The default is "hog". 106 | :return: A list of tuples of found face locations in css (top, right, bottom, left) order 107 | """ 108 | if model == "cnn": 109 | return [self._trim_css_to_bounds(self._rect_to_css(face.rect), img.shape) for face in self._raw_face_locations(img, number_of_times_to_upsample, "cnn")] 110 | else: 111 | return [self._trim_css_to_bounds(self._rect_to_css(face), img.shape) for face in self._raw_face_locations(img, number_of_times_to_upsample, model)] 112 | 113 | 114 | def _raw_face_locations_batched(self, images, number_of_times_to_upsample=1, batch_size=128): 115 | """ 116 | Returns an 2d array of dlib rects of human faces in a image using the cnn face detector 117 | 118 | :param img: A list of images (each as a numpy array) 119 | :param number_of_times_to_upsample: How many times to upsample the image looking for faces. Higher numbers find smaller faces. 120 | :return: A list of dlib 'rect' objects of found face locations 121 | """ 122 | return cnn_face_detector(images, number_of_times_to_upsample, batch_size=batch_size) 123 | 124 | 125 | def batch_face_locations(self, images, number_of_times_to_upsample=1, batch_size=128): 126 | """ 127 | Returns an 2d array of bounding boxes of human faces in a image using the cnn face detector 128 | If you are using a GPU, this can give you much faster results since the GPU 129 | can process batches of images at once. If you aren't using a GPU, you don't need this function. 130 | 131 | :param img: A list of images (each as a numpy array) 132 | :param number_of_times_to_upsample: How many times to upsample the image looking for faces. Higher numbers find smaller faces. 133 | :param batch_size: How many images to include in each GPU processing batch. 134 | :return: A list of tuples of found face locations in css (top, right, bottom, left) order 135 | """ 136 | def convert_cnn_detections_to_css(detections): 137 | return [_trim_css_to_bounds(_rect_to_css(face.rect), images[0].shape) for face in detections] 138 | 139 | raw_detections_batched = _raw_face_locations_batched(images, number_of_times_to_upsample, batch_size) 140 | 141 | return list(map(convert_cnn_detections_to_css, raw_detections_batched)) 142 | 143 | 144 | def _raw_face_landmarks(self, face_image, face_locations=None, model="large"): 145 | if face_locations is None: 146 | face_locations = _raw_face_locations(face_image) 147 | else: 148 | face_locations = [_css_to_rect(face_location) for face_location in face_locations] 149 | 150 | 151 | pose_predictor = pose_predictor_5_point 152 | 153 | return [pose_predictor(face_image, face_location) for face_location in face_locations] 154 | 155 | 156 | def face_landmarks(self, face_image, face_locations=None, model="large"): 157 | """ 158 | Given an image, returns a dict of face feature locations (eyes, nose, etc) for each face in the image 159 | 160 | :param face_image: image to search 161 | :param face_locations: Optionally provide a list of face locations to check. 162 | :param model: Optional - which model to use. "large" (default) or "small" which only returns 5 points but is faster. 163 | :return: A list of dicts of face feature locations (eyes, nose, etc) 164 | """ 165 | landmarks = self._raw_face_landmarks(face_image, face_locations, model) 166 | landmarks_as_tuples = [[(p.x, p.y) for p in landmark.parts()] for landmark in landmarks] 167 | 168 | # For a definition of each point index, see https://cdn-images-1.medium.com/max/1600/1*AbEg31EgkbXSQehuNJBlWg.png 169 | if model == 'large': 170 | return [{ 171 | "chin": points[0:17], 172 | "left_eyebrow": points[17:22], 173 | "right_eyebrow": points[22:27], 174 | "nose_bridge": points[27:31], 175 | "nose_tip": points[31:36], 176 | "left_eye": points[36:42], 177 | "right_eye": points[42:48], 178 | "top_lip": points[48:55] + [points[64]] + [points[63]] + [points[62]] + [points[61]] + [points[60]], 179 | "bottom_lip": points[54:60] + [points[48]] + [points[60]] + [points[67]] + [points[66]] + [points[65]] + [points[64]] 180 | } for points in landmarks_as_tuples] 181 | elif model == 'small': 182 | return [{ 183 | "nose_tip": [points[4]], 184 | "left_eye": points[2:4], 185 | "right_eye": points[0:2], 186 | } for points in landmarks_as_tuples] 187 | else: 188 | raise ValueError("Invalid landmarks model type. Supported models are ['small', 'large'].") 189 | 190 | 191 | def face_encodings(self, face_image, known_face_locations=None, num_jitters=1): 192 | """ 193 | Given an image, return the 128-dimension face encoding for each face in the image. 194 | 195 | :param face_image: The image that contains one or more faces 196 | :param known_face_locations: Optional - the bounding boxes of each face if you already know them. 197 | :param num_jitters: How many times to re-sample the face when calculating encoding. Higher is more accurate, but slower (i.e. 100 is 100x slower) 198 | :return: A list of 128-dimensional face encodings (one for each face in the image) 199 | """ 200 | raw_landmarks = _raw_face_landmarks(face_image, known_face_locations, model="small") 201 | return [np.array(face_encoder.compute_face_descriptor(face_image, raw_landmark_set, num_jitters)) for raw_landmark_set in raw_landmarks] 202 | 203 | 204 | def compare_faces(self, known_face_encodings, face_encoding_to_check, tolerance=0.6): 205 | """ 206 | Compare a list of face encodings against a candidate encoding to see if they match. 207 | 208 | :param known_face_encodings: A list of known face encodings 209 | :param face_encoding_to_check: A single face encoding to compare against the list 210 | :param tolerance: How much distance between faces to consider it a match. Lower is more strict. 0.6 is typical best performance. 211 | :return: A list of True/False values indicating which known_face_encodings match the face encoding to check 212 | """ 213 | return list(face_distance(known_face_encodings, face_encoding_to_check) <= tolerance) 214 | -------------------------------------------------------------------------------- /optracker/facerec/facerec.py: -------------------------------------------------------------------------------- 1 | from PIL import Image 2 | from .api import face_recognition 3 | 4 | class facerec(): 5 | def __init__(self, Zero): 6 | self.zero = Zero 7 | 8 | self.zero.printText("+ Loading Face Recognition", True) 9 | self.face = face_recognition() 10 | 11 | def findFaceinImgCNN(self, img): 12 | print("Finding Face CNN") 13 | image = self.face.load_image_file(img) 14 | face_locations = self.face.face_locations(image, number_of_times_to_upsample=0, model="cnn") 15 | print("I found {} face(s) in this photograph.".format(len(face_locations))) 16 | 17 | for face_location in face_locations: 18 | # Print the location of each face in this image 19 | top, right, bottom, left = face_location 20 | print("A face is located at pixel location Top: {}, Left: {}, Bottom: {}, Right: {}".format(top, left, bottom, right)) 21 | 22 | # You can access the actual face itself like this: 23 | face_image = image[top:bottom, left:right] 24 | pil_image = Image.fromarray(face_image) 25 | pil_image.show() 26 | 27 | def findFaceinImg(self, img): 28 | self.zero.printText("+ Finding Face HOG log", False) 29 | image = self.face.load_image_file(img) 30 | face_locations = self.face.face_locations(image) 31 | 32 | #Return Image Value of all faces in array. Use LEN to see how many 33 | return face_locations, image 34 | 35 | def readSource(self, file): 36 | print("Read source") 37 | -------------------------------------------------------------------------------- /optracker/functions/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /optracker/functions/core_func.py: -------------------------------------------------------------------------------- 1 | import os 2 | import requests 3 | import shutil 4 | import filecmp 5 | import imghdr 6 | from PIL import Image 7 | from ..functions.instagram_func import * 8 | 9 | class coreFunc(): 10 | def __init__(self, dbTool, dbConn, instagram, Zero, Facerec): 11 | self.dbTool = dbTool 12 | self.dbConn = dbConn 13 | self.instagram = instagram 14 | self.zero = Zero 15 | self.instaTool = InstagramFunc(self.instagram) 16 | self.curPrivate = 0 17 | self.facerec = Facerec 18 | 19 | def find_between_r(self, s, first, last,): 20 | try: 21 | start = s.rindex( first ) + len( first ) 22 | end = s.rindex( last, start ) 23 | return s[start:end] 24 | except ValueError: 25 | return "" 26 | 27 | def is_similar(self, image1, image2): 28 | return filecmp.cmp(image1, image2) 29 | 30 | def createFolderIf(self, folder): 31 | if not os.path.exists(folder): 32 | os.mkdir(folder) 33 | self.zero.printText("+ Folder created: {}".format(folder), False) 34 | else: 35 | self.zero.printText("+ Folder loacted: {}".format(folder), False) 36 | 37 | def setImageName(self, folder, username, type): 38 | c = 1 39 | d = True 40 | currentFile = "" 41 | while d == True: 42 | currentFile = folder + username + "-" + str(c) + type 43 | if os.path.isfile(currentFile) == False: 44 | d = False 45 | else: 46 | self.zero.printText("+ File: {} exist trying next.".format(currentFile), False) 47 | c += 1 48 | 49 | self.zero.printText("+ Setting filename: {}".format(currentFile), False) 50 | return currentFile 51 | 52 | def compareImage(self, path, currentFile): 53 | #Get files in directory # r=root, d=directories, f = files 54 | files = [] 55 | for r, d, f in os.walk(path): 56 | for file in f: 57 | files.append(os.path.join(r, file)) 58 | 59 | #Check if its the only ones 60 | for f in files: 61 | if f != currentFile: 62 | self.zero.printText("+ Checking {} and {} if same.".format(currentFile, f), False) 63 | if self.is_similar(currentFile, f) == True: 64 | self.zero.printText("+ Image same deleting: {}".format(currentFile), False) 65 | os.remove(currentFile) 66 | return True 67 | return False 68 | 69 | def createInstaProfileFolder(self, ID): 70 | curr = self.zero.OP_INSTA_PROFILEFOLDER_NAME_VALUE 71 | counter = 1 72 | 73 | for i in range(0, len(ID), 2): 74 | if counter <= 2: 75 | curr = curr + ID[i:i + 2] + '\\' 76 | self.createFolderIf(curr) 77 | counter += 1 78 | 79 | curr = curr + ID + "\\" 80 | self.createFolderIf(curr) 81 | return curr 82 | 83 | def downloadProfileImage(self, name, username, type, url): 84 | instaFolder = self.createInstaProfileFolder(name) 85 | file = self.setImageName(instaFolder, username, type) 86 | 87 | self.zero.printText("+ Downloading Image: {}".format(file), True) 88 | downloadok = True 89 | 90 | #Write Image 91 | resp = requests.get(url, stream=True) 92 | local_file = open(file, 'wb') 93 | resp.raw.decode_content = True 94 | shutil.copyfileobj(resp.raw, local_file) 95 | local_file.close() 96 | del resp 97 | 98 | #Read Image to verify 99 | type = imghdr.what(file) 100 | 101 | if str(type) != "None": 102 | self.zero.printText("+ Download Complete", True) 103 | else: 104 | downloadok = False 105 | self.zero.printText("+ File dont contain image, deleting file", True) 106 | os.remove(file) 107 | 108 | if downloadok == True: 109 | if self.compareImage(instaFolder, file) == False: 110 | if int(self.zero.FACEREC_ON_VALUE) == int(1): 111 | self.zero.printText("+ Face scan active, scanning image", True) 112 | face, image = self.facerec.findFaceinImg(file) 113 | if len(face) == 0: 114 | self.zero.printText("+ Found no face in image, deleting file", True) 115 | os.remove(file) 116 | else: 117 | self.zero.printText("+ Found {} faces in image".format(len(face)), True) 118 | 119 | def exportDBData(self): 120 | self.zero.printText("\n- Loading current data from DB", True) 121 | totalNodes = self.dbTool.getValueSQLnoinput(self.dbConn, self.zero.DB_SELECT_COUNT_NODES)[0][0] 122 | totalEdgesInsta = self.dbTool.getValueSQLnoinput(self.dbConn, self.zero.DB_SELECT_COUNT_EDES_INSTA)[0][0] 123 | self.zero.printText("+ Total nodes: {}\n+ Total egdes from instagram:{}".format(totalNodes, totalEdgesInsta), True) 124 | exportyes = input("+ Do you want to export? [Y/n] ") 125 | 126 | if exportyes.lower().strip() != "n": 127 | self.zero.printText("+ Exporting NODES", True) 128 | self.dbTool.exportNode(self.dbConn, self.zero.DB_SELECT_EXPORT_ID_USER, self.zero.DB_DATABASE_EXPORT_FOLDER + self.zero.DB_DATABASE_EXPORT_NODES) 129 | self.zero.printText("+ NODES exported to: {}".format(self.zero.DB_DATABASE_EXPORT_NODES), True) 130 | 131 | self.zero.printText("+ Exporting EDGES", True) 132 | self.dbTool.exportNode(self.dbConn, self.zero.DB_SELECT_ALL_INSTA_EDGES, self.zero.DB_DATABASE_EXPORT_FOLDER + self.zero.DB_DATABASE_EXPORT_INSTA_EGDE) 133 | self.zero.printText("+ EDGES exported to: {}".format(self.zero.DB_DATABASE_EXPORT_INSTA_EGDE), True) 134 | 135 | def getDoneUserIDFromInsta(self): 136 | self.zero.printText("\n- Loading done user from instagram", True) 137 | userList = self.dbTool.getValueSQLnoinput(self.dbConn, self.zero.DB_SELECT_ALL_DONE_NEW_INSTA) 138 | 139 | if userList == 0: 140 | self.zero.printText("+ No users in database that have been scannet 100%", True) 141 | return 0 142 | 143 | else: 144 | self.zero.printText("+ User list imported", True) 145 | count = 0 146 | for i in userList: 147 | count += 1 148 | self.zero.printText("[{}] {} ({})".format(count, i[0], str(i[1]).strip()), True) 149 | selectUser = input("+ Select user (1-{}): ".format(count)) 150 | 151 | if not selectUser.isnumeric(): 152 | self.zero.printText("+ Invalid input, #1 selected", True) 153 | selectUser = 1 154 | 155 | if int(selectUser) > count: 156 | self.zero.printText("+ Invalid input, #1 selected", True) 157 | selectUser = 1 158 | 159 | newNumber = int(selectUser) - 1 160 | return userList[newNumber] 161 | 162 | def updateNodesUser(self, instaID): 163 | self.zero.printText("+ Updating user data for: {}".format(instaID), False) 164 | newDataUser = self.instaTool.get_insta_account_info_id(instaID) 165 | self.zero.printText("+ User data loaded.", False) 166 | label = self.getLabelforUser(newDataUser) 167 | 168 | #Download profile Image 169 | if int(self.zero.DOWNLOAD_PROFILE_INSTA_VALUE) == 1: 170 | self.downloadProfileImage(newDataUser.identifier, newDataUser.username, self.zero.INSTA_FILE_EXT, newDataUser.get_profile_picture_url()) 171 | 172 | UPDATE_DATA = (self.zero.sanTuple(newDataUser.full_name), self.zero.sanTuple(label), newDataUser.get_profile_picture_url(), newDataUser.follows_count, newDataUser.followed_by_count, self.zero.sanTuple(newDataUser.biography), newDataUser.username, newDataUser.is_private, newDataUser.is_verified, newDataUser.media_count, newDataUser.external_url, 1, newDataUser.identifier) 173 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_NODES, UPDATE_DATA) 174 | self.zero.printText("+ Update of DB NODE complete.", False) 175 | return newDataUser 176 | 177 | def updateProfileImg(self): 178 | self.zero.printText("\n- Starting Profile Img Update", True) 179 | user_img_list = self.dbTool.getValueSQLnoinput(self.dbConn, self.zero.DB_SELECT_IMG) 180 | lengList = len(user_img_list) 181 | counter = 1 182 | for u in user_img_list: 183 | self.zero.printText("\n+ {} of {}: {}".format(counter, lengList, u[0]), True) 184 | counter += 1 185 | #Download profile Image 186 | self.downloadProfileImage(str(u[1]), str(u[0]), self.zero.INSTA_FILE_EXT, str(u[2])) 187 | 188 | def updateNodesUserLoaded(self, newDataUser): 189 | self.zero.printText("+ Updating user data for: {} ({})".format(newDataUser.username, newDataUser.identifier), False) 190 | label = self.getLabelforUser(newDataUser) 191 | UPDATE_DATA = (self.zero.sanTuple(newDataUser.full_name), self.zero.sanTuple(label), newDataUser.get_profile_picture_url(), newDataUser.follows_count, newDataUser.followed_by_count, self.zero.sanTuple(newDataUser.biography), newDataUser.username, newDataUser.is_private, newDataUser.is_verified, newDataUser.media_count, newDataUser.external_url, 1, newDataUser.identifier) 192 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_NODES, UPDATE_DATA) 193 | self.zero.printText("+ Update of DB NODE complete.", False) 194 | 195 | def updateNodeFromList(self): 196 | self.zero.printText("\n- Updating users from list", True) 197 | fullpath = self.zero.OP_ROOT_FOLDER_PATH_VALUE + self.zero.USER_FILE_SCAN_NODE_INSTA 198 | if os.path.isfile(fullpath): 199 | self.zero.printText("+ Found: {}, extracting data".format(fullpath), True) 200 | with open(fullpath) as fp: 201 | line = fp.readline() 202 | while line: 203 | if line != 0: 204 | user = line.strip() 205 | zero.printText("+ Getting user info for {}:".format(user), True) 206 | updatenode = self.instaTool.get_insta_account_info(user) 207 | 208 | #Download profile Image 209 | if int(self.zero.DOWNLOAD_PROFILE_INSTA_VALUE) == 1: 210 | self.downloadProfileImage(updatenode.identifier, updatenode.username, self.zero.INSTA_FILE_EXT, updatenode.get_profile_picture_url()) 211 | 212 | self.updateNodesUserLoaded(updatenode) 213 | line = fp.readline() 214 | else: 215 | self.zero.printText("+ File not found.", True) 216 | self.zero.printText("+ Create {} to continue.".format(fullpath), True) 217 | 218 | def deepScanAll(self): 219 | self.zero.printText("\n-Geting users from DB", True) 220 | allDeep = self.dbTool.getValueSQLnoinput(self.dbConn, self.zero.DB_SELECT_DEEPSCAN_NEED) 221 | lengDeep = len(allDeep) 222 | counter = 1 223 | for u in allDeep: 224 | user = u[0] 225 | self.zero.printText("+ {} of {} - Getting user info for {}:".format(counter, lengDeep, user), True) 226 | updatenode = self.instaTool.get_insta_account_info(user) 227 | 228 | #Download profile Image 229 | if int(self.zero.DOWNLOAD_PROFILE_INSTA_VALUE) == 1: 230 | self.downloadProfileImage(updatenode.identifier, updatenode.username, self.zero.INSTA_FILE_EXT, updatenode.get_profile_picture_url()) 231 | 232 | self.updateNodesUserLoaded(updatenode) 233 | counter += 1 234 | 235 | def scanFollowToInstaID(self): 236 | currentInstaID = self.getDoneUserIDFromInsta() 237 | 238 | self.zero.printText("\n- Starting scan by follow", True) 239 | if currentInstaID == 0: 240 | self.zero.printText("+ No users could be selected.\n+ Run a full single scan of a user to continue.", True) 241 | 242 | else: 243 | currentUser = currentInstaID[1] 244 | currentID = currentInstaID[0] 245 | 246 | getMaxValueFOLLOW = int(self.zero.INSTA_MAX_FOLLOW_SCAN_VALUE) 247 | getMaxValueFOLLOWBY = int(self.zero.INSTA_MAX_FOLLOW_BY_SCAN_VALUE) 248 | 249 | self.zero.printText("+ Current insta id: {} ({})".format(currentID, currentUser), True) 250 | self.zero.printText("+ Looking up NODE ID.", True) 251 | currentNode = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_ID_NODE, (currentID, ))[0][0] 252 | self.zero.printText("+ Node ID found: {}".format(currentNode), True) 253 | self.zero.printText("+ Loading followed by list where PRIVATE = 0", True) 254 | followList = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_FOLLOW_OF, (currentNode, )) 255 | if followList == 0: 256 | self.zero.printText("+ Are followed nobody that have PUBLIC profile.", True) 257 | else: 258 | lenFollowList = len(followList) 259 | counter = 0 260 | self.zero.printText("+ Loaded {} users from: {} where private = 0".format(lenFollowList, currentUser), True) 261 | 262 | #TODO: ADD SORTING OF USER BASED ON KEY WORD FROM BIO 263 | for i in followList: 264 | counter += 1 265 | self.zero.printText("\n- {} of {} :: {}".format(counter, lenFollowList, i[8]), True), 266 | self.zero.printText("+ Checking search status for: {}".format(i[3]), False) 267 | moveON = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_DONE_NEW_INSTA, (i[3],)) 268 | if moveON[0][0] == 1: 269 | self.zero.printText("+ User SCAN are allready DONE.", True) 270 | else: 271 | if moveON[0][1] == 1: 272 | self.zero.printText("+ User NOT scanned but set on WAIT", True) 273 | else: 274 | self.zero.printText("+ User VALID for singel scan.", True) 275 | 276 | scan_insta_followed_by = int(i[6]) 277 | scan_insta_follow = int(i[5]) 278 | 279 | #TODO: SPEEDUP SCAN - LESS SQL REQUEST 280 | deepScan = int(i[13]) 281 | if deepScan == 0: 282 | #User have not been deepscanned scan and update 283 | self.zero.printText("+ User missing deepScan, getting info.", False) 284 | newDataUser = self.updateNodesUser(i[3]) 285 | scan_insta_followed_by = int(newDataUser.followed_by_count) 286 | scan_insta_follow = int(newDataUser.follows_count) 287 | 288 | 289 | self.zero.printText("+ User are following: {}\n+ User are followed by: {}".format(scan_insta_follow, scan_insta_followed_by), False) 290 | 291 | #Search sorting firt step follows_count 292 | if scan_insta_follow <= getMaxValueFOLLOW: 293 | if scan_insta_followed_by <= getMaxValueFOLLOWBY: 294 | #Search critera for allowed OK Start scan. 295 | self.setCurrentUser(i[8].strip()) 296 | 297 | #Check if private 298 | if self.curPrivate == 0: 299 | #Extract info from following list 300 | if scan_insta_follow != 0: 301 | if self.loadFollowlist(False) == True: 302 | self.add_egde_from_list_insta(False) 303 | else: 304 | self.zero.printText("+ Follow list is empty", False) 305 | 306 | #Extract followed by 307 | if scan_insta_followed_by != 0: 308 | if self.loadFollowlist(True) == True: 309 | self.add_egde_from_list_insta(True) 310 | else: 311 | self.zero.printText("+ Follow by list is empty", False) 312 | 313 | #Update new_Insta 314 | self.zero.printText("\n- Scan complete", False) 315 | self.zero.printText("+ Setting {} ({}) to complete.".format(i[8], i[3]), True) 316 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_NEW_INSTA_DONE_TRUE, (i[3],)) 317 | else: 318 | self.zero.printText("+ User profile are private after update.", True) 319 | self.zero.printText("+ Setting {} ({}) to complete.".format(i[8], i[3]), True) 320 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_NEW_INSTA_DONE_TRUE, (i[3],)) 321 | else: 322 | self.zero.printText("+ User are followed by to many, increese allowed follow to continue", True) 323 | else: 324 | self.zero.printText("+ User are following to many, increese allowed follow to continue", True) 325 | 326 | 327 | def loadFollowlist(self, inOut): #False load Follow, True Load followers 328 | continueScan = True 329 | 330 | if inOut == False: 331 | #Getting following 332 | self.zero.printText("\n- Loading follows list for: {}".format(self.currentUser.username), True) 333 | self.followNumber = self.currentUser.follows_count; 334 | if self.followNumber != 0: 335 | self.zero.printText("+ {} are following {} starting info extract.".format(self.currentUser.full_name, self.followNumber), False) 336 | self.imported_follow = self.instaTool.get_insta_following(self.followNumber, self.currentUser.identifier) 337 | self.lenImpF = len(self.imported_follow['accounts']) 338 | self.zero.printText("+ Total loaded: {}".format(self.lenImpF), False) 339 | continueScan = True 340 | else: 341 | print("+ {} are following NOBODY, skipping this stage".format(self.currentUser.username)) 342 | continueScan = False 343 | else: 344 | #Getting following 345 | self.zero.printText("\n- Loading followed by list for: {}".format(self.currentUser.username), True) 346 | self.followNumber = self.currentUser.followed_by_count; 347 | if self.followNumber != 0: 348 | self.zero.printText("+ {} are followed by {} starting info extract".format(self.currentUser.full_name, self.followNumber), False) 349 | self.imported_follow = self.instaTool.get_insta_follow_by(self.followNumber, self.currentUser.identifier) 350 | self.lenImpF = len(self.imported_follow['accounts']) 351 | self.zero.printText("+ Total loaded: {}".format(self.lenImpF), False) 352 | continueScan = True 353 | else: 354 | print("+ {} are following NOBODY, skipping this stage".format(self.currentUser.username)) 355 | continueScan = False 356 | 357 | return continueScan 358 | 359 | def setCurrentUser(self, user): 360 | #Get information 361 | self.zero.printText("\n- Setting current user to: {}".format(user), True) 362 | 363 | #Check if zeroPoint is in DB if not add. 364 | self.zero.printText("+ Getting user information from Instagram", True) 365 | self.currentUser = self.instaTool.get_insta_account_info(user) 366 | self.curPrivate = self.currentUser.is_private 367 | self.check_user_db_node(self.currentUser, False) 368 | 369 | #Download profile Image 370 | if int(self.zero.DOWNLOAD_PROFILE_INSTA_VALUE) == 1: 371 | self.downloadProfileImage(self.currentUser.identifier, self.currentUser.username, self.zero.INSTA_FILE_EXT, self.currentUser.get_profile_picture_url()) 372 | 373 | #Update User information 374 | self.updateNodesUserLoaded(self.currentUser) 375 | 376 | #Check if in new_Insta 377 | self.check_new_insta(self.currentUser.identifier, self.currentUser.username) 378 | 379 | #Getting current NODE ID for source 380 | self.sourceID = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_ID_NODE, (self.currentUser.identifier, ))[0][0] 381 | self.zero.printText("+ Recived node ID: {} for zeroPoint: {}".format(self.sourceID, self.currentUser.username), True) 382 | 383 | #Setting global INSTA # IDEA 384 | self.zero.printText("+ Global insta ID set to {}".format(self.currentUser.identifier), True) 385 | self.zero.INSTA_USER_ID = self.currentUser.identifier 386 | 387 | def check_new_insta(self, instaID, insert_username): 388 | self.zero.printText("+ Checking new_insta DB for: {}".format(instaID), False) 389 | getNewinsta = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_DONE_NEW_INSTA, (instaID, )) 390 | if getNewinsta == 0: 391 | self.zero.printText("+ NOT found in new_insta adding user_id: {} ({})".format(instaID, insert_username), False) 392 | self.zero.INSERT_DATA = (instaID, insert_username) 393 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_INSERT_NEW_INSTA, self.zero.INSERT_DATA) 394 | else: 395 | self.zero.printText("+ FOUND in new_insta", False) 396 | if getNewinsta[0][0] == 1: 397 | self.zero.printText("+ STATUS = FINNISH", False) 398 | else: 399 | if getNewinsta[0][1] == 1: 400 | self.zero.printText("+ STATUS = WAIT", False) 401 | else: 402 | self.zero.printText("+ STATUS = IN LINE", False) 403 | 404 | def getLabelforUser(self, user): 405 | self.zero.printText("+ Are full_name empty?", False) 406 | if user.full_name: 407 | self.zero.printText("+ NO", False) 408 | self.zero.printText("+ Using: {} for label.".format(user.full_name), False) 409 | return user.full_name 410 | 411 | else: 412 | self.zero.printText("+ YES", False) 413 | self.zero.printText("+ Using: {} for label.".format(user.username), False) 414 | return user.username 415 | 416 | def check_user_db_node(self, user, getInfo): 417 | #Check if we do a full scan 418 | getSurfaceScan = int(self.zero.SURFACE_SCAN_VALUE) 419 | 420 | #Get node id 421 | userNodeID = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_ID_NODE, (user.identifier, )) 422 | 423 | self.zero.printText("+ Checking NODE DB for id: {} ({})".format(user.identifier, user.username), False) 424 | if userNodeID == 0: 425 | self.zero.printText("+ NOT found in node", False) 426 | tempID = user.identifier; 427 | 428 | if getSurfaceScan == 0: 429 | if getInfo == True: 430 | self.zero.printText("+ Getting user data for: {}".format(user.username), False) 431 | user = self.instaTool.get_insta_account_info_id(tempID) 432 | 433 | #Download profile 434 | if int(self.zero.DOWNLOAD_PROFILE_INSTA_VALUE) == 1: 435 | self.downloadProfileImage(user.identifier, user.username, self.zero.INSTA_FILE_EXT, user.get_profile_picture_url()) 436 | 437 | label = self.getLabelforUser(user) 438 | self.zero.INSERT_DATA = (self.zero.sanTuple(user.full_name), self.zero.sanTuple(label), user.identifier, user.get_profile_picture_url(), user.follows_count, user.followed_by_count, self.zero.sanTuple(user.biography), user.username, user.is_private, user.is_verified, user.media_count, user.external_url, 1, user.identifier) 439 | 440 | else: 441 | #TODO: Add image download to surface`? 442 | self.zero.printText("+ Surfacescan are ON", False) 443 | 444 | if user.is_private == False: 445 | user.is_private = 0 446 | else: 447 | user.is_private = 1 448 | 449 | if user.is_verified == False: 450 | user.is_verified = 0 451 | else: 452 | user.is_verified = 1 453 | 454 | label = self.getLabelforUser(user) 455 | self.zero.INSERT_DATA = (self.zero.sanTuple(user.full_name), self.zero.sanTuple(label), user.identifier, user.get_profile_picture_url(), user.follows_count, user.followed_by_count, self.zero.sanTuple(user.biography), user.username, user.is_private, user.is_verified, user.media_count, user.external_url, 0, user.identifier) 456 | 457 | self.zero.printText("+ ADDING to NODE db", False) 458 | userNodeID = self.dbTool.inserttoTabelMulti(self.dbConn, self.zero.DB_INSERT_NODE, self.zero.INSERT_DATA)[0][0] 459 | else: 460 | userNodeID = userNodeID[0][0] 461 | self.zero.printText("+ FOUND in NODE list ({}) moving on".format(userNodeID), False) 462 | 463 | return userNodeID 464 | 465 | def add_egde_from_list_insta(self, inOut): 466 | counterF = 0 467 | for following in self.imported_follow['accounts']: 468 | counterF += 1 469 | self.zero.printText("\n", False) 470 | self.zero.printText("- {} of {} :: Username: {} - ID: {}".format(counterF, self.lenImpF, following.username, following.identifier), True) 471 | 472 | #Add in Node DB 473 | tempID = self.check_user_db_node(following, True) 474 | 475 | #Check if this is a new node that havent been search 476 | self.check_new_insta(following.identifier, following.username) 477 | 478 | #Get node ID 479 | self.zero.printText("+ Recived node ID: {} ({})".format(tempID, following.username), False) 480 | 481 | #Add in egdes_insta 482 | if inOut == True: 483 | self.zero.printText("+ Checking insta_edges DB. Source: {} ({}), Target: {} ({})".format(tempID, following.username, self.sourceID, self.currentUser.username), False) 484 | if self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_TARGET_EDGE, (tempID, self.sourceID, )) == 0: 485 | self.zero.printText("+ NOT found in insta_edges adding data", False) 486 | self.zero.INSERT_DATA = (tempID, self.sourceID) 487 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_INSERT_INSTA_EGDE, self.zero.INSERT_DATA) 488 | else: 489 | self.zero.printText("+ FOUND in insta_edges list moving on", False) 490 | else: 491 | self.zero.printText("+ Checking insta_edges DB. Source: {} ({}), Target: {} ({})".format(self.sourceID, self.currentUser.full_name, tempID, following.username), False) 492 | if self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_TARGET_EDGE, (self.sourceID, tempID, )) == 0: 493 | self.zero.printText("+ NOT found in insta_edges adding data.", False) 494 | self.zero.INSERT_DATA = (self.sourceID, tempID) 495 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_INSERT_INSTA_EGDE, self.zero.INSERT_DATA) 496 | else: 497 | self.zero.printText("+ FOUND in insta_edges list moving on", False) 498 | -------------------------------------------------------------------------------- /optracker/functions/db_func.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sqlite3 3 | import unicodecsv as csv 4 | import sys 5 | import mysql.connector 6 | from sqlite3 import Error 7 | 8 | 9 | 10 | class dbFunc(): 11 | def __init__(self, dbname, Zero): 12 | self.zero = Zero 13 | self.dbname = self.zero.DB_DATABASE_FOLDER + dbname 14 | self.createDBfolder() 15 | 16 | def create_connection(self): 17 | conn = None 18 | if self.zero.DB_MYSQL_ON == 0: 19 | try: 20 | self.zero.printText("+ Connection to: SQLite", True) 21 | conn = sqlite3.connect(self.dbname) 22 | except Error as e: 23 | print(e) 24 | else: 25 | try: 26 | self.zero.printText("+ Connection to: MySQL", True) 27 | conn = mysql.connector.connect(host= self.zero.DB_MYSQL, database=self.zero.DB_MYSQL_DATABASE, user=self.zero.DB_MYSQL_USER, passwd=self.zero.DB_MYSQL_PASSWORD, charset=self.zero.DB_MYSQL_CHARSET, collation=self.zero.DB_MYSQL_COLLATION) 28 | except Exception as e: 29 | print(e) 30 | 31 | self.zero.printText("+ Connection to DB done.", False) 32 | return conn 33 | 34 | def createTabels(self, conn, create_table_sql): 35 | try: 36 | c = conn.cursor() 37 | c.execute(create_table_sql) 38 | except Error as e: 39 | print(e) 40 | 41 | def inserttoTabel(self, conn, sql, task): 42 | cur = conn.cursor() 43 | cur.execute(sql, task) 44 | conn.commit() 45 | return cur.lastrowid 46 | 47 | def inserttoTabelMulti(self, conn, sql, task): 48 | if self.zero.DB_MYSQL_ON == 1: 49 | c = conn.cursor(buffered=True) 50 | results = c.execute(sql, task, multi=True) 51 | for cur in results: 52 | if cur.with_rows: 53 | data = cur.fetchall() 54 | conn.commit() 55 | 56 | else: 57 | newSQL = sql.replace("?", "'%s'") 58 | exeSQL = (newSQL % task) 59 | multi = exeSQL.split(";") 60 | lenM = len(multi) 61 | counter = 0 62 | 63 | cur = conn.cursor() 64 | for i in multi: 65 | counter += 1 66 | if counter != lenM: 67 | cur.execute(i) 68 | 69 | data = cur.fetchall() 70 | conn.commit() 71 | 72 | if data: 73 | return data 74 | else: 75 | return 0 76 | 77 | def getValueSQL(self, conn, sql, task): 78 | cur = conn.cursor() 79 | cur.execute(sql, task) 80 | rows = cur.fetchall() 81 | 82 | if rows: 83 | return rows 84 | else: 85 | return 0 86 | 87 | def getValueSQLnoinput(self, conn, sql): 88 | cur = conn.cursor() 89 | cur.execute(sql) 90 | rows = cur.fetchall() 91 | 92 | if rows: 93 | return rows 94 | else: 95 | return 0 96 | 97 | def exportNode(self, conn, sql, filename): 98 | if os.path.exists(filename): 99 | print("+ File: {} exist, deleting it.".format(filename)) 100 | os.remove(filename) 101 | 102 | cur = conn.cursor() 103 | cur.execute(sql) 104 | with open(filename, 'wb') as csvfile: 105 | print("+ Creating and writing to: {}".format(filename)) 106 | writer = csv.writer(csvfile, encoding=self.zero.WRITE_ENCODING) 107 | writer.writerow([ i[0] for i in cur.description ]) 108 | writer.writerows(cur.fetchall()) 109 | 110 | def createDBfolder(self): 111 | if not os.path.exists(self.zero.DB_DATABASE_FOLDER): 112 | os.mkdir(self.zero.DB_DATABASE_FOLDER) 113 | 114 | if not os.path.exists(self.zero.DB_DATABASE_EXPORT_FOLDER): 115 | os.mkdir(self.zero.DB_DATABASE_EXPORT_FOLDER) 116 | 117 | def setDefaultValue(self, conn, text, value): 118 | getValue = self.getValueSQL(conn, self.zero.DB_SELECT_OPTIONS, (text, )) 119 | if getValue == 0: 120 | self.zero.printText("+ {} are NOT in database".format(text), True) 121 | self.inserttoTabel(conn, self.zero.DB_INSERT_OPTIONS_LASTINSTA, (value, text, )) 122 | self.zero.printText("+ {} set to: {}".format(text, value), True) 123 | else: 124 | value = getValue[0][1] 125 | self.zero.printText("+ {} in database, value set to: {}".format(text, value, True), False) 126 | 127 | if text == self.zero.INSTA_MAX_FOLLOW_SCAN_TEXT: 128 | self.zero.INSTA_MAX_FOLLOW_SCAN_VALUE = value 129 | 130 | if text == self.zero.INSTA_MAX_FOLLOW_BY_SCAN_TEXT: 131 | self.zero.INSTA_MAX_FOLLOW_BY_SCAN_VALUE = value 132 | 133 | if text == self.zero.SURFACE_SCAN_TEXT: 134 | self.zero.SURFACE_SCAN_VALUE = value 135 | 136 | if text == self.zero.DETAIL_PRINT_TEXT: 137 | self.zero.DETAIL_PRINT_VALUE = value 138 | 139 | if text == self.zero.DOWNLOAD_PROFILE_INSTA_TEXT: 140 | self.zero.DOWNLOAD_PROFILE_INSTA_VALUE = value 141 | 142 | if text == self.zero.FACEREC_ON_TEXT: 143 | self.zero.FACEREC_ON_VALUE = value 144 | 145 | 146 | def setDefaultValueOptions(self, conn): 147 | #Set max value for scan 148 | print("+ Setup of default values") 149 | self.setDefaultValue(conn, self.zero.INSTA_MAX_FOLLOW_SCAN_TEXT, self.zero.INSTA_MAX_FOLLOW_SCAN_VALUE) 150 | self.setDefaultValue(conn, self.zero.INSTA_MAX_FOLLOW_BY_SCAN_TEXT, self.zero.INSTA_MAX_FOLLOW_BY_SCAN_VALUE) 151 | self.setDefaultValue(conn, self.zero.SURFACE_SCAN_TEXT, self.zero.SURFACE_SCAN_VALUE) 152 | self.setDefaultValue(conn, self.zero.DOWNLOAD_PROFILE_INSTA_TEXT, self.zero.DOWNLOAD_PROFILE_INSTA_VALUE) 153 | self.setDefaultValue(conn, self.zero.FACEREC_ON_TEXT, self.zero.FACEREC_ON_VALUE) 154 | self.zero.printText("+ Setup of default DONE", False) 155 | -------------------------------------------------------------------------------- /optracker/functions/instagram_func.py: -------------------------------------------------------------------------------- 1 | from time import sleep 2 | 3 | class InstagramFunc(): 4 | def __init__(self, instagram): 5 | self.instagram = instagram 6 | 7 | def page_size_check(self, totalFollow): 8 | page_size = 100 9 | if totalFollow < page_size: 10 | page_size = totalFollow 11 | return page_size 12 | 13 | def get_insta_follow_by(self, totalFollow, insta_id): 14 | followers = [] 15 | followers = self.instagram.get_followers(insta_id, totalFollow, self.page_size_check(totalFollow), delayed=True) 16 | return followers 17 | 18 | def get_insta_following(self, totalFollow, insta_id): 19 | following = [] 20 | following = self.instagram.get_following(insta_id, totalFollow, self.page_size_check(totalFollow), delayed=True) 21 | return following 22 | 23 | def get_insta_media(self, user): 24 | medias = self.instagram.get_medias(user, 25) 25 | media = medias[6] 26 | print(media) 27 | account = media.owner 28 | 29 | def get_insta_account_info(self, currentUser): 30 | newInfo = self.instagram.get_account(currentUser) 31 | sleep(3) #mimic user 32 | return newInfo 33 | 34 | def get_insta_account_info_id(self, currentUser): 35 | newInfo = self.instagram.get_account_by_id(currentUser) 36 | sleep(3) #mimix user 37 | return newInfo 38 | -------------------------------------------------------------------------------- /optracker/functions/side_func.py: -------------------------------------------------------------------------------- 1 | import os 2 | from datetime import datetime, timedelta 3 | 4 | class sideFunc(): 5 | def __init__(self, dbTool, dbConn, Zero): 6 | self.zero = Zero 7 | self.dbTool = dbTool 8 | self.dbConn = dbConn 9 | 10 | def setCurrentUserUpdate(self, user, password): 11 | self.zero.LOGIN_PASSWORD_INSTA = password 12 | self.zero.LOGIN_USERNAME_INSTA = user 13 | self.zero.printText("+ Setting user to: {} and password to: {}".format(self.zero.LOGIN_USERNAME_INSTA, self.zero.LOGIN_PASSWORD_INSTA), True) 14 | 15 | #Update time in account 16 | currentTime = datetime.today() 17 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_ACCOUNT_LAST_USED, (currentTime, user,)) 18 | self.zero.printText("+ Updating last time for: {} to: {}".format(user, currentTime), False) 19 | 20 | def autoSelectLogin(self): 21 | userList = self.runUserCheck() 22 | self.zero.printText("\n- Auto selecting login user", True) 23 | if userList != True: 24 | count = 0 25 | currentSelect = 0 26 | oldestTime = datetime.strptime(str(datetime.today()), self.zero.DATETIME_MASK) #Setting current time 27 | for i in userList: 28 | lastTime = i[6] 29 | #Print function to list time and date, not needed. 30 | #self.zero.printText("+ User: {}, last used: {}".format(i[0], lastTime)) 31 | datetimelasttime = datetime.strptime(str(datetime.today()), self.zero.DATETIME_MASK) 32 | 33 | if not lastTime: 34 | self.zero.printText("+ {} oldest so far.".format(i[0]), False) 35 | oldestTime = datetimelasttime 36 | currentSelect = count 37 | break 38 | else: 39 | datetimelasttime = datetime.strptime(lastTime, self.zero.DATETIME_MASK) 40 | 41 | if oldestTime >= datetimelasttime: 42 | #oldestTime er nyere så setter forløpig denne til eldste 43 | self.zero.printText("+ {} oldest so far.".format(i[0]), False) 44 | oldestTime = datetimelasttime 45 | currentSelect = count 46 | 47 | count += 1 48 | 49 | self.setCurrentUserUpdate(userList[currentSelect][0].strip(), userList[currentSelect][1].strip()) 50 | 51 | def runUserCheck(self): 52 | currentTime = datetime.today() 53 | self.zero.printText("\n- Loading INSTAGRAM user list from DB", True) 54 | userList = self.dbTool.getValueSQLnoinput(self.dbConn, self.zero.DB_SELECT_LOGIN_INSTA) 55 | if userList == 0: 56 | self.zero.printText("+ No user for Instagram found, please add one", True) 57 | user = input("+ Username: ") 58 | password = input("+ Password: ") 59 | email = input("+ Email: ") 60 | fullname = input("+ Fullname: ") 61 | 62 | self.zero.printText("+ Adding {} to DB".format(user), True) 63 | INSERT_DATA = (user, password, email, fullname, "instagram", currentTime) 64 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_INSERT_LOGIN_INSTA, INSERT_DATA) 65 | password = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_LOGIN_PASSWORD_INSTA , (user,)) 66 | 67 | if password == 0: 68 | #Add loop for user 69 | self.zero.printText("+ Not able to add user", True) 70 | 71 | else: 72 | self.zero.printText("+ User add OK", True) 73 | self.setCurrentUserUpdate(user, password[0][0]) 74 | return True 75 | else: 76 | if len(userList) == 1: 77 | self.zero.printText("+ One user found using: {}".format(userList[0][0]), True) 78 | self.setCurrentUserUpdate(userList[0][0].strip(), userList[0][1].strip()) 79 | return True 80 | else: 81 | self.zero.printText("+ User list loaded.", True) 82 | return userList 83 | 84 | def setDefValue(self, newValue, text, value_text, oneup, json): 85 | change = False 86 | if newValue.isdigit(): 87 | if oneup == False: 88 | if int(newValue) < 1: 89 | self.zero.printText("+ Invalid input {} not changed".format(text), False) 90 | else: 91 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_OPTIONS, (newValue, text)) 92 | self.changeValue(text, newValue) 93 | self.zero.printText("+ {} set to: {}".format(text, value_text), True) 94 | if oneup == True: 95 | if int(newValue) > 1: 96 | self.zero.printText("+ Invalid input {} not changed".format(text), False) 97 | else: 98 | if int(newValue) < 0: 99 | self.zero.printText("+ Invalid input {} not changed".format(text), False) 100 | else: 101 | self.changeValue(text, newValue) 102 | if json == False: 103 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_OPTIONS, (newValue, text)) 104 | self.zero.printText("+ {} set to: {}".format(text, value_text), True) 105 | else: 106 | self.zero.setupJSON(True) 107 | self.zero.printText("+ {} set to: {}".format(text, newValue), True) 108 | self.zero.printText("+ IF SQL CHANGE RESTART PROGRAM", True) 109 | 110 | else: 111 | self.zero.printText("+ Invalid input {} not changed".format(text), False) 112 | 113 | def changeValue(self, value_text, newValue): 114 | if str(value_text).strip() == str(self.zero.INSTA_MAX_FOLLOW_SCAN_TEXT).strip(): 115 | self.zero.INSTA_MAX_FOLLOW_SCAN_VALUE = newValue 116 | if str(value_text).strip() == str(self.zero.INSTA_MAX_FOLLOW_BY_SCAN_TEXT).strip(): 117 | self.zero.INSTA_MAX_FOLLOW_BY_SCAN_VALUE = newValue 118 | if str(value_text).strip() == str(self.zero.SURFACE_SCAN_TEXT).strip(): 119 | self.zero.SURFACE_SCAN_VALUE = newValue 120 | if str(value_text).strip() == str(self.zero.DETAIL_PRINT_TEXT).strip(): 121 | self.zero.DETAIL_PRINT_VALUE = newValue 122 | if str(value_text).strip() == str(self.zero.DOWNLOAD_PROFILE_INSTA_TEXT).strip(): 123 | self.zero.DOWNLOAD_PROFILE_INSTA_VALUE = newValue 124 | if str(value_text).strip() == str(self.zero.FACEREC_ON_TEXT).strip(): 125 | self.zero.FACEREC_ON_VALUE = newValue 126 | 127 | def editDefaultValue(self): 128 | getMaxValueFOLLOW = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_OPTIONS, (self.zero.INSTA_MAX_FOLLOW_SCAN_TEXT, ))[0][1] 129 | getMaxValueFOLLOWBY = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_OPTIONS, (self.zero.INSTA_MAX_FOLLOW_BY_SCAN_TEXT, ))[0][1] 130 | getSurfaceScan = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_OPTIONS, (self.zero.SURFACE_SCAN_TEXT, ))[0][1] 131 | getDownload = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_OPTIONS, (self.zero.DOWNLOAD_PROFILE_INSTA_TEXT, ))[0][1] 132 | 133 | self.zero.printText("\n- Loading default values:", True) 134 | self.zero.printText("+ Max allowed follow: {}".format(getMaxValueFOLLOW), True) 135 | self.zero.printText("+ Max allowed follow by: {}".format(getMaxValueFOLLOWBY), True) 136 | self.zero.printText("+ Surface scan set to: {} (0 = OFF, 1 = ON)".format(getSurfaceScan), True) 137 | self.zero.printText("+ Download profile set to: {} (0 = OFF, 1 = ON)".format(getDownload), True) 138 | self.zero.printText("+ Detail print set to: {} (0 = OFF, 1 = ON)".format(self.zero.DETAIL_PRINT_VALUE), True) 139 | self.zero.printText("+ Face Recognition on download: {} (0 = OFF, 1 = ON)".format(self.zero.FACEREC_ON_VALUE), True) 140 | self.zero.printText("+ Mysql(1) - Sqlite(0): {}".format(self.zero.DB_MYSQL_ON), True) 141 | 142 | change = input("+ Change value? [y/N] ") 143 | 144 | if change.lower().strip() == "y": 145 | newMaxFollow = input("+ Max allowed follow: ") 146 | newMaxFollowBy = input("+ Max allowed followed by: ") 147 | newSurfaceScan = input("+ Surface scan on[1]/off[0]: ") 148 | newDetailPrint = input("+ Detail print on[1]/off[0]: ") 149 | newSavePhoto = input("+ Download profile on[1]/off[0]: ") 150 | newFace = input("+ Face recognition on download on[1]/off[0]: ") 151 | newMysql = input("+ Mysql[1] - Sqlite[0]: ") 152 | 153 | self.setDefValue(newMaxFollow, self.zero.INSTA_MAX_FOLLOW_SCAN_TEXT, self.zero.INSTA_MAX_FOLLOW_SCAN_VALUE, False, False) 154 | self.setDefValue(newMaxFollowBy, self.zero.INSTA_MAX_FOLLOW_BY_SCAN_TEXT, self.zero.INSTA_MAX_FOLLOW_BY_SCAN_VALUE, False, False) 155 | self.setDefValue(newSurfaceScan, self.zero.SURFACE_SCAN_TEXT, self.zero.SURFACE_SCAN_VALUE, True, False) 156 | self.setDefValue(newDetailPrint, self.zero.DETAIL_PRINT_TEXT, self.zero.DETAIL_PRINT_VALUE, True, True) 157 | self.setDefValue(newSavePhoto, self.zero.DOWNLOAD_PROFILE_INSTA_TEXT, self.zero.DOWNLOAD_PROFILE_INSTA_VALUE, True, False) 158 | self.setDefValue(newFace, self.zero.FACEREC_ON_TEXT, self.zero.FACEREC_ON_VALUE, True, False) 159 | else: 160 | self.zero.printText("+ Nothing changed.", True) 161 | 162 | def addLastInsta(self, update): 163 | lastInsta = input("+ Enter account to scrape: ") 164 | self.zero.printText("+ Adding {} to DB".format(lastInsta), True) 165 | if update == False: 166 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_INSERT_OPTIONS_LASTINSTA, (lastInsta, self.zero.LAST_INSTA_TEXT)) 167 | else: 168 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_LAST_INSTA, (lastInsta,)) 169 | 170 | if lastInsta == 0: 171 | #Add loop for user 172 | self.zero.printText("+ Not able to add to scraper", True) 173 | 174 | else: 175 | self.zero.INSTA_USER = lastInsta 176 | self.zero.printText("+ Scraper enabled for: {}".format(self.zero.INSTA_USER), True) 177 | 178 | def lastSearch(self): 179 | self.zero.printText("\n- Loading last scraper for Instagram from DB", True) 180 | lastInsta = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_OPTIONS, (self.zero.LAST_INSTA_TEXT, )) 181 | if lastInsta == 0: 182 | self.zero.printText("+ No last search for Instagram found", True) 183 | self.addLastInsta(False) 184 | 185 | else: 186 | self.zero.printText("+ Last user scraped: {}".format(lastInsta[0][1]), True) 187 | goon = input("+ Continue with user? [Y/n] ") 188 | 189 | if goon.lower().strip() == "n": 190 | self.addLastInsta(True) 191 | else: 192 | self.zero.INSTA_USER = lastInsta[0][1] 193 | 194 | def setupLogin(self): 195 | userList = self.runUserCheck() 196 | if userList != True: 197 | self.zero.printText("+ User list imported", True) 198 | count = 0 199 | for i in userList: 200 | count += 1 201 | self.zero.printText("[{}] {} ({}) (Last used: {})".format(count, i[0], i[3].strip(), i[6]), True) 202 | selectUser = input("+ Select user (1-{}): ".format(count)) 203 | 204 | if not selectUser.isnumeric(): 205 | self.zero.printText("+ Invalid input, #1 selected", True) 206 | selectUser = 1 207 | 208 | if int(selectUser) > count: 209 | self.zero.printText("+ Invalid input, #1 selected", True) 210 | selectUser = 1 211 | 212 | newNumber = int(selectUser) - 1 213 | self.setCurrentUserUpdate(userList[newNumber][0].strip(), userList[newNumber][1].strip()) 214 | 215 | def countCurrentUser(self): 216 | userList = self.dbTool.getValueSQLnoinput(self.dbConn, self.zero.DB_SELECT_LOGIN_INSTA) 217 | count = 0 218 | 219 | if userList != 0: 220 | for i in userList: 221 | count =+ 1 222 | 223 | self.zero.TOTAL_USER_COUNT = count 224 | 225 | def loadLoginText(self): 226 | currentTime = datetime.today() 227 | self.zero.printText("\n- Loading user and password from file", True) 228 | for file in self.zero.USER_FILES: 229 | fullpath = self.zero.OP_ROOT_FOLDER_PATH_VALUE + file[0] 230 | if os.path.isfile(fullpath): 231 | self.zero.printText("+ Found: {}, extracting data".format(fullpath), True) 232 | with open(fullpath) as fp: 233 | line = fp.readline() 234 | while line: 235 | newUser = line.strip().split(",") 236 | 237 | if len(newUser[0]) != 0: 238 | #Check if exists 239 | password = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_LOGIN_PASSWORD_INSTA , (newUser[0],)) 240 | 241 | if password != 0: 242 | self.zero.printText("+ User allready exist: {}".format(newUser[0]), False) 243 | 244 | else: 245 | self.zero.printText("+ User NOT found adding user: {}. ".format(newUser[0]), False) 246 | INSERT_DATA = (newUser[0].strip(), newUser[1].strip(), newUser[2].strip(), newUser[3].strip(), newUser[4].strip(), currentTime) 247 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_INSERT_LOGIN_INSTA, INSERT_DATA) 248 | password = self.dbTool.getValueSQL(self.dbConn, self.zero.DB_SELECT_LOGIN_PASSWORD_INSTA , (newUser[0].strip(),)) 249 | 250 | if password == 0: 251 | self.zero.printText("+ Not able to add user", False) 252 | else: 253 | self.zero.printText("+ User add OK", False) 254 | line = fp.readline() 255 | else: 256 | self.zero.printText("+ File: {} does not exist".format(fullpath), False) 257 | -------------------------------------------------------------------------------- /optracker/igramscraper/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/igramscraper/__init__.py -------------------------------------------------------------------------------- /optracker/igramscraper/endpoints.py: -------------------------------------------------------------------------------- 1 | import urllib.parse 2 | import json 3 | 4 | USER_MEDIAS = '17880160963012870' 5 | USER_STORIES = '17890626976041463' 6 | STORIES = '17873473675158481' 7 | 8 | BASE_URL = 'https://www.instagram.com' 9 | LOGIN_URL = 'https://www.instagram.com/accounts/login/ajax/' 10 | ACCOUNT_PAGE = 'https://www.instagram.com/%s' 11 | MEDIA_LINK = 'https://www.instagram.com/p/%s' 12 | ACCOUNT_MEDIAS = 'https://www.instagram.com/graphql/query/?query_hash=42323d64886122307be10013ad2dcc44&variables=%s' 13 | ACCOUNT_JSON_INFO = 'https://www.instagram.com/%s/?__a=1' 14 | MEDIA_JSON_INFO = 'https://www.instagram.com/p/%s/?__a=1' 15 | MEDIA_JSON_BY_LOCATION_ID = 'https://www.instagram.com/explore/locations/%s/?__a=1&max_id=%s' 16 | MEDIA_JSON_BY_TAG = 'https://www.instagram.com/explore/tags/%s/?__a=1&max_id=%s' 17 | GENERAL_SEARCH = 'https://www.instagram.com/web/search/topsearch/?query=%s' 18 | COMMENTS_BEFORE_COMMENT_ID_BY_CODE = 'https://www.instagram.com/graphql/query/?query_hash=97b41c52301f77ce508f55e66d17620e&variables=%s' 19 | LIKES_BY_SHORTCODE_OLD = 'https://www.instagram.com/graphql/query/?query_id=17864450716183058&variables={"shortcode":"%s","first":%s,"after":"%s"}' 20 | LIKES_BY_SHORTCODE = 'https://www.instagram.com/graphql/query/?query_hash=d5d763b1e2acf209d62d22d184488e57&variables=%s' 21 | FOLLOWING_URL_OLD = 'https://www.instagram.com/graphql/query/?query_id=17874545323001329&id={{accountId}}&first={{count}}&after={{after}}' 22 | FOLLOWING_URL = 'https://www.instagram.com/graphql/query/?query_hash=d04b0a864b4b54837c0d870b0e77e076&variables=%s' 23 | FOLLOWERS_URL_OLD = 'https://www.instagram.com/graphql/query/?query_id=17851374694183129&id={{accountId}}&first={{count}}&after={{after}}' 24 | FOLLOWERS_URL = 'https://www.instagram.com/graphql/query/?query_hash=c76146de99bb02f6415203be841dd25a&variables=%s' 25 | FOLLOW_URL = 'https://www.instagram.com/web/friendships/%s/follow/' 26 | UNFOLLOW_URL = 'https://www.instagram.com/web/friendships/%s/unfollow/' 27 | INSTAGRAM_CDN_URL = 'https://scontent.cdninstagram.com/' 28 | ACCOUNT_JSON_PRIVATE_INFO_BY_ID = 'https://i.instagram.com/api/v1/users/%s/info/' 29 | LIKE_URL = 'https://www.instagram.com/web/likes/%s/like/' 30 | UNLIKE_URL = 'https://www.instagram.com/web/likes/%s/unlike/' 31 | ADD_COMMENT_URL = 'https://www.instagram.com/web/comments/%s/add/' 32 | DELETE_COMMENT_URL = 'https://www.instagram.com/web/comments/%s/delete/%s/' 33 | 34 | ACCOUNT_MEDIAS2 = 'https://www.instagram.com/graphql/query/?query_id=17880160963012870&id={{accountId}}&first=10&after=' 35 | 36 | GRAPH_QL_QUERY_URL = 'https://www.instagram.com/graphql/query/?query_id=%s' 37 | 38 | request_media_count = 30 39 | 40 | 41 | def get_account_page_link(username): 42 | return ACCOUNT_PAGE % urllib.parse.quote_plus(username) 43 | 44 | 45 | def get_account_json_link(username): 46 | return ACCOUNT_JSON_INFO % urllib.parse.quote_plus(username) 47 | 48 | 49 | def get_account_json_private_info_link_by_account_id(account_id): 50 | return ACCOUNT_JSON_PRIVATE_INFO_BY_ID % urllib.parse.quote_plus(str(account_id)) 51 | 52 | 53 | def get_account_medias_json_link(variables): 54 | return ACCOUNT_MEDIAS % urllib.parse.quote_plus(json.dumps(variables, separators=(',', ':'))) 55 | 56 | 57 | def get_media_page_link(code): 58 | return MEDIA_LINK % urllib.parse.quote_plus(code) 59 | 60 | 61 | def get_media_json_link(code): 62 | return MEDIA_JSON_INFO % urllib.parse.quote_plus(code) 63 | 64 | 65 | def get_medias_json_by_location_id_link(facebook_location_id, max_id=''): 66 | return MEDIA_JSON_BY_LOCATION_ID % (urllib.parse.quote_plus(str(facebook_location_id)), urllib.parse.quote_plus(max_id)) 67 | 68 | 69 | def get_medias_json_by_tag_link(tag, max_id=''): 70 | return MEDIA_JSON_BY_TAG % (urllib.parse.quote_plus(str(tag)), urllib.parse.quote_plus(str(max_id))) 71 | 72 | 73 | def get_general_search_json_link(query): 74 | return GENERAL_SEARCH % urllib.parse.quote_plus(query) 75 | 76 | 77 | def get_comments_before_comments_id_by_code(variables): 78 | return COMMENTS_BEFORE_COMMENT_ID_BY_CODE % urllib.parse.quote_plus(json.dumps(variables, separators=(',', ':'))) 79 | 80 | 81 | def get_last_likes_by_code_old(code, count, last_like_id): 82 | return LIKES_BY_SHORTCODE_OLD % (urllib.parse.quote_plus(code), urllib.parse.quote_plus(str(count)), urllib.parse.quote_plus(str(last_like_id))) 83 | 84 | 85 | def get_last_likes_by_code(variables): 86 | return LIKES_BY_SHORTCODE % urllib.parse.quote_plus(json.dumps(variables, separators=(',', ':'))) 87 | 88 | 89 | def get_follow_url(account_id): 90 | return FOLLOW_URL % urllib.parse.quote_plus(account_id) 91 | 92 | 93 | def get_unfollow_url(account_id): 94 | return UNFOLLOW_URL % urllib.parse.quote_plus(account_id) 95 | 96 | 97 | def get_followers_json_link_old(account_id, count, after=''): 98 | url = FOLLOWERS_URL_OLD.replace( 99 | '{{accountId}}', urllib.parse.quote_plus(account_id)) 100 | url = url.replace('{{count}}', urllib.parse.quote_plus(str(count))) 101 | 102 | if after == '': 103 | url = url.replace('&after={{after}}', '') 104 | else: 105 | url = url.replace('{{after}}', urllib.parse.quote_plus(str(after))) 106 | 107 | return url 108 | 109 | def get_followers_json_link(variables): 110 | return FOLLOWERS_URL % urllib.parse.quote_plus(json.dumps(variables, separators=(',', ':'))) 111 | 112 | 113 | def get_following_json_link_old(account_id, count, after=''): 114 | url = FOLLOWING_URL_OLD.replace( 115 | '{{accountId}}', urllib.parse.quote_plus(account_id)) 116 | url = url.replace('{{count}}', urllib.parse.quote_plus(count)) 117 | 118 | if after == '': 119 | url = url.replace('&after={{after}}', '') 120 | else: 121 | url = url.replace('{{after}}', urllib.parse.quote_plus(after)) 122 | 123 | return url 124 | 125 | def get_following_json_link(variables): 126 | return FOLLOWING_URL % urllib.parse.quote_plus(json.dumps(variables, separators=(',', ':'))) 127 | 128 | def get_user_stories_link(): 129 | return get_graph_ql_url(USER_STORIES, {'variables': json.dumps([], separators=(',', ':'))}) 130 | 131 | 132 | def get_graph_ql_url(query_id, parameters): 133 | url = GRAPH_QL_QUERY_URL % urllib.parse.quote_plus(query_id) 134 | 135 | if len(parameters) > 0: 136 | query_string = urllib.parse.urlencode(parameters) 137 | url += '&' + query_string 138 | 139 | return url 140 | 141 | 142 | def get_stories_link(variables): 143 | return get_graph_ql_url(STORIES, {'variables': json.dumps(variables, separators=(',', ':'))}) 144 | 145 | 146 | def get_like_url(media_id): 147 | return LIKE_URL % urllib.parse.quote_plus(str(media_id)) 148 | 149 | 150 | def get_unlike_url(media_id): 151 | return UNLIKE_URL % urllib.parse.quote_plus(str(media_id)) 152 | 153 | 154 | def get_add_comment_url(media_id): 155 | return ADD_COMMENT_URL % urllib.parse.quote_plus(str(media_id)) 156 | 157 | 158 | def get_delete_comment_url(media_id, comment_id): 159 | return DELETE_COMMENT_URL % (urllib.parse.quote_plus(str(media_id)), urllib.parse.quote_plus(str(comment_id))) 160 | 161 | -------------------------------------------------------------------------------- /optracker/igramscraper/exception/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/igramscraper/exception/__init__.py -------------------------------------------------------------------------------- /optracker/igramscraper/exception/instagram_auth_exception.py: -------------------------------------------------------------------------------- 1 | class InstagramAuthException(Exception): 2 | def __init__(self, message = "", code = 401): 3 | super().__init__(f'{message}, Code:{code}') -------------------------------------------------------------------------------- /optracker/igramscraper/exception/instagram_exception.py: -------------------------------------------------------------------------------- 1 | class InstagramException(Exception): 2 | StatusCode = -1 3 | 4 | def __init__(self, message="", code=500): 5 | super().__init__(f'{message}, Code:{code}') 6 | 7 | @staticmethod 8 | def default(response_text, status_code): 9 | StatusCode = status_code 10 | return InstagramException( 11 | 'Response code is {status_code}. Body: {response_text} ' 12 | 'Something went wrong. Please report issue.'.format( 13 | response_text=response_text, status_code=status_code), 14 | status_code) 15 | -------------------------------------------------------------------------------- /optracker/igramscraper/exception/instagram_not_found_exception.py: -------------------------------------------------------------------------------- 1 | class InstagramNotFoundException(Exception): 2 | def __init__(self, message="", code=404): 3 | super().__init__(f'{message}, Code:{code}') 4 | -------------------------------------------------------------------------------- /optracker/igramscraper/helper.py: -------------------------------------------------------------------------------- 1 | # -*- coding:utf-8 -*- 2 | 3 | from functools import reduce 4 | import signal 5 | 6 | 7 | def get_from_dict(data_dict, map_list, default=None): 8 | def getitem(source, key): 9 | try: 10 | if isinstance(source, list): 11 | return source[int(key)] 12 | if isinstance(source, dict) and key not in source.keys(): 13 | return default 14 | if not source: 15 | return default 16 | except IndexError: 17 | return default 18 | 19 | return source[key] 20 | 21 | if isinstance(map_list, str): 22 | map_list = map_list.split('.') 23 | 24 | return reduce(getitem, map_list, data_dict) 25 | 26 | 27 | def set_timeout(num, callback): 28 | """ 29 | A decorator to limit the method run time. 30 | example: 31 | def after_timeout(): # callback function 32 | print("Time out!") 33 | @set_timeout(2, after_timeout) # 2s limited 34 | def connect(): 35 | time.sleep(3) 36 | print('Finished without timeout.') 37 | :param num: 38 | :param callback: 39 | :return: 40 | """ 41 | def wrap(func): 42 | def handle(signum, frame): 43 | raise RuntimeError 44 | 45 | def to_do(*args, **kwargs): 46 | try: 47 | signal.signal(signal.SIGALRM, handle) 48 | signal.alarm(num) 49 | r = func(*args, **kwargs) 50 | signal.alarm(0) 51 | return r 52 | except RuntimeError as e: 53 | callback() 54 | return to_do 55 | return wrap -------------------------------------------------------------------------------- /optracker/igramscraper/instagram.py: -------------------------------------------------------------------------------- 1 | import time 2 | import requests 3 | import re 4 | import json 5 | import hashlib 6 | import os 7 | from slugify import slugify 8 | import random 9 | from .session_manager import CookieSessionManager 10 | from .exception.instagram_auth_exception import InstagramAuthException 11 | from .exception.instagram_exception import InstagramException 12 | from .exception.instagram_not_found_exception import InstagramNotFoundException 13 | from .model.account import Account 14 | from .model.comment import Comment 15 | from .model.location import Location 16 | from .model.media import Media 17 | from .model.story import Story 18 | from .model.user_stories import UserStories 19 | from .model.tag import Tag 20 | from . import endpoints 21 | from .two_step_verification.console_verification import ConsoleVerification 22 | 23 | class Instagram: 24 | HTTP_NOT_FOUND = 404 25 | HTTP_OK = 200 26 | HTTP_FORBIDDEN = 403 27 | HTTP_BAD_REQUEST = 400 28 | 29 | MAX_COMMENTS_PER_REQUEST = 300 30 | MAX_LIKES_PER_REQUEST = 50 31 | # 30 mins time limit on operations that require multiple self.__req 32 | PAGING_TIME_LIMIT_SEC = 1800 33 | PAGING_DELAY_MINIMUM_MICROSEC = 1000000 # 1 sec min delay to simulate browser 34 | PAGING_DELAY_MAXIMUM_MICROSEC = 3000000 # 3 sec max delay to simulate browser 35 | 36 | instance_cache = None 37 | 38 | def __init__(self, sleep_between_requests=0): 39 | self.__req = requests.session() 40 | self.paging_time_limit_sec = Instagram.PAGING_TIME_LIMIT_SEC 41 | self.paging_delay_minimum_microsec = Instagram.PAGING_DELAY_MINIMUM_MICROSEC 42 | self.paging_delay_maximum_microsec = Instagram.PAGING_DELAY_MAXIMUM_MICROSEC 43 | 44 | self.session_username = None 45 | self.session_password = None 46 | self.user_session = None 47 | self.rhx_gis = None 48 | self.sleep_between_requests = sleep_between_requests 49 | self.user_agent = 'Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X)' \ 50 | 'AppleWebKit/605.1.15 (KHTML, like Gecko)' \ 51 | 'Mobile/15E148 Instagram 105.0.0.11.118 (iPhone11,8; iOS 12_3_1; en_US; en-US; scale=2.00; 828x1792; 165586599)' 52 | 53 | def with_credentials(self, username, password, session_folder=None): 54 | """ 55 | param string username 56 | param string password 57 | param null sessionFolder 58 | 59 | return Instagram 60 | """ 61 | Instagram.instance_cache = None 62 | 63 | if not session_folder: 64 | cwd = os.getcwd() 65 | session_folder = cwd + os.path.sep + 'sessions' + os.path.sep 66 | 67 | if isinstance(session_folder, str): 68 | 69 | Instagram.instance_cache = CookieSessionManager( 70 | session_folder, slugify(username) + '.txt') 71 | 72 | else: 73 | Instagram.instance_cache = session_folder 74 | 75 | Instagram.instance_cache.empty_saved_cookies() 76 | 77 | 78 | self.session_username = username 79 | self.session_password = password 80 | 81 | def set_proxies(self, proxy): 82 | if proxy and isinstance(proxy, dict): 83 | self.__req.proxies = proxy 84 | 85 | def disable_verify(self): 86 | self.__req.verify = False 87 | 88 | def disable_proxies(self): 89 | self.__req.proxies = {} 90 | 91 | def get_user_agent(self): 92 | return self.user_agent 93 | 94 | def set_user_agent(self, user_agent): 95 | self.user_agent = user_agent 96 | 97 | @staticmethod 98 | def set_account_medias_request_count(count): 99 | """ 100 | Set how many media objects should be retrieved in a single request 101 | param int count 102 | """ 103 | endpoints.request_media_count = count 104 | 105 | def get_account_by_id(self, id): 106 | """ 107 | :param id: account id 108 | :return: Account 109 | """ 110 | username = self.get_username_by_id(id) 111 | return self.get_account(username) 112 | 113 | def get_username_by_id(self, id): 114 | """ 115 | :param id: account id 116 | :return: username string from response 117 | """ 118 | time.sleep(self.sleep_between_requests) 119 | response = self.__req.get( 120 | endpoints.get_account_json_private_info_link_by_account_id( 121 | id), headers=self.generate_headers(self.user_session)) 122 | 123 | if Instagram.HTTP_NOT_FOUND == response.status_code: 124 | raise InstagramNotFoundException( 125 | 'Failed to fetch account with given id') 126 | 127 | if Instagram.HTTP_OK != response.status_code: 128 | raise InstagramException.default(response.text, 129 | response.status_code) 130 | 131 | json_response = response.json() 132 | if not json_response: 133 | raise InstagramException('Response does not JSON') 134 | 135 | if json_response['status'] != 'ok': 136 | message = json_response['message'] if ( 137 | 'message' in json_response.keys()) else 'Unknown Error' 138 | raise InstagramException(message) 139 | 140 | return json_response['user']['username'] 141 | 142 | def generate_headers(self, session, gis_token=None): 143 | """ 144 | :param session: user session dict 145 | :param gis_token: a token used to be verified by instagram in header 146 | :return: header dict 147 | """ 148 | headers = {} 149 | if session is not None: 150 | cookies = '' 151 | 152 | for key in session.keys(): 153 | cookies += f"{key}={session[key]}; " 154 | 155 | csrf = session['x-csrftoken'] if session['csrftoken'] is None else \ 156 | session['csrftoken'] 157 | 158 | headers = { 159 | 'cookie': cookies, 160 | 'referer': endpoints.BASE_URL + '/', 161 | 'x-csrftoken': csrf 162 | } 163 | 164 | if self.user_agent is not None: 165 | headers['user-agent'] = self.user_agent 166 | 167 | if gis_token is not None: 168 | headers['x-instagram-gis'] = gis_token 169 | 170 | return headers 171 | 172 | def __generate_gis_token(self, variables): 173 | """ 174 | :param variables: a dict used to generate_gis_token 175 | :return: a token used to be verified by instagram 176 | """ 177 | rhx_gis = self.__get_rhx_gis() if self.__get_rhx_gis() is not None else 'NULL' 178 | string_to_hash = ':'.join([rhx_gis, json.dumps(variables, separators=(',', ':')) if isinstance(variables, dict) else variables]) 179 | return hashlib.md5(string_to_hash.encode('utf-8')).hexdigest() 180 | 181 | def __get_rhx_gis(self): 182 | """ 183 | :return: a string to generate gis_token 184 | """ 185 | if self.rhx_gis is None: 186 | try: 187 | shared_data = self.__get_shared_data_from_page() 188 | except Exception as _: 189 | raise InstagramException('Could not extract gis from page') 190 | 191 | if 'rhx_gis' in shared_data.keys(): 192 | self.rhx_gis = shared_data['rhx_gis'] 193 | else: 194 | self.rhx_gis = None 195 | 196 | return self.rhx_gis 197 | 198 | def __get_mid(self): 199 | """manually fetches the machine id from graphQL""" 200 | time.sleep(self.sleep_between_requests) 201 | response = self.__req.get('https://www.instagram.com/web/__mid/') 202 | 203 | if response.status_code != Instagram.HTTP_OK: 204 | raise InstagramException.default(response.text, 205 | response.status_code) 206 | 207 | return response.text 208 | 209 | def __get_shared_data_from_page(self, url=endpoints.BASE_URL): 210 | """ 211 | :param url: the requested url 212 | :return: a dict extract from page 213 | """ 214 | url = url.rstrip('/') + '/' 215 | time.sleep(self.sleep_between_requests) 216 | response = self.__req.get(url, headers=self.generate_headers( 217 | self.user_session)) 218 | 219 | if Instagram.HTTP_NOT_FOUND == response.status_code: 220 | raise InstagramNotFoundException(f"Page {url} not found") 221 | 222 | if not Instagram.HTTP_OK == response.status_code: 223 | raise InstagramException.default(response.text, 224 | response.status_code) 225 | 226 | return Instagram.extract_shared_data_from_body(response.text) 227 | 228 | @staticmethod 229 | def extract_shared_data_from_body(body): 230 | """ 231 | :param body: html string from a page 232 | :return: a dict extract from page 233 | """ 234 | array = re.findall(r'_sharedData = .*?;', body) 235 | if len(array) > 0: 236 | raw_json = array[0][len("_sharedData ="):-len(";")] 237 | 238 | return json.loads(raw_json) 239 | 240 | return None 241 | 242 | def search_tags_by_tag_name(self, tag): 243 | """ 244 | :param tag: tag string 245 | :return: list of Tag 246 | """ 247 | # TODO: Add tests and auth 248 | time.sleep(self.sleep_between_requests) 249 | response = self.__req.get(endpoints.get_general_search_json_link(tag)) 250 | 251 | if Instagram.HTTP_NOT_FOUND == response.status_code: 252 | raise InstagramNotFoundException( 253 | 'Account with given username does not exist.') 254 | 255 | if not Instagram.HTTP_OK == response.status_code: 256 | raise InstagramException.default(response.text, 257 | response.status_code) 258 | 259 | json_response = response.json() 260 | 261 | try: 262 | status = json_response['status'] 263 | if status != 'ok': 264 | raise InstagramException( 265 | 'Response code is not equal 200. ' 266 | 'Something went wrong. Please report issue.') 267 | except KeyError: 268 | raise InstagramException('Response code is not equal 200. Something went wrong. Please report issue.') 269 | 270 | try: 271 | hashtags_raw = json_response['hashtags'] 272 | if len(hashtags_raw) == 0: 273 | return [] 274 | except KeyError: 275 | return [] 276 | 277 | hashtags = [] 278 | for json_hashtag in hashtags_raw: 279 | hashtags.append(Tag(json_hashtag['hashtag'])) 280 | 281 | return hashtags 282 | 283 | def get_medias(self, username, count=20, maxId=''): 284 | """ 285 | :param username: instagram username 286 | :param count: the number of how many media you want to get 287 | :param maxId: used to paginate 288 | :return: list of Media 289 | """ 290 | account = self.get_account(username) 291 | return self.get_medias_by_user_id(account.identifier, count, maxId) 292 | 293 | def get_medias_by_code(self, media_code): 294 | """ 295 | :param media_code: media code 296 | :return: Media 297 | """ 298 | url = endpoints.get_media_page_link(media_code) 299 | return self.get_media_by_url(url) 300 | 301 | def get_medias_by_user_id(self, id, count=12, max_id=''): 302 | """ 303 | :param id: instagram account id 304 | :param count: the number of how many media you want to get 305 | :param max_id: used to paginate 306 | :return: list of Media 307 | """ 308 | index = 0 309 | medias = [] 310 | is_more_available = True 311 | 312 | while index < count and is_more_available: 313 | 314 | variables = { 315 | 'id': str(id), 316 | 'first': str(count), 317 | 'after': str(max_id) 318 | } 319 | 320 | headers = self.generate_headers(self.user_session, 321 | self.__generate_gis_token( 322 | variables)) 323 | 324 | time.sleep(self.sleep_between_requests) 325 | response = self.__req.get( 326 | endpoints.get_account_medias_json_link(variables), 327 | headers=headers) 328 | 329 | if not Instagram.HTTP_OK == response.status_code: 330 | raise InstagramException.default(response.text, 331 | response.status_code) 332 | 333 | arr = json.loads(response.text) 334 | 335 | try: 336 | nodes = arr['data']['user']['edge_owner_to_timeline_media'][ 337 | 'edges'] 338 | except KeyError: 339 | return {} 340 | 341 | for mediaArray in nodes: 342 | if index == count: 343 | return medias 344 | 345 | media = Media(mediaArray['node']) 346 | medias.append(media) 347 | index += 1 348 | 349 | if not nodes or nodes == '': 350 | return medias 351 | 352 | max_id = \ 353 | arr['data']['user']['edge_owner_to_timeline_media'][ 354 | 'page_info'][ 355 | 'end_cursor'] 356 | is_more_available = \ 357 | arr['data']['user']['edge_owner_to_timeline_media'][ 358 | 'page_info'][ 359 | 'has_next_page'] 360 | 361 | return medias 362 | 363 | def get_media_by_id(self, media_id): 364 | """ 365 | :param media_id: media id 366 | :return: list of Media 367 | """ 368 | media_link = Media.get_link_from_id(media_id) 369 | return self.get_media_by_url(media_link) 370 | 371 | def get_media_by_url(self, media_url): 372 | """ 373 | :param media_url: media url 374 | :return: Media 375 | """ 376 | url_regex = r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+' 377 | 378 | if len(re.findall(url_regex, media_url)) <= 0: 379 | raise ValueError('Malformed media url') 380 | 381 | url = media_url.rstrip('/') + '/?__a=1' 382 | time.sleep(self.sleep_between_requests) 383 | response = self.__req.get(url, headers=self.generate_headers( 384 | self.user_session)) 385 | 386 | if Instagram.HTTP_NOT_FOUND == response.status_code: 387 | raise InstagramNotFoundException( 388 | 'Media with given code does not exist or account is private.') 389 | 390 | if Instagram.HTTP_OK != response.status_code: 391 | raise InstagramException.default(response.text, 392 | response.status_code) 393 | 394 | media_array = response.json() 395 | try: 396 | media_in_json = media_array['graphql']['shortcode_media'] 397 | except KeyError: 398 | raise InstagramException('Media with this code does not exist') 399 | 400 | return Media(media_in_json) 401 | 402 | def get_medias_from_feed(self, username, count=20): 403 | """ 404 | :param username: instagram username 405 | :param count: the number of how many media you want to get 406 | :return: list of Media 407 | """ 408 | medias = [] 409 | index = 0 410 | time.sleep(self.sleep_between_requests) 411 | response = self.__req.get(endpoints.get_account_json_link(username), 412 | headers=self.generate_headers( 413 | self.user_session)) 414 | 415 | if Instagram.HTTP_NOT_FOUND == response.status_code: 416 | raise InstagramNotFoundException( 417 | 'Account with given username does not exist.') 418 | 419 | if Instagram.HTTP_OK != response.status_code: 420 | raise InstagramException.default(response.text, 421 | response.status_code) 422 | 423 | user_array = response.json() 424 | 425 | try: 426 | user = user_array['graphql']['user'] 427 | except KeyError: 428 | raise InstagramNotFoundException( 429 | 'Account with this username does not exist') 430 | 431 | try: 432 | nodes = user['edge_owner_to_timeline_media']['edges'] 433 | if len(nodes) == 0: 434 | return [] 435 | except Exception: 436 | return [] 437 | 438 | for media_array in nodes: 439 | if index == count: 440 | return medias 441 | medias.append(Media(media_array['node'])) 442 | index += 1 443 | 444 | return medias 445 | 446 | def get_medias_by_tag(self, tag, count=12, max_id='', min_timestamp=None): 447 | """ 448 | :param tag: tag string 449 | :param count: the number of how many media you want to get 450 | :param max_id: used to paginate 451 | :param min_timestamp: limit the time you want to start from 452 | :return: list of Media 453 | """ 454 | index = 0 455 | medias = [] 456 | media_ids = [] 457 | has_next_page = True 458 | while index < count and has_next_page: 459 | 460 | time.sleep(self.sleep_between_requests) 461 | response = self.__req.get( 462 | endpoints.get_medias_json_by_tag_link(tag, max_id), 463 | headers=self.generate_headers(self.user_session)) 464 | 465 | if response.status_code != Instagram.HTTP_OK: 466 | raise InstagramException.default(response.text, 467 | response.status_code) 468 | 469 | arr = response.json() 470 | 471 | try: 472 | arr['graphql']['hashtag']['edge_hashtag_to_media']['count'] 473 | except KeyError: 474 | return [] 475 | 476 | nodes = arr['graphql']['hashtag']['edge_hashtag_to_media']['edges'] 477 | for media_array in nodes: 478 | if index == count: 479 | return medias 480 | media = Media(media_array['node']) 481 | if media.identifier in media_ids: 482 | return medias 483 | 484 | if min_timestamp is not None \ 485 | and media.created_time < min_timestamp: 486 | return medias 487 | 488 | media_ids.append(media.identifier) 489 | medias.append(media) 490 | index += 1 491 | 492 | if len(nodes) == 0: 493 | return medias 494 | 495 | max_id = \ 496 | arr['graphql']['hashtag']['edge_hashtag_to_media']['page_info'][ 497 | 'end_cursor'] 498 | has_next_page = \ 499 | arr['graphql']['hashtag']['edge_hashtag_to_media']['page_info'][ 500 | 'has_next_page'] 501 | 502 | return medias 503 | 504 | def get_medias_by_location_id(self, facebook_location_id, count=24, 505 | max_id=''): 506 | """ 507 | :param facebook_location_id: facebook location id 508 | :param count: the number of how many media you want to get 509 | :param max_id: used to paginate 510 | :return: list of Media 511 | """ 512 | index = 0 513 | medias = [] 514 | has_next_page = True 515 | 516 | while index < count and has_next_page: 517 | 518 | time.sleep(self.sleep_between_requests) 519 | response = self.__req.get( 520 | endpoints.get_medias_json_by_location_id_link( 521 | facebook_location_id, max_id), 522 | headers=self.generate_headers(self.user_session)) 523 | 524 | if response.status_code != Instagram.HTTP_OK: 525 | raise InstagramException.default(response.text, 526 | response.status_code) 527 | 528 | arr = response.json() 529 | 530 | nodes = arr['graphql']['location']['edge_location_to_media'][ 531 | 'edges'] 532 | for media_array in nodes: 533 | if index == count: 534 | return medias 535 | 536 | medias.append(Media(media_array['node'])) 537 | index += 1 538 | 539 | if len(nodes) == 0: 540 | return medias 541 | 542 | has_next_page = \ 543 | arr['graphql']['location']['edge_location_to_media'][ 544 | 'page_info'][ 545 | 'has_next_page'] 546 | max_id = \ 547 | arr['graphql']['location']['edge_location_to_media'][ 548 | 'page_info'][ 549 | 'end_cursor'] 550 | 551 | return medias 552 | 553 | def get_current_top_medias_by_tag_name(self, tag_name): 554 | """ 555 | :param tag_name: tag string 556 | :return: list of the top Media 557 | """ 558 | time.sleep(self.sleep_between_requests) 559 | response = self.__req.get( 560 | endpoints.get_medias_json_by_tag_link(tag_name, ''), 561 | headers=self.generate_headers(self.user_session)) 562 | 563 | if response.status_code == Instagram.HTTP_NOT_FOUND: 564 | raise InstagramNotFoundException( 565 | 'Account with given username does not exist.') 566 | 567 | if response.status_code is not Instagram.HTTP_OK: 568 | raise InstagramException.default(response.text, 569 | response.status_code) 570 | 571 | json_response = response.json() 572 | medias = [] 573 | 574 | nodes = \ 575 | json_response['graphql']['hashtag']['edge_hashtag_to_top_posts'][ 576 | 'edges'] 577 | 578 | for media_array in nodes: 579 | medias.append(Media(media_array['node'])) 580 | 581 | return medias 582 | 583 | def get_current_top_medias_by_location_id(self, facebook_location_id): 584 | """ 585 | :param facebook_location_id: facebook location id 586 | :return: list of the top Media 587 | """ 588 | time.sleep(self.sleep_between_requests) 589 | response = self.__req.get( 590 | endpoints.get_medias_json_by_location_id_link(facebook_location_id), 591 | headers=self.generate_headers(self.user_session)) 592 | if response.status_code == Instagram.HTTP_NOT_FOUND: 593 | raise InstagramNotFoundException( 594 | "Location with this id doesn't exist") 595 | 596 | if response.status_code != Instagram.HTTP_OK: 597 | raise InstagramException.default(response.text, 598 | response.status_code) 599 | 600 | json_response = response.json() 601 | 602 | nodes = \ 603 | json_response['graphql']['location']['edge_location_to_top_posts'][ 604 | 'edges'] 605 | medias = [] 606 | 607 | for media_array in nodes: 608 | medias.append(Media(media_array['node'])) 609 | 610 | return medias 611 | 612 | def get_paginate_medias(self, username, max_id=''): 613 | """ 614 | :param username: instagram user name 615 | :param max_id: used to paginate next time 616 | :return: dict that contains Media list, maxId, hasNextPage 617 | """ 618 | account = self.get_account(username) 619 | has_next_page = True 620 | medias = [] 621 | 622 | to_return = { 623 | 'medias': medias, 624 | 'maxId': max_id, 625 | 'hasNextPage': has_next_page, 626 | } 627 | 628 | variables = json.dumps({ 629 | 'id': str(account.identifier), 630 | 'first': str(endpoints.request_media_count), 631 | 'after': str(max_id) 632 | }, separators=(',', ':')) 633 | 634 | time.sleep(self.sleep_between_requests) 635 | response = self.__req.get( 636 | endpoints.get_account_medias_json_link(variables), 637 | headers=self.generate_headers(self.user_session, 638 | self.__generate_gis_token(variables)) 639 | ) 640 | 641 | if not Instagram.HTTP_OK == response.status_code: 642 | raise InstagramException.default(response.text, 643 | response.status_code) 644 | 645 | arr = response.json() 646 | 647 | try: 648 | nodes = arr['data']['user']['edge_owner_to_timeline_media']['edges'] 649 | except KeyError: 650 | return to_return 651 | 652 | for mediaArray in nodes: 653 | medias.append(Media(mediaArray['node'])) 654 | 655 | max_id = \ 656 | arr['data']['user']['edge_owner_to_timeline_media']['page_info'][ 657 | 'end_cursor'] 658 | has_next_page = \ 659 | arr['data']['user']['edge_owner_to_timeline_media']['page_info'][ 660 | 'has_next_page'] 661 | 662 | to_return = { 663 | 'medias': medias, 664 | 'maxId': max_id, 665 | 'hasNextPage': has_next_page, 666 | } 667 | 668 | return to_return 669 | 670 | def get_paginate_medias_by_tag(self, tag, max_id=''): 671 | """ 672 | :param tag: tag name 673 | :param max_id: used to paginate next time 674 | :return: dict that contains Media list, maxId, hasNextPage 675 | """ 676 | has_next_page = True 677 | medias = [] 678 | 679 | to_return = { 680 | 'medias': medias, 681 | 'maxId': max_id, 682 | 'hasNextPage': has_next_page, 683 | } 684 | 685 | time.sleep(self.sleep_between_requests) 686 | response = self.__req.get( 687 | endpoints.get_medias_json_by_tag_link(tag, max_id), 688 | headers=self.generate_headers(self.user_session)) 689 | 690 | if response.status_code != Instagram.HTTP_OK: 691 | raise InstagramException.default(response.text, 692 | response.status_code) 693 | 694 | arr = response.json() 695 | 696 | try: 697 | nodes = arr['graphql']['hashtag']['edge_hashtag_to_media']['edges'] 698 | except KeyError: 699 | return to_return 700 | 701 | for media_array in nodes: 702 | medias.append(Media(media_array['node'])) 703 | 704 | max_id = \ 705 | arr['graphql']['hashtag']['edge_hashtag_to_media']['page_info'][ 706 | 'end_cursor'] 707 | has_next_page = \ 708 | arr['graphql']['hashtag']['edge_hashtag_to_media']['page_info'][ 709 | 'has_next_page'] 710 | try: 711 | media_count = arr['graphql']['hashtag']['edge_hashtag_to_media'][ 712 | 'count'] 713 | except KeyError: 714 | return to_return 715 | 716 | to_return = { 717 | 'medias': medias, 718 | 'count': media_count, 719 | 'maxId': max_id, 720 | 'hasNextPage': has_next_page, 721 | } 722 | 723 | return to_return 724 | 725 | def get_location_by_id(self, facebook_location_id): 726 | """ 727 | :param facebook_location_id: facebook location id 728 | :return: Location 729 | """ 730 | time.sleep(self.sleep_between_requests) 731 | response = self.__req.get( 732 | endpoints.get_medias_json_by_location_id_link(facebook_location_id), 733 | headers=self.generate_headers(self.user_session)) 734 | 735 | if response.status_code == Instagram.HTTP_NOT_FOUND: 736 | raise InstagramNotFoundException( 737 | 'Location with this id doesn\'t exist') 738 | 739 | if response.status_code != Instagram.HTTP_OK: 740 | raise InstagramException.default(response.text, 741 | response.status_code) 742 | 743 | json_response = response.json() 744 | 745 | return Location(json_response['graphql']['location']) 746 | 747 | def get_media_likes_by_code(self, code, count=10, max_id=None): 748 | """ 749 | :param code: 750 | :param count: 751 | :param max_id: 752 | :return: 753 | """ 754 | 755 | remain = count 756 | likes = [] 757 | index = 0 758 | has_previous = True 759 | 760 | #TODO: $index < $count (bug index getting to high since max_likes_per_request gets sometimes changed by instagram) 761 | 762 | while (has_previous and index < count): 763 | if (remain > self.MAX_LIKES_PER_REQUEST): 764 | number_of_likes_to_receive = self.MAX_LIKES_PER_REQUEST 765 | remain -= self.MAX_LIKES_PER_REQUEST 766 | index += self.MAX_LIKES_PER_REQUEST 767 | else: 768 | number_of_likes_to_receive = remain 769 | index += remain 770 | remain = 0 771 | 772 | if (max_id != None): 773 | max_id = '' 774 | 775 | variables = { 776 | "shortcode": str(code), 777 | "first": str(number_of_likes_to_receive), 778 | "after": '' if not max_id else max_id 779 | } 780 | 781 | time.sleep(self.sleep_between_requests) 782 | 783 | response = self.__req.get( 784 | endpoints.get_last_likes_by_code(variables), 785 | headers=self.generate_headers(self.user_session)) 786 | 787 | if not response.status_code == Instagram.HTTP_OK: 788 | raise InstagramException.default(response.text,response.status_code) 789 | 790 | jsonResponse = response.json() 791 | 792 | nodes = jsonResponse['data']['shortcode_media']['edge_liked_by']['edges'] 793 | 794 | for likesArray in nodes: 795 | 796 | like = Account(likesArray['node']) 797 | likes.append(like) 798 | 799 | 800 | has_previous = jsonResponse['data']['shortcode_media']['edge_liked_by']['page_info']['has_next_page'] 801 | number_of_likes = jsonResponse['data']['shortcode_media']['edge_liked_by']['count'] 802 | if count > number_of_likes: 803 | count = number_of_likes 804 | 805 | if len(nodes) == 0: 806 | data = {} 807 | data['next_page'] = max_id 808 | data['accounts'] = likes 809 | 810 | return data 811 | 812 | max_id = jsonResponse['data']['shortcode_media']['edge_liked_by']['page_info']['end_cursor'] 813 | 814 | data = {} 815 | data['next_page'] = max_id 816 | data['accounts'] = likes 817 | 818 | return data 819 | 820 | def get_followers(self, account_id, count=20, page_size=20, end_cursor='', 821 | delayed=True): 822 | 823 | """ 824 | :param account_id: 825 | :param count: 826 | :param page_size: 827 | :param end_cursor: 828 | :param delayed: 829 | :return: 830 | """ 831 | # TODO set time limit 832 | # if ($delayed) { 833 | # set_time_limit($this->pagingTimeLimitSec); 834 | # } 835 | 836 | index = 0 837 | accounts = [] 838 | 839 | next_page = end_cursor 840 | 841 | if count < page_size: 842 | raise InstagramException( 843 | 'Count must be greater than or equal to page size.') 844 | 845 | while True: 846 | time.sleep(self.sleep_between_requests) 847 | 848 | variables = { 849 | 'id': str(account_id), 850 | 'first': str(count), 851 | 'after': next_page 852 | } 853 | 854 | headers = self.generate_headers(self.user_session) 855 | 856 | response = self.__req.get( 857 | endpoints.get_followers_json_link(variables), 858 | headers=headers) 859 | 860 | if not response.status_code == Instagram.HTTP_OK: 861 | raise InstagramException.default(response.text, 862 | response.status_code) 863 | 864 | jsonResponse = response.json() 865 | 866 | if jsonResponse['data']['user']['edge_followed_by']['count'] == 0: 867 | return accounts 868 | 869 | edgesArray = jsonResponse['data']['user']['edge_followed_by'][ 870 | 'edges'] 871 | if len(edgesArray) == 0: 872 | InstagramException( 873 | f'Failed to get followers of account id {account_id}.' 874 | f' The account is private.', 875 | Instagram.HTTP_FORBIDDEN) 876 | 877 | pageInfo = jsonResponse['data']['user']['edge_followed_by'][ 878 | 'page_info'] 879 | if pageInfo['has_next_page']: 880 | next_page = pageInfo['end_cursor'] 881 | 882 | for edge in edgesArray: 883 | 884 | accounts.append(Account(edge['node'])) 885 | index += 1 886 | 887 | if index >= count: 888 | #since break 2 not in python, looking for better solution since duplicate code 889 | data = {} 890 | data['next_page'] = next_page 891 | data['accounts'] = accounts 892 | 893 | return data 894 | 895 | #must be below here 896 | if not pageInfo['has_next_page']: 897 | break 898 | 899 | if delayed != None: 900 | # Random wait between 1 and 3 sec to mimic browser 901 | microsec = random.uniform(1.0, 3.0) 902 | time.sleep(microsec) 903 | 904 | data = {} 905 | data['next_page'] = next_page 906 | data['accounts'] = accounts 907 | 908 | return data 909 | 910 | def get_following(self, account_id, count=20, page_size=20, end_cursor='', 911 | delayed=True): 912 | """ 913 | :param account_id: 914 | :param count: 915 | :param page_size: 916 | :param end_cursor: 917 | :param delayed: 918 | :return: 919 | """ 920 | 921 | #TODO 922 | # if ($delayed) { 923 | # set_time_limit($this->pagingTimeLimitSec); 924 | # } 925 | 926 | index = 0 927 | accounts = [] 928 | 929 | next_page = end_cursor 930 | 931 | if count < page_size: 932 | raise InstagramException('Count must be greater than or equal to page size.') 933 | 934 | while True: 935 | 936 | variables = { 937 | 'id': str(account_id), 938 | 'first': str(count), 939 | 'after': next_page 940 | } 941 | 942 | headers = self.generate_headers(self.user_session) 943 | 944 | 945 | response = self.__req.get( 946 | endpoints.get_following_json_link(variables), 947 | headers=headers) 948 | 949 | if not response.status_code == Instagram.HTTP_OK: 950 | raise InstagramException.default(response.text,response.status_code) 951 | 952 | jsonResponse = response.json() 953 | if jsonResponse['data']['user']['edge_follow']['count'] == 0: 954 | return accounts 955 | 956 | edgesArray = jsonResponse['data']['user']['edge_follow'][ 957 | 'edges'] 958 | 959 | if len(edgesArray) == 0: 960 | raise InstagramException( 961 | f'Failed to get follows of account id {account_id}.' 962 | f' The account is private.', 963 | Instagram.HTTP_FORBIDDEN) 964 | 965 | pageInfo = jsonResponse['data']['user']['edge_follow']['page_info'] 966 | if pageInfo['has_next_page']: 967 | next_page = pageInfo['end_cursor'] 968 | 969 | for edge in edgesArray: 970 | accounts.append(Account(edge['node'])) 971 | index += 1 972 | if index >= count: 973 | #since no break 2, looking for better solution since duplicate code 974 | data = {} 975 | data['next_page'] = next_page 976 | data['accounts'] = accounts 977 | 978 | return data 979 | 980 | #must be below here 981 | if not pageInfo['has_next_page']: 982 | break 983 | 984 | if delayed != None: 985 | # Random wait between 1 and 3 sec to mimic browser 986 | microsec = random.uniform(1.0, 3.0) 987 | time.sleep(microsec) 988 | 989 | data = {} 990 | data['next_page'] = next_page 991 | data['accounts'] = accounts 992 | 993 | return data 994 | 995 | def get_media_comments_by_id(self, media_id, count=10, max_id=None): 996 | """ 997 | :param media_id: media id 998 | :param count: the number of how many comments you want to get 999 | :param max_id: used to paginate 1000 | :return: Comment List 1001 | """ 1002 | code = Media.get_code_from_id(media_id) 1003 | return self.get_media_comments_by_code(code, count, max_id) 1004 | 1005 | def get_media_comments_by_code(self, code, count=10, max_id=''): 1006 | 1007 | """ 1008 | :param code: media code 1009 | :param count: the number of how many comments you want to get 1010 | :param max_id: used to paginate 1011 | :return: Comment List 1012 | """ 1013 | 1014 | comments = [] 1015 | index = 0 1016 | has_previous = True 1017 | 1018 | while has_previous and index < count: 1019 | number_of_comments_to_receive = 0 1020 | if count - index > Instagram.MAX_COMMENTS_PER_REQUEST: 1021 | number_of_comments_to_receive = Instagram.MAX_COMMENTS_PER_REQUEST 1022 | else: 1023 | number_of_comments_to_receive = count - index 1024 | 1025 | variables = { 1026 | "shortcode": str(code), 1027 | "first": str(number_of_comments_to_receive), 1028 | "after": '' if not max_id else max_id 1029 | } 1030 | 1031 | comments_url = endpoints.get_comments_before_comments_id_by_code( 1032 | variables) 1033 | 1034 | time.sleep(self.sleep_between_requests) 1035 | response = self.__req.get(comments_url, 1036 | headers=self.generate_headers( 1037 | self.user_session, 1038 | self.__generate_gis_token(variables))) 1039 | 1040 | if not response.status_code == Instagram.HTTP_OK: 1041 | raise InstagramException.default(response.text, 1042 | response.status_code) 1043 | 1044 | jsonResponse = response.json() 1045 | 1046 | nodes = jsonResponse['data']['shortcode_media']['edge_media_to_parent_comment']['edges'] 1047 | 1048 | for commentArray in nodes: 1049 | comment = Comment(commentArray['node']) 1050 | comments.append(comment) 1051 | index += 1 1052 | 1053 | has_previous = jsonResponse['data']['shortcode_media']['edge_media_to_parent_comment']['page_info']['has_next_page'] 1054 | 1055 | number_of_comments = jsonResponse['data']['shortcode_media']['edge_media_to_parent_comment']['count'] 1056 | if count > number_of_comments: 1057 | count = number_of_comments 1058 | 1059 | max_id = jsonResponse['data']['shortcode_media']['edge_media_to_parent_comment']['page_info']['end_cursor'] 1060 | 1061 | if len(nodes) == 0: 1062 | break 1063 | 1064 | 1065 | data = {} 1066 | data['next_page'] = max_id 1067 | data['comments'] = comments 1068 | return data 1069 | 1070 | def get_account(self, username): 1071 | """ 1072 | :param username: username 1073 | :return: Account 1074 | """ 1075 | time.sleep(self.sleep_between_requests) 1076 | response = self.__req.get(endpoints.get_account_page_link( 1077 | username), headers=self.generate_headers(self.user_session)) 1078 | 1079 | if Instagram.HTTP_NOT_FOUND == response.status_code: 1080 | raise InstagramNotFoundException( 1081 | 'Account with given username does not exist.') 1082 | 1083 | if Instagram.HTTP_OK != response.status_code: 1084 | raise InstagramException.default(response.text, 1085 | response.status_code) 1086 | 1087 | user_array = Instagram.extract_shared_data_from_body(response.text) 1088 | 1089 | if user_array['entry_data']['ProfilePage'][0]['graphql']['user'] is None: 1090 | raise InstagramNotFoundException( 1091 | 'Account with this username does not exist') 1092 | 1093 | return Account( 1094 | user_array['entry_data']['ProfilePage'][0]['graphql']['user']) 1095 | 1096 | def get_stories(self, reel_ids=None): 1097 | """ 1098 | :param reel_ids: reel ids 1099 | :return: UserStories List 1100 | """ 1101 | variables = {'precomposed_overlay': False, 'reel_ids': []} 1102 | 1103 | if reel_ids is None or len(reel_ids) == 0: 1104 | time.sleep(self.sleep_between_requests) 1105 | response = self.__req.get(endpoints.get_user_stories_link(), 1106 | headers=self.generate_headers( 1107 | self.user_session)) 1108 | 1109 | if not Instagram.HTTP_OK == response.status_code: 1110 | raise InstagramException.default(response.text, 1111 | response.status_code) 1112 | 1113 | json_response = response.json() 1114 | 1115 | try: 1116 | edges = json_response['data']['user']['feed_reels_tray'][ 1117 | 'edge_reels_tray_to_reel']['edges'] 1118 | except KeyError: 1119 | return [] 1120 | 1121 | for edge in edges: 1122 | variables['reel_ids'].append(edge['node']['id']) 1123 | 1124 | else: 1125 | variables['reel_ids'] = reel_ids 1126 | 1127 | time.sleep(self.sleep_between_requests) 1128 | response = self.__req.get(endpoints.get_stories_link(variables), 1129 | headers=self.generate_headers( 1130 | self.user_session)) 1131 | 1132 | if not Instagram.HTTP_OK == response.status_code: 1133 | raise InstagramException.default(response.text, 1134 | response.status_code) 1135 | 1136 | json_response = response.json() 1137 | 1138 | try: 1139 | reels_media = json_response['data']['reels_media'] 1140 | if len(reels_media) == 0: 1141 | return [] 1142 | except KeyError: 1143 | return [] 1144 | 1145 | stories = [] 1146 | for user in reels_media: 1147 | user_stories = UserStories() 1148 | 1149 | user_stories.owner = Account(user['user']) 1150 | for item in user['items']: 1151 | story = Story(item) 1152 | user_stories.stories.append(story) 1153 | 1154 | stories.append(user_stories) 1155 | return stories 1156 | 1157 | def search_accounts_by_username(self, username): 1158 | """ 1159 | :param username: user name 1160 | :return: Account List 1161 | """ 1162 | time.sleep(self.sleep_between_requests) 1163 | response = self.__req.get( 1164 | endpoints.get_general_search_json_link(username), 1165 | headers=self.generate_headers(self.user_session)) 1166 | 1167 | if Instagram.HTTP_NOT_FOUND == response.status_code: 1168 | raise InstagramNotFoundException( 1169 | 'Account with given username does not exist.') 1170 | 1171 | if not Instagram.HTTP_OK == response.status_code: 1172 | raise InstagramException.default(response.text, 1173 | response.status_code) 1174 | 1175 | json_response = response.json() 1176 | 1177 | try: 1178 | status = json_response['status'] 1179 | if not status == 'ok': 1180 | raise InstagramException( 1181 | 'Response code is not equal 200.' 1182 | ' Something went wrong. Please report issue.') 1183 | except KeyError: 1184 | raise InstagramException( 1185 | 'Response code is not equal 200.' 1186 | ' Something went wrong. Please report issue.') 1187 | 1188 | try: 1189 | users = json_response['users'] 1190 | if len(users) == 0: 1191 | return [] 1192 | except KeyError: 1193 | return [] 1194 | 1195 | accounts = [] 1196 | for json_account in json_response['users']: 1197 | accounts.append(Account(json_account['user'])) 1198 | 1199 | return accounts 1200 | 1201 | # TODO not optimal separate http call after getMedia 1202 | def get_media_tagged_users_by_code(self, code): 1203 | """ 1204 | :param code: media short code 1205 | :return: list contains tagged_users dict 1206 | """ 1207 | url = endpoints.get_media_json_link(code) 1208 | 1209 | time.sleep(self.sleep_between_requests) 1210 | response = self.__req.get(url, headers=self.generate_headers( 1211 | self.user_session)) 1212 | 1213 | if not Instagram.HTTP_OK == response.status_code: 1214 | raise InstagramException.default(response.text, 1215 | response.status_code) 1216 | 1217 | json_response = response.json() 1218 | 1219 | try: 1220 | tag_data = json_response['graphql']['shortcode_media'][ 1221 | 'edge_media_to_tagged_user']['edges'] 1222 | except KeyError: 1223 | return [] 1224 | 1225 | tagged_users = [] 1226 | 1227 | for tag in tag_data: 1228 | x_pos = tag['node']['x'] 1229 | y_pos = tag['node']['y'] 1230 | user = tag['node']['user'] 1231 | # TODO: add Model and add Data to it instead of Dict 1232 | tagged_user = dict() 1233 | tagged_user['x_pos'] = x_pos 1234 | tagged_user['y_pos'] = y_pos 1235 | tagged_user['user'] = user 1236 | 1237 | tagged_users.append(tagged_user) 1238 | 1239 | return tagged_users 1240 | 1241 | def is_logged_in(self, session): 1242 | """ 1243 | :param session: session dict 1244 | :return: bool 1245 | """ 1246 | if session is None or 'sessionid' not in session.keys(): 1247 | return False 1248 | 1249 | session_id = session['sessionid'] 1250 | csrf_token = session['csrftoken'] 1251 | 1252 | headers = { 1253 | 'cookie': f"ig_cb=1; csrftoken={csrf_token}; sessionid={session_id};", 1254 | 'referer': endpoints.BASE_URL + '/', 1255 | 'x-csrftoken': csrf_token, 1256 | 'X-CSRFToken': csrf_token, 1257 | 'user-agent': self.user_agent, 1258 | } 1259 | 1260 | time.sleep(self.sleep_between_requests) 1261 | response = self.__req.get(endpoints.BASE_URL, headers=headers) 1262 | 1263 | if not response.status_code == Instagram.HTTP_OK: 1264 | return False 1265 | 1266 | cookies = response.cookies.get_dict() 1267 | 1268 | if cookies is None or not 'ds_user_id' in cookies.keys(): 1269 | return False 1270 | 1271 | return True 1272 | 1273 | def login(self, force=False, two_step_verificator=None): 1274 | """support_two_step_verification true works only in cli mode - just run login in cli mode - save cookie to file and use in any mode 1275 | :param force: true will refresh the session 1276 | :param two_step_verificator: true will need to do verification when an account goes wrong 1277 | :return: headers dict 1278 | """ 1279 | if self.session_username is None or self.session_password is None: 1280 | raise InstagramAuthException("User credentials not provided") 1281 | 1282 | if two_step_verificator: 1283 | two_step_verificator = ConsoleVerification() 1284 | 1285 | session = json.loads( 1286 | Instagram.instance_cache.get_saved_cookies()) if Instagram.instance_cache.get_saved_cookies() != None else None 1287 | 1288 | if force or not self.is_logged_in(session): 1289 | time.sleep(self.sleep_between_requests) 1290 | response = self.__req.get(endpoints.BASE_URL) 1291 | if not response.status_code == Instagram.HTTP_OK: 1292 | raise InstagramException.default(response.text, 1293 | response.status_code) 1294 | 1295 | match = re.findall(r'"csrf_token":"(.*?)"', response.text) 1296 | 1297 | if len(match) > 0: 1298 | csrfToken = match[0] 1299 | 1300 | cookies = response.cookies.get_dict() 1301 | 1302 | # cookies['mid'] doesnt work at the moment so fetch it with function 1303 | mid = self.__get_mid() 1304 | 1305 | headers = { 1306 | 'cookie': f"ig_cb=1; csrftoken={csrfToken}; mid={mid};", 1307 | 'referer': endpoints.BASE_URL + '/', 1308 | 'x-csrftoken': csrfToken, 1309 | 'X-CSRFToken': csrfToken, 1310 | 'user-agent': self.user_agent, 1311 | } 1312 | payload = {'username': self.session_username, 1313 | 'password': self.session_password} 1314 | response = self.__req.post(endpoints.LOGIN_URL, data=payload, 1315 | headers=headers) 1316 | 1317 | if not response.status_code == Instagram.HTTP_OK: 1318 | if ( 1319 | response.status_code == Instagram.HTTP_BAD_REQUEST 1320 | and response.text is not None 1321 | and response.json()['message'] == 'checkpoint_required' 1322 | and two_step_verificator is not None): 1323 | response = self.__verify_two_step(response, cookies, 1324 | two_step_verificator) 1325 | print('checkpoint required') 1326 | 1327 | elif response.status_code is not None and response.text is not None: 1328 | raise InstagramAuthException( 1329 | f'Response code is {response.status_code}. Body: {response.text} Something went wrong. Please report issue.', 1330 | response.status_code) 1331 | else: 1332 | raise InstagramAuthException( 1333 | 'Something went wrong. Please report issue.', 1334 | response.status_code) 1335 | 1336 | if not response.json()['authenticated']: 1337 | raise InstagramAuthException('User credentials are wrong.') 1338 | 1339 | cookies = response.cookies.get_dict() 1340 | 1341 | cookies['mid'] = mid 1342 | Instagram.instance_cache.set_saved_cookies(json.dumps(cookies, separators=(',', ':'))) 1343 | 1344 | self.user_session = cookies 1345 | 1346 | else: 1347 | self.user_session = session 1348 | 1349 | return self.generate_headers(self.user_session) 1350 | 1351 | def __verify_two_step(self, response, cookies, two_step_verificator): 1352 | """ 1353 | :param response: Response object returned by Request 1354 | :param cookies: user cookies 1355 | :param two_step_verificator: two_step_verification instance 1356 | :return: Response 1357 | """ 1358 | new_cookies = response.cookies.get_dict() 1359 | cookies = {**cookies, **new_cookies} 1360 | 1361 | cookie_string = '' 1362 | for key in cookies.keys(): 1363 | cookie_string += f'{key}={cookies[key]};' 1364 | 1365 | headers = { 1366 | 'cookie': cookie_string, 1367 | 'referer': endpoints.LOGIN_URL, 1368 | 'x-csrftoken': cookies['csrftoken'], 1369 | 'user-agent': self.user_agent, 1370 | } 1371 | 1372 | url = endpoints.BASE_URL + response.json()['checkpoint_url'] 1373 | 1374 | time.sleep(self.sleep_between_requests) 1375 | response = self.__req.get(url, headers=headers) 1376 | data = Instagram.extract_shared_data_from_body(response.text) 1377 | 1378 | if data is not None: 1379 | try: 1380 | choices = \ 1381 | data['entry_data']['Challenge'][0]['extraData']['content'][ 1382 | 3][ 1383 | 'fields'][0]['values'] 1384 | except KeyError: 1385 | choices = dict() 1386 | try: 1387 | fields = data['entry_data']['Challenge'][0]['fields'] 1388 | try: 1389 | choices.update({'label': f"Email: {fields['email']}", 1390 | 'value': 1}) 1391 | except KeyError: 1392 | pass 1393 | try: 1394 | choices.update( 1395 | {'label': f"Phone: {fields['phone_number']}", 1396 | 'value': 0}) 1397 | except KeyError: 1398 | pass 1399 | 1400 | except KeyError: 1401 | pass 1402 | 1403 | if len(choices) > 0: 1404 | selected_choice = two_step_verificator.get_verification_type( 1405 | choices) 1406 | response = self.__req.post(url, 1407 | data={'choice': selected_choice}, 1408 | headers=headers) 1409 | 1410 | if len(re.findall('name="security_code"', response.text)) <= 0: 1411 | raise InstagramAuthException( 1412 | 'Something went wrong when try ' 1413 | 'two step verification. Please report issue.', 1414 | response.status_code) 1415 | 1416 | security_code = two_step_verificator.get_security_code() 1417 | 1418 | post_data = { 1419 | 'csrfmiddlewaretoken': cookies['csrftoken'], 1420 | 'verify': 'Verify Account', 1421 | 'security_code': security_code, 1422 | } 1423 | response = self.__req.post(url, data=post_data, headers=headers) 1424 | if not response.status_code == Instagram.HTTP_OK \ 1425 | or 'Please check the code we sent you and try again' in response.text: 1426 | raise InstagramAuthException( 1427 | 'Something went wrong when try two step' 1428 | ' verification and enter security code. Please report issue.', 1429 | response.status_code) 1430 | 1431 | return response 1432 | 1433 | def add_comment(self, media_id, text, replied_to_comment_id=None): 1434 | """ 1435 | :param media_id: media id 1436 | :param text: the content you want to post 1437 | :param replied_to_comment_id: the id of the comment you want to reply 1438 | :return: Comment 1439 | """ 1440 | media_id = media_id.identifier if isinstance(media_id, Media) else media_id 1441 | 1442 | replied_to_comment_id = replied_to_comment_id._data['id'] if isinstance(replied_to_comment_id, Comment) else replied_to_comment_id 1443 | 1444 | body = {'comment_text': text, 1445 | 'replied_to_comment_id': replied_to_comment_id 1446 | if replied_to_comment_id is not None else ''} 1447 | 1448 | response = self.__req.post(endpoints.get_add_comment_url(media_id), 1449 | data=body, headers=self.generate_headers( 1450 | self.user_session)) 1451 | 1452 | if not Instagram.HTTP_OK == response.status_code: 1453 | raise InstagramException.default(response.text, 1454 | response.status_code) 1455 | 1456 | json_response = response.json() 1457 | 1458 | if json_response['status'] != 'ok': 1459 | status = json_response['status'] 1460 | raise InstagramException( 1461 | f'Response status is {status}. ' 1462 | f'Body: {response.text} Something went wrong.' 1463 | f' Please report issue.', 1464 | response.status_code) 1465 | 1466 | return Comment(json_response) 1467 | 1468 | def delete_comment(self, media_id, comment_id): 1469 | """ 1470 | :param media_id: media id 1471 | :param comment_id: the id of the comment you want to delete 1472 | """ 1473 | media_id = media_id.identifier if isinstance(media_id, 1474 | Media) else media_id 1475 | 1476 | comment_id = comment_id._data['id'] if isinstance(comment_id, 1477 | Comment) else comment_id 1478 | 1479 | response = self.__req.post( 1480 | endpoints.get_delete_comment_url(media_id, comment_id), 1481 | headers=self.generate_headers(self.user_session)) 1482 | 1483 | if not Instagram.HTTP_OK == response.status_code: 1484 | raise InstagramException.default(response.text, 1485 | response.status_code) 1486 | 1487 | json_response = response.json() 1488 | 1489 | if json_response['status'] != 'ok': 1490 | status = json_response['status'] 1491 | raise InstagramException( 1492 | f'Response status is {status}. ' 1493 | f'Body: {response.text} Something went wrong.' 1494 | f' Please report issue.', 1495 | response.status_code) 1496 | 1497 | def like(self, media_id): 1498 | """ 1499 | :param media_id: media id 1500 | """ 1501 | media_id = media_id.identifier if isinstance(media_id, 1502 | Media) else media_id 1503 | response = self.__req.post(endpoints.get_like_url(media_id), 1504 | headers=self.generate_headers( 1505 | self.user_session)) 1506 | 1507 | if not Instagram.HTTP_OK == response.status_code: 1508 | raise InstagramException.default(response.text, 1509 | response.status_code) 1510 | 1511 | json_response = response.json() 1512 | 1513 | if json_response['status'] != 'ok': 1514 | status = json_response['status'] 1515 | raise InstagramException( 1516 | f'Response status is {status}. ' 1517 | f'Body: {response.text} Something went wrong.' 1518 | f' Please report issue.', 1519 | response.status_code) 1520 | 1521 | def unlike(self, media_id): 1522 | """ 1523 | :param media_id: media id 1524 | """ 1525 | media_id = media_id.identifier if isinstance(media_id, 1526 | Media) else media_id 1527 | response = self.__req.post(endpoints.get_unlike_url(media_id), 1528 | headers=self.generate_headers( 1529 | self.user_session)) 1530 | 1531 | if not Instagram.HTTP_OK == response.status_code: 1532 | raise InstagramException.default(response.text, 1533 | response.status_code) 1534 | 1535 | json_response = response.json() 1536 | 1537 | if json_response['status'] != 'ok': 1538 | status = json_response['status'] 1539 | raise InstagramException( 1540 | f'Response status is {status}. ' 1541 | f'Body: {response.text} Something went wrong.' 1542 | f' Please report issue.', 1543 | response.status_code) 1544 | 1545 | def follow(self, user_id): 1546 | """ 1547 | :param user_id: user id 1548 | :return: bool 1549 | """ 1550 | if self.is_logged_in(self.user_session): 1551 | url = endpoints.get_follow_url(user_id) 1552 | 1553 | try: 1554 | follow = self.__req.post(url, 1555 | headers=self.generate_headers( 1556 | self.user_session)) 1557 | if follow.status_code == Instagram.HTTP_OK: 1558 | return True 1559 | except: 1560 | raise InstagramException("Except on follow!") 1561 | return False 1562 | 1563 | def unfollow(self, user_id): 1564 | """ 1565 | :param user_id: user id 1566 | :return: bool 1567 | """ 1568 | if self.is_logged_in(self.user_session): 1569 | url_unfollow = endpoints.get_unfollow_url(user_id) 1570 | try: 1571 | unfollow = self.__req.post(url_unfollow) 1572 | if unfollow.status_code == Instagram.HTTP_OK: 1573 | return unfollow 1574 | except: 1575 | raise InstagramException("Exept on unfollow!") 1576 | return False 1577 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/igramscraper/model/__init__.py -------------------------------------------------------------------------------- /optracker/igramscraper/model/account.py: -------------------------------------------------------------------------------- 1 | from .initializer_model import InitializerModel 2 | from .media import Media 3 | import textwrap 4 | 5 | 6 | class Account(InitializerModel): 7 | 8 | def __init__(self, props=None): 9 | self.identifier = None 10 | self.username = None 11 | self.full_name = None 12 | self.profile_pic_url = None 13 | self.profile_pic_url_hd = None 14 | self.biography = None 15 | self.external_url = None 16 | self.follows_count = 0 17 | self.followed_by_count = 0 18 | self.media_count = 0 19 | self.is_private = False 20 | self.is_verified = False 21 | self.medias = [] 22 | self.blocked_by_viewer = False 23 | self.country_block = False 24 | self.followed_by_viewer = False 25 | self.follows_viewer = False 26 | self.has_channel = False 27 | self.has_blocked_viewer = False 28 | self.highlight_reel_count = 0 29 | self.has_requested_viewer = False 30 | self.is_business_account = False 31 | self.is_joined_recently = False 32 | self.business_category_name = None 33 | self.business_email = None 34 | self.business_phone_number = None 35 | self.business_address_json = None 36 | self.requested_by_viewer = False 37 | self.connected_fb_page = None 38 | 39 | super(Account, self).__init__(props) 40 | 41 | def get_profile_picture_url(self): 42 | try: 43 | if not self.profile_pic_url_hd == '': 44 | return self.profile_pic_url_hd 45 | except AttributeError: 46 | try: 47 | return self.profile_pic_url 48 | except AttributeError: 49 | return '' 50 | 51 | def __str__(self): 52 | string = f""" 53 | Account info: 54 | Id: {self.identifier} 55 | Username: {self.username if hasattr(self, 'username') else '-'} 56 | Full Name: {self.full_name if hasattr(self, 'full_name') else '-'} 57 | Bio: {self.biography if hasattr(self, 'biography') else '-'} 58 | Profile Pic Url: {self.get_profile_picture_url()} 59 | External url: {self.external_url if hasattr(self, 'external_url') else '-'} 60 | Number of published posts: {self.media_count if hasattr(self, 'mediaCount') else '-'} 61 | Number of followers: {self.followed_by_count if hasattr(self, 'followed_by_count') else '-'} 62 | Number of follows: {self.follows_count if hasattr(self, 'follows_count') else '-'} 63 | Is private: {self.is_private if hasattr(self, 'is_private') else '-'} 64 | Is verified: {self.is_verified if hasattr(self, 'is_verified') else '-'} 65 | """ 66 | return textwrap.dedent(string) 67 | 68 | """ 69 | * @param Media $media 70 | * @return Account 71 | """ 72 | def add_media(self, media): 73 | try: 74 | self.medias.append(media) 75 | except AttributeError: 76 | raise AttributeError 77 | 78 | def _init_properties_custom(self, value, prop, array): 79 | 80 | if prop == 'id': 81 | self.identifier = value 82 | 83 | standart_properties = [ 84 | 'username', 85 | 'full_name', 86 | 'profile_pic_url', 87 | 'profile_pic_url_hd', 88 | 'biography', 89 | 'external_url', 90 | 'is_private', 91 | 'is_verified', 92 | 'blocked_by_viewer', 93 | 'country_block', 94 | 'followed_by_viewer', 95 | 'follows_viewer', 96 | 'has_channel', 97 | 'has_blocked_viewer', 98 | 'highlight_reel_count', 99 | 'has_requested_viewer', 100 | 'is_business_account', 101 | 'is_joined_recently', 102 | 'business_category_name', 103 | 'business_email', 104 | 'business_phone_number', 105 | 'business_address_json', 106 | 'requested_by_viewer', 107 | 'connected_fb_page' 108 | ] 109 | if prop in standart_properties: 110 | self.__setattr__(prop, value) 111 | 112 | if prop == 'edge_follow': 113 | self.follows_count = array[prop]['count'] \ 114 | if array[prop]['count'] is not None else 0 115 | 116 | if prop == 'edge_followed_by': 117 | self.followed_by_count = array[prop]['count'] \ 118 | if array[prop]['count'] is not None else 0 119 | 120 | if prop == 'edge_owner_to_timeline_media': 121 | self._init_media(array[prop]) 122 | 123 | def _init_media(self, array): 124 | self.media_count = array['count'] if 'count' in array.keys() else 0 125 | 126 | try: 127 | nodes = array['edges'] 128 | except: 129 | return 130 | 131 | if not self.media_count or isinstance(nodes, list): 132 | return 133 | 134 | for media_array in nodes: 135 | media = Media(media_array['node']) 136 | if isinstance(media, Media): 137 | self.add_media(media) 138 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/carousel_media.py: -------------------------------------------------------------------------------- 1 | class CarouselMedia: 2 | 3 | def __init__(self): 4 | self.__type = '' 5 | self.__image_low_resolution_url = '' 6 | self.__image_thumbnail_url = '' 7 | self.__image_standard_resolution_url = '' 8 | self.__image_high_resolution_url = '' 9 | self.__video_low_resolution_url = '' 10 | self.__video_standard_resolution_url = '' 11 | self.__video_low_bandwidth_url = '' 12 | self.__video_views = '' 13 | 14 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/comment.py: -------------------------------------------------------------------------------- 1 | from .initializer_model import InitializerModel 2 | 3 | 4 | class Comment(InitializerModel): 5 | """ 6 | * @param $value 7 | * @param $prop 8 | """ 9 | 10 | def __init__(self, props=None): 11 | self.identifier = None 12 | self.text = None 13 | self.created_at = None 14 | # Account object 15 | self.owner = None 16 | 17 | super(Comment, self).__init__(props) 18 | 19 | def _init_properties_custom(self, value, prop, array): 20 | 21 | if prop == 'id': 22 | self.identifier = value 23 | 24 | standart_properties = [ 25 | 'created_at', 26 | 'text', 27 | ] 28 | 29 | if prop in standart_properties: 30 | self.__setattr__(prop, value) 31 | 32 | if prop == 'owner': 33 | from .account import Account 34 | self.owner = Account(value) 35 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/initializer_model.py: -------------------------------------------------------------------------------- 1 | import time 2 | 3 | 4 | class InitializerModel: 5 | 6 | def __init__(self, props=None): 7 | 8 | self._is_new = True 9 | self._is_loaded = False 10 | """init data was empty""" 11 | self._is_load_empty = True 12 | self._is_fake = False 13 | self._modified = None 14 | 15 | """Array of initialization data""" 16 | self._data = {} 17 | 18 | self.modified = time.time() 19 | 20 | if props is not None and len(props) > 0: 21 | self._init(props) 22 | 23 | def _init(self, props): 24 | """ 25 | 26 | :param props: props array 27 | :return: None 28 | """ 29 | for key in props.keys(): 30 | try: 31 | self._init_properties_custom(props[key], key, props) 32 | except AttributeError: 33 | # if function does not exist fill help data array 34 | self._data[key] = props[key] 35 | 36 | self._is_new = False 37 | self._is_loaded = True 38 | self._is_load_empty = False 39 | 40 | # ''' 41 | # * @return $this 42 | # ''' 43 | # @staticmethod 44 | # def fake(): 45 | # return static::create()->setFake(true); 46 | 47 | # ''' 48 | # * @param bool $value 49 | # * 50 | # * @return $this 51 | # ''' 52 | # def _setFake(self, value = True): 53 | # self._isFake = (bool)value 54 | 55 | # ''' 56 | # * @return bool 57 | # ''' 58 | # public function isNotEmpty() 59 | # { 60 | # return !$this->isLoadEmpty; 61 | # } 62 | 63 | # ''' 64 | # * @return bool 65 | # ''' 66 | # public function isFake() 67 | # { 68 | # return $this->isFake; 69 | # } 70 | 71 | # ''' 72 | # * @return array 73 | # ''' 74 | # public function toArray() 75 | # { 76 | # ret = [] 77 | # map = static::$initPropertiesMap; 78 | # foreach ($map as $key => $init) { 79 | # if (\property_exists($this, $key)) { 80 | # //if there is property then it just assign value 81 | # $ret[$key] = $this->{$key}; 82 | # } elseif (isset($this[$key])) { 83 | # //probably array access 84 | # $ret[$key] = $this[$key]; 85 | # } else { 86 | # $ret[$key] = null; 87 | # } 88 | # } 89 | 90 | # return $ret; 91 | # } 92 | 93 | # ''' 94 | # * @param $datetime 95 | # * 96 | # * @return $this 97 | # ''' 98 | # protected function initModified($datetime) 99 | # { 100 | # $this->modified = \strtotime($datetime); 101 | 102 | # return $this; 103 | # } 104 | 105 | # ''' 106 | # * @param string $date 107 | # * @param string $key 108 | # * 109 | # * @return $this 110 | # ''' 111 | # protected function initDatetime($date, $key) 112 | # { 113 | # return $this->initProperty(\strtotime($date), $key); 114 | # } 115 | 116 | # ''' 117 | # * @param $value 118 | # * @param $key 119 | # * 120 | # * @return $this 121 | # ''' 122 | # protected function initProperty($value, $key) 123 | # { 124 | # $keys = \func_get_args(); 125 | # unset($keys[0]); //remove value 126 | # if (\count($keys) > 1) { 127 | # foreach ($keys as $key) { 128 | # if (\property_exists($this, $key)) { //first found set 129 | # $this->{$key} = $value; 130 | 131 | # return $this; 132 | # } 133 | # } 134 | # } elseif (\property_exists($this, $key)) { 135 | # $this->{$key} = $value; 136 | # } 137 | 138 | # return $this; 139 | # } 140 | 141 | # ''' 142 | # * @param mixed $value 143 | # * @param string $key 144 | # * 145 | # * @return $this 146 | # ''' 147 | # protected function initBool($value, $key) 148 | # { 149 | # return $this->initProperty(!empty($value), "is{$key}", $key); 150 | # } 151 | 152 | # ''' 153 | # * @param mixed $value 154 | # * @param string $key 155 | # * 156 | # * @return $this 157 | # ''' 158 | # protected function initInt($value, $key) 159 | # { 160 | # return $this->initProperty((int)$value, $key); 161 | # } 162 | 163 | # ''' 164 | # * @param mixed $value 165 | # * @param string $key 166 | # * 167 | # * @return $this 168 | # ''' 169 | # protected function initFloat($value, $key) 170 | # { 171 | # return $this->initProperty((float)$value, $key); 172 | # } 173 | 174 | # ''' 175 | # * @param string $rawData 176 | # * @param string $key 177 | # * 178 | # * @return $this 179 | # ''' 180 | # def _initJsonArray(rawData, key): 181 | 182 | # value = json.loads(rawData) 183 | # if value == None or len(value) == 0: 184 | # if ('null' == rawData or '' == rawData or 'None' == rawData): 185 | # value = [] 186 | # else: 187 | # value = (array)rawData; 188 | # else 189 | # value = (array)$value; 190 | 191 | # return $this->initProperty($value, $key); 192 | 193 | # ''' 194 | # * @param mixed $value 195 | # * @param string $key 196 | # * 197 | # * @return $this 198 | # ''' 199 | # protected function initExplode($value, $key) 200 | # { 201 | # return $this->initProperty(\explode(',', $value), "is{$key}", $key); 202 | # } 203 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/location.py: -------------------------------------------------------------------------------- 1 | from .initializer_model import InitializerModel 2 | import textwrap 3 | 4 | 5 | class Location(InitializerModel): 6 | 7 | def __init__(self, props=None): 8 | self.identifier = None 9 | self.has_public_page = None 10 | self.name = None 11 | self.slug = None 12 | self.lat = None 13 | self.lng = None 14 | self.modified = None 15 | super(Location, self).__init__(props) 16 | 17 | def __str__(self): 18 | string = f""" 19 | Location info: 20 | Id: {self.identifier} 21 | Name: {self.name} 22 | Latitude: {self.lat} 23 | Longitude: {self.lng} 24 | Slug: {self.slug} 25 | Is public page available: {self.has_public_page} 26 | """ 27 | 28 | return textwrap.dedent(string) 29 | 30 | def _init_properties_custom(self, value, prop, arr): 31 | 32 | if prop == 'id': 33 | self.identifier = value 34 | 35 | standart_properties = [ 36 | 'has_public_page', 37 | 'name', 38 | 'slug', 39 | 'lat', 40 | 'lng', 41 | 'modified', 42 | ] 43 | 44 | if prop in standart_properties: 45 | self.__setattr__(prop, value) 46 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/media.py: -------------------------------------------------------------------------------- 1 | import urllib.parse 2 | import textwrap 3 | 4 | from .initializer_model import InitializerModel 5 | from .comment import Comment 6 | from .carousel_media import CarouselMedia 7 | from .. import endpoints 8 | 9 | 10 | class Media(InitializerModel): 11 | TYPE_IMAGE = 'image' 12 | TYPE_VIDEO = 'video' 13 | TYPE_SIDECAR = 'sidecar' 14 | TYPE_CAROUSEL = 'carousel' 15 | 16 | def __init__(self, props=None): 17 | self.identifier = None 18 | self.short_code = None 19 | self.created_time = 0 20 | self.type = None 21 | self.link = None 22 | self.image_low_resolution_url = None 23 | self.image_thumbnail_url = None 24 | self.image_standard_resolution_url = None 25 | self.image_high_resolution_url = None 26 | self.square_images = [] 27 | self.carousel_media = [] 28 | self.caption = None 29 | self.is_ad = False 30 | self.video_low_resolution_url = None 31 | self.video_standard_resolution_url = None 32 | self.video_low_bandwidth_url = None 33 | self.video_views = 0 34 | self.video_url = None 35 | # account object 36 | self.owner = None 37 | self.likes_count = 0 38 | self.location_id = None 39 | self.location_name = None 40 | self.comments_count = 0 41 | self.comments = [] 42 | self.has_more_comments = False 43 | self.comments_next_page = None 44 | self.location_slug = None 45 | 46 | super(Media, self).__init__(props) 47 | 48 | @staticmethod 49 | def get_id_from_code(code): 50 | alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_' 51 | id = 0 52 | 53 | for i in range(len(code)): 54 | c = code[i] 55 | id = id * 64 + alphabet.index(c) 56 | 57 | return id 58 | 59 | @staticmethod 60 | def get_link_from_id(id): 61 | code = Media.get_code_from_id(id) 62 | return endpoints.get_media_page_link(code) 63 | 64 | @staticmethod 65 | def get_code_from_id(id): 66 | parts = str(id).partition('_') 67 | id = int(parts[0]) 68 | alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_' 69 | code = '' 70 | 71 | while (id > 0): 72 | remainder = int(id) % 64 73 | id = (id - remainder) // 64 74 | code = alphabet[remainder] + code 75 | 76 | return code 77 | 78 | def __str__(self): 79 | string = f""" 80 | Media Info: 81 | 'Id: {self.identifier} 82 | Shortcode: {self.short_code} 83 | Created at: {self.created_time} 84 | Caption: {self.caption} 85 | Number of comments: {self.comments_count if hasattr(self, 86 | 'commentsCount') else 0} 87 | Number of likes: {self.likes_count} 88 | Link: {self.link} 89 | Hig res image: {self.image_high_resolution_url} 90 | Media type: {self.type} 91 | """ 92 | 93 | return textwrap.dedent(string) 94 | 95 | def _init_properties_custom(self, value, prop, arr): 96 | 97 | if prop == 'id': 98 | self.identifier = value 99 | 100 | standart_properties = [ 101 | 'type', 102 | 'link', 103 | 'thumbnail_src', 104 | 'caption', 105 | 'video_view_count', 106 | 'caption_is_edited', 107 | 'is_ad' 108 | ] 109 | 110 | if prop in standart_properties: 111 | self.__setattr__(prop, value) 112 | 113 | elif prop == 'created_time' or prop == 'taken_at_timestamp' or prop == 'date': 114 | self.created_time = int(value) 115 | 116 | elif prop == 'code': 117 | self.short_code = value 118 | self.link = endpoints.get_media_page_link(self.short_code) 119 | 120 | elif prop == 'comments': 121 | self.comments_count = arr[prop]['count'] 122 | elif prop == 'likes': 123 | self.likes_count = arr[prop]['count'] 124 | 125 | elif prop == 'display_resources': 126 | medias_url = [] 127 | for media in value: 128 | medias_url.append(media['src']) 129 | 130 | if media['config_width'] == 640: 131 | self.image_thumbnail_url = media['src'] 132 | elif media['config_width'] == 750: 133 | self.image_low_resolution_url = media['src'] 134 | elif media['config_width'] == 1080: 135 | self.image_standard_resolution_url = media['src'] 136 | 137 | elif prop == 'display_src' or prop == 'display_url': 138 | self.image_high_resolution_url = value 139 | if self.type is None: 140 | self.type = Media.TYPE_IMAGE 141 | 142 | elif prop == 'thumbnail_resources': 143 | square_images_url = [] 144 | for square_image in value: 145 | square_images_url.append(square_image['src']) 146 | self.square_images = square_images_url 147 | 148 | elif prop == 'carousel_media': 149 | self.type = Media.TYPE_CAROUSEL 150 | self.carousel_media = [] 151 | for carousel_array in arr["carousel_media"]: 152 | self.set_carousel_media(arr, carousel_array) 153 | 154 | elif prop == 'video_views': 155 | self.video_views = value 156 | self.type = Media.TYPE_VIDEO 157 | 158 | elif prop == 'videos': 159 | self.video_low_resolution_url = arr[prop]['low_resolution']['url'] 160 | self.video_standard_resolution_url = \ 161 | arr[prop]['standard_resolution']['url'] 162 | self.video_low_bandwith_url = arr[prop]['low_bandwidth']['url'] 163 | 164 | elif prop == 'video_resources': 165 | for video in value: 166 | if video['profile'] == 'MAIN': 167 | self.video_standard_resolution_url = video['src'] 168 | elif video['profile'] == 'BASELINE': 169 | self.video_low_resolution_url = video['src'] 170 | self.video_low_bandwith_url = video['src'] 171 | 172 | elif prop == 'location' and value is not None: 173 | self.location_id = arr[prop]['id'] 174 | self.location_name = arr[prop]['name'] 175 | self.location_slug = arr[prop]['slug'] 176 | 177 | elif prop == 'user' or prop == 'owner': 178 | from .account import Account 179 | self.owner = Account(arr[prop]) 180 | 181 | elif prop == 'is_video': 182 | if bool(value): 183 | self.type = Media.TYPE_VIDEO 184 | 185 | elif prop == 'video_url': 186 | self.video_standard_resolution_url = value 187 | 188 | elif prop == 'shortcode': 189 | self.short_code = value 190 | self.link = endpoints.get_media_page_link(self.short_code) 191 | 192 | elif prop == 'edge_media_to_comment': 193 | try: 194 | self.comments_count = int(arr[prop]['count']) 195 | except KeyError: 196 | pass 197 | try: 198 | edges = arr[prop]['edges'] 199 | 200 | for comment_data in edges: 201 | self.comments.append(Comment(comment_data['node'])) 202 | except KeyError: 203 | pass 204 | try: 205 | self.has_more_comments = bool( 206 | arr[prop]['page_info']['has_next_page']) 207 | except KeyError: 208 | pass 209 | try: 210 | self.comments_next_page = str( 211 | arr[prop]['page_info']['end_cursor']) 212 | except KeyError: 213 | pass 214 | 215 | elif prop == 'edge_media_preview_like': 216 | self.likes_count = arr[prop]['count'] 217 | elif prop == 'edge_liked_by': 218 | self.likes_count = arr[prop]['count'] 219 | 220 | elif prop == 'edge_media_to_caption': 221 | try: 222 | self.caption = arr[prop]['edges'][0]['node']['text'] 223 | except (KeyError, IndexError): 224 | pass 225 | 226 | elif prop == 'edge_sidecar_to_children': 227 | pass 228 | # #TODO implement 229 | # if (!is_array($arr[$prop]['edges'])) { 230 | # break; 231 | # } 232 | # foreach ($arr[$prop]['edges'] as $edge) { 233 | # if (!isset($edge['node'])) { 234 | # continue; 235 | # } 236 | 237 | # $this->sidecarMedias[] = static::create($edge['node']); 238 | # } 239 | elif prop == '__typename': 240 | if value == 'GraphImage': 241 | self.type = Media.TYPE_IMAGE 242 | elif value == 'GraphVideo': 243 | self.type = Media.TYPE_VIDEO 244 | elif value == 'GraphSidecar': 245 | self.type = Media.TYPE_SIDECAR 246 | 247 | # if self.ownerId and self.owner != None: 248 | # self.ownerId = self.getOwner().getId() 249 | 250 | @staticmethod 251 | def set_carousel_media(media_array, carousel_array): 252 | 253 | print(carousel_array) 254 | # TODO implement 255 | pass 256 | """ 257 | param mediaArray 258 | param carouselArray 259 | param instance 260 | return mixed 261 | """ 262 | # carousel_media = CarouselMedia() 263 | # carousel_media.type(carousel_array['type']) 264 | 265 | # try: 266 | # images = carousel_array['images'] 267 | # except KeyError: 268 | # pass 269 | 270 | # carousel_images = Media.__get_image_urls( 271 | # carousel_array['images']['standard_resolution']['url']) 272 | # carousel_media.imageLowResolutionUrl = carousel_images['low'] 273 | # carousel_media.imageThumbnailUrl = carousel_images['thumbnail'] 274 | # carousel_media.imageStandardResolutionUrl = carousel_images['standard'] 275 | # carousel_media.imageHighResolutionUrl = carousel_images['high'] 276 | 277 | # if carousel_media.type == Media.TYPE_VIDEO: 278 | # try: 279 | # carousel_media.video_views = carousel_array['video_views'] 280 | # except KeyError: 281 | # pass 282 | 283 | # if 'videos' in carousel_array.keys(): 284 | # carousel_media.videoLowResolutionUrl( 285 | # carousel_array['videos']['low_resolution']['url']) 286 | # carousel_media.videoStandardResolutionUrl( 287 | # carousel_array['videos']['standard_resolution']['url']) 288 | # carousel_media.videoLowBandwidthUrl( 289 | # carousel_array['videos']['low_bandwidth']['url']) 290 | 291 | # media_array.append(carousel_media) 292 | # # array_push($instance->carouselMedia, $carouselMedia); 293 | # return media_array 294 | 295 | @staticmethod 296 | def __getImageUrls(image_url): 297 | parts = '/'.split(urllib.parse.quote_plus(image_url)['path']) 298 | imageName = parts[len(parts) - 1] 299 | urls = { 300 | 'thumbnail': endpoints.INSTAGRAM_CDN_URL + 't/s150x150/' + imageName, 301 | 'low': endpoints.INSTAGRAM_CDN_URL + 't/s320x320/' + imageName, 302 | 'standard': endpoints.INSTAGRAM_CDN_URL + 't/s640x640/' + imageName, 303 | 'high': endpoints.INSTAGRAM_CDN_URL + 't/' + imageName, 304 | } 305 | return urls 306 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/story.py: -------------------------------------------------------------------------------- 1 | from .media import Media 2 | import textwrap 3 | 4 | 5 | class Story(Media): 6 | 7 | skip_prop = [ 8 | 'owner' 9 | ] 10 | 11 | # We do not need some values - do not parse it for Story, 12 | # for example - we do not need owner object inside story 13 | 14 | # param value 15 | # param prop 16 | # param arr 17 | 18 | def _init_properties_custom(self, value, prop, arr): 19 | if prop in Story.skip_prop: 20 | return 21 | 22 | super()._init_properties_custom(value, prop, arr) 23 | 24 | def __str__(self): 25 | string = f""" 26 | Story Info: 27 | 'Id: {self.identifier} 28 | Hig res image: {self.image_high_resolution_url} 29 | Media type: {self.type if hasattr(self, 'type') else ''} 30 | """ 31 | 32 | return textwrap.dedent(string) 33 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/tag.py: -------------------------------------------------------------------------------- 1 | from .initializer_model import InitializerModel 2 | 3 | 4 | class Tag(InitializerModel): 5 | 6 | def __init__(self, props=None): 7 | self._media_count = 0 8 | self._name = None 9 | self._id = None 10 | super(Tag, self).__init__(props) 11 | 12 | def _init_properties_custom(self, value, prop, arr): 13 | 14 | if prop == 'id': 15 | self.identifier = value 16 | 17 | standart_properties = [ 18 | 'media_count', 19 | 'name', 20 | ] 21 | 22 | if prop in standart_properties: 23 | self.__setattr__(prop, value) 24 | -------------------------------------------------------------------------------- /optracker/igramscraper/model/user_stories.py: -------------------------------------------------------------------------------- 1 | from .initializer_model import InitializerModel 2 | 3 | class UserStories(InitializerModel): 4 | 5 | def __init__(self, stories=[], owner=None): 6 | if stories is None: 7 | stories = [] 8 | self.owner = owner 9 | self.stories = stories 10 | super().__init__() 11 | 12 | -------------------------------------------------------------------------------- /optracker/igramscraper/session_manager.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | 4 | class CookieSessionManager: 5 | def __init__(self, session_folder, filename): 6 | self.session_folder = session_folder 7 | self.filename = filename 8 | 9 | def get_saved_cookies(self): 10 | try: 11 | f = open(self.session_folder + self.filename, 'r') 12 | return f.read() 13 | except FileNotFoundError: 14 | return None 15 | 16 | def set_saved_cookies(self, cookie_string): 17 | if not os.path.exists(self.session_folder): 18 | os.makedirs(self.session_folder) 19 | 20 | with open(self.session_folder + self.filename,"w+") as f: 21 | f.write(cookie_string) 22 | 23 | def empty_saved_cookies(self): 24 | try: 25 | os.remove(self.session_folder + self.filename) 26 | except FileNotFoundError: 27 | pass 28 | -------------------------------------------------------------------------------- /optracker/igramscraper/two_step_verification/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NorseByte/opensource-tracker/5388d845ba57bf6ca0aa80575608f4b903e1b8dc/optracker/igramscraper/two_step_verification/__init__.py -------------------------------------------------------------------------------- /optracker/igramscraper/two_step_verification/console_verification.py: -------------------------------------------------------------------------------- 1 | from .two_step_verification_abstract_class import TwoStepVerificationAbstractClass 2 | 3 | 4 | class ConsoleVerification(TwoStepVerificationAbstractClass): 5 | 6 | def get_verification_type(self, choices): 7 | if (len(choices) > 1): 8 | possible_values = {} 9 | print('Select where to send security code') 10 | 11 | for choice in choices: 12 | print(choice['label'] + ' - ' + str(choice['value'])) 13 | possible_values[str(choice['value'])] = True 14 | 15 | selected_choice = None 16 | 17 | while (not selected_choice in possible_values.keys()): 18 | if (selected_choice): 19 | print('Wrong choice. Try again') 20 | 21 | selected_choice = input('Your choice: ').strip() 22 | else: 23 | print('Message with security code sent to: ' + choices[0]['label']) 24 | selected_choice = choices[0]['value'] 25 | 26 | return selected_choice 27 | 28 | def get_security_code(self): 29 | """ 30 | 31 | :return: string 32 | """ 33 | security_code = '' 34 | while (len(security_code) != 6 and not security_code.isdigit()): 35 | if (security_code): 36 | print('Wrong security code') 37 | 38 | security_code = input('Enter security code: ').strip() 39 | 40 | return security_code 41 | -------------------------------------------------------------------------------- /optracker/igramscraper/two_step_verification/two_step_verification_abstract_class.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | 3 | 4 | class TwoStepVerificationAbstractClass(ABC): 5 | 6 | @abstractmethod 7 | def get_verification_type(self, possible_values): 8 | """ 9 | :param possible_values: array of possible values 10 | :return: string 11 | """ 12 | pass 13 | 14 | @abstractmethod 15 | def get_security_code(self): 16 | """ 17 | 18 | :return: string 19 | """ 20 | pass 21 | -------------------------------------------------------------------------------- /optracker/optracker.py: -------------------------------------------------------------------------------- 1 | import os 2 | from .zerodata import zerodata 3 | from .functions.db_func import * 4 | from .functions.side_func import * 5 | from .functions.core_func import * 6 | from .igramscraper.instagram import Instagram 7 | from .facerec.facerec import facerec 8 | from time import sleep 9 | 10 | class Optracker(): 11 | def __init__(self): 12 | #Adding Text source 13 | self.zero = zerodata() 14 | 15 | #Setting up OP_ROOT_FOLDER 16 | self.createRootfolder() 17 | 18 | #Load Config 19 | self.zero.setupJSON(False) 20 | 21 | #Load face_recognition 22 | self.myFace = facerec(self.zero) 23 | 24 | #Iniatlaize DB_DATABASE 25 | print("+ Setting up DB") 26 | self.dbTool = dbFunc(self.zero.DB_DATABASE, self.zero) 27 | self.dbConn = self.dbTool.create_connection() 28 | 29 | #Create Tabels 30 | if self.zero.DB_MYSQL_ON == 0: 31 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_NODES) 32 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_EGDES) 33 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_NEW_INSTA) 34 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_LOGIN_INSTA) 35 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_OPTIONS) 36 | else: 37 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_MYSQL_NODES) 38 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_MYSQL_EGDES) 39 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_MYSQL_NEW_INSTA) 40 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_MYSQL_LOGIN_INSTA) 41 | self.dbTool.createTabels(self.dbConn, self.zero.DB_TABLE_MYSQL_OPTIONS) 42 | 43 | self.dbTool.setDefaultValueOptions(self.dbConn) 44 | self.zero.printText("+ DB setup complete", False) 45 | 46 | #Get usernames 47 | self.sideTool = sideFunc(self.dbTool, self.dbConn, self.zero) 48 | self.sideTool.loadLoginText() 49 | self.sideTool.countCurrentUser() 50 | 51 | #Init INSTAGRAM 52 | self.instagram = Instagram() 53 | 54 | #User Select and Login 55 | #selectUserAndLogin() 56 | self.autoSelectAndLogin() 57 | 58 | #Setup coreFunc 59 | print("+ Setting up core functions") 60 | self.mainFunc = coreFunc(self.dbTool, self.dbConn, self.instagram, self.zero, self.myFace) 61 | 62 | self.MENU_ITEMS = [ 63 | { self.zero.HELP_TEXT_DISP: self.dispHelp }, 64 | { self.zero.RUN_CURRENT_DISP: self.runSingelScan }, 65 | { self.zero.RUN_FOLLOW_DISP: self.runFollowScan }, 66 | { self.zero.RUN_CHANGE_USER: self.selectUserAndLogin }, 67 | { self.zero.RUN_LOAD_SCAN: self.runLoadUserNodeScan }, 68 | { self.zero.RUN_EXPORT_DATA: self.dispExport}, 69 | { self.zero.RUN_EDIT_OPTIONS: self.runEditDefault}, 70 | { self.zero.RUN_GET_DEEP: self.runDeepfromDB}, 71 | { self.zero.RUN_UPDATE_IMG: self.updateImg}, 72 | { self.zero.RUN_EXIT_DISP: exit}, 73 | ] 74 | 75 | #Core Functions to main 76 | def runSingelScan(self): 77 | #Setup zeroPoint 78 | self.sideTool.lastSearch() 79 | 80 | #Run Scan from zeroPoint 81 | self.mainFunc.setCurrentUser(self.zero.INSTA_USER) 82 | self.runCurrentScan() 83 | 84 | def updateImg(self): 85 | self.mainFunc.updateProfileImg() 86 | 87 | def selectUserAndLogin(self): 88 | #Setusername 89 | self.sideTool.setupLogin() 90 | #Login Instagram 91 | self.loginInstagram(instagram) 92 | 93 | def autoSelectAndLogin(self): 94 | #Find user 95 | self.sideTool.autoSelectLogin() 96 | #Login instagram 97 | self.loginInstagram(self.instagram) 98 | 99 | def runCurrentScan(self): 100 | #Extract info from following list 101 | if self.mainFunc.loadFollowlist(False) == True: 102 | self.mainFunc.add_egde_from_list_insta(False) 103 | 104 | #Extract followed by 105 | if self.mainFunc.loadFollowlist(True) == True: 106 | self.mainFunc.add_egde_from_list_insta(True) 107 | 108 | #Update new_Insta 109 | print("\n- Scan complete") 110 | print("+ Setting {} ({}) to complete.".format(self.zero.INSTA_USER, self.zero.INSTA_USER_ID)) 111 | self.dbTool.inserttoTabel(self.dbConn, self.zero.DB_UPDATE_NEW_INSTA_DONE_TRUE, (self.zero.INSTA_USER_ID,)) 112 | 113 | def runFollowScan(self): 114 | self.mainFunc.scanFollowToInstaID() 115 | input("+ Press [Enter] to continue...") 116 | 117 | def runLoadUserNodeScan(self): 118 | self.mainFunc.updateNodeFromList() 119 | 120 | def runEditDefault(self): 121 | self.sideTool.editDefaultValue() 122 | 123 | def runDeepfromDB(self): 124 | self.mainFunc.deepScanAll() 125 | 126 | def dispHelp(self): 127 | print(self.zero.HELP_TEXT) 128 | input("\nPress [Enter] to continue...") 129 | 130 | def dispExport(self): 131 | self.mainFunc.exportDBData() 132 | input("+ Press [Enter] to continue...") 133 | 134 | def loginInstagram(self, instagram): 135 | #Iniatlaize Instagram login 136 | print("\n- Connecting to Instagram") 137 | self.instagram.with_credentials(self.zero.LOGIN_USERNAME_INSTA, self.zero.LOGIN_PASSWORD_INSTA, '/cachepath') 138 | self.instagram.login(force=False,two_step_verificator=True) 139 | sleep(2) # Delay to mimic user 140 | 141 | def root_path(self): 142 | return os.path.abspath(os.sep) 143 | 144 | def createFolder(self, folder): 145 | if not os.path.exists(folder): 146 | os.mkdir(folder) 147 | self.zero.printText("+ Folder created: {}".format(folder), True) 148 | else: 149 | self.zero.printText("+ Folder loacted: {}".format(folder), True) 150 | 151 | def createRootfolder(self): 152 | self.zero.OP_ROOT_FOLDER_PATH_VALUE = self.root_path() 153 | self.zero.OP_ROOT_FOLDER_PATH_VALUE = self.zero.OP_ROOT_FOLDER_PATH_VALUE + self.zero.OP_ROOT_FOLDER_NAME_VALUE 154 | self.createFolder(self.zero.OP_ROOT_FOLDER_PATH_VALUE) 155 | 156 | #Setup INSTA_FOLDER 157 | self.zero.OP_INSTA_FOLDER_NAME_VALUE = self.zero.OP_ROOT_FOLDER_PATH_VALUE + self.zero.OP_INSTA_FOLDER_NAME_VALUE 158 | self.zero.OP_INSTA_PROFILEFOLDER_NAME_VALUE = self.zero.OP_INSTA_FOLDER_NAME_VALUE + self.zero.OP_INSTA_PROFILEFOLDER_NAME_VALUE 159 | self.zero.OP_INSTA_INSTAID_FOLDER_VALUE = self.zero.OP_INSTA_FOLDER_NAME_VALUE + self.zero.OP_INSTA_INSTAID_FOLDER_VALUE 160 | 161 | self.createFolder(self.zero.OP_INSTA_FOLDER_NAME_VALUE) 162 | self.createFolder(self.zero.OP_INSTA_PROFILEFOLDER_NAME_VALUE) 163 | self.createFolder(self.zero.OP_INSTA_INSTAID_FOLDER_VALUE) 164 | 165 | #Setting up full path starting 166 | self.zero.DB_DATABASE_FOLDER = self.zero.OP_ROOT_FOLDER_PATH_VALUE + self.zero.DB_DATABASE_FOLDER 167 | self.zero.DB_DATABASE_EXPORT_FOLDER = self.zero.OP_ROOT_FOLDER_PATH_VALUE + self.zero.DB_DATABASE_EXPORT_FOLDER 168 | self.zero.OP_ROOT_CONFIG = self.zero.OP_ROOT_FOLDER_PATH_VALUE + self.zero.OP_ROOT_CONFIG 169 | self.zero.printText("+ Database folder are loacted {}".format(self.zero.DB_DATABASE_FOLDER), False) 170 | self.zero.printText("+ Export folder are loacted {}".format(self.zero.DB_DATABASE_EXPORT_FOLDER), False) 171 | self.zero.printText("+ Config file are loacted {}".format(self.zero.OP_ROOT_CONFIG), False) 172 | 173 | def run(): 174 | myOptracker = Optracker() 175 | while True: 176 | print("\n- Main menu") 177 | for item in myOptracker.MENU_ITEMS: 178 | print("[" + str(myOptracker.MENU_ITEMS.index(item)) + "] " + list(item.keys())[0]) 179 | choice = input(">> ") 180 | 181 | if choice.isdigit(): 182 | newInfo = int(choice) 183 | if newInfo <= len(myOptracker.MENU_ITEMS): 184 | list(myOptracker.MENU_ITEMS[newInfo].values())[0]() 185 | else: 186 | myOptracker.dispHelp() 187 | else: 188 | myOptracker.dispHelp() 189 | -------------------------------------------------------------------------------- /optracker/zerodata.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | 4 | class zerodata(): 5 | #Define username and password 6 | LOGIN_USERNAME_INSTA = "" 7 | LOGIN_PASSWORD_INSTA = "" 8 | PROGRAM_NAME = "openSource Tracker v.1.3.6" 9 | 10 | #List log 11 | USER_FILES = ( ["user_insta.txt"], 12 | ["user_face.txt"], 13 | ["user_list.txt"] 14 | ) 15 | 16 | USER_FILE_SCAN_NODE_INSTA = "user_scan_insta.txt" 17 | 18 | #Menu variabels 19 | HELP_TEXT_DISP = "Display Help" 20 | RUN_CURRENT_DISP = "Singel Scan" 21 | RUN_FOLLOW_DISP = "Scan Followed by to user" 22 | RUN_CHANGE_USER = "Change user Instagram" 23 | RUN_EXPORT_DATA = "Export nodes and egdes" 24 | RUN_EDIT_OPTIONS = "Change default values" 25 | RUN_LOAD_SCAN = "Deepscan from list" 26 | RUN_GET_DEEP = "Deepscan from database" 27 | RUN_UPDATE_IMG = "Update Profile Images" 28 | RUN_EXIT_DISP = "Exit" 29 | 30 | #ERROR codes 31 | ERROR_429 = "429 - To many request" 32 | 33 | #FOLDER Setup 34 | OP_ROOT_FOLDER_PATH_TEXT = "OP_ROOT_FOLDER_PATH" 35 | OP_ROOT_FOLDER_PATH_VALUE = "\\" 36 | 37 | OP_ROOT_FOLDER_NAME_TEXT = "OP_ROOT_FOLDER_NAME" 38 | OP_ROOT_FOLDER_NAME_VALUE = "optracker\\" 39 | 40 | OP_INSTA_FOLDER_NAME_TEXT = "INSTA_FOLDER_NAME" 41 | OP_INSTA_FOLDER_NAME_VALUE = "instadata\\" 42 | 43 | OP_INSTA_PROFILEFOLDER_NAME_TEXT = "INSTA_PROFILE_FOLDER_NAME" 44 | OP_INSTA_PROFILEFOLDER_NAME_VALUE = "profile_pic_insta\\" 45 | 46 | OP_INSTA_INSTAID_FOLDER_TEXT = "INSTA_INSTAID_FOLDER_NAME" 47 | OP_INSTA_INSTAID_FOLDER_VALUE = "post\\" 48 | 49 | #Config filename 50 | OP_ROOT_CONFIG = "optracker.config" 51 | 52 | #Database setup 53 | DB_DATABASE = "openSource-tracker.db" 54 | DB_DATABASE_FOLDER = "db\\" 55 | DB_DATABASE_EXPORT_FOLDER = "export\\" 56 | DB_DATABASE_EXPORT_NODES = "nodes.csv" 57 | DB_DATABASE_EXPORT_INSTA_EGDE = "edges_insta.csv" 58 | 59 | #DB MYSQL 60 | DB_MYSQL = "localhost" 61 | DB_MYSQL_USER = "optracker" 62 | DB_MYSQL_PASSWORD = "localpassword" 63 | DB_MYSQL_DATABASE = "openSource_tracker" 64 | DB_MYSQL_PORT = "3306" 65 | DB_MYSQL_ON = 0 66 | DB_MYSQL_COLLATION = "utf8mb4_general_ci" 67 | DB_MYSQL_CHARSET = "utf8mb4" 68 | 69 | DB_MYSQL_TEXT = "MYSQL_HOST" 70 | DB_MYSQL_USER_TEXT = "MYSQL_USER" 71 | DB_MYSQL_PASSWORD_TEXT = "MYSQL_PASSWORD" 72 | DB_MYSQL_DATABASE_TEXT = "MYSQL_DB" 73 | DB_MYSQL_PORT_TEXT = "MYSQL_PORT" 74 | DB_MYSQL_ON_TEXT = "MYSQL_ON" 75 | DB_MYSQL_COLLATION_TEXT = "MYSQL_COL" 76 | DB_MYSQL_CHARSET_TEXT = "MYSQL_CHAR" 77 | 78 | #SQLIte 79 | DB_TABLE_NODES = """ 80 | CREATE TABLE IF NOT EXISTS "nodes" ( 81 | "id" INTEGER PRIMARY KEY AUTOINCREMENT, 82 | "name" TEXT, 83 | "label" TEXT, 84 | "insta_id" INTEGER, 85 | "insta_img" TEXT, 86 | "insta_follow" INTEGER, 87 | "insta_follower" INTEGER, 88 | "insta_bio" TEXT, 89 | "insta_username" TEXT, 90 | "insta_private" INTEGER, 91 | "insta_verifyed" INTEGER, 92 | "insta_post" INTEGER, 93 | "insta_exturl" TEXT, 94 | "insta_deepscan" INTEGER DEFAULT 0 95 | );""" 96 | 97 | DB_TABLE_EGDES = """ 98 | CREATE TABLE IF NOT EXISTS "egdes_insta" ( 99 | "source" INTEGER, 100 | "target" INTEGER, 101 | "type" TEXT DEFAULT 'undirected', 102 | "weight" INTEGER DEFAULT 1 103 | );""" 104 | 105 | DB_TABLE_NEW_INSTA = """ 106 | CREATE TABLE IF NOT EXISTS "new_insta" ( 107 | "insta_id" INTEGER UNIQUE, 108 | "insta_user" INTEGER, 109 | "done" INTEGER DEFAULT 0, 110 | "wait" INTEGER DEFAULT 0, 111 | "followed_by_done" INTEGER DEFAULT 0 112 | ); 113 | """ 114 | 115 | DB_TABLE_LOGIN_INSTA = """ 116 | CREATE TABLE IF NOT EXISTS "accounts" ( 117 | "username" TEXT UNIQUE, 118 | "password" TEXT, 119 | "email" TEXT, 120 | "fullname" TEXT, 121 | "account_type" TEXT, 122 | "current_run" INTEGER DEFAULT 0, 123 | "last_used" TEXT 124 | ); 125 | """ 126 | 127 | DB_TABLE_OPTIONS = """ 128 | CREATE TABLE IF NOT EXISTS "options" ( 129 | "what" TEXT UNIQUE, 130 | "value" TEXT, 131 | "ref" TEXT 132 | ); 133 | """ 134 | 135 | #MYSQL 136 | DB_TABLE_MYSQL_NODES = """ 137 | CREATE TABLE IF NOT EXISTS nodes ( 138 | id BIGINT(20) NOT NULL AUTO_INCREMENT, 139 | name VARCHAR(64) NULL DEFAULT "N/A", 140 | label VARCHAR(64) NULL DEFAULT "N/A", 141 | insta_id BIGINT(20) NULL DEFAULT 0, 142 | insta_img TEXT NULL, 143 | insta_follow BIGINT(20) NULL DEFAULT 0, 144 | insta_follower BIGINT(20) NOT NULL DEFAULT 0, 145 | insta_bio TEXT NULL, 146 | insta_username VARCHAR(64) NOT NULL DEFAULT "N/A", 147 | insta_private INT(10) NOT NULL DEFAULT 0, 148 | insta_verifyed INT(10) NOT NULL DEFAULT 0, 149 | insta_post BIGINT(20) NOT NULL DEFAULT 0, 150 | insta_exturl TEXT NULL, 151 | insta_deepscan INT(20) NOT NULL DEFAULT '0', 152 | PRIMARY KEY (`id`)) ENGINE = MyISAM 153 | """ 154 | 155 | DB_TABLE_MYSQL_EGDES = """ 156 | CREATE TABLE IF NOT EXISTS egdes_insta ( 157 | source BIGINT(20) NOT NULL , 158 | target BIGINT(20) NOT NULL , 159 | type VARCHAR(64) NOT NULL DEFAULT 'undirected' , 160 | weight INT(20) NOT NULL DEFAULT '1' 161 | ) ENGINE = MyISAM;""" 162 | 163 | DB_TABLE_MYSQL_NEW_INSTA = """ 164 | CREATE TABLE IF NOT EXISTS new_insta ( 165 | insta_id BIGINT(20) NOT NULL UNIQUE, 166 | insta_user TEXT NULL, 167 | done INT(20) NOT NULL DEFAULT 0, 168 | wait INT(20) NOT NULL DEFAULT 0, 169 | followed_by_done INT(20) NOT NULL DEFAULT 0 170 | ) ENGINE = MyISAM; 171 | """ 172 | 173 | DB_TABLE_MYSQL_LOGIN_INSTA = """ 174 | CREATE TABLE IF NOT EXISTS accounts ( 175 | username VARCHAR(64) NOT NULL UNIQUE, 176 | password VARCHAR(64) NOT NULL, 177 | email VARCHAR(64) NOT NULL, 178 | fullname VARCHAR(64) NOT NULL, 179 | account_type VARCHAR(64) NOT NULL, 180 | current_run INT(20) NOT NULL DEFAULT 0, 181 | last_used VARCHAR(64) NOT NULL DEFAULT 0 182 | ) ENGINE = MyISAM;""" 183 | 184 | DB_TABLE_MYSQL_OPTIONS = """ 185 | CREATE TABLE IF NOT EXISTS options ( 186 | what VARCHAR(64) NOT NULL UNIQUE, 187 | value VARCHAR(64) NOT NULL, 188 | ref VARCHAR(64) NOT NULL DEFAULT 0 189 | ) ENGINE = MyISAM; 190 | """ 191 | 192 | #MySQL 193 | DB_INSERT_MYSQL_NODE = """INSERT INTO nodes (name, label, insta_id, insta_img, insta_follow, insta_follower, insta_bio, insta_username, insta_private, insta_verifyed, insta_post, insta_exturl, insta_deepscan) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s); SELECT id FROM nodes where insta_id = %s;""" 194 | DB_INSERT_MYSQL_INSTA_EGDE = 'INSERT INTO egdes_insta (source, target) VALUES (%s, %s);' 195 | DB_INSERT_MYSQL_NEW_INSTA = 'INSERT INTO new_insta (insta_id, insta_user) VALUES (%s, %s);' 196 | DB_INSERT_MYSQL_LOGIN_INSTA = 'INSERT INTO accounts (username, password, email, fullname, account_type, last_used) VALUES (%s, %s, %s, %s, %s, %s);' 197 | DB_INSERT_MYSQL_OPTIONS_LASTINSTA = 'INSERT INTO options (value, what) VALUES (%s, %s);' 198 | DB_SELECT_MYSQL_DEEPSCAN_NEED = 'SELECT insta_username FROM nodes WHERE insta_deepscan = 0' 199 | 200 | DB_UPDATE_MYSQL_LAST_INSTA = 'UPDATE options SET value = (%s) WHERE what = "LAST_INSTA";' 201 | DB_UPDATE_MYSQL_OPTIONS = 'UPDATE options SET value = (%s) WHERE what = %s;' 202 | DB_UPDATE_MYSQL_NEW_INSTA_DONE_TRUE = 'UPDATE new_insta SET done = 1 WHERE insta_id = %s;' 203 | DB_UPDATE_MYSQL_NEW_INSTA_DONE_FALSE = 'UPDATE new_insta SET done = 0 WHERE insta_id = %s;' 204 | DB_UPDATE_MYSQL_ACCOUNT_LAST_USED = 'UPDATE accounts SET last_used = %s WHERE username = %s' 205 | DB_UPDATE_MYSQL_NODES = 'UPDATE nodes SET name = %s, label = %s, insta_img = %s, insta_follow = %s, insta_follower = %s, insta_bio = %s, insta_username = %s, insta_private = %s, insta_verifyed = %s, insta_post = %s, insta_exturl = %s, insta_deepscan = %s WHERE insta_id = %s' 206 | 207 | DB_SELECT_MYSQL_IMG = 'SELECT insta_username, insta_id, insta_img FROM nodes WHERE insta_img IS NOT NULL AND insta_img IS NOT "None"' 208 | DB_SELECT_MYSQL_EXPORT_ID_USER = 'SELECT id, insta_username FROM nodes' 209 | DB_SELECT_MYSQL_ID_NODE = 'SELECT id FROM nodes WHERE insta_id = %s' 210 | DB_SELECT_MYSQL_USERNAME_NODE = 'SELECT insta_username FROM nodes WHERE insta_id = %s' 211 | DB_SELECT_MYSQL_DONE_NEW_INSTA = 'SELECT done, wait FROM new_insta WHERE insta_id = %s' 212 | DB_SELECT_MYSQL_TARGET_EDGE = 'SELECT target FROM egdes_insta WHERE source = %s AND target = %s' 213 | DB_SELECT_MYSQL_LOGIN_INSTA = 'SELECT * FROM accounts WHERE account_type = "instagram"' 214 | DB_SELECT_MYSQL_LOGIN_PASSWORD_INSTA = 'SELECT password FROM accounts WHERE username = %s AND account_type = "instagram"' 215 | DB_SELECT_MYSQL_OPTIONS = 'SELECT * FROM options WHERE what = %s' 216 | DB_SELECT_MYSQL_ALL_DONE_NEW_INSTA = 'SELECT * FROM new_insta WHERE done = 1' 217 | DB_SELECT_MYSQL_ALL_NODE = "SELECT * FROM nodes" 218 | DB_SELECT_MYSQL_ALL_INSTA_EDGES = "SELECT source, target, type, weight FROM egdes_insta" 219 | DB_SELECT_MYSQL_COUNT_NODES = "SELECT count(*) FROM nodes" 220 | DB_SELECT_MYSQL_COUNT_EDES_INSTA = "SELECT count(*) FROM egdes_insta" 221 | DB_SELECT_MYSQL_INSTA_FOLLOWER_NODE_ID = 'SELECT insta_follower FROM nodes WHERE id = %s' 222 | DB_SELECT_MYSQL_FOLLOW_OF = 'SELECT * FROM nodes as Node INNER JOIN egdes_insta as Edge ON Node.id = Edge.source WHERE Node.insta_private = 0 AND Edge.target = %s' 223 | 224 | 225 | #SQLite 226 | DB_SELECT_EXPORT_ID_USER = 'SELECT id, insta_username FROM nodes' 227 | DB_INSERT_NODE = """INSERT INTO "main"."nodes" ("name", "label", "insta_id", "insta_img", "insta_follow", "insta_follower", "insta_bio", "insta_username", "insta_private", "insta_verifyed", "insta_post", "insta_exturl", "insta_deepscan") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?); SELECT id FROM nodes where insta_id = ?;""" 228 | DB_INSERT_INSTA_EGDE = 'INSERT INTO "main"."egdes_insta" ("source", "target") VALUES (?, ?);' 229 | DB_INSERT_NEW_INSTA = 'INSERT INTO "main"."new_insta" ("insta_id", "insta_user") VALUES (?, ?);' 230 | DB_INSERT_LOGIN_INSTA = 'INSERT INTO "main"."accounts" ("username", "password", "email", "fullname", "account_type", "last_used") VALUES (?, ?, ?, ?, ?, ?);' 231 | DB_INSERT_OPTIONS_LASTINSTA = 'INSERT INTO "main"."options" ("value", "what") VALUES (?, ?);' 232 | 233 | DB_UPDATE_LAST_INSTA = 'UPDATE "main"."options" SET "value" = (?) WHERE "what" = "LAST_INSTA";' 234 | DB_UPDATE_OPTIONS = 'UPDATE "main"."options" SET "value" = (?) WHERE "what" = ?;' 235 | DB_UPDATE_NEW_INSTA_DONE_TRUE = 'UPDATE "main"."new_insta" SET "done" = 1 WHERE "insta_id" = ?;' 236 | DB_UPDATE_NEW_INSTA_DONE_FALSE = 'UPDATE "main"."new_insta" SET "done" = 0 WHERE "insta_id" = ?;' 237 | DB_UPDATE_ACCOUNT_LAST_USED = 'UPDATE "main"."accounts" SET ("last_used") = ? WHERE username = ?' 238 | DB_UPDATE_NODES = 'UPDATE "main"."nodes" SET "name" = ?, "label" = ?, "insta_img" = ?, "insta_follow" = ?, "insta_follower" = ?, "insta_bio" = ?, "insta_username" = ?, "insta_private" = ?, "insta_verifyed" = ?, "insta_post" = ?, "insta_exturl" = ?, "insta_deepscan" = ? WHERE "insta_id" = ?' 239 | 240 | DB_SELECT_IMG = 'SELECT insta_username, insta_id, insta_img FROM nodes WHERE insta_img IS NOT NULL AND insta_img IS NOT "None"' 241 | DB_SELECT_DEEPSCAN_NEED = 'SELECT insta_username FROM nodes WHERE insta_deepscan = 0' 242 | DB_SELECT_ID_NODE = 'SELECT id FROM "main"."nodes" WHERE ("insta_id") = ?' 243 | DB_SELECT_USERNAME_NODE = 'SELECT insta_username FROM "main"."nodes" WHERE insta_id = ?' 244 | DB_SELECT_DONE_NEW_INSTA = 'SELECT done, wait FROM "main"."new_insta" WHERE ("insta_id") = ?' 245 | DB_SELECT_TARGET_EDGE = 'SELECT target FROM "main"."egdes_insta" WHERE source = ? AND target = ?' 246 | DB_SELECT_LOGIN_INSTA = 'SELECT * FROM "main"."accounts" WHERE account_type = "instagram"' 247 | DB_SELECT_LOGIN_PASSWORD_INSTA = 'SELECT password FROM "main"."accounts" WHERE ("username") = ? AND account_type = "instagram"' 248 | DB_SELECT_OPTIONS = 'SELECT * FROM options WHERE what = ?' 249 | DB_SELECT_ALL_DONE_NEW_INSTA = 'SELECT * FROM "main"."new_insta" WHERE done = 1' 250 | DB_SELECT_ALL_NODE = "SELECT * FROM main.nodes" 251 | DB_SELECT_ALL_INSTA_EDGES = "SELECT source, target, type, weight FROM main.egdes_insta" 252 | DB_SELECT_COUNT_NODES = "SELECT count(*) FROM main.nodes" 253 | DB_SELECT_COUNT_EDES_INSTA = "SELECT count(*) FROM main.egdes_insta" 254 | DB_SELECT_INSTA_FOLLOWER_NODE_ID = 'SELECT insta_follower FROM "main"."nodes" WHERE id = ?' 255 | DB_SELECT_FOLLOW_OF = 'SELECT * FROM "main"."nodes" as Node INNER JOIN "main"."egdes_insta" as Edge ON Node.id = Edge.source WHERE Node.insta_private = 0 AND Edge.target = ?' 256 | 257 | 258 | #Startpoint information 259 | INSTA_USER = "" 260 | INSTA_USER_ID = "" 261 | INSERT_DATA = "" 262 | DATETIME_MASK = "%Y-%m-%d %H:%M:%S.%f" 263 | TOTAL_USER_COUNT = 0 264 | WRITE_ENCODING = "utf-8" 265 | ON_ERROR_ENCODING = "replace" 266 | INSTA_FILE_EXT = ".jpg" 267 | 268 | INSTA_MAX_FOLLOW_SCAN_TEXT = "INSTA_MAX_FOLLOW_SCAN" 269 | INSTA_MAX_FOLLOW_SCAN_VALUE = 2000 270 | 271 | INSTA_MAX_FOLLOW_BY_SCAN_TEXT = "INSTA_MAX_FOLLOW_BY_SCAN" 272 | INSTA_MAX_FOLLOW_BY_SCAN_VALUE = 2000 273 | 274 | SURFACE_SCAN_TEXT = "SURFACE_SCAN" 275 | SURFACE_SCAN_VALUE = "1" 276 | 277 | DETAIL_PRINT_TEXT = "DETAIL_PRINT" 278 | DETAIL_PRINT_VALUE = "1" 279 | 280 | LAST_INSTA_TEXT = "LAST_INSTA" 281 | LAST_INSTA_VALUE = "" 282 | 283 | DOWNLOAD_PROFILE_INSTA_TEXT = "DOWNLOAD_PROFILE_INSTA" 284 | DOWNLOAD_PROFILE_INSTA_VALUE = "1" 285 | 286 | FACEREC_ON_TEXT = "FACE_REC_ON" 287 | FACEREC_ON_VALUE = "1" 288 | 289 | 290 | #Help TEXT 291 | HELP_TEXT = """ 292 | {} - HELP TEXT 293 | 294 | {} - Scan a specific node 295 | This mode will allow you to run a scan for a specific user and is your first step to generate nodes and edges. You will need to enter a startpoint, it is a instagram username. The program will look it up find follow and followed by. For then to add it to the database with connections. 296 | 297 | {} - Scan all follower 298 | You will be presented with a list of users that you have finnished adding to your database. The program til then scan all the connections it has as it was a first time use and add the data to the database. Short and sweet scan the follow to the follow for a user. 299 | 300 | {} - Allow you to change users 301 | This will give you a list of all avalible users so you can change before the scan if you are not happy with the choice from startup. 302 | 303 | Nodes - Main database 304 | The node database is a collection of all the users that have been scanned. It contains basic data as ID, username, instagram description with more. 305 | 306 | Edges - connections 307 | The edges database is a database with connections between nodes. This is used to create a visual display for how a social nettwork are connected. 308 | 309 | SQLite - The Database 310 | All data are saved in the database found in folder 'db/'. You need to open it in a SQL browser and then export the data in node table and edges table to a .CSV file witch you can import into a visualising program (eks. gephi). 311 | 312 | {} - RUN_EXPORT_DATA 313 | Gives you an overveiew of data collected so far, and exports it to folder {}. 314 | 315 | {} 316 | Loads a list of users from root folder, scraps all info from instagram and updates node DB. 317 | 318 | Max Follows and Max Followed by 319 | During search of follows by, where you scan the profile for one user that have completet the singel search you can set a limit to how many followers a user can have or how many it are following. This is to prevent to scan uninterested profils like public organizations and so on as they can have up to 10K. Default is 2000 and is considerated a normal amount of followes/followed by. 320 | 321 | Deepscan and Surfacescan 322 | On default are SurfaceScan turned off. By turning on surfacescan you only extract username and instagram id when scraping. This is to save you for request to the server so you can use one user for a longer periode of time, and make the scan go quicker if you are scraping a big nettwork. You can later add specific users found in the graphic to a text file and scan only the ones that are interesting and get all the data. 323 | 324 | Print Detail 325 | On Default is it turned ON. You will be presented with all the output the scraper have. If turned OFF you will only get the minimum of info to see if it is working properly. 326 | 327 | ERROR CODES - List of ERROR codes 328 | 001 - INSTAGRAM USER BLOCKED 329 | 002 - TO MANY REQUEST FROM CURRENT USER 330 | 003 - ERROR LOGIN 331 | 004 - USER DONT HAVE ACCESS TO DATA, RETURNING JSON ERROR 332 | """.format(PROGRAM_NAME, RUN_CURRENT_DISP, RUN_FOLLOW_DISP, RUN_CHANGE_USER, RUN_EXPORT_DATA, DB_DATABASE_EXPORT_FOLDER, RUN_LOAD_SCAN) 333 | 334 | 335 | def printText(self, text, override): 336 | if int(self.DETAIL_PRINT_VALUE) == 1: 337 | print(text) 338 | else: 339 | if override == True: 340 | print(text) 341 | 342 | #Removes unwanted symbols in string 343 | def sanTuple(self, text): 344 | text = str(text) 345 | text = text.replace("'", "") 346 | text = text.replace('"', "") 347 | text = text.replace(";", "") 348 | 349 | return text 350 | 351 | def setupJSON(self, export): 352 | if export == True: 353 | self.printText("+ Config export started", False) 354 | DATA = {} 355 | DATA['DB'] = [] 356 | DATA['DB'].append({ 357 | self.DB_MYSQL_TEXT : self.DB_MYSQL, 358 | self.DB_MYSQL_USER_TEXT : self.DB_MYSQL_USER, 359 | self.DB_MYSQL_PASSWORD_TEXT : self.DB_MYSQL_PASSWORD, 360 | self.DB_MYSQL_DATABASE_TEXT : self.DB_MYSQL_DATABASE, 361 | self.DB_MYSQL_PORT_TEXT : self.DB_MYSQL_PORT, 362 | self.DB_MYSQL_ON_TEXT : self.DB_MYSQL_ON, 363 | self.DB_MYSQL_COLLATION_TEXT : self.DB_MYSQL_COLLATION, 364 | self.DB_MYSQL_CHARSET_TEXT : self.DB_MYSQL_CHARSET, 365 | self.DETAIL_PRINT_TEXT : self.DETAIL_PRINT_VALUE 366 | }) 367 | 368 | if os.path.exists(self.OP_ROOT_CONFIG): 369 | self.printText("+ File: {} exist, deleting it.".format(self.OP_ROOT_CONFIG), False) 370 | os.remove(self.OP_ROOT_CONFIG) 371 | 372 | with open(self.OP_ROOT_CONFIG, 'w') as outfile: 373 | json.dump(DATA, outfile) 374 | 375 | self.printText("+ Config export end", False) 376 | 377 | else: 378 | self.printText("+ Config import started", True) 379 | if os.path.exists(self.OP_ROOT_CONFIG): 380 | with open(self.OP_ROOT_CONFIG) as json_file: 381 | data = json.load(json_file) 382 | for p in data['DB']: 383 | self.DB_MYSQL = p[self.DB_MYSQL_TEXT] 384 | self.DB_MYSQL_USER = p[self.DB_MYSQL_USER_TEXT] 385 | self.DB_MYSQL_PASSWORD = p[self.DB_MYSQL_PASSWORD_TEXT] 386 | self.DB_MYSQL_DATABASE = p[self.DB_MYSQL_DATABASE_TEXT] 387 | self.DB_MYSQL_PORT = int(p[self.DB_MYSQL_PORT_TEXT]) 388 | self.DB_MYSQL_ON = int(p[self.DB_MYSQL_ON_TEXT]) 389 | self.DB_MYSQL_COLLATION = p[self.DB_MYSQL_COLLATION_TEXT] 390 | self.DB_MYSQL_CHARSET = p[self.DB_MYSQL_CHARSET_TEXT] 391 | self.DETAIL_PRINT_VALUE = int(p[self.DETAIL_PRINT_TEXT]) 392 | 393 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_TEXT, self.DB_MYSQL), False) 394 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_USER_TEXT, self.DB_MYSQL_USER), False) 395 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_PASSWORD_TEXT, self.DB_MYSQL_PASSWORD), False) 396 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_DATABASE_TEXT, self.DB_MYSQL_DATABASE), False) 397 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_PORT_TEXT, self.DB_MYSQL_PORT), False) 398 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_ON_TEXT, self.DB_MYSQL_ON), False) 399 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_COLLATION_TEXT, self.DB_MYSQL_COLLATION), False) 400 | self.printText("+ {} are set to: {}".format(self.DB_MYSQL_CHARSET_TEXT, self.DB_MYSQL_CHARSET), False) 401 | self.printText("+ {} are set to: {}".format(self.DETAIL_PRINT_TEXT, self.DETAIL_PRINT_VALUE), False) 402 | 403 | else: 404 | self.printText("+ Config file dosent exist - using standard", False) 405 | 406 | self.changeSQLquery() 407 | self.printText("+ Config import end", False) 408 | 409 | def changeSQLquery(self): 410 | if self.DB_MYSQL_ON == 1: 411 | self.DB_INSERT_NODE = self.DB_INSERT_MYSQL_NODE 412 | self.DB_INSERT_INSTA_EGDE = self.DB_INSERT_MYSQL_INSTA_EGDE 413 | self.DB_INSERT_NEW_INSTA = self.DB_INSERT_MYSQL_NEW_INSTA 414 | self.DB_INSERT_LOGIN_INSTA = self.DB_INSERT_MYSQL_LOGIN_INSTA 415 | self.DB_INSERT_OPTIONS_LASTINSTA = self.DB_INSERT_MYSQL_OPTIONS_LASTINSTA 416 | self.DB_UPDATE_LAST_INSTA = self.DB_UPDATE_MYSQL_LAST_INSTA 417 | self.DB_UPDATE_OPTIONS = self.DB_UPDATE_MYSQL_OPTIONS 418 | self.DB_UPDATE_NEW_INSTA_DONE_TRUE = self.DB_UPDATE_MYSQL_NEW_INSTA_DONE_TRUE 419 | self.DB_UPDATE_NEW_INSTA_DONE_FALSE = self.DB_UPDATE_MYSQL_NEW_INSTA_DONE_FALSE 420 | self.DB_UPDATE_ACCOUNT_LAST_USED = self.DB_UPDATE_MYSQL_ACCOUNT_LAST_USED 421 | self.DB_UPDATE_NODES = self.DB_UPDATE_MYSQL_NODES 422 | self.DB_SELECT_ID_NODE = self.DB_SELECT_MYSQL_ID_NODE 423 | self.DB_SELECT_USERNAME_NODE = self.DB_SELECT_MYSQL_USERNAME_NODE 424 | self.DB_SELECT_DONE_NEW_INSTA = self.DB_SELECT_MYSQL_DONE_NEW_INSTA 425 | self.DB_SELECT_TARGET_EDGE = self.DB_SELECT_MYSQL_TARGET_EDGE 426 | self.DB_SELECT_LOGIN_INSTA = self.DB_SELECT_MYSQL_LOGIN_INSTA 427 | self.DB_SELECT_LOGIN_PASSWORD_INSTA = self.DB_SELECT_MYSQL_LOGIN_PASSWORD_INSTA 428 | self.DB_SELECT_OPTIONS = self.DB_SELECT_MYSQL_OPTIONS 429 | self.DB_SELECT_ALL_DONE_NEW_INSTA = self.DB_SELECT_MYSQL_ALL_DONE_NEW_INSTA 430 | self.DB_SELECT_ALL_NODE = self.DB_SELECT_MYSQL_ALL_NODE 431 | self.DB_SELECT_ALL_INSTA_EDGES = self.DB_SELECT_MYSQL_ALL_INSTA_EDGES 432 | self.DB_SELECT_COUNT_NODES = self.DB_SELECT_MYSQL_COUNT_NODES 433 | self.DB_SELECT_COUNT_EDES_INSTA = self.DB_SELECT_MYSQL_COUNT_EDES_INSTA 434 | self.DB_SELECT_INSTA_FOLLOWER_NODE_ID = self.DB_SELECT_MYSQL_INSTA_FOLLOWER_NODE_ID 435 | self.DB_SELECT_FOLLOW_OF = self.DB_SELECT_MYSQL_FOLLOW_OF 436 | self.DB_SELECT_DEEPSCAN_NEED = self.DB_SELECT_MYSQL_DEEPSCAN_NEED 437 | self.DB_SELECT_EXPORT_ID_USER = self.DB_SELECT_MYSQL_EXPORT_ID_USER 438 | self.DB_SELECT_IMG = self.DB_SELECT_MYSQL_IMG 439 | 440 | def __init__(self): 441 | #Starting up 442 | print("- Starting {}".format(self.PROGRAM_NAME)) 443 | print("+ Text Libray loaded") 444 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | python-slugify==3.0.2 2 | unicodecsv==0.14.1 3 | mysql-connector-python==8.0.18 4 | cmake 5 | Pillow 6 | dlib>=19.7 7 | -------------------------------------------------------------------------------- /run_tracker.py: -------------------------------------------------------------------------------- 1 | from optracker.optracker import run 2 | 3 | if __name__ == '__main__': 4 | run() 5 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | from pathlib import Path 3 | 4 | setuptools.setup( 5 | name="optracker", 6 | version="1.3.6", 7 | description=('Scrapes medias, likes, followers from social media. Organize them in a database for more deeper analyze.'), 8 | long_description=Path("README.md").read_text(), 9 | long_description_content_type="text/markdown", 10 | packages=setuptools.find_packages(), 11 | package_data={'optracker': ['data/face_models/*.dat']}, 12 | license="MIT", 13 | maintainer="suxSx", 14 | author='suxSx', 15 | author_email='marcuscrazy@gmail.com', 16 | keywords='scraper media social network mapper tracker instagram scrape like follow analyze', 17 | url='https://github.com/suxSx/openSource-tracker', 18 | entry_points={ 19 | 'console_scripts': [ 20 | 'optracker = optracker.optracker:run', 21 | ], 22 | }, 23 | install_requires=[ 24 | 'python-slugify==3.0.2', 25 | 'unicodecsv==0.14.1', 26 | 'mysql-connector-python==8.0.18', 27 | 'cmake', 28 | 'Pillow', 29 | 'dlib>=19.7' 30 | ], 31 | classifiers=[ 32 | 'Development Status :: 4 - Beta', 33 | 'Environment :: Console', 34 | 'Operating System :: OS Independent', 35 | 'Intended Audience :: Developers', 36 | 'Intended Audience :: Education', 37 | 'Programming Language :: Python', 38 | 'Programming Language :: Python :: 3.6', 39 | 'Topic :: Education :: Testing', 40 | "License :: OSI Approved :: MIT License" 41 | ], 42 | ) 43 | --------------------------------------------------------------------------------