├── .travis.yml ├── AUTHORS.md ├── LICENSE ├── README.md ├── appveyor.yml ├── instatag ├── __init__.py ├── __main__.py ├── instatag.py ├── tags.tf └── test.py ├── requirements.txt └── setup.py /.travis.yml: -------------------------------------------------------------------------------- 1 | sudo: true 2 | 3 | language: python 4 | 5 | python: 6 | - 3.5 7 | 8 | install: 9 | - pip3 install -r requirements.txt 10 | - python3 setup.py install 11 | 12 | script: 13 | - python3 -m instatag test -------------------------------------------------------------------------------- /AUTHORS.md: -------------------------------------------------------------------------------- 1 | # Core Developers # 2 | 3 | ---------- 4 | - Sepand Haghighi - Sharif University of Technology - ([http://github.com/sepandhaghighi](http://github.com/sepandhaghighi)) ([sepand@qpage.ir](mailto:sepand@qpage.ir)) -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Moduland Co 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |
2 | 3 | 4 |
5 | 6 |
7 | 8 |
9 | built with Python3 10 | 11 | PyPI version 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | # InstaTag 20 | 21 | 22 | Extract Instagram Users from tags (Public , without API and Login) 23 | 24 | 25 | ---------- 26 | 27 | By [Moduland Co](http://www.moduland.ir) 28 | 29 | ---------- 30 | 31 | 32 | 33 | 34 |
35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 |
LinuxWindows
49 | 50 |
51 | 52 | ## Installation 53 | ### Source Code 54 | - Download [Version 0.1](https://github.com/moduland/instatag/archive/v0.1.zip) or [Latest Source ](https://github.com/Moduland/instatag/archive/master.zip) 55 | 56 | - `python3 setup.py install` or `python setup.py install` 57 | 58 | ### PyPI 59 | 60 | 61 | - Check [Python Packaging User Guide](https://packaging.python.org/installing/) 62 | - `pip install instatag` or `pip3 install instatag` (Need root access) 63 | 64 | 65 | ## Usage 66 | 67 | - Manual Inputs : `python -m instatag input tag1,tag2,tag3,...` 68 | - File Input : `python -m instatag file tags.tf` 69 | - Default : `python -m instatag` (Search for `tags.tf` file in current word directory) 70 | - Result in `insta_data` folder 71 | 72 |
73 | 74 |

Screen Record

75 |
76 | 77 | 78 | ## Issues & Bug Reports 79 | 80 | Just fill an issue and describe it. We'll check it ASAP! 81 | or send an email to [info@moduland.ir](mailto:info@moduland.ir "info@moduland.ir"). 82 | 83 | 84 | ## Contribution 85 | 86 | You can fork the repository, improve or fix some part of it and then send the pull requests back if you want to see them here. I really appreciate that. ❤️ 87 | 88 | Remember to write a few tests for your code before sending pull requests. 89 | 90 | ## Donate to our project 91 | 92 | If you feel like our project is important can you please support us? 93 | Our project is not and is never going to be working for profit. We need the money just so we can continue doing what we do. 94 | 95 |

Bitcoin :

96 | 97 | ```1XGr9qbZjBpUQJJSB6WtgBQbDTgrhPLPA``` 98 | 99 | 100 |

Payping (For Iranian citizens) :

101 | 102 | 103 | 104 | ## License 105 |
106 | 107 |
108 | Moduland Website 109 | 110 |
111 | 112 | 113 | 114 | 115 | 116 | -------------------------------------------------------------------------------- /appveyor.yml: -------------------------------------------------------------------------------- 1 | build: false 2 | 3 | environment: 4 | matrix: 5 | 6 | - PYTHON: "C:\\Python35" 7 | PYTHON_VERSION: "3.5.2" 8 | PYTHON_ARCH: "32" 9 | 10 | - PYTHON: "C:\\Python35" 11 | PYTHON_VERSION: "3.5.2" 12 | PYTHON_ARCH: "64" 13 | 14 | 15 | 16 | init: 17 | - "ECHO %PYTHON% %PYTHON_VERSION% %PYTHON_ARCH%" 18 | 19 | install: 20 | - "%PYTHON%/Scripts/pip.exe install -r requirements.txt" 21 | - "%PYTHON%/python.exe setup.py install" 22 | 23 | test_script: 24 | - "%PYTHON%/python.exe -m instatag test" -------------------------------------------------------------------------------- /instatag/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | from .instatag import * -------------------------------------------------------------------------------- /instatag/__main__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | from .instatag import * 4 | import time 5 | import os 6 | import sys 7 | import doctest 8 | import multiprocessing as mu 9 | tag_list=[] 10 | if __name__=="__main__": 11 | args=sys.argv 12 | if len(args)>1: 13 | if args[1].upper()=="HELP": 14 | instatag_help() 15 | sys.exit() 16 | elif args[1].upper()=="TEST": 17 | doctest.testfile("test.py",verbose=True) 18 | sys.exit() 19 | if len(args) > 2: 20 | if args[1].upper()=="INPUT": 21 | tag_list=list(args[2].split(",")) 22 | elif args[1].upper()=="FILE": 23 | tag_list=get_tags(args[2]) 24 | timer_1=time.time() 25 | if "insta_data" not in os.listdir(os.getcwd()): 26 | os.mkdir("insta_data") 27 | if len(tag_list)==0: 28 | tag_list=get_tags() 29 | p=mu.Pool(mu.cpu_count()) 30 | p.map(user_list_gen,tag_list) 31 | timer_2=time.time() 32 | delta_time=int(timer_2-timer_1) 33 | print("InstaTag Data Generated In "+time_convert(delta_time)) -------------------------------------------------------------------------------- /instatag/instatag.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import requests 4 | import time 5 | import socket 6 | import io 7 | from bs4 import BeautifulSoup 8 | from random import randint 9 | import os 10 | import sys 11 | from art import * 12 | DEBUG=True 13 | VERSION="0.1" 14 | def instatag_help(): 15 | tprint("Instatag") 16 | tprint("V"+VERSION) 17 | print("Help : \n") 18 | print(" - test --> (run tests)\n") 19 | print(" - input tags --> Example : 'python -m instatag input tag1,tag2,tag3'\n") 20 | print(" - file filename --> Example : 'python -m instatag file file.tf\n") 21 | print(" - --> Example : 'python -m instatag (Search for tags.tf file)") 22 | def tag_url_maker(tag): 23 | return "http://www.instagram.com/explore/tags/"+tag 24 | def post_url_maker(post_hash): 25 | return "https://www.instagram.com/p/"+post_hash 26 | 27 | def step_2_url_maker(name): 28 | return "http://www.instagram.com/"+name+"/media/" 29 | def print_line(number=30,char="-"): 30 | ''' 31 | This function print line in screen 32 | :param number: number of items in each line 33 | :param char: each char of line 34 | :return: None 35 | ''' 36 | line="" 37 | i=0 38 | while(i>> internet() # if there is stable internet connection 100 | True 101 | >>> internet() # if there is no stable internet connection 102 | False 103 | """ 104 | try: 105 | socket.setdefaulttimeout(timeout) 106 | socket.socket(socket.AF_INET, socket.SOCK_STREAM).connect((host, port)) 107 | return True 108 | except Exception: 109 | return False 110 | def create_random_sleep(index=1,min_time=5,max_time=60): 111 | ''' 112 | This function generate sleep time with random processes 113 | :param index: index to determine first page and messages(index = 0 is for first page) 114 | :param min_time: minimum time of sleep 115 | :param max_time: maximum time of sleep 116 | :type index:int 117 | :type min_time:int 118 | :type max_time:int 119 | :return: time of sleep as integer (a number between max and min) 120 | ''' 121 | if index==0: 122 | time_sleep = 5 123 | if DEBUG==True: 124 | print("Wait "+str(time_sleep)+" sec for first search . . .") 125 | else: 126 | time_sleep = randint(min_time, max_time) 127 | if DEBUG==True: 128 | print("Wait "+str(time_sleep)+" sec for next search . . .") 129 | if DEBUG==True: 130 | print_line(70,"*") 131 | return time_sleep 132 | 133 | def post_list_gen(tag,index=0): 134 | ''' 135 | This function extract instagram post_link 136 | :param tag: hashtag 137 | :param index: index for resume search 138 | :type tag:str 139 | :type index:int 140 | :return: post links as a list 141 | ''' 142 | try: 143 | print("Page Extracting . . .") 144 | post_list = [] 145 | tag_url = tag_url_maker(tag) 146 | raw_file = get_html(tag_url) 147 | if "No posts yet." in raw_file: 148 | print("Invalid Tag") 149 | sys.exit() 150 | while(index!=-1): 151 | index=raw_file.find('"shortcode":',index+7,len(raw_file)) 152 | length=raw_file[index:].find(',') 153 | post=raw_file[index+13:index+length-1] 154 | if len(post)<50: 155 | post_list.append(post) 156 | return post_list 157 | except Exception: 158 | return post_list 159 | 160 | def step_2_gen(name_list,tag): 161 | ''' 162 | This function extract 2nd users (from 1st users follower and following list) 163 | :param name_list: 1st users id 164 | :param tag: hashtag 165 | :type name:list 166 | :type tag:str 167 | :return: None 168 | ''' 169 | try: 170 | file=io.open("insta_data/users_2_"+tag+".txt","a",encoding="utf-8") 171 | user_list=[] 172 | for i in range(len(name_list)): 173 | print("ID : "+name_list[i]) 174 | user_url=step_2_url_maker(name_list[i]) 175 | raw_html=get_html(user_url,max_delay=8) 176 | raw_file = BeautifulSoup(raw_html, "html.parser").prettify() 177 | if raw_file.find('"more_available": false')!=-1: 178 | print("Account is private") 179 | continue 180 | index=0 181 | while (index != -1): 182 | index = raw_file.find('"username":', index + 13, len(raw_file)) 183 | length = raw_file[index:].find('}') 184 | user = raw_file[index + 13:index + length - 1] 185 | if user[-1] == '"': 186 | user = user[:-1] 187 | user = user.encode("utf-8") 188 | if len(user) < 50 and user not in user_list and user!=name_list[i]: 189 | file.write(str(user, encoding="utf-8") + "\n") 190 | user_list.append(user) 191 | print(str(len(user_list))+" ID Extracted") 192 | except Exception as ex: 193 | print(str(ex)) 194 | 195 | 196 | 197 | def user_list_gen(tag): 198 | ''' 199 | This function extract user_list for each tag in first step and then run step_2_gen for second users 200 | :param tag: hastag 201 | :type tag:str 202 | :return: None 203 | ''' 204 | try: 205 | hash_list = post_list_gen(tag) 206 | file=io.open("insta_data/users_1_"+tag+".txt","a",encoding="utf-8") 207 | user_list=[] 208 | for i in range(len(hash_list)): 209 | print("Code : "+hash_list[i]) 210 | user_url = post_url_maker(hash_list[i]) 211 | raw_html = get_html(user_url) 212 | raw_file=BeautifulSoup(raw_html,"html.parser").prettify() 213 | index = 0 214 | while(index!=-1): 215 | index=raw_file.find('"username":',index+13,len(raw_file)) 216 | length=raw_file[index:].find('}') 217 | user=raw_file[index+12:index+length-1] 218 | if user[-1]=='"': 219 | user=user[:-1] 220 | if len(user)<50 and (user not in user_list): 221 | user_list.append(user) 222 | user = user.encode("utf-8") 223 | file.write(str(user,encoding="utf-8")+"\n") 224 | 225 | step_2_gen(user_list,tag) 226 | file.close() 227 | except Exception as ex: 228 | print(str(ex)) 229 | instatag_help() 230 | step_2_gen(user_list, tag) 231 | 232 | def get_tags(filename="tags.tf"): 233 | ''' 234 | This function read tags from a comma seperated file (tags.tf) 235 | :return: Tags as a list 236 | ''' 237 | try: 238 | if filename in os.listdir(os.getcwd()): 239 | file=io.open(filename,"r",encoding="utf-8") 240 | contain=file.read() 241 | return contain.strip().split(",") 242 | else: 243 | print("[Error] Tag File Missed!") 244 | instatag_help() 245 | sys.exit() 246 | except Exception: 247 | print("[Error] In Tags Read!") 248 | instatag_help() 249 | sys.exit() 250 | 251 | -------------------------------------------------------------------------------- /instatag/tags.tf: -------------------------------------------------------------------------------- 1 | Tag1,Tag2,Tag3,Tag4 -------------------------------------------------------------------------------- /instatag/test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | ''' 3 | >>> import multiprocessing as mu 4 | >>> from instatag import * 5 | >>> tag_url_maker("test") 6 | 'http://www.instagram.com/explore/tags/test' 7 | >>> step_2_url_maker("test") 8 | 'http://www.instagram.com/test/media/' 9 | >>> print_line() 10 | ------------------------------ 11 | >>> print_line(30,"p") 12 | pppppppppppppppppppppppppppppp 13 | >>> zero_insert("w") 14 | '0w' 15 | >>> zero_insert("asdasdasd") 16 | 'asdasdasd' 17 | >>> time_convert(10000) 18 | '00 days, 02 hour, 46 minutes, 40 seconds' 19 | >>> internet() 20 | True 21 | >>> import random 22 | >>> random.seed(2) 23 | >>> create_random_sleep(index=1,min_time=2,max_time=5) 24 | Wait 2 sec for next search . . . 25 | ********************************************************************** 26 | 2 27 | 28 | ''' -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | requests 2 | bs4 3 | art 4 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from distutils.core import setup 2 | setup( 3 | name = 'instatag', 4 | packages = ['instatag'], 5 | version = '0.1', 6 | description = 'Extract users from tag in instagram', 7 | long_description="", 8 | author = 'Moduland Co', 9 | author_email = 'info@moduland.ir', 10 | url = 'https://github.com/Moduland/instatag', 11 | download_url = 'https://github.com/Moduland/instatag/tarball/v0.1', 12 | keywords = ['extract', 'scrap', 'instagram','python','tags','users'], 13 | install_requires=[ 14 | 'art', 15 | 'bs4', 16 | 'requests', 17 | ], 18 | classifiers = [ 19 | 'Development Status :: 3 - Alpha', 20 | 'Intended Audience :: End Users/Desktop', 21 | 'License :: OSI Approved :: MIT License', 22 | 'Operating System :: OS Independent', 23 | 'Programming Language :: Python :: 3.4', 24 | 'Programming Language :: Python :: 3.5', 25 | 'Programming Language :: Python :: 3.6', 26 | 'Topic :: Internet', 27 | ], 28 | license='MIT', 29 | ) 30 | --------------------------------------------------------------------------------