├── IGAudit.pdf ├── graph_out.png ├── terminal_out.png ├── IGAudit_mdfiles ├── output_19_1.png ├── output_28_1.png ├── output_33_2.png └── IGAudit.md ├── StatisticalMethod ├── README.md └── instabusted.py ├── README.md ├── test.csv └── train.csv /IGAudit.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit.pdf -------------------------------------------------------------------------------- /graph_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/graph_out.png -------------------------------------------------------------------------------- /terminal_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/terminal_out.png -------------------------------------------------------------------------------- /IGAudit_mdfiles/output_19_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit_mdfiles/output_19_1.png -------------------------------------------------------------------------------- /IGAudit_mdfiles/output_28_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit_mdfiles/output_28_1.png -------------------------------------------------------------------------------- /IGAudit_mdfiles/output_33_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit_mdfiles/output_33_2.png -------------------------------------------------------------------------------- /StatisticalMethod/README.md: -------------------------------------------------------------------------------- 1 | # InstaBusted 2 | 3 | #FakersGonnaFake: Using Simple Statistical Tools to Audit Instagram Accounts for Authenticity 4 | 5 | ## Sample output 6 | 7 | Terminal output 8 |
9 |
10 | 11 | Graph output 12 |
13 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # IGAudit 2 | #FakersGonnaFake: Using Simple Statistical Tools and Machine Learning to Audit Instagram Accounts for Authenticity 3 | 4 | **Motivation:** During lockdown, businesses have started increasing the use of social media influencers to market their products while their physical outlets are temporary closed. However, it is sad that there are some that will try and game the system for their own good. But in a world where a single influencer's post is worth as much as an average 9-5 Joe's annual salary, influencer marketing fake followers and fake engagement is a price that brands shouldn't have to pay for. 5 | 6 | *Inspired by igaudit.io, a very userful tool that was unfortunately taken down by Facebook recently.* 7 | 8 | ## Got your attention? Great! 9 | - If you want to read through the code and the outputs quickly, head over to the [PDF](https://github.com/athiyadeviyani/IGAudit/blob/master/IGAudit.pdf), [Markdown](https://github.com/athiyadeviyani/IGAudit/blob/master/IGAudit_mdfiles/IGAudit.md), or the [HTML](https://igaudit-by-tia.glitch.me/) version of the Jupyter Notebook 10 | - If you want to get your hands dirty and understand the inner workings of the code, make sure you install the following requirements and run the [Jupyter Notebook](https://github.com/athiyadeviyani/IGAudit/blob/master/IGAudit.ipynb)! 11 | 12 |          ``$ pip install numpy pandas seaborn matplotlib sklearn`` 13 |
         ``$ pip install git+https://git@github.com/ping/instagram_private_api.git@1.6.0`` 14 | 15 | ## Contents 16 | - Short Introduction 17 | - Part 1: Understanding and Splitting the Data 18 | - Part 2: Comparing Classification Models 19 | - Logistic Regression 20 | - K-Nearest Neighbors 21 | - Decision Trees 22 | - Random Forest 23 | - Part 3: Obtaining Instagram Data 24 | - Getter methods for the API 25 | - Part 4: Preparing the Data 26 | - Understanding the data obtained 27 | - Filtering and extracting relevant information 28 | - Part 5: Make the prediction! 29 | - Part 6: Extension - Fake Likes 30 | - Part 7: Comparison with Another User 31 | - Part 8: Thoughts 32 | - Making sense of the result 33 | - So is this influencer worth investing or not? 34 | - Conclusion 35 | -------------------------------------------------------------------------------- /StatisticalMethod/instabusted.py: -------------------------------------------------------------------------------- 1 | from instagram_private_api import Client, ClientCompatPatch 2 | from matplotlib import pyplot as plt 3 | 4 | import getpass 5 | import random 6 | 7 | 8 | # INITIAL AUTHENTICATION 9 | def login(): 10 | username = input("username: ") 11 | password = getpass.getpass("password: ") 12 | api = Client(username, password) 13 | return api 14 | 15 | api = login() 16 | 17 | def get_ID(username): 18 | return api.username_info(username)['user']['pk'] 19 | 20 | def get_followers(userID, rank): 21 | followers = [] 22 | next_max_id = True 23 | 24 | while next_max_id: 25 | if next_max_id == True: next_max_id='' 26 | f = api.user_followers(userID, rank, max_id=next_max_id) 27 | followers.extend(f.get('users', [])) 28 | next_max_id = f.get('next_max_id', '') 29 | 30 | user_fer = [dic['username'] for dic in followers] 31 | 32 | return user_fer 33 | 34 | 35 | # GET TARGET USER INFORMATION 36 | username = input("Instagram username for analysis: ") 37 | 38 | uid = get_ID(username) 39 | rank = api.generate_uuid() 40 | 41 | 42 | # GET USER'S FOLLOWERS 43 | followers = get_followers(uid, rank) 44 | 45 | print("This user has " + str(len(followers)) + " followers.") 46 | 47 | print("============================") 48 | 49 | 50 | # GENERATE RANDOM SAMPLE (for efficiency) 51 | samples = int(input("Number of random samples (recommended: 1-100): ")) 52 | print("Generating " + str(samples) + " random samples for " + username + " followers!") 53 | random_followers = random.sample(followers, samples) 54 | 55 | print("Analyzing random samples...") 56 | 57 | 58 | # START ANALYSIS OF THE RANDOM SAMPLE 59 | suspicious = [] 60 | tuples = [] 61 | 62 | i = 0 63 | for follower in random_followers: 64 | f_id = get_ID(follower) 65 | followers = api.user_info(f_id)['user'].get('follower_count') 66 | followings = api.user_info(f_id)['user'].get('following_count') 67 | 68 | tuples.append((followers, followings)) 69 | 70 | # SMOOTH! 71 | if followers == 0: 72 | followers = 1 73 | 74 | if followings == 0: 75 | followings = 1 76 | 77 | ratio = followings/followers 78 | 79 | i += 1 80 | 81 | if (i%10==0): 82 | print(str(i) + " out of " + str(len(random_followers)) + " followers processed.") 83 | 84 | # 'FAKENESS' THRESHOLD 85 | # e.g. user_x has 100 followers and 2000 followings, user_x is flagged 'suspicious' 86 | if ratio > 20: 87 | suspicious.append(follower) 88 | 89 | print(str(len(suspicious)) + " suspicious accounts detected!") 90 | 91 | 92 | # CALCULATE THE OVERALL AUTHENTICITY OF THE USER 93 | percentage_fake = len(suspicious)*100/len(random_followers) 94 | 95 | print(username + " has " + str(100-percentage_fake) + "% authenticity!") 96 | 97 | 98 | # GENERATE THE GRAPH 99 | x = [x[0] for x in tuples] 100 | y = [x[1] for x in tuples] 101 | 102 | f, ax = plt.subplots(figsize=(16,10)) 103 | plt.scatter(x,y) 104 | plt.plot([i for i in range(max(max(x), max(y)))], 105 | [i for i in range(max(max(x), max(y)))], 106 | color = 'red', 107 | linewidth = 2, label='following:followers = 1:1' 108 | ) 109 | plt.text(2500, 4000, 'following:followers = 1:1', size=14) 110 | plt.title('Following:Followers plot for user:' + username + ' Instagram Followers', size=20) 111 | plt.xlabel('Followers', size=14) 112 | plt.ylabel('Following', size=14) 113 | 114 | plt.show() 115 | -------------------------------------------------------------------------------- /test.csv: -------------------------------------------------------------------------------- 1 | profile pic,nums/length username,fullname words,nums/length fullname,name==username,description length,external URL,private,#posts,#followers,#follows,fake 2 | 1,0.33,1,0.33,1,30,0,1,35,488,604,0 3 | 1,0,5,0,0,64,0,1,3,35,6,0 4 | 1,0,2,0,0,82,0,1,319,328,668,0 5 | 1,0,1,0,0,143,0,1,273,14890,7369,0 6 | 1,0.5,1,0,0,76,0,1,6,225,356,0 7 | 1,0,1,0,0,0,0,1,6,362,424,0 8 | 1,0,1,0,0,132,0,1,9,213,254,0 9 | 1,0,2,0,0,0,0,1,19,552,521,0 10 | 1,0,2,0,0,96,0,1,17,122,143,0 11 | 1,0,1,0,0,78,0,1,9,834,358,0 12 | 1,0,1,0,0,0,0,1,53,229,492,0 13 | 1,0.14,1,0,0,78,1,1,97,1913,436,0 14 | 1,0.14,2,0,0,61,0,1,17,200,437,0 15 | 1,0.33,2,0,0,45,0,1,8,130,622,0 16 | 1,0.1,2,0,0,43,0,0,60,192,141,0 17 | 1,0,2,0,0,56,0,1,51,498,337,0 18 | 1,0.33,2,0,0,86,0,1,25,96,499,0 19 | 1,0,1,0,0,97,0,1,188,202,605,0 20 | 1,0,3,0,0,46,0,1,590,175,199,0 21 | 1,0,2,0,0,39,0,1,251,223,694,0 22 | 1,0.5,1,0,0,0,0,1,0,189,276,0 23 | 1,0,0,0,0,28,0,1,58,486,862,0 24 | 1,0.22,2,0,0,63,0,1,46,464,367,0 25 | 1,0,2,0,0,24,0,1,19,150,157,0 26 | 1,0.1,1,0,0,4,0,1,250,2983,545,0 27 | 1,0,1,0,0,27,0,1,28,116,138,0 28 | 1,0,1,0,0,137,1,0,1065,155537,1395,0 29 | 1,0,1,0,0,69,0,1,209,248,490,0 30 | 1,0.33,1,0,0,0,0,1,5,348,347,0 31 | 1,0,2,0,0,147,1,0,1879,4021842,5514,0 32 | 1,0,2,0,0,0,0,0,9,366,552,0 33 | 1,0,2,0,0,138,1,0,253,1064,573,0 34 | 1,0,2,0,0,117,1,0,90,81267,963,0 35 | 1,0,2,0,0,0,0,1,8,400,449,0 36 | 1,0.17,1,0,0,0,0,0,12,361,562,0 37 | 1,0,2,0,0,14,0,0,13,228,346,0 38 | 1,0,2,0.3,0,18,0,0,59,855,151,0 39 | 1,0,4,0,0,54,1,0,77,777,148,0 40 | 1,0,1,0,0,0,1,0,14,264,151,0 41 | 1,0.15,2,0,0,73,1,0,330,1572,3504,0 42 | 1,0,0,0,0,47,0,1,2,510,185,0 43 | 1,0,2,0,0,84,1,0,932,1027419,293,0 44 | 1,0,1,0,0,39,1,0,6,710,549,0 45 | 1,0.09,2,0,0,77,0,0,463,2267,466,0 46 | 1,0,1,0,0,33,0,1,62,2055,993,0 47 | 1,0,2,0,0,149,1,1,247,814,1111,0 48 | 1,0,2,0,0,8,0,1,85,668,605,0 49 | 1,0,2,0,0,0,0,0,0,87,40,0 50 | 1,0,2,0,0,18,0,1,118,461,1055,0 51 | 1,0,9,0,0,133,1,0,307,602517,482,0 52 | 1,0,3,0,0,0,0,0,9,62,47,0 53 | 1,0,2,0,0,81,0,1,25,341,274,0 54 | 1,0,2,0,0,4,0,0,42,717,223,0 55 | 1,0,1,0,0,13,0,1,14,386,363,0 56 | 1,0,2,0,0,20,0,0,78,673,568,0 57 | 1,0,7,0,0,24,0,1,465,654,535,0 58 | 1,0,2,0,0,19,0,0,65,751,577,0 59 | 1,0.14,2,0,0,0,0,0,9,209,276,0 60 | 1,0,2,0,0,146,0,0,145,573,474,0 61 | 1,0,2,0,0,0,0,1,1,284,505,0 62 | 0,0.05,1,0.0,0,0,0,0,0,0,2,1 63 | 1,0.27,1,0.0,0,0,0,0,0,45,64,1 64 | 0,0.07,1,0.0,0,0,0,0,0,19,30,1 65 | 0,0.0,1,0.0,1,0,0,0,0,69,694,1 66 | 0,0.0,2,0.0,0,0,0,0,0,22,82,1 67 | 0,0.22,0,0,0,0,0,0,0,31,124,1 68 | 0,0.0,3,0.0,0,0,0,0,0,9,25,1 69 | 0,0.0,1,0.0,1,0,0,0,0,69,694,1 70 | 0,0.0,1,0.0,0,0,0,0,0,23,33,1 71 | 0,0.62,1,0.4,0,0,0,0,1,17,34,1 72 | 0,0.0,2,0.0,0,14,0,0,0,46,38,1 73 | 0,0.42,1,0.0,0,0,0,0,0,16,2,1 74 | 0,0.62,1,0.4,0,0,0,0,0,0,18,1 75 | 0,0.5,1,0.4,0,0,0,0,0,21,1,1 76 | 0,0.31,0,0,0,0,0,0,0,52,15,1 77 | 0,0.27,1,0.0,0,0,0,0,1,24,2,1 78 | 0,0.5,2,0.0,0,0,0,0,0,13,22,1 79 | 0,0.75,1,1.0,0,0,0,0,0,227,353,1 80 | 1,0.43,1,0.0,0,0,0,0,2,10,24,1 81 | 0,0.25,1,0.0,0,0,0,0,0,46,18,1 82 | 1,0.88,1,1.0,0,0,0,0,3,57,22,1 83 | 1,0.89,1,1.0,0,0,0,0,8,341,2287,1 84 | 1,0.0,2,0.0,0,0,0,0,3,1789,6153,1 85 | 1,0.27,1,0.0,0,0,0,0,0,45,64,1 86 | 0,0.0,1,0.0,0,0,0,0,0,21,31,1 87 | 1,0.56,1,0.56,1,20,0,0,34,309,250,1 88 | 1,0.0,3,0.0,0,58,0,0,4,1742,6172,1 89 | 0,0.14,1,0.0,0,0,0,0,0,1906,2129,1 90 | 1,0.0,5,0.0,0,0,0,0,1,39,324,1 91 | 0,0.5,0,0,0,0,0,0,1,17,33,1 92 | 1,0.0,2,0.0,0,0,0,0,1,62,126,1 93 | 1,0.5,1,0.0,0,0,0,0,0,119,350,1 94 | 0,0.33,1,0.0,0,0,0,0,0,2997,764,1 95 | 1,0.0,0,0,0,0,0,0,15,772,3239,1 96 | 1,0.0,1,0.0,0,0,0,0,4,129,920,1 97 | 1,0.38,2,0.25,0,0,0,0,70,94,105,1 98 | 0,0.0,1,0.0,0,0,0,0,2,37,58,1 99 | 1,0.33,1,0.0,0,0,0,0,81,75,55,1 100 | 0,0.31,1,0.33,0,0,0,0,0,42,175,1 101 | 1,0.67,1,0.5,0,0,0,0,5,145,202,1 102 | 1,0.38,1,0.0,0,0,0,0,21,128,636,1 103 | 1,0.38,1,0.0,0,0,0,0,3,88,72,1 104 | 0,0.0,0,0,0,0,0,0,2,1987,7453,1 105 | 1,0.48,1,0.0,0,0,0,0,16,100,162,1 106 | 1,0.57,1,0.0,0,0,0,0,3,214,829,1 107 | 1,0.88,1,1.0,0,0,0,0,23,227,776,1 108 | 1,0.89,2,0.0,0,0,0,0,6,192,942,1 109 | 0,0.44,1,0.44,1,112,0,0,4,415,1445,1 110 | 0,0.0,1,0.0,0,0,0,0,0,926,4239,1 111 | 1,0.44,1,0.0,0,0,0,0,60,238,1381,1 112 | 1,0.0,1,0.0,0,0,0,0,1,193,669,1 113 | 1,0.0,1,0.25,0,0,0,0,0,49,235,1 114 | 0,0.44,1,0.0,0,0,0,0,0,13,7,1 115 | 1,0.45,1,0.4,0,0,0,0,2,74,270,1 116 | 1,0.33,1,0.0,0,0,0,0,8,88,76,1 117 | 1,0.29,1,0.0,0,0,0,0,13,114,811,1 118 | 1,0.4,1,0.0,0,0,0,0,4,150,164,1 119 | 1,0.0,2,0.0,0,0,0,0,3,833,3572,1 120 | 0,0.17,1,0.0,0,0,0,0,1,219,1695,1 121 | 1,0.44,1,0.0,0,0,0,0,3,39,68,1 -------------------------------------------------------------------------------- /train.csv: -------------------------------------------------------------------------------- 1 | profile pic,nums/length username,fullname words,nums/length fullname,name==username,description length,external URL,private,#posts,#followers,#follows,fake 2 | 1,0.27,0,0,0,53,0,0,32,1000,955,0 3 | 1,0,2,0,0,44,0,0,286,2740,533,0 4 | 1,0.1,2,0,0,0,0,1,13,159,98,0 5 | 1,0,1,0,0,82,0,0,679,414,651,0 6 | 1,0,2,0,0,0,0,1,6,151,126,0 7 | 1,0,4,0,0,81,1,0,344,669987,150,0 8 | 1,0,2,0,0,50,0,0,16,122,177,0 9 | 1,0,2,0,0,0,0,0,33,1078,76,0 10 | 1,0,0,0,0,71,0,0,72,1824,2713,0 11 | 1,0,2,0,0,40,1,0,213,12945,813,0 12 | 1,0,2,0,0,54,0,0,648,9884,1173,0 13 | 1,0,2,0,0,54,1,0,76,1188,365,0 14 | 1,0,2,0,0,0,1,0,298,945,583,0 15 | 1,0,2,0,0,103,1,0,117,12033,248,0 16 | 1,0,2,0,0,98,1,0,487,1962,2701,0 17 | 1,0,3,0,0,46,0,0,254,50374,900,0 18 | 1,0,3,0,0,0,0,0,59,7007,289,0 19 | 1,0.29,3,0,0,48,0,0,1570,1128,694,0 20 | 1,0,2,0,0,63,1,0,378,34670,1878,0 21 | 1,0,2,0,0,106,1,0,526,2338,776,0 22 | 1,0,2,0,0,40,0,0,228,3516,999,0 23 | 1,0,1,0,0,35,1,1,35,1809,416,0 24 | 1,0,2,0,0,30,0,0,281,427,470,0 25 | 1,0,1,0,0,27,0,0,285,759,956,0 26 | 1,0,0,0,0,0,0,0,148,15338538,61,0 27 | 1,0,1,0,0,109,1,1,57,109,179,0 28 | 1,0,6,0,0,0,0,1,17,536,665,0 29 | 1,0,2,0,0,132,1,0,511,121354,176,0 30 | 1,0,2,0,0,126,1,0,230,2284,130,0 31 | 1,0,2,0,0,122,0,1,15,186,174,0 32 | 1,0,2,0,0,138,0,1,980,687,1517,0 33 | 1,0.13,0,0,0,0,0,1,53,966,952,0 34 | 1,0,2,0,0,50,0,1,111,177,170,0 35 | 1,0,2,0,0,35,0,0,719,744,967,0 36 | 1,0,2,0,0,56,1,0,1164,542073,674,0 37 | 1,0.18,2,0,0,9,0,0,497,5315651,2703,0 38 | 1,0.33,0,0,0,0,0,1,18,267,328,0 39 | 1,0,2,0,0,81,0,0,50,691,680,0 40 | 1,0,2,0,0,134,0,1,74,120,112,0 41 | 1,0,2,0,0,0,0,0,8,105,98,0 42 | 1,0,0,0,0,2,0,0,7389,890969,11,0 43 | 1,0,2,0,0,0,1,0,420,361853,583,0 44 | 1,0,2,0,0,23,0,0,433,3678,1359,0 45 | 1,0,2,0,0,138,1,0,156,92192,16,0 46 | 1,0,4,0,0,35,0,0,4494,12397719,8,0 47 | 1,0,3,0,0,93,0,0,751,380510,0,0 48 | 1,0,2,0,0,4,0,1,4,132,183,0 49 | 1,0,2,0,0,1,0,1,27,162,208,0 50 | 1,0,1,0,0,4,0,0,91,369,546,0 51 | 1,0,0,0,0,23,0,0,262,1476,666,0 52 | 1,0,3,0,0,91,1,0,274,1798,461,0 53 | 1,0,2,0,0,57,0,0,271,2118,1109,0 54 | 1,0,1,0,0,108,1,0,713,812,432,0 55 | 1,0,2,0.12,0,30,1,0,200,7217,761,0 56 | 1,0,0,0,0,82,0,0,12,313,376,0 57 | 1,0.12,1,0,0,12,1,0,26,64,261,0 58 | 1,0,2,0,0,54,0,0,75,1759,643,0 59 | 1,0,1,0,0,0,0,1,94,404,283,0 60 | 1,0,1,0,0,12,0,0,63,1843,598,0 61 | 1,0.12,2,0,0,0,0,0,69,320377,228,0 62 | 1,0,1,0,0,3,0,0,12,108,97,0 63 | 1,0,1,0,0,39,1,0,63,384,447,0 64 | 1,0.19,2,0,0,0,0,0,19,60,100,0 65 | 1,0,1,0,0,68,1,0,100,802,151,0 66 | 1,0,2,0,0,129,1,0,661,51145,528,0 67 | 1,0,2,0,0,57,1,0,149,1582,1882,0 68 | 1,0,2,0,0,64,0,0,22,223,266,0 69 | 1,0,1,0,0,42,0,0,400,18842,744,0 70 | 1,0,2,0,0,71,1,0,149,10240,1255,0 71 | 1,0,3,0,0,0,0,0,122,539,639,0 72 | 1,0.33,2,0,0,70,0,0,74,399,452,0 73 | 1,0,1,0,0,74,0,0,13,581,568,0 74 | 1,0,1,0,0,8,0,1,8,166,163,0 75 | 1,0,2,0,0,35,0,0,77,417,362,0 76 | 1,0.2,2,0,0,0,0,1,5,266,324,0 77 | 1,0,1,0,0,0,0,1,3,33,37,0 78 | 1,0,2,0,0,28,0,1,106,494,998,0 79 | 1,0,2,0,0,18,0,1,14,178,245,0 80 | 1,0,3,0,0,28,0,0,172,470,288,0 81 | 1,0.33,1,0,0,36,0,0,111,807,675,0 82 | 1,0,2,0,0,2,0,0,38,162,256,0 83 | 1,0.06,2,0,0,11,0,1,19,417,395,0 84 | 1,0,3,0,0,70,1,0,227,17303,360,0 85 | 1,0,2,0,0,29,0,0,221,1439,629,0 86 | 1,0,2,0,0,24,1,0,580,91446,526,0 87 | 1,0,3,0,0,21,1,0,40,824,489,0 88 | 1,0,1,0,0,81,0,0,101,741,1440,0 89 | 1,0,1,0,0,34,1,1,157,1267,899,0 90 | 1,0,0,0,0,40,0,0,197,4594,1713,0 91 | 1,0,1,0,0,12,0,0,61,1135,899,0 92 | 1,0,2,0,0,0,0,1,698,1926,1410,0 93 | 1,0,2,0,0,59,0,1,49,1068,1925,0 94 | 1,0,1,0,0,15,0,1,85,815,748,0 95 | 1,0,2,0,0,54,0,0,77,565,469,0 96 | 1,0,1,0,0,16,0,0,58,2556,1074,0 97 | 1,0,2,0,0,73,0,1,232,1312,935,0 98 | 1,0,1,0,0,24,0,0,20,699,599,0 99 | 1,0,0,0,0,0,0,0,98,4328,418,0 100 | 1,0,0,0,0,26,1,0,559,2487,999,0 101 | 1,0,3,0,0,0,0,0,189,673,438,0 102 | 1,0,2,0,0,0,0,1,388,730,413,0 103 | 1,0,0,0,0,0,0,1,28,59,55,0 104 | 1,0,3,0,0,0,0,1,156,289,222,0 105 | 1,0,1,0,0,28,0,1,6,19,20,0 106 | 1,0,2,0,0,55,1,0,69,3551,173,0 107 | 1,0,1,0,0,140,1,0,775,19512,591,0 108 | 1,0,0,0,0,122,1,0,205,2756,638,0 109 | 1,0,2,0,0,113,1,0,209,5406,589,0 110 | 1,0,2,0,0,38,0,1,334,459,390,0 111 | 1,0,2,0,0,0,0,0,9,218,75,0 112 | 1,0,2,0,0,23,0,0,6,796,1155,0 113 | 1,0,1,0,0,0,0,0,416,1113,1854,0 114 | 1,0.2,1,0,0,89,0,1,13,138,208,0 115 | 1,0.44,1,0,0,30,0,1,33,205,164,0 116 | 1,0,1,0,0,0,0,0,1,331,333,0 117 | 1,0.1,1,0.1,0,0,0,0,711,748,4659,0 118 | 1,0,1,0,0,0,0,0,114,490,1093,0 119 | 1,0,1,0.08,0,12,0,1,16,456,413,0 120 | 1,0,0,0,0,123,1,0,107,971,2047,0 121 | 1,0.24,1,0.24,0,0,0,0,5,497,438,0 122 | 1,0.14,2,0,0,0,0,1,7,99,132,0 123 | 1,0,0,0,0,0,0,1,21,193,413,0 124 | 1,0,2,0,0,40,0,0,65,492,689,0 125 | 1,0.29,2,0,0,0,0,0,10,167,164,0 126 | 1,0,2,0,0,33,0,1,59,99,178,0 127 | 1,0,0,0,0,0,0,1,137,916,1142,0 128 | 1,0,2,0,0,5,0,1,10,196,209,0 129 | 1,0,1,0,0,0,0,1,571,765,424,0 130 | 1,0.18,2,0,0,23,0,1,24,45,80,0 131 | 1,0,1,0,0,35,1,0,328,634,719,0 132 | 1,0,4,0,0,150,1,0,161,1383,7500,0 133 | 1,0,1,0,0,26,0,1,280,650,703,0 134 | 1,0,2,0,0,149,0,1,92,484,3296,0 135 | 1,0,3,0,0,129,0,0,31,200,270,0 136 | 1,0,2,0,0,0,0,0,0,192,65,0 137 | 1,0,0,0,0,18,0,1,25,553,610,0 138 | 1,0,2,0,0,74,1,0,921,27477,7202,0 139 | 1,0,2,0,0,0,1,1,1020,464,1039,0 140 | 1,0,1,0,0,59,0,0,301,1057,524,0 141 | 1,0,3,0,0,148,0,0,9,413,138,0 142 | 1,0,0,0,0,0,1,0,158,389,806,0 143 | 1,0.29,1,0,0,15,0,0,43,505,503,0 144 | 1,0,12,0,0,46,0,1,24,941,1208,0 145 | 1,0,2,0,0,5,0,1,60,2598,802,0 146 | 1,0,1,0,0,98,0,0,4,59,111,0 147 | 1,0,2,0,0,55,0,0,220,622,475,0 148 | 1,0,2,0,0,19,1,0,1159,2719,1061,0 149 | 1,0.36,5,0,0,71,0,0,8,216,305,0 150 | 1,0.22,2,0,0,133,0,0,396,881,375,0 151 | 1,0.08,2,0,0,150,1,0,38,870,72,0 152 | 1,0,2,0,0,43,0,1,1,265,371,0 153 | 1,0.15,2,0,0,37,0,1,2,1204,633,0 154 | 1,0,1,0,0,35,0,1,43,655,1016,0 155 | 1,0,2,0,0,87,0,0,131,1662,1065,0 156 | 1,0.14,2,0,0,59,0,0,7,14222,7399,0 157 | 1,0,2,0,0,0,0,0,57,483,1216,0 158 | 1,0,3,0,0,0,0,0,36,1204,2928,0 159 | 1,0,2,0,0,9,0,1,91,408,635,0 160 | 1,0,2,0,0,12,0,1,8,303,417,0 161 | 1,0.1,0,0,0,0,0,0,11,125,101,0 162 | 1,0.09,1,0,0,0,0,0,252,229,383,0 163 | 1,0.15,3,0,0,95,1,1,15,357,489,0 164 | 1,0,1,0,0,0,0,0,74,137,96,0 165 | 1,0,2,0,0,46,0,1,83,255,535,0 166 | 1,0,3,0,0,123,0,1,126,87,399,0 167 | 1,0,6,0,0,117,1,0,663,326946,3,0 168 | 1,0,1,0,0,26,0,1,0,114,446,0 169 | 1,0,0,0,0,0,0,1,10,167,387,0 170 | 1,0,2,0,0,58,0,1,8,1247,1196,0 171 | 1,0,1,0,0,0,0,0,24,585,1364,0 172 | 1,0,1,0,0,30,0,1,65,135,232,0 173 | 1,0,3,0,0,62,1,0,64,722,261,0 174 | 1,0.45,1,0.25,0,137,1,0,664,714,1159,0 175 | 1,0,2,0,0,149,1,0,130,39867,4664,0 176 | 0,0,0,0,0,14,0,1,131,533,1060,0 177 | 1,0,2,0,0,19,0,1,917,1158,3932,0 178 | 1,0,10,0,0,131,1,0,2,45834,280,0 179 | 1,0,2,0,0,0,0,0,142,876,529,0 180 | 1,0.09,5,0,0,5,0,0,165,3003,455,0 181 | 1,0.09,2,0,0,0,0,0,80,1298,407,0 182 | 1,0,2,0,0,11,0,1,32,3800,278,0 183 | 1,0,2,0,0,0,0,1,81,1358,127,0 184 | 1,0,2,0,0,27,1,0,334,6741307,111,0 185 | 1,0.38,2,0,0,10,1,0,8,791,456,0 186 | 1,0,2,0,0,72,1,0,373,732075,363,0 187 | 1,0,2,0,0,3,0,1,56,373,360,0 188 | 1,0,3,0,0,51,0,1,7,162,311,0 189 | 1,0,1,0,0,44,0,1,12,309,364,0 190 | 1,0,2,0,0,0,0,0,5,135,176,0 191 | 1,0,2,0,0,73,1,0,77,244,172,0 192 | 1,0.18,1,0,0,70,0,1,93,67,149,0 193 | 1,0,2,0,0,35,0,0,192,984,76,0 194 | 1,0,2,0,0,13,0,1,8,751,1223,0 195 | 1,0,4,0.25,0,105,0,1,145,781,529,0 196 | 1,0,1,0.33,0,91,0,0,135,1761,905,0 197 | 1,0.14,2,0,0,0,0,1,18,318,523,0 198 | 1,0,2,0,0,48,0,0,222,5282,652,0 199 | 1,0,2,0,0,48,0,0,222,5282,652,0 200 | 1,0,2,0,0,126,1,1,5,228,238,0 201 | 1,0,2,0,0,0,1,1,119,393,502,0 202 | 1,0,2,0,0,53,0,0,201,875,754,0 203 | 1,0,0,0,0,8,0,1,12,173,373,0 204 | 1,0,1,0,0,67,0,0,301,3896490,351,0 205 | 1,0,1,0,0,20,0,0,20,106,133,0 206 | 1,0.08,2,0,0,26,1,0,112,206,32,0 207 | 1,0,0,0,0,86,0,1,54,259,1371,0 208 | 1,0,1,0,0,51,0,1,98,1013,996,0 209 | 1,0.11,4,0,0,26,1,0,16,738,544,0 210 | 1,0,0,0,0,18,0,1,133,1008,517,0 211 | 1,0,2,0,0,96,1,0,142,2441,396,0 212 | 1,0,2,0,0,17,0,0,63,416,292,0 213 | 1,0,2,0,0,0,0,1,70,516,463,0 214 | 1,0,2,0,0,62,1,0,403,8711,345,0 215 | 1,0,2,0,0,86,0,0,15,433,584,0 216 | 1,0,2,0,0,148,1,0,990,18515,1000,0 217 | 1,0,2,0,0,1,0,1,12,70,67,0 218 | 1,0.22,3,0,0,39,0,0,26,5863,157,0 219 | 1,0.36,2,0,0,35,0,0,2,1677,716,0 220 | 1,0,0,0,0,103,0,0,24,617,272,0 221 | 1,0,2,0,0,0,1,0,411,31182,414,0 222 | 1,0,2,0,0,61,0,1,217,1152,292,0 223 | 1,0.15,1,0,0,44,0,0,156,8578,164,0 224 | 1,0,1,0,0,0,0,0,389,4347,748,0 225 | 1,0,1,0,0,0,0,1,8,319,335,0 226 | 1,0,1,0,0,0,0,1,3,189,177,0 227 | 1,0,1,0,0,112,0,1,22,743,331,0 228 | 1,0.18,1,0,0,123,0,0,59,11204,124,0 229 | 1,0,3,0,0,24,0,1,144,419,271,0 230 | 1,0,2,0,0,34,0,0,2,81,268,0 231 | 1,0,1,0,0,19,0,0,78,947,582,0 232 | 1,0,2,0,0,0,0,0,28,206,333,0 233 | 1,0,2,0,0,42,0,0,111,541,701,0 234 | 1,0.18,3,0,0,50,0,0,6,392,237,0 235 | 1,0.22,5,0,0,67,0,0,4,4177,3678,0 236 | 1,0.17,2,0,0,134,1,0,31,265,321,0 237 | 1,0,2,0,0,101,0,0,86,272,1486,0 238 | 1,0,2,0,0,0,0,0,240,425,917,0 239 | 1,0,3,0,0,0,0,1,11,150,178,0 240 | 1,0,2,0,0,17,0,1,44,711,673,0 241 | 1,0,2,0,0,0,0,1,9,89,121,0 242 | 1,0,2,0,0,32,0,1,62,742,653,0 243 | 1,0,2,0,0,0,0,0,0,96,50,0 244 | 1,0,1,0,1,80,1,1,15,77,107,0 245 | 1,0,2,0,0,2,0,1,53,195,395,0 246 | 1,0,2,0,0,0,0,1,2,9,13,0 247 | 1,0,1,0,0,146,1,0,52,10794,3164,0 248 | 1,0.19,0,0,0,0,0,0,1,104,15,0 249 | 1,0,0,0,0,0,1,1,62,318,569,0 250 | 1,0.18,1,0,0,0,0,0,247,355,137,0 251 | 1,0.31,1,0,0,0,0,1,9,99,488,0 252 | 1,0,0,0,0,6,0,0,13,300,372,0 253 | 1,0.15,1,0,0,0,0,0,204,139,61,0 254 | 1,0,2,0,0,0,0,1,0,13,77,0 255 | 1,0.22,1,0.14,0,0,0,1,51,606,430,0 256 | 1,0,1,0,0,64,0,0,15,428,333,0 257 | 1,0,1,0,0,0,0,0,108,494,545,0 258 | 1,0.24,0,0,0,0,0,1,353,1261,2187,0 259 | 1,0,1,0,0,0,0,1,5,68,87,0 260 | 1,0,2,0,0,49,1,0,560,205558,53,0 261 | 1,0,2,0,0,23,0,0,85,668,609,0 262 | 1,0,2,0,0,120,1,0,95,1456,555,0 263 | 1,0.14,2,0,0,34,0,0,14,410,387,0 264 | 1,0,3,0,0,25,0,1,52,298,555,0 265 | 1,0,2,0,0,0,0,1,7,254,345,0 266 | 1,0,2,0,0,12,0,0,197,1070,1072,0 267 | 1,0.17,2,0,0,0,0,1,89,1167,618,0 268 | 1,0,2,0,0,9,0,1,34,335,300,0 269 | 1,0.08,2,0,0,1,0,0,283,346,426,0 270 | 1,0,2,0,0,18,0,1,65,1746,1631,0 271 | 1,0,2,0,0,34,0,1,126,268,370,0 272 | 1,0,3,0,0,23,0,1,327,537,1002,0 273 | 1,0,1,0,0,19,0,1,42,805,488,0 274 | 1,0,2,0,0,139,0,0,36,1504,52,0 275 | 1,0,2,0,0,13,0,1,3,104,119,0 276 | 1,0,2,0,0,50,0,0,53,380,462,0 277 | 1,0,3,0,0,46,1,1,7,257,346,0 278 | 1,0,1,0,0,30,0,1,103,1775,7500,0 279 | 1,0.3,2,0,0,26,0,1,241,1456,1200,0 280 | 1,0,0,0,0,0,0,1,103,265,264,0 281 | 1,0,1,0,0,0,0,1,93,1051,694,0 282 | 1,0.11,1,0,0,0,0,0,1,77,34,0 283 | 0,0,1,0,0,27,0,1,16,220,323,0 284 | 1,0.07,2,0,0,37,1,1,32,178,81,0 285 | 1,0,1,0,0,31,0,1,1232,728,1213,0 286 | 1,0,1,0,0,20,0,1,75,668,294,0 287 | 1,0,2,0,0,7,0,0,5,406,408,0 288 | 1,0,2,0,0,0,0,0,3,37,22,0 289 | 1,0,2,0,0,0,0,1,30,56,114,0 290 | 0,0.22,1,0.0,0,0,0,0,0,90,333,1 291 | 0,0.38,1,0.0,0,0,0,0,0,60,31,1 292 | 0,0.43,1,0.0,0,0,0,1,2,271,445,1 293 | 1,0.0,0,0,0,0,0,1,3,1,80,1 294 | 1,0.5,3,0.0,0,24,0,1,13,158,309,1 295 | 0,0.31,2,0.0,0,0,0,0,0,39,64,1 296 | 0,0.22,1,0.22,0,43,0,1,1,0,11,1 297 | 0,0.0,1,0.0,0,0,0,1,0,0,853,1 298 | 1,0.25,1,0.0,0,0,0,1,10,0,23,1 299 | 0,0.0,2,0.0,0,0,0,0,0,12,5,1 300 | 0,0.33,1,0.0,0,0,0,0,0,10,18,1 301 | 0,0.33,1,0.0,0,0,0,1,0,1,34,1 302 | 0,0.0,1,0.0,0,0,0,0,0,1,8,1 303 | 0,0.0,0,0,0,0,0,1,0,0,0,1 304 | 0,0.12,1,0.0,0,0,0,1,0,31,213,1 305 | 0,0.18,1,0.0,0,0,0,1,0,5,10,1 306 | 0,0.5,1,0.0,0,0,0,1,5,18,58,1 307 | 1,0.5,1,0.0,0,0,0,1,0,12,77,1 308 | 0,0.5,1,0.0,0,0,0,1,0,6,37,1 309 | 1,0.57,1,0.0,0,0,0,0,0,47,10,1 310 | 0,0.45,3,0.0,0,0,0,0,0,33,4,1 311 | 0,0.5,1,0.0,0,0,0,1,141,4,279,1 312 | 0,0.57,1,0.0,0,0,0,1,6,5,15,1 313 | 1,0.0,1,0.0,0,0,0,1,1,0,44,1 314 | 0,0.44,1,0.43,0,0,0,1,0,5,17,1 315 | 0,0.0,1,0.0,0,0,0,1,1,107,42,1 316 | 1,0.88,1,0.0,0,0,0,1,39,8,60,1 317 | 0,0.0,1,0.0,0,0,0,0,0,48,6,1 318 | 0,0.22,0,0,0,0,0,1,9,2,215,1 319 | 0,0.0,2,0.0,0,0,0,1,0,51,126,1 320 | 1,0.0,1,0.0,0,43,0,1,21,5,48,1 321 | 0,0.0,1,0.0,0,0,0,1,0,0,802,1 322 | 0,0.55,1,0.0,0,0,0,0,0,26,46,1 323 | 0,0.07,1,0.0,0,0,0,1,0,0,601,1 324 | 1,0.8,1,0.0,0,13,0,1,3,76,168,1 325 | 0,0.0,1,0.4,0,0,0,1,0,165,689,1 326 | 0,0.0,1,0.0,0,0,0,0,0,115,230,1 327 | 0,0.31,1,0.36,0,0,0,0,0,6,15,1 328 | 0,0.4,1,0.0,0,0,0,0,0,24,49,1 329 | 0,0.33,1,0.0,0,0,0,1,0,40,236,1 330 | 0,0.0,1,0.0,0,0,0,1,6,32,52,1 331 | 0,0.33,1,0.33,1,0,0,1,0,0,35,1 332 | 0,0.36,1,0.0,0,0,0,0,0,21,44,1 333 | 1,0.55,1,0.29,0,0,0,1,1,79,767,1 334 | 1,0.41,1,0.0,0,0,0,1,4,8,37,1 335 | 1,0.43,0,0,0,0,0,0,0,49,79,1 336 | 0,0.11,1,0.11,1,0,0,1,0,31,91,1 337 | 1,0.0,1,0.0,1,0,0,1,0,15,41,1 338 | 1,0.25,1,0.0,0,0,0,0,0,4,3,1 339 | 1,0.44,1,0.0,0,0,0,1,0,0,229,1 340 | 1,0.3,0,0,0,0,0,1,17,316,1165,1 341 | 0,0.1,1,0.0,0,0,0,1,0,0,0,1 342 | 0,0.0,1,0.0,0,0,0,0,1,3,1,1 343 | 0,0.0,2,0.0,0,0,0,0,0,2,30,1 344 | 1,0.29,2,0.0,0,9,0,1,7,221,244,1 345 | 0,0.0,1,0.0,0,0,0,0,13,181,935,1 346 | 0,0.0,2,0.0,0,0,0,1,0,9,167,1 347 | 0,0.67,1,0.0,0,0,0,0,0,2,0,1 348 | 1,0.31,1,0.31,1,0,0,1,0,25,86,1 349 | 1,0.0,3,0.0,0,18,0,0,5,26,129,1 350 | 1,0.89,1,0.89,0,0,0,1,124,15,305,1 351 | 1,0.0,1,0.0,0,10,0,0,9,59,48,1 352 | 0,0.2,1,0.2,1,61,0,1,5,7,47,1 353 | 0,0.12,1,0.0,0,0,0,1,2,40,179,1 354 | 0,0.0,1,0.0,0,0,0,0,0,51,41,1 355 | 1,0.0,5,0.0,0,0,0,0,150,133,750,1 356 | 1,0.0,0,0,0,0,0,1,29,25,39,1 357 | 1,0.18,1,0.0,0,0,0,1,108,864,3646,1 358 | 1,0.0,0,0,0,0,0,0,77,73,35,1 359 | 1,0.38,1,0.4,0,0,0,1,3,184,170,1 360 | 1,0.43,1,0.0,0,0,0,1,7,161,333,1 361 | 1,0.22,1,0.22,1,0,0,1,3,42,694,1 362 | 1,0.0,1,0.0,0,22,0,1,131,279,1124,1 363 | 1,0.0,1,0.0,0,0,0,0,40,219,155,1 364 | 1,0.24,1,0.24,1,0,0,1,4,18,106,1 365 | 1,0.44,1,0.0,0,0,0,1,84,106,171,1 366 | 1,0.5,0,0,0,2,0,1,299,34,108,1 367 | 0,0.0,1,0.0,0,0,0,0,11,42,26,1 368 | 1,0.16,0,0,0,146,0,1,0,1,13,1 369 | 0,0.89,0,0,0,0,0,1,0,38,68,1 370 | 1,0.58,1,0.36,0,6,0,1,1,34,44,1 371 | 1,0.29,1,0.0,0,50,0,1,4,23,151,1 372 | 1,0.3,1,0.0,0,0,0,1,0,23,37,1 373 | 0,0.5,2,0.0,0,0,0,0,0,83,139,1 374 | 0,0.4,1,0.0,0,0,0,0,0,25,28,1 375 | 1,0.0,0,0,0,39,0,1,3,13,38,1 376 | 0,0.25,1,0.0,0,0,0,0,0,64,60,1 377 | 0,0.36,1,0.0,0,0,0,0,0,90,69,1 378 | 0,0.33,2,0.0,0,0,0,0,0,82,6,1 379 | 0,0.0,0,0,0,0,0,0,0,140,319,1 380 | 0,0.36,1,0.0,0,0,0,1,0,0,29,1 381 | 1,0.3,1,0.0,0,5,0,1,1,51,420,1 382 | 1,0.12,1,0.0,0,91,0,0,7,38,87,1 383 | 0,0.33,1,0.38,0,2,0,1,3,21,733,1 384 | 1,0.38,1,0.0,0,0,0,1,0,5,121,1 385 | 1,0.4,1,0.0,0,0,0,1,0,21,45,1 386 | 0,0.57,0,0,0,0,0,1,5,124,249,1 387 | 0,0.22,1,0.0,0,0,0,1,0,5,56,1 388 | 1,0.38,1,0.0,0,37,0,1,1,2,16,1 389 | 0,0.64,2,0.0,0,0,0,0,0,42,46,1 390 | 1,0.33,1,0.33,0,0,0,1,12,48,45,1 391 | 0,0.31,1,0.31,1,0,0,0,0,32,25,1 392 | 0,0.0,1,0.0,0,0,0,1,0,0,71,1 393 | 0,0.25,1,0.0,0,0,0,0,0,0,0,1 394 | 1,0.12,1,0.0,0,0,0,1,1,10,104,1 395 | 1,0.38,1,0.0,0,0,0,1,92,19,31,1 396 | 0,0.31,1,0.0,0,0,0,1,0,138,99,1 397 | 0,0.2,0,0,0,0,0,1,0,11,157,1 398 | 0,0.0,1,0.0,0,0,0,1,0,1,64,1 399 | 0,0.42,1,0.0,0,0,0,0,0,39,40,1 400 | 1,0.0,0,0,0,148,0,0,21,446,1762,1 401 | 1,0.57,1,0.0,0,0,0,1,7,9,82,1 402 | 0,0.0,0,0,0,0,0,0,9,589,2980,1 403 | 0,0.06,1,0.0,0,0,0,0,0,27,63,1 404 | 1,0.0,2,0.0,0,0,0,0,14,143,500,1 405 | 0,0.0,2,0.0,0,0,0,1,0,0,177,1 406 | 0,0.0,3,0.0,0,0,0,1,0,0,130,1 407 | 0,0.21,1,0.0,0,0,0,0,3,124,135,1 408 | 1,0.07,0,0,0,0,0,1,0,12,192,1 409 | 0,0.42,1,0.0,0,0,0,0,0,16,36,1 410 | 0,0.21,1,0.0,0,0,0,0,1,1,2,1 411 | 0,0.38,1,0.0,0,0,0,0,1,120,181,1 412 | 0,0.0,2,0.0,0,0,0,1,0,0,71,1 413 | 0,0.0,1,0.0,0,0,0,0,1,26,8,1 414 | 1,0.14,3,0.0,0,0,0,0,0,57,167,1 415 | 0,0.15,2,0.0,0,0,0,0,1,13,22,1 416 | 1,0.0,4,0.0,0,149,0,0,28,1031,208,1 417 | 1,0.18,1,0.0,0,0,0,0,0,42,146,1 418 | 1,0.33,2,0.0,0,0,0,0,25,70,64,1 419 | 1,0.18,1,0.0,0,0,0,1,0,0,8,1 420 | 1,0.3,1,0.0,0,0,0,0,0,46,92,1 421 | 1,0.43,2,0.0,0,0,0,0,111,834,4118,1 422 | 0,0.0,2,0.0,0,22,0,1,2,389,392,1 423 | 1,0.33,0,0,0,0,0,0,0,43,221,1 424 | 0,0.33,1,0.0,0,0,0,1,0,23,40,1 425 | 0,0.0,1,0.0,0,0,0,1,0,5,82,1 426 | 0,0.71,0,0,0,0,0,1,0,4,25,1 427 | 0,0.5,1,0.0,0,0,0,0,0,17,44,1 428 | 1,0.25,1,0.0,0,2,0,1,5,182,227,1 429 | 1,0.33,1,0.0,0,0,0,0,10,108,304,1 430 | 0,0.5,1,0.33,0,0,0,0,3,43,123,1 431 | 0,0.29,1,0.0,0,0,0,0,19,77,8,1 432 | 0,0.09,1,0.0,0,0,0,0,0,32,87,1 433 | 0,0.25,1,0.0,0,0,0,1,0,0,11,1 434 | 1,0.13,1,0.0,0,0,0,1,8,40,66,1 435 | 0,0.33,1,0.33,1,20,0,0,2,53,303,1 436 | 1,0.14,2,0.0,0,0,0,0,10,67,90,1 437 | 1,0.0,3,0.0,0,0,0,0,29,122,336,1 438 | 0,0.0,2,0.0,0,0,0,0,0,18,49,1 439 | 0,0.4,1,0.0,0,0,0,1,0,24,88,1 440 | 1,0.0,1,0.0,0,0,0,0,2,35,136,1 441 | 1,0.4,1,0.4,1,0,0,1,3,119,151,1 442 | 1,0.1,1,0.0,0,0,0,1,5,6,24,1 443 | 0,0.43,1,0.4,0,0,0,0,0,34,26,1 444 | 1,0.0,3,0.0,0,148,0,0,63,272,295,1 445 | 1,0.44,2,0.0,0,0,0,0,1,53,137,1 446 | 1,0.33,0,0,0,50,0,1,4,818,618,1 447 | 1,0.0,3,0.0,0,0,0,0,34,82,74,1 448 | 1,0.38,2,0.0,0,2,0,0,0,40,233,1 449 | 1,0.4,0,0,0,0,0,1,0,21,76,1 450 | 1,0.33,1,0.0,0,0,0,0,6,59,38,1 451 | 1,0.12,1,0.0,0,0,0,0,2,102,109,1 452 | 1,0.29,0,0,0,0,0,1,7,576,474,1 453 | 0,0.57,2,0.4,0,0,0,0,0,19,16,1 454 | 1,0.0,0,0,0,0,0,1,0,66,161,1 455 | 1,0.0,1,0.0,0,0,0,1,5,310,894,1 456 | 1,0.0,1,0.0,1,34,0,1,0,16,17,1 457 | 0,0.0,1,0.0,0,0,0,0,0,15,4,1 458 | 1,0.07,1,0.0,0,0,0,1,0,47,98,1 459 | 1,0.0,1,0.0,1,0,0,0,4,88,61,1 460 | 1,0.27,1,0.0,0,0,0,1,1,51,59,1 461 | 1,0.0,2,0.0,0,0,0,0,26,505,2330,1 462 | 1,0.36,1,0.0,0,0,0,0,9,159,433,1 463 | 1,0.29,1,0.33,0,0,0,1,4,77,1269,1 464 | 1,0.83,1,0.0,0,32,0,0,4,61,76,1 465 | 0,0.67,1,0.0,0,0,0,1,0,7,14,1 466 | 1,0.38,1,0.33,0,0,0,1,0,11,60,1 467 | 0,0.58,1,0.0,0,0,0,0,0,49,22,1 468 | 0,0.0,1,0.0,0,0,0,0,0,15,595,1 469 | 1,0.2,2,0.0,0,0,0,0,1,1,9,1 470 | 0,0.0,1,0.0,0,0,0,0,0,17,1,1 471 | 0,0.29,2,0.0,0,0,0,0,0,15,1,1 472 | 1,0.25,1,0.0,0,59,0,0,28,358,1990,1 473 | 0,0.25,1,0.0,0,0,0,0,0,20,7,1 474 | 0,0.0,1,0.0,0,6,0,0,1,50,32,1 475 | 0,0.0,1,0.0,0,0,0,0,0,57,76,1 476 | 0,0.31,1,0.0,0,0,0,0,2,12,15,1 477 | 1,0.67,1,0.0,0,0,0,0,2,218,792,1 478 | 0,0.33,1,0.25,0,0,0,0,9,78,98,1 479 | 0,0.47,1,0.0,0,0,0,0,1,16,23,1 480 | 0,0.36,1,0.0,0,0,0,0,8,39,17,1 481 | 0,0.0,1,0.0,0,0,0,0,0,15,1,1 482 | 0,0.4,0,0,0,0,0,0,0,30,48,1 483 | 0,0.33,1,0.0,0,0,0,0,0,85,638,1 484 | 0,0.33,1,0.0,0,0,0,0,0,77,747,1 485 | 0,0.55,2,0.0,0,0,0,0,0,49,82,1 486 | 0,0.27,1,0.27,1,0,0,0,0,16,24,1 487 | 0,0.44,1,0.0,0,0,0,0,0,8,6,1 488 | 0,0.44,1,0.0,0,0,0,0,2,34,22,1 489 | 0,0.47,1,0.0,0,0,0,0,2,45,83,1 490 | 0,0.33,1,0.0,0,0,0,0,0,49,0,1 491 | 0,0.0,2,0.0,0,0,0,0,0,15,5,1 492 | 0,0.25,1,0.0,0,0,0,0,0,92,403,1 493 | 1,0.91,1,0.0,0,0,0,0,0,75,26,1 494 | 0,0.44,1,0.0,0,0,0,0,0,10,0,1 495 | 1,0.12,1,0.0,0,1,0,0,0,23,26,1 496 | 0,0.2,1,0.0,0,0,0,0,0,22,11,1 497 | 0,0.24,2,0.0,0,0,0,0,0,55,46,1 498 | 0,0.27,1,0.0,0,0,0,0,0,16,0,1 499 | 0,0.25,1,0.0,0,0,0,0,0,7,20,1 500 | 0,0.28,1,0.24,0,0,0,0,0,86,0,1 501 | 0,0.38,1,0.0,0,0,0,0,0,3,11,1 502 | 0,0.57,2,0.0,0,0,0,0,0,12,30,1 503 | 1,0.44,1,0.0,0,0,0,0,1,14,56,1 504 | 0,0.47,1,0.0,0,0,0,0,0,24,22,1 505 | 0,0.0,2,0.0,0,0,0,0,0,52,1,1 506 | 0,0.44,1,0.0,0,0,0,0,1,16,27,1 507 | 0,0.44,1,0.0,0,0,0,0,4,17,20,1 508 | 0,0.57,2,0.0,0,0,0,0,0,50,49,1 509 | 0,0.54,1,0.0,0,0,0,0,5,136,1029,1 510 | 1,0.43,1,0.0,0,0,0,0,10,178,1417,1 511 | 0,0.89,0,0,0,0,0,0,1,50,39,1 512 | 1,0.38,1,0.0,0,0,0,0,2,207,2426,1 513 | 0,0.36,1,0.0,0,0,0,1,0,178,828,1 514 | 0,0.0,1,0.0,1,0,0,0,0,16,26,1 515 | 0,0.14,1,0.0,0,0,0,0,0,49,2,1 516 | 0,0.2,1,0.33,0,0,0,0,2,49,12,1 517 | 0,0.1,0,0,0,0,0,0,0,37,4,1 518 | 0,0.5,1,0.4,0,0,0,0,0,21,0,1 519 | 1,0.58,1,0.44,0,0,0,0,21,356,2176,1 520 | 0,0.44,2,0.0,0,0,0,0,0,46,4,1 521 | 0,0.22,1,0.33,0,0,0,0,0,49,0,1 522 | 0,0.0,1,0.0,0,0,0,0,0,49,58,1 523 | 0,0.0,1,0.0,0,0,0,0,0,44,45,1 524 | 0,0.13,2,0.0,0,0,0,0,0,42,240,1 525 | 0,0.38,1,0.33,0,0,0,0,0,47,9,1 526 | 0,0.1,2,0.0,0,0,0,0,0,48,1,1 527 | 1,0.91,1,0.0,0,0,0,0,0,75,26,1 528 | 0,0.08,1,0.0,0,0,0,0,0,115,289,1 529 | 0,0.88,1,1.0,0,0,0,0,5,53,66,1 530 | 0,0.67,1,0.0,0,0,0,0,0,32,23,1 531 | 0,0.44,1,0.0,0,0,0,0,0,39,153,1 532 | 0,0.18,1,0.0,0,0,0,0,0,3033,1155,1 533 | 0,0.25,1,0.0,0,0,0,0,0,3003,825,1 534 | 0,0.88,1,1.0,0,0,0,0,1,34,36,1 535 | 0,0.36,1,0.0,0,0,0,0,37,218,1528,1 536 | 1,0.33,2,0.22,0,0,0,0,8,210,1543,1 537 | 1,0.2,2,0.0,0,19,0,0,4,1489,3715,1 538 | 0,0.5,1,0.5,1,0,0,0,0,10,2,1 539 | 0,0.33,1,0.0,0,0,0,0,28,201,812,1 540 | 0,0.0,2,0.0,0,0,0,0,0,351,2663,1 541 | 0,0.43,1,0.0,0,0,0,0,0,52,16,1 542 | 0,0.25,1,0.18,0,0,0,0,6,156,423,1 543 | 0,0.43,2,0.0,0,0,0,0,12,85,477,1 544 | 0,0.24,1,0.0,0,0,0,0,0,35,38,1 545 | 0,0.44,1,0.0,0,0,0,0,0,55,28,1 546 | 0,0.27,2,0.0,0,43,0,1,0,16,51,1 547 | 0,0.38,1,0.0,0,0,0,0,0,43,88,1 548 | 0,0.4,1,0.0,0,0,0,0,2,97,408,1 549 | 0,0.57,1,0.57,1,0,0,0,2,34,112,1 550 | 0,0.0,1,0.0,1,0,0,0,0,2346,7272,1 551 | 0,0.25,1,0.0,0,0,0,0,0,49,95,1 552 | 1,0.19,0,0,0,0,0,1,2,2,55,1 553 | 0,0.46,1,0.46,1,0,0,0,2,40,19,1 554 | 0,0.46,1,0.0,0,0,0,1,0,332,1333,1 555 | 0,0.73,1,0.0,0,0,0,0,0,14,542,1 556 | 1,0.6,1,0.5,0,0,0,0,12,65,162,1 557 | 1,0.46,1,0.5,0,0,0,0,1,20,52,1 558 | 0,0.31,1,0.31,1,0,0,0,0,26,27,1 559 | 1,0.43,1,0.43,0,0,0,0,0,72,434,1 560 | 1,0.0,1,0.0,0,0,0,0,1,55,2,1 561 | 1,0.5,1,0.4,0,33,0,0,12,77,108,1 562 | 1,0.0,4,0.25,0,0,0,1,59,31,215,1 563 | 0,0.86,2,0.18,0,0,0,0,0,57,130,1 564 | 0,0.62,1,0.0,0,0,0,1,0,58,347,1 565 | 1,0.33,2,0.0,0,0,0,0,23,47,139,1 566 | 1,0.16,1,0.12,0,43,0,0,3,31,37,1 567 | 1,0.92,1,1.0,0,0,0,0,11,96,242,1 568 | 1,0.27,1,0.0,0,19,0,0,8,126,860,1 569 | 1,0.25,1,0.0,0,0,0,1,102,39,229,1 570 | 1,0.43,1,0.0,0,5,0,0,6,66,161,1 571 | 1,0.31,3,0.0,0,0,0,0,25,87,698,1 572 | 1,0.2,1,0.0,0,28,0,0,0,15,64,1 573 | 1,0.55,1,0.44,0,0,0,0,33,166,596,1 574 | 1,0.38,1,0.33,0,21,0,0,44,66,75,1 575 | 1,0.57,2,0.0,0,0,0,0,4,96,339,1 576 | 1,0.57,1,0.0,0,11,0,0,0,57,73,1 577 | 1,0.27,1,0.0,0,0,0,0,2,150,487,1 -------------------------------------------------------------------------------- /IGAudit_mdfiles/IGAudit.md: -------------------------------------------------------------------------------- 1 | # IG Audit 2 | Objective: Using Simple Statistical Tools and Machine Learning to Audit Instagram Accounts for Authenticity 3 | 4 | Motivation: During lockdown, businesses have started increasing the use of social media influencers to market their products while their physical outlets are temporary closed. However, it is sad that there are some that will try and game the system for their own good. But in a world where a single influencer's post is worth as much as an average 9-5 Joe's annual salary, influencer marketing fake followers and fake engagement is a price that brands shouldn't have to pay for. 5 | 6 | *Inspired by igaudit.io that was taken down by Facebook only recently.* 7 | 8 | 9 | ```python 10 | # Imports 11 | 12 | import numpy as np 13 | import pandas as pd 14 | import seaborn as sns 15 | import matplotlib.pyplot as plt 16 | %matplotlib inline 17 | 18 | from sklearn.linear_model import LogisticRegression 19 | from sklearn.neighbors import KNeighborsClassifier 20 | from sklearn.tree import DecisionTreeClassifier 21 | from sklearn.ensemble import RandomForestClassifier 22 | from sklearn.metrics import accuracy_score, confusion_matrix, classification_report 23 | from sklearn.model_selection import GridSearchCV, cross_val_score, StratifiedKFold, learning_curve 24 | from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier, ExtraTreesClassifier, VotingClassifier 25 | 26 | from instagram_private_api import Client, ClientCompatPatch 27 | import getpass 28 | 29 | import random 30 | ``` 31 | 32 | ## Part 1: Understanding and Splitting the Data 33 | Dataset source: https://www.kaggle.com/eswarchandt/is-your-insta-fake-or-genuine 34 | 35 | Import the data 36 | 37 | 38 | ```python 39 | train = pd.read_csv("train.csv") 40 | test = pd.read_csv("test.csv") 41 | ``` 42 | 43 | Inspect the training data 44 | 45 | 46 | ```python 47 | train.head() 48 | ``` 49 | 50 | 51 | 52 | 53 |
54 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#followsfake
010.2700.0053003210009550
110.0020.00440028627405330
210.1020.0000113159980
310.0010.0082006794146510
410.0020.0000161511260
163 |
164 | 165 | 166 | 167 | The features in the training data are the following: 168 | - profile pic: does the user have a profile picture? 169 | - nums/length username: ratio of numerical to alphabetical characters in the username 170 | - fullname words: how many words are in the user's full name? 171 | - nums/length fullname: ratio of numerical to alphabetical characters in the full name 172 | - name==username: is the user's full name the same as the username? 173 | - description length: how many characters is in the user's Instagram bio? 174 | - external URL: does the user have an external URL linked to their profile? 175 | - private: is the user private? 176 | - #posts: number of posts 177 | - #followers: number of people following the user 178 | - #follows: number of people the user follows 179 | - fake: if the user is fake, fake=1, else fake=0 180 | 181 | 182 | ```python 183 | train.describe() 184 | ``` 185 | 186 | 187 | 188 | 189 |
190 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 | 237 | 238 | 239 | 240 | 241 | 242 | 243 | 244 | 245 | 246 | 247 | 248 | 249 | 250 | 251 | 252 | 253 | 254 | 255 | 256 | 257 | 258 | 259 | 260 | 261 | 262 | 263 | 264 | 265 | 266 | 267 | 268 | 269 | 270 | 271 | 272 | 273 | 274 | 275 | 276 | 277 | 278 | 279 | 280 | 281 | 282 | 283 | 284 | 285 | 286 | 287 | 288 | 289 | 290 | 291 | 292 | 293 | 294 | 295 | 296 | 297 | 298 | 299 | 300 | 301 | 302 | 303 | 304 | 305 | 306 | 307 | 308 | 309 | 310 | 311 | 312 | 313 | 314 | 315 | 316 | 317 | 318 | 319 | 320 | 321 | 322 | 323 | 324 | 325 | 326 | 327 | 328 | 329 | 330 | 331 | 332 | 333 | 334 | 335 | 336 | 337 | 338 | 339 | 340 | 341 | 342 | 343 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#followsfake
count576.000000576.000000576.000000576.000000576.000000576.000000576.000000576.000000576.0000005.760000e+02576.000000576.000000
mean0.7013890.1638371.4600690.0360940.03472222.6232640.1163190.381944107.4895838.530724e+04508.3819440.500000
std0.4580470.2140961.0526010.1251210.18323437.7029870.3208860.486285402.0344319.101485e+05917.9812390.500435
min0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000e+000.0000000.000000
25%0.0000000.0000001.0000000.0000000.0000000.0000000.0000000.0000000.0000003.900000e+0157.5000000.000000
50%1.0000000.0000001.0000000.0000000.0000000.0000000.0000000.0000009.0000001.505000e+02229.5000000.500000
75%1.0000000.3100002.0000000.0000000.00000034.0000000.0000001.00000081.5000007.160000e+02589.5000001.000000
max1.0000000.92000012.0000001.0000001.000000150.0000001.0000001.0000007389.0000001.533854e+077500.0000001.000000
344 |
345 | 346 | 347 | 348 | 349 | ```python 350 | train.info() 351 | ``` 352 | 353 | 354 | RangeIndex: 576 entries, 0 to 575 355 | Data columns (total 12 columns): 356 | # Column Non-Null Count Dtype 357 | --- ------ -------------- ----- 358 | 0 profile pic 576 non-null int64 359 | 1 nums/length username 576 non-null float64 360 | 2 fullname words 576 non-null int64 361 | 3 nums/length fullname 576 non-null float64 362 | 4 name==username 576 non-null int64 363 | 5 description length 576 non-null int64 364 | 6 external URL 576 non-null int64 365 | 7 private 576 non-null int64 366 | 8 #posts 576 non-null int64 367 | 9 #followers 576 non-null int64 368 | 10 #follows 576 non-null int64 369 | 11 fake 576 non-null int64 370 | dtypes: float64(2), int64(10) 371 | memory usage: 54.1 KB 372 | 373 | 374 | 375 | ```python 376 | train.shape 377 | ``` 378 | 379 | 380 | 381 | 382 | (576, 12) 383 | 384 | 385 | 386 | Inspect the test data 387 | 388 | 389 | ```python 390 | test.head() 391 | ``` 392 | 393 | 394 | 395 | 396 |
397 | 410 | 411 | 412 | 413 | 414 | 415 | 416 | 417 | 418 | 419 | 420 | 421 | 422 | 423 | 424 | 425 | 426 | 427 | 428 | 429 | 430 | 431 | 432 | 433 | 434 | 435 | 436 | 437 | 438 | 439 | 440 | 441 | 442 | 443 | 444 | 445 | 446 | 447 | 448 | 449 | 450 | 451 | 452 | 453 | 454 | 455 | 456 | 457 | 458 | 459 | 460 | 461 | 462 | 463 | 464 | 465 | 466 | 467 | 468 | 469 | 470 | 471 | 472 | 473 | 474 | 475 | 476 | 477 | 478 | 479 | 480 | 481 | 482 | 483 | 484 | 485 | 486 | 487 | 488 | 489 | 490 | 491 | 492 | 493 | 494 | 495 | 496 | 497 | 498 | 499 | 500 | 501 | 502 | 503 | 504 | 505 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#followsfake
010.3310.3313001354886040
110.0050.000640133560
210.0020.00082013193286680
310.0010.000143012731489073690
410.5010.000760162253560
506 |
507 | 508 | 509 | 510 | 511 | ```python 512 | test.describe() 513 | ``` 514 | 515 | 516 | 517 | 518 |
519 | 532 | 533 | 534 | 535 | 536 | 537 | 538 | 539 | 540 | 541 | 542 | 543 | 544 | 545 | 546 | 547 | 548 | 549 | 550 | 551 | 552 | 553 | 554 | 555 | 556 | 557 | 558 | 559 | 560 | 561 | 562 | 563 | 564 | 565 | 566 | 567 | 568 | 569 | 570 | 571 | 572 | 573 | 574 | 575 | 576 | 577 | 578 | 579 | 580 | 581 | 582 | 583 | 584 | 585 | 586 | 587 | 588 | 589 | 590 | 591 | 592 | 593 | 594 | 595 | 596 | 597 | 598 | 599 | 600 | 601 | 602 | 603 | 604 | 605 | 606 | 607 | 608 | 609 | 610 | 611 | 612 | 613 | 614 | 615 | 616 | 617 | 618 | 619 | 620 | 621 | 622 | 623 | 624 | 625 | 626 | 627 | 628 | 629 | 630 | 631 | 632 | 633 | 634 | 635 | 636 | 637 | 638 | 639 | 640 | 641 | 642 | 643 | 644 | 645 | 646 | 647 | 648 | 649 | 650 | 651 | 652 | 653 | 654 | 655 | 656 | 657 | 658 | 659 | 660 | 661 | 662 | 663 | 664 | 665 | 666 | 667 | 668 | 669 | 670 | 671 | 672 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#followsfake
count120.000000120.000000120.000000120.000000120.000000120.000000120.000000120.000000120.0000001.200000e+02120.000000120.000000
mean0.7583330.1799171.5500000.0713330.04166727.2000000.1000000.30833382.8666674.959472e+04779.2666670.500000
std0.4298880.2414921.1871160.2094290.20066442.5886320.3012580.463741230.4681363.816126e+051409.3835580.502096
min0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000e+001.0000000.000000
25%1.0000000.0000001.0000000.0000000.0000000.0000000.0000000.0000001.0000006.725000e+01119.2500000.000000
50%1.0000000.0000001.0000000.0000000.0000000.0000000.0000000.0000008.0000002.165000e+02354.5000000.500000
75%1.0000000.3300002.0000000.0000000.00000045.2500000.0000001.00000058.2500005.932500e+02668.2500001.000000
max1.0000000.8900009.0000001.0000001.000000149.0000001.0000001.0000001879.0000004.021842e+067453.0000001.000000
673 |
674 | 675 | 676 | 677 | 678 | ```python 679 | test.info() 680 | ``` 681 | 682 | 683 | RangeIndex: 120 entries, 0 to 119 684 | Data columns (total 12 columns): 685 | # Column Non-Null Count Dtype 686 | --- ------ -------------- ----- 687 | 0 profile pic 120 non-null int64 688 | 1 nums/length username 120 non-null float64 689 | 2 fullname words 120 non-null int64 690 | 3 nums/length fullname 120 non-null float64 691 | 4 name==username 120 non-null int64 692 | 5 description length 120 non-null int64 693 | 6 external URL 120 non-null int64 694 | 7 private 120 non-null int64 695 | 8 #posts 120 non-null int64 696 | 9 #followers 120 non-null int64 697 | 10 #follows 120 non-null int64 698 | 11 fake 120 non-null int64 699 | dtypes: float64(2), int64(10) 700 | memory usage: 11.4 KB 701 | 702 | 703 | 704 | ```python 705 | test.shape 706 | ``` 707 | 708 | 709 | 710 | 711 | (120, 12) 712 | 713 | 714 | 715 | Check for NULL values 716 | 717 | 718 | ```python 719 | print(train.isna().values.any().sum()) 720 | print(test.isna().values.any().sum()) 721 | ``` 722 | 723 | 0 724 | 0 725 | 726 | 727 | Create a correlation matrix for the features in the training data to check for significantly relevant features 728 | 729 | 730 | ```python 731 | fig, ax = plt.subplots(figsize=(15,10)) 732 | corr=train.corr() 733 | sns.heatmap(corr, annot=True) 734 | ``` 735 | 736 | 737 | 738 | 739 | 740 | 741 | 742 | 743 | 744 | ![png](output_19_1.png) 745 | 746 | 747 | Split the training set into data and labels 748 | 749 | 750 | ```python 751 | # Labels 752 | train_Y = train.fake 753 | train_Y = pd.DataFrame(train_Y) 754 | 755 | # Data 756 | train_X = train.drop(columns='fake') 757 | train_X.head() 758 | ``` 759 | 760 | 761 | 762 | 763 |
764 | 777 | 778 | 779 | 780 | 781 | 782 | 783 | 784 | 785 | 786 | 787 | 788 | 789 | 790 | 791 | 792 | 793 | 794 | 795 | 796 | 797 | 798 | 799 | 800 | 801 | 802 | 803 | 804 | 805 | 806 | 807 | 808 | 809 | 810 | 811 | 812 | 813 | 814 | 815 | 816 | 817 | 818 | 819 | 820 | 821 | 822 | 823 | 824 | 825 | 826 | 827 | 828 | 829 | 830 | 831 | 832 | 833 | 834 | 835 | 836 | 837 | 838 | 839 | 840 | 841 | 842 | 843 | 844 | 845 | 846 | 847 | 848 | 849 | 850 | 851 | 852 | 853 | 854 | 855 | 856 | 857 | 858 | 859 | 860 | 861 | 862 | 863 | 864 | 865 | 866 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#follows
010.2700.005300321000955
110.0020.0044002862740533
210.1020.000011315998
310.0010.008200679414651
410.0020.000016151126
867 |
868 | 869 | 870 | 871 | Split the test set into data and labels 872 | 873 | 874 | ```python 875 | # Labels 876 | test_Y = test.fake 877 | test_Y = pd.DataFrame(test_Y) 878 | 879 | # Data 880 | test_X = test.drop(columns='fake') 881 | test_X.head() 882 | ``` 883 | 884 | 885 | 886 | 887 |
888 | 901 | 902 | 903 | 904 | 905 | 906 | 907 | 908 | 909 | 910 | 911 | 912 | 913 | 914 | 915 | 916 | 917 | 918 | 919 | 920 | 921 | 922 | 923 | 924 | 925 | 926 | 927 | 928 | 929 | 930 | 931 | 932 | 933 | 934 | 935 | 936 | 937 | 938 | 939 | 940 | 941 | 942 | 943 | 944 | 945 | 946 | 947 | 948 | 949 | 950 | 951 | 952 | 953 | 954 | 955 | 956 | 957 | 958 | 959 | 960 | 961 | 962 | 963 | 964 | 965 | 966 | 967 | 968 | 969 | 970 | 971 | 972 | 973 | 974 | 975 | 976 | 977 | 978 | 979 | 980 | 981 | 982 | 983 | 984 | 985 | 986 | 987 | 988 | 989 | 990 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#follows
010.3310.331300135488604
110.0050.00064013356
210.0020.0008201319328668
310.0010.00014301273148907369
410.5010.00076016225356
991 |
992 | 993 | 994 | 995 | ## Part 2: Comparing Classification Models 996 | 997 | **Baseline Classifier** 998 |
Classify everything as the majority class. 999 | 1000 | 1001 | ```python 1002 | # Baseline classifier 1003 | fakes = len([i for i in train.fake if i==1]) 1004 | auth = len([i for i in train.fake if i==0]) 1005 | fakes, auth 1006 | 1007 | # classify everything as fake 1008 | pred = [1 for i in range(len(test_X))] 1009 | pred = np.array(pred) 1010 | print("Baseline accuracy: " + str(accuracy_score(pred, test_Y))) 1011 | ``` 1012 | 1013 | Baseline accuracy: 0.5 1014 | 1015 | 1016 | **Statistical Method** 1017 |
Classify all users with a following to follower ratio above a certain threshold as 'fake'. 1018 |
i.e. a user with 10 follower and 200 followings will be classified as fake if the threshold r=20 1019 | 1020 | 1021 | ```python 1022 | # Statistical method 1023 | def stat_predict(test_X, r): 1024 | pred = [] 1025 | for row in range(len(test_X)): 1026 | followers = test_X.loc[row]['#followers'] 1027 | followings = test_X.loc[row]['#follows'] 1028 | if followers == 0: 1029 | followers = 1 1030 | if followings == 0: 1031 | followings == 1 1032 | 1033 | ratio = followings/followers 1034 | 1035 | if ratio >= r: 1036 | pred.append(1) 1037 | else: 1038 | pred.append(0) 1039 | 1040 | return np.array(pred) 1041 | accuracies = [] 1042 | for i in [x / 10.0 for x in range(5, 255, 5)]: 1043 | prediction = stat_predict(test_X, i) 1044 | accuracies.append(accuracy_score(prediction, test_Y)) 1045 | 1046 | f, ax = plt.subplots(figsize=(20,10)) 1047 | plt.plot([x / 10.0 for x in range(5, 255, 5)], accuracies) 1048 | plt.plot([2.5 for i in range(len(accuracies))], accuracies, color='red') 1049 | plt.title("Accuracy for different thresholds", size=30) 1050 | plt.xlabel('Ratio', fontsize=20) 1051 | plt.ylabel('Accuracy', fontsize=20) 1052 | print("Maximum Accuracy for the statistical method: " + str(max(accuracies))) 1053 | ``` 1054 | 1055 | Maximum Accuracy for the statistical method: 0.7 1056 | 1057 | 1058 | 1059 | ![png](output_28_1.png) 1060 | 1061 | 1062 | **Logistic Regression** 1063 | 1064 | 1065 | ```python 1066 | lm = LogisticRegression() 1067 | 1068 | # Train the model 1069 | model1 = lm.fit(train_X, train_Y) 1070 | 1071 | # Make a prediction 1072 | lm_predict = model1.predict(test_X) 1073 | ``` 1074 | 1075 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/sklearn/utils/validation.py:73: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1076 | return f(**kwargs) 1077 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py:762: ConvergenceWarning: lbfgs failed to converge (status=1): 1078 | STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. 1079 | 1080 | Increase the number of iterations (max_iter) or scale the data as shown in: 1081 | https://scikit-learn.org/stable/modules/preprocessing.html 1082 | Please also refer to the documentation for alternative solver options: 1083 | https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression 1084 | extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG) 1085 | 1086 | 1087 | 1088 | ```python 1089 | # Compute the accuracy of the model 1090 | acc = accuracy_score(lm_predict, test_Y) 1091 | print("Logistic Regression accuracy: " + str(acc)) 1092 | ``` 1093 | 1094 | Logistic Regression accuracy: 0.9083333333333333 1095 | 1096 | 1097 | **KNN Classifier** 1098 | 1099 | 1100 | ```python 1101 | accuracies = [] 1102 | 1103 | # Compare the accuracies of using the KNN classifier with different number of neighbors 1104 | for i in range(1,10): 1105 | knn = KNeighborsClassifier(n_neighbors=i) 1106 | model_2 = knn.fit(train_X,train_Y) 1107 | knn_predict = model_2.predict(test_X) 1108 | accuracy = accuracy_score(knn_predict,test_Y) 1109 | accuracies.append(accuracy) 1110 | 1111 | max_acc = (0, 0) 1112 | for i in range(1, 10): 1113 | if accuracies[i-1] > max_acc[1]: 1114 | max_acc = (i, accuracies[i-1]) 1115 | 1116 | max_acc 1117 | 1118 | f, ax = plt.subplots(figsize=(20,10)) 1119 | plt.plot([i for i in range(1,10)], accuracies) 1120 | plt.plot([7 for i in range(len(accuracies))], accuracies, color='red') 1121 | plt.title("Accuracy for different n-neighbors", size=30) 1122 | plt.xlabel('Number of neighbors', fontsize=20) 1123 | plt.ylabel('Accuracy', fontsize=20) 1124 | 1125 | print("The highest accuracy obtained using KNN is " + str(max_acc[1]) + " achieved by a value of n=" + str(max_acc[0])) 1126 | ``` 1127 | 1128 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1129 | 1130 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1131 | 1132 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1133 | 1134 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1135 | 1136 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1137 | 1138 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1139 | 1140 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1141 | 1142 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1143 | 1144 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). 1145 | 1146 | 1147 | 1148 | The highest accuracy obtained using KNN is 0.8666666666666667 achieved by a value of n=7 1149 | 1150 | 1151 | 1152 | ![png](output_33_2.png) 1153 | 1154 | 1155 | **Decision Tree Classifier** 1156 | 1157 | 1158 | ```python 1159 | DT = DecisionTreeClassifier() 1160 | 1161 | # Train the model 1162 | model3 = DT.fit(train_X, train_Y) 1163 | 1164 | # Make a prediction 1165 | DT_predict = model3.predict(test_X) 1166 | ``` 1167 | 1168 | 1169 | ```python 1170 | # Compute the accuracy of the model 1171 | acc = accuracy_score(DT_predict, test_Y) 1172 | print("Decision Tree accuracy: " + str(acc)) 1173 | ``` 1174 | 1175 | Decision Tree accuracy: 0.9 1176 | 1177 | 1178 | **Random Forest Classifier** 1179 | 1180 | 1181 | ```python 1182 | rfc = RandomForestClassifier() 1183 | 1184 | # Train the model 1185 | model_4 = rfc.fit(train_X, train_Y) 1186 | 1187 | # Make a prediction 1188 | rfc_predict = model_4.predict(test_X) 1189 | ``` 1190 | 1191 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:4: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). 1192 | after removing the cwd from sys.path. 1193 | 1194 | 1195 | 1196 | ```python 1197 | # Compute the accuracy of the model 1198 | acc = accuracy_score(rfc_predict, test_Y) 1199 | print("Random Forest accuracy: " + str(acc)) 1200 | ``` 1201 | 1202 | Random Forest accuracy: 0.925 1203 | 1204 | 1205 | ## Part 3: Obtaining Instagram Data 1206 | We are going to use the hassle-free unofficial Instagram API.
To install: ```$ pip install git+https://git@github.com/ping/instagram_private_api.git@1.6.0``` 1207 | 1208 | Log in to your Instagram account (preferably not your personal one! I created one just for this project 😉) 1209 | 1210 | 1211 | ```python 1212 | def login(): 1213 | username = input("username: ") 1214 | password = getpass.getpass("password: ") 1215 | api = Client(username, password) 1216 | return api 1217 | 1218 | api = login() 1219 | ``` 1220 | 1221 | username: ins.tapolice 1222 | password: ········ 1223 | 1224 | 1225 | Get the Instagram user ID 1226 | 1227 | 1228 | ```python 1229 | def get_ID(username): 1230 | return api.username_info(username)['user']['pk'] 1231 | ``` 1232 | 1233 | 1234 | ```python 1235 | # The user used for the experiment below is anonymised! 1236 | # i.e. this cell was run and then changed to protect the user's anonymity 1237 | userID = get_ID('') 1238 | ``` 1239 | 1240 | The API needs some sort of rank to query followers, posts, etc. 1241 | 1242 | 1243 | ```python 1244 | rank = api.generate_uuid() 1245 | ``` 1246 | 1247 | Get the user's list follower usernames (this may take a while, depending on how many followers the user have) 1248 | 1249 | 1250 | ```python 1251 | def get_followers(userID, rank): 1252 | followers = [] 1253 | next_max_id = True 1254 | 1255 | while next_max_id: 1256 | if next_max_id == True: next_max_id='' 1257 | f = api.user_followers(userID, rank, max_id=next_max_id) 1258 | followers.extend(f.get('users', [])) 1259 | next_max_id = f.get('next_max_id', '') 1260 | 1261 | user_fer = [dic['username'] for dic in followers] 1262 | 1263 | return user_fer 1264 | ``` 1265 | 1266 | 1267 | ```python 1268 | followers = get_followers(userID, rank) 1269 | ``` 1270 | 1271 | 1272 | ```python 1273 | # You can check the number of followers if you'd like to 1274 | # len(followers) 1275 | ``` 1276 | 1277 | ## Part 4: Preparing the Data 1278 | 1279 | Inspect the data (and what other data can you obtain from it) and compare it with the train and test tables above. Find out what you need to do to obtain the features for a data point in order to make a prediction. 1280 | 1281 | Recall that the features for a data point are the following: 1282 | - profile pic: does the user have a profile picture? 1283 | - nums/length username: ratio of numerical to alphabetical characters in the username 1284 | - fullname words: how many words are in the user's full name? 1285 | - nums/length fullname: ratio of numerical to alphabetical characters in the full name 1286 | - name==username: is the user's full name the same as the username? 1287 | - description length: how many characters is in the user's Instagram bio? 1288 | - external URL: does the user have an external URL linked to their profile? 1289 | - private: is the user private? 1290 | - #posts: number of posts 1291 | - #followers: number of people following the user 1292 | - #follows: number of people the user follows 1293 | - fake: if the user is fake, fake=1, else fake=0 1294 | 1295 | 1296 | ```python 1297 | # This will print the first follower username on the list 1298 | # print(followers[0]) 1299 | ``` 1300 | 1301 | 1302 | ```python 1303 | # This will get the information on a certain user 1304 | info = api.user_info(get_ID(followers[0]))['user'] 1305 | 1306 | # Check what information is available for one particular user 1307 | info.keys() 1308 | ``` 1309 | 1310 | 1311 | 1312 | 1313 | dict_keys(['pk', 'username', 'full_name', 'is_private', 'profile_pic_url', 'profile_pic_id', 'is_verified', 'has_anonymous_profile_picture', 'media_count', 'geo_media_count', 'follower_count', 'following_count', 'following_tag_count', 'biography', 'biography_with_entities', 'external_url', 'external_lynx_url', 'total_igtv_videos', 'total_clips_count', 'total_ar_effects', 'usertags_count', 'is_favorite', 'is_favorite_for_stories', 'is_favorite_for_highlights', 'live_subscription_status', 'is_interest_account', 'has_chaining', 'hd_profile_pic_versions', 'hd_profile_pic_url_info', 'mutual_followers_count', 'has_highlight_reels', 'can_be_reported_as_fraud', 'is_eligible_for_smb_support_flow', 'smb_support_partner', 'smb_delivery_partner', 'smb_donation_partner', 'smb_support_delivery_partner', 'displayed_action_button_type', 'direct_messaging', 'fb_page_call_to_action_id', 'address_street', 'business_contact_method', 'category', 'city_id', 'city_name', 'contact_phone_number', 'is_call_to_action_enabled', 'latitude', 'longitude', 'public_email', 'public_phone_country_code', 'public_phone_number', 'zip', 'instagram_location_id', 'is_business', 'account_type', 'professional_conversion_suggested_account_type', 'can_hide_category', 'can_hide_public_contacts', 'should_show_category', 'should_show_public_contacts', 'personal_account_ads_page_name', 'personal_account_ads_page_id', 'include_direct_blacklist_status', 'is_potential_business', 'show_post_insights_entry_point', 'is_bestie', 'has_unseen_besties_media', 'show_account_transparency_details', 'show_leave_feedback', 'robi_feedback_source', 'auto_expand_chaining', 'highlight_reshare_disabled', 'is_memorialized', 'open_external_url_with_in_app_browser']) 1314 | 1315 | 1316 | 1317 | You can see that we have pretty much all the features to make a user data point for prediction, but we need to filter and extract them, and perform some very minor calculations. The following function will do just that: 1318 | 1319 | 1320 | ```python 1321 | def get_data(info): 1322 | 1323 | """Extract the information from the returned JSON. 1324 | 1325 | This function will return the following array: 1326 | data = [profile pic, 1327 | nums/length username, 1328 | full name words, 1329 | nums/length full name, 1330 | name==username, 1331 | description length, 1332 | external URL, 1333 | private, 1334 | #posts, 1335 | #followers, 1336 | #followings] 1337 | """ 1338 | 1339 | data = [] 1340 | 1341 | # Does the user have a profile photo? 1342 | profile_pic = not info['has_anonymous_profile_picture'] 1343 | if profile_pic == True: 1344 | profile_pic = 1 1345 | else: 1346 | profile_pic = 0 1347 | data.append(profile_pic) 1348 | 1349 | # Ratio of number of numerical chars in username to its length 1350 | username = info['username'] 1351 | uname_ratio = len([x for x in username if x.isdigit()]) / float(len(username)) 1352 | data.append(uname_ratio) 1353 | 1354 | # Full name in word tokens 1355 | full_name = info['full_name'] 1356 | fname_tokens = len(full_name.split(' ')) 1357 | data.append(fname_tokens) 1358 | 1359 | # Ratio of number of numerical characters in full name to its length 1360 | if len(full_name) == 0: 1361 | fname_ratio = 0 1362 | else: 1363 | fname_ratio = len([x for x in full_name if x.isdigit()]) / float(len(full_name)) 1364 | data.append(fname_ratio) 1365 | 1366 | # Is name == username? 1367 | name_eq_uname = (full_name == username) 1368 | if name_eq_uname == True: 1369 | name_eq_uname = 1 1370 | else: 1371 | name_eq_uname = 0 1372 | data.append(name_eq_uname) 1373 | 1374 | # Number of characters on user bio 1375 | bio_length = len(info['biography']) 1376 | data.append(bio_length) 1377 | 1378 | # Does the user have an external URL? 1379 | ext_url = info['external_url'] != '' 1380 | if ext_url == True: 1381 | ext_url = 1 1382 | else: 1383 | ext_url = 0 1384 | data.append(ext_url) 1385 | 1386 | # Is the user private or no? 1387 | private = info['is_private'] 1388 | if private == True: 1389 | private = 1 1390 | else: 1391 | private = 0 1392 | data.append(private) 1393 | 1394 | # Number of posts 1395 | posts = info['media_count'] 1396 | data.append(posts) 1397 | 1398 | # Number of followers 1399 | followers = info['follower_count'] 1400 | data.append(followers) 1401 | 1402 | # Number of followings 1403 | followings = info['following_count'] 1404 | data.append(followings) 1405 | 1406 | 1407 | return data 1408 | ``` 1409 | 1410 | 1411 | ```python 1412 | # Check if the function returns as expected 1413 | get_data(info) 1414 | ``` 1415 | 1416 | 1417 | 1418 | 1419 | [1, 0.0, 3, 0.0, 0, 118, 1, 0, 589, 22227, 510] 1420 | 1421 | 1422 | 1423 | Unfortunately the Instagram Private API has a very limited number of API calls per hour so we will not be able to analyse *all* of the user's followers. 1424 | 1425 | Fortunately, I took Statistics and learned that **random sampling** is useful to cull a smaller sample size from a larger population and use it to research and make generalizations about the larger group. 1426 | 1427 | This will allow us to make user authenticity approximations despite the API limitations and still have a data that is representative of the user's followers. 1428 | 1429 | 1430 | ```python 1431 | # Get a random sample of 50 followers 1432 | random_followers = random.sample(followers, 50) 1433 | ``` 1434 | 1435 | Get user information for each follower 1436 | 1437 | 1438 | ```python 1439 | f_infos = [] 1440 | 1441 | for follower in random_followers: 1442 | info = api.user_info(get_ID(follower))['user'] 1443 | f_infos.append(info) 1444 | ``` 1445 | 1446 | Extract the relevant features 1447 | 1448 | 1449 | ```python 1450 | f_table = [] 1451 | 1452 | for info in f_infos: 1453 | f_table.append(get_data(info)) 1454 | 1455 | f_table 1456 | ``` 1457 | 1458 | 1459 | 1460 | 1461 | [[1, 0.0, 3, 0.0, 0, 43, 0, 1, 108, 788, 764], 1462 | [1, 0.0, 1, 0, 0, 45, 0, 0, 1, 252, 483], 1463 | [1, 0.0, 3, 0.0, 0, 90, 0, 0, 536, 1818, 7486], 1464 | [1, 0.5, 3, 0.0, 0, 0, 0, 0, 157, 148, 813], 1465 | [1, 0.0, 1, 0.0, 0, 102, 0, 1, 24, 481, 592], 1466 | [1, 0.0, 1, 0.0, 0, 59, 0, 1, 19, 773, 3639], 1467 | [1, 0.0, 1, 0, 0, 8, 0, 1, 0, 3, 3639], 1468 | [1, 0.0, 3, 0.0, 0, 90, 1, 0, 27, 63, 19], 1469 | [1, 0.0, 4, 0.0, 0, 148, 0, 1, 458, 682, 436], 1470 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 35, 1054, 1046], 1471 | [1, 0.36363636363636365, 1, 0.0, 0, 96, 0, 1, 96, 50, 98], 1472 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 2, 10, 202], 1473 | [1, 0.0, 2, 0.0, 0, 135, 1, 1, 159, 52, 240], 1474 | [1, 0.0, 1, 0.0, 0, 20, 0, 0, 87, 1864, 692], 1475 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 35, 275, 2039], 1476 | [1, 0.0625, 3, 0.0, 0, 98, 0, 0, 9, 98, 847], 1477 | [1, 0.0, 3, 0.0, 0, 92, 0, 1, 10, 11, 46], 1478 | [1, 0.0, 2, 0.0, 0, 69, 0, 1, 16, 2686, 6570], 1479 | [1, 0.0, 2, 0.0, 0, 68, 0, 1, 31, 18, 64], 1480 | [1, 0.0, 3, 0.0, 0, 6, 0, 0, 27, 1628, 1037], 1481 | [1, 0.0, 1, 0, 0, 2, 0, 0, 21, 1730, 1298], 1482 | [0, 0.18181818181818182, 2, 0.0, 0, 0, 0, 1, 219, 183, 275], 1483 | [1, 0.0, 2, 0.0, 0, 38, 0, 0, 11, 645, 4452], 1484 | [1, 0.0, 2, 0.0, 0, 30, 1, 0, 42, 1258, 952], 1485 | [1, 0.0, 1, 0.0, 0, 9, 0, 0, 2, 629, 485], 1486 | [1, 0.23529411764705882, 1, 0.0, 0, 62, 0, 1, 12, 1270, 951], 1487 | [1, 0.0, 1, 0.0, 0, 86, 0, 0, 299, 1669, 1133], 1488 | [1, 0.0, 2, 0.0, 0, 14, 0, 0, 11, 753, 853], 1489 | [1, 0.2, 2, 0.0, 0, 9, 0, 0, 0, 213, 700], 1490 | [1, 0.0, 1, 0.0, 0, 133, 0, 1, 11, 28, 169], 1491 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 3, 1395, 794], 1492 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 71, 831, 1024], 1493 | [1, 0.0, 3, 0.0, 0, 29, 0, 0, 61, 680, 566], 1494 | [1, 0.0, 2, 0.0, 0, 64, 0, 0, 1729, 6114, 5758], 1495 | [1, 0.0, 2, 0.0, 0, 17, 0, 0, 73, 2104, 7091], 1496 | [1, 0.0, 3, 0.0, 0, 36, 0, 1, 20, 728, 4139], 1497 | [1, 0.0, 2, 0.0, 0, 106, 0, 1, 23, 83, 458], 1498 | [1, 0.0, 2, 0.0, 0, 31, 0, 1, 78, 2035, 1035], 1499 | [1, 0.0, 2, 0.0, 0, 35, 0, 1, 12, 11549, 712], 1500 | [1, 0.0, 3, 0.08333333333333333, 0, 100, 0, 1, 56, 39, 190], 1501 | [1, 0.13333333333333333, 1, 0.0, 0, 103, 0, 1, 109, 1053, 6221], 1502 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 49, 412, 520], 1503 | [1, 0.0, 1, 0, 0, 7, 0, 0, 110, 317, 334], 1504 | [1, 0.0, 1, 0.0, 0, 31, 1, 0, 141, 2490, 1043], 1505 | [1, 0.18181818181818182, 2, 0.0, 0, 35, 1, 0, 320, 2345, 861], 1506 | [1, 0.0, 3, 0.0, 0, 115, 0, 1, 1336, 1018, 1208], 1507 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 39, 37, 611], 1508 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 0, 513, 633], 1509 | [1, 0.0, 2, 0.0, 0, 46, 0, 0, 23, 83, 306], 1510 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 30, 126, 372]] 1511 | 1512 | 1513 | 1514 | Create a pandas dataframe 1515 | 1516 | 1517 | ```python 1518 | test_data = pd.DataFrame(f_table, 1519 | columns = ['profile pic', 1520 | 'nums/length username', 1521 | 'fullname words', 1522 | 'nums/length fullname', 1523 | 'name==username', 1524 | 'description length', 1525 | 'external URL', 1526 | 'private', 1527 | '#posts', 1528 | '#followers', 1529 | '#follows']) 1530 | test_data 1531 | ``` 1532 | 1533 | 1534 | 1535 | 1536 |
1537 | 1550 | 1551 | 1552 | 1553 | 1554 | 1555 | 1556 | 1557 | 1558 | 1559 | 1560 | 1561 | 1562 | 1563 | 1564 | 1565 | 1566 | 1567 | 1568 | 1569 | 1570 | 1571 | 1572 | 1573 | 1574 | 1575 | 1576 | 1577 | 1578 | 1579 | 1580 | 1581 | 1582 | 1583 | 1584 | 1585 | 1586 | 1587 | 1588 | 1589 | 1590 | 1591 | 1592 | 1593 | 1594 | 1595 | 1596 | 1597 | 1598 | 1599 | 1600 | 1601 | 1602 | 1603 | 1604 | 1605 | 1606 | 1607 | 1608 | 1609 | 1610 | 1611 | 1612 | 1613 | 1614 | 1615 | 1616 | 1617 | 1618 | 1619 | 1620 | 1621 | 1622 | 1623 | 1624 | 1625 | 1626 | 1627 | 1628 | 1629 | 1630 | 1631 | 1632 | 1633 | 1634 | 1635 | 1636 | 1637 | 1638 | 1639 | 1640 | 1641 | 1642 | 1643 | 1644 | 1645 | 1646 | 1647 | 1648 | 1649 | 1650 | 1651 | 1652 | 1653 | 1654 | 1655 | 1656 | 1657 | 1658 | 1659 | 1660 | 1661 | 1662 | 1663 | 1664 | 1665 | 1666 | 1667 | 1668 | 1669 | 1670 | 1671 | 1672 | 1673 | 1674 | 1675 | 1676 | 1677 | 1678 | 1679 | 1680 | 1681 | 1682 | 1683 | 1684 | 1685 | 1686 | 1687 | 1688 | 1689 | 1690 | 1691 | 1692 | 1693 | 1694 | 1695 | 1696 | 1697 | 1698 | 1699 | 1700 | 1701 | 1702 | 1703 | 1704 | 1705 | 1706 | 1707 | 1708 | 1709 | 1710 | 1711 | 1712 | 1713 | 1714 | 1715 | 1716 | 1717 | 1718 | 1719 | 1720 | 1721 | 1722 | 1723 | 1724 | 1725 | 1726 | 1727 | 1728 | 1729 | 1730 | 1731 | 1732 | 1733 | 1734 | 1735 | 1736 | 1737 | 1738 | 1739 | 1740 | 1741 | 1742 | 1743 | 1744 | 1745 | 1746 | 1747 | 1748 | 1749 | 1750 | 1751 | 1752 | 1753 | 1754 | 1755 | 1756 | 1757 | 1758 | 1759 | 1760 | 1761 | 1762 | 1763 | 1764 | 1765 | 1766 | 1767 | 1768 | 1769 | 1770 | 1771 | 1772 | 1773 | 1774 | 1775 | 1776 | 1777 | 1778 | 1779 | 1780 | 1781 | 1782 | 1783 | 1784 | 1785 | 1786 | 1787 | 1788 | 1789 | 1790 | 1791 | 1792 | 1793 | 1794 | 1795 | 1796 | 1797 | 1798 | 1799 | 1800 | 1801 | 1802 | 1803 | 1804 | 1805 | 1806 | 1807 | 1808 | 1809 | 1810 | 1811 | 1812 | 1813 | 1814 | 1815 | 1816 | 1817 | 1818 | 1819 | 1820 | 1821 | 1822 | 1823 | 1824 | 1825 | 1826 | 1827 | 1828 | 1829 | 1830 | 1831 | 1832 | 1833 | 1834 | 1835 | 1836 | 1837 | 1838 | 1839 | 1840 | 1841 | 1842 | 1843 | 1844 | 1845 | 1846 | 1847 | 1848 | 1849 | 1850 | 1851 | 1852 | 1853 | 1854 | 1855 | 1856 | 1857 | 1858 | 1859 | 1860 | 1861 | 1862 | 1863 | 1864 | 1865 | 1866 | 1867 | 1868 | 1869 | 1870 | 1871 | 1872 | 1873 | 1874 | 1875 | 1876 | 1877 | 1878 | 1879 | 1880 | 1881 | 1882 | 1883 | 1884 | 1885 | 1886 | 1887 | 1888 | 1889 | 1890 | 1891 | 1892 | 1893 | 1894 | 1895 | 1896 | 1897 | 1898 | 1899 | 1900 | 1901 | 1902 | 1903 | 1904 | 1905 | 1906 | 1907 | 1908 | 1909 | 1910 | 1911 | 1912 | 1913 | 1914 | 1915 | 1916 | 1917 | 1918 | 1919 | 1920 | 1921 | 1922 | 1923 | 1924 | 1925 | 1926 | 1927 | 1928 | 1929 | 1930 | 1931 | 1932 | 1933 | 1934 | 1935 | 1936 | 1937 | 1938 | 1939 | 1940 | 1941 | 1942 | 1943 | 1944 | 1945 | 1946 | 1947 | 1948 | 1949 | 1950 | 1951 | 1952 | 1953 | 1954 | 1955 | 1956 | 1957 | 1958 | 1959 | 1960 | 1961 | 1962 | 1963 | 1964 | 1965 | 1966 | 1967 | 1968 | 1969 | 1970 | 1971 | 1972 | 1973 | 1974 | 1975 | 1976 | 1977 | 1978 | 1979 | 1980 | 1981 | 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | 2023 | 2024 | 2025 | 2026 | 2027 | 2028 | 2029 | 2030 | 2031 | 2032 | 2033 | 2034 | 2035 | 2036 | 2037 | 2038 | 2039 | 2040 | 2041 | 2042 | 2043 | 2044 | 2045 | 2046 | 2047 | 2048 | 2049 | 2050 | 2051 | 2052 | 2053 | 2054 | 2055 | 2056 | 2057 | 2058 | 2059 | 2060 | 2061 | 2062 | 2063 | 2064 | 2065 | 2066 | 2067 | 2068 | 2069 | 2070 | 2071 | 2072 | 2073 | 2074 | 2075 | 2076 | 2077 | 2078 | 2079 | 2080 | 2081 | 2082 | 2083 | 2084 | 2085 | 2086 | 2087 | 2088 | 2089 | 2090 | 2091 | 2092 | 2093 | 2094 | 2095 | 2096 | 2097 | 2098 | 2099 | 2100 | 2101 | 2102 | 2103 | 2104 | 2105 | 2106 | 2107 | 2108 | 2109 | 2110 | 2111 | 2112 | 2113 | 2114 | 2115 | 2116 | 2117 | 2118 | 2119 | 2120 | 2121 | 2122 | 2123 | 2124 | 2125 | 2126 | 2127 | 2128 | 2129 | 2130 | 2131 | 2132 | 2133 | 2134 | 2135 | 2136 | 2137 | 2138 | 2139 | 2140 | 2141 | 2142 | 2143 | 2144 | 2145 | 2146 | 2147 | 2148 | 2149 | 2150 | 2151 | 2152 | 2153 | 2154 | 2155 | 2156 | 2157 | 2158 | 2159 | 2160 | 2161 | 2162 | 2163 | 2164 | 2165 | 2166 | 2167 | 2168 | 2169 | 2170 | 2171 | 2172 | 2173 | 2174 | 2175 | 2176 | 2177 | 2178 | 2179 | 2180 | 2181 | 2182 | 2183 | 2184 | 2185 | 2186 | 2187 | 2188 | 2189 | 2190 | 2191 | 2192 | 2193 | 2194 | 2195 | 2196 | 2197 | 2198 | 2199 | 2200 | 2201 | 2202 | 2203 | 2204 | 2205 | 2206 | 2207 | 2208 | 2209 | 2210 | 2211 | 2212 | 2213 | 2214 | 2215 | 2216 | 2217 | 2218 | 2219 | 2220 | 2221 | 2222 | 2223 | 2224 | 2225 | 2226 | 2227 | 2228 | 2229 | 2230 | 2231 | 2232 | 2233 | 2234 | 2235 | 2236 | 2237 | 2238 | 2239 | 2240 | 2241 | 2242 | 2243 | 2244 | 2245 | 2246 | 2247 | 2248 | 2249 | 2250 | 2251 | 2252 | 2253 | 2254 | 2255 | 2256 | 2257 | 2258 | 2259 | 2260 | 2261 | 2262 | 2263 | 2264 | 2265 | 2266 | 2267 | 2268 | 2269 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#follows
010.00000030.00000004301108788764
110.00000010.000000045001252483
210.00000030.0000000900053618187486
310.50000030.0000000000157148813
410.00000010.00000001020124481592
510.00000010.00000005901197733639
610.00000010.0000000801033639
710.00000030.00000009010276319
810.00000040.000000014801458682436
910.00000020.00000000013510541046
1010.36363610.00000009601965098
1110.00000010.0000000001210202
1210.00000020.00000001351115952240
1310.00000010.00000002000871864692
1410.00000010.0000000001352752039
1510.06250030.00000009800998847
1610.00000030.00000009201101146
1710.00000020.000000069011626866570
1810.00000020.00000006801311864
1910.00000030.00000006002716281037
2010.00000010.00000002002117301298
2100.18181820.0000000001219183275
2210.00000020.00000003800116454452
2310.00000020.00000003010421258952
2410.00000010.00000009002629485
2510.23529410.00000006201121270951
2610.00000010.0000000860029916691133
2710.00000020.0000000140011753853
2810.20000020.00000009000213700
2910.00000010.0000000133011128169
3010.00000020.000000000131395794
3110.00000020.0000000000718311024
3210.00000030.0000000290061680566
3310.00000020.00000006400172961145758
3410.00000020.000000017007321047091
3510.00000030.00000003601207284139
3610.00000020.0000000106012383458
3710.00000020.000000031017820351035
3810.00000020.000000035011211549712
3910.00000030.0833330100015639190
4010.13333310.00000001030110910536221
4110.00000010.000000000049412520
4210.00000010.0000000700110317334
4310.00000010.0000000311014124901043
4410.18181820.000000035103202345861
4510.00000030.000000011501133610181208
4610.00000010.00000000013937611
4710.00000010.00000000010513633
4810.00000020.000000046002383306
4910.00000010.000000000030126372
2270 |
2271 | 2272 | 2273 | 2274 | ## Part 5: Make the prediction! 2275 | In part 2, we have compared the different classifiers and found that the Random Forest Classifier had the highest accuracy at 92.5%. Therefore, we are going to use this classifier to make the prediction. 2276 | 2277 | 2278 | ```python 2279 | rfc = RandomForestClassifier() 2280 | 2281 | # Train the model 2282 | # We've done this in Part 2 but I'm redoing it here for coherence ☺️ 2283 | rfc_model = rfc.fit(train_X, train_Y) 2284 | ``` 2285 | 2286 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:5: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). 2287 | """ 2288 | 2289 | 2290 | 2291 | ```python 2292 | rfc_labels = rfc_model.predict(test_data) 2293 | rfc_labels 2294 | ``` 2295 | 2296 | 2297 | 2298 | 2299 | array([0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 2300 | 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2301 | 0, 0, 1, 0, 0, 0]) 2302 | 2303 | 2304 | 2305 | Calculate the number of fake accounts in the random sample of 50 followers 2306 | 2307 | 2308 | ```python 2309 | no_fakes = len([x for x in rfc_labels if x==1]) 2310 | ``` 2311 | 2312 | Calculate the Instagram user's authenticity, 2313 |
where authenticity = (#followers - #fakes)*100 / #followers 2314 | 2315 | 2316 | ```python 2317 | authenticity = (len(random_followers) - no_fakes) * 100 / len(random_followers) 2318 | print("User X's Instagram Followers is " + str(authenticity) + "% authentic.") 2319 | ``` 2320 | 2321 | User X's Instagram Followers is 82.0% authentic. 2322 | 2323 | 2324 | ## Part 6: Extension - Fake Likes 2325 | The method above can also be extended to check fake likes within a post. 2326 | 2327 | Get the user's posts 2328 | 2329 | 2330 | ```python 2331 | def get_user_posts(userID, min_posts_to_be_retrieved): 2332 | # Retrieve all posts from my profile 2333 | my_posts = [] 2334 | has_more_posts = True 2335 | max_id = '' 2336 | 2337 | while has_more_posts: 2338 | feed = api.user_feed(userID, max_id=max_id) 2339 | if feed.get('more_available') is not True: 2340 | has_more_posts = False 2341 | 2342 | max_id = feed.get('next_max_id', '') 2343 | my_posts.extend(feed.get('items')) 2344 | 2345 | # time.sleep(2) to avoid flooding 2346 | 2347 | if len(my_posts) > min_posts_to_be_retrieved: 2348 | print('Total posts retrieved: ' + str(len(my_posts))) 2349 | return my_posts 2350 | 2351 | if has_more_posts: 2352 | print(str(len(my_posts)) + ' posts retrieved so far...') 2353 | 2354 | print('Total posts retrieved: ' + str(len(my_posts))) 2355 | 2356 | return my_posts 2357 | ``` 2358 | 2359 | 2360 | ```python 2361 | posts = get_user_posts(userID, 10) 2362 | ``` 2363 | 2364 | Total posts retrieved: 18 2365 | 2366 | 2367 | Pick one post to analyse (here I'm just going to pick by random) 2368 | 2369 | 2370 | ```python 2371 | random_post = random.sample(posts, 1) 2372 | ``` 2373 | 2374 | Get post likers 2375 | 2376 | 2377 | ```python 2378 | random_post[0].keys() 2379 | ``` 2380 | 2381 | 2382 | 2383 | 2384 | dict_keys(['taken_at', 'pk', 'id', 'device_timestamp', 'media_type', 'code', 'client_cache_key', 'filter_type', 'carousel_media_count', 'carousel_media', 'can_see_insights_as_brand', 'location', 'lat', 'lng', 'user', 'can_viewer_reshare', 'caption_is_edited', 'comment_likes_enabled', 'comment_threading_enabled', 'has_more_comments', 'next_max_id', 'max_num_visible_preview_comments', 'preview_comments', 'can_view_more_preview_comments', 'comment_count', 'inline_composer_display_condition', 'inline_composer_imp_trigger_time', 'like_count', 'has_liked', 'top_likers', 'photo_of_you', 'usertags', 'caption', 'can_viewer_save', 'organic_tracking_token']) 2385 | 2386 | 2387 | 2388 | 2389 | ```python 2390 | likers = api.media_likers(random_post[0]['id']) 2391 | ``` 2392 | 2393 | Get a list of usernames 2394 | 2395 | 2396 | ```python 2397 | likers_usernames = [liker['username'] for liker in likers['users']] 2398 | ``` 2399 | 2400 | Get a random sample of 50 users 2401 | 2402 | 2403 | ```python 2404 | random_likers = random.sample(likers_usernames, 50) 2405 | ``` 2406 | 2407 | Retrieve the information for the 50 users 2408 | 2409 | 2410 | ```python 2411 | l_infos = [] 2412 | 2413 | for liker in random_likers: 2414 | info = api.user_info(get_ID(liker))['user'] 2415 | l_infos.append(info) 2416 | ``` 2417 | 2418 | 2419 | ```python 2420 | l_table = [] 2421 | 2422 | for info in l_infos: 2423 | l_table.append(get_data(info)) 2424 | 2425 | l_table 2426 | ``` 2427 | 2428 | 2429 | 2430 | 2431 | [[1, 0.0, 1, 0, 0, 30, 0, 0, 6, 21, 177], 2432 | [1, 0.0, 1, 0.0, 0, 69, 0, 1, 131, 942, 1229], 2433 | [1, 0.0, 2, 0.0, 0, 83, 0, 1, 609, 1558, 2925], 2434 | [1, 0.0, 1, 0.0, 0, 39, 0, 0, 851, 2940, 1255], 2435 | [1, 0.0, 1, 0.0, 0, 36, 1, 0, 106, 1626, 1050], 2436 | [0, 0.0, 1, 0, 0, 0, 0, 1, 7, 371, 350], 2437 | [1, 0.0, 2, 0.0, 0, 96, 1, 0, 405, 1656, 2843], 2438 | [1, 0.0, 2, 0.0, 0, 5, 1, 0, 9, 1363, 854], 2439 | [1, 0.0, 1, 0, 0, 1, 0, 1, 5, 433, 371], 2440 | [1, 0.0, 6, 0.0, 0, 93, 1, 0, 73, 1356, 1081], 2441 | [1, 0.0, 3, 0.0, 0, 80, 1, 1, 188, 966, 966], 2442 | [1, 0.0, 3, 0.0, 0, 0, 0, 1, 156, 1401, 1249], 2443 | [1, 0.0, 2, 0.0, 0, 118, 1, 0, 115, 6557, 2423], 2444 | [1, 0.0, 1, 0.0, 0, 12, 0, 0, 84, 1552, 661], 2445 | [1, 0.0, 1, 0.0, 0, 80, 0, 0, 99, 1413, 2479], 2446 | [1, 0.0, 1, 0.0, 0, 23, 0, 1, 12, 1116, 1031], 2447 | [1, 0.0, 1, 0.0, 0, 20, 0, 0, 87, 1864, 692], 2448 | [1, 0.0, 3, 0.0, 0, 62, 1, 0, 17, 1266, 1107], 2449 | [1, 0.0, 2, 0.0, 0, 20, 0, 1, 15, 636, 579], 2450 | [1, 0.0, 4, 0.0, 0, 17, 0, 1, 127, 546, 536], 2451 | [1, 0.0, 1, 0.0, 0, 18, 0, 0, 5, 918, 678], 2452 | [1, 0.2857142857142857, 1, 0.0, 0, 0, 0, 1, 0, 20, 35], 2453 | [1, 0.0, 2, 0.0, 0, 8, 0, 0, 39, 1490, 1321], 2454 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 10, 519, 547], 2455 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 43, 933, 1101], 2456 | [1, 0.0, 2, 0.0, 0, 10, 0, 1, 19, 613, 612], 2457 | [1, 0.25, 3, 0.0, 0, 139, 1, 0, 104, 1738, 999], 2458 | [1, 0.0, 3, 0.0, 0, 42, 1, 0, 17, 2973, 1339], 2459 | [1, 0.0, 1, 0.0, 0, 20, 0, 1, 107, 749, 857], 2460 | [1, 0.0, 4, 0.0, 0, 119, 1, 0, 655, 675, 1904], 2461 | [1, 0.0, 1, 0.0, 0, 103, 1, 0, 48, 10075, 2379], 2462 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 12, 534, 563], 2463 | [1, 0.0, 1, 0, 0, 0, 0, 1, 58, 2220, 1418], 2464 | [1, 0.0, 1, 0.0, 0, 11, 1, 1, 18, 775, 514], 2465 | [1, 0.0, 3, 0.0, 0, 30, 0, 0, 10, 1070, 1364], 2466 | [1, 0.0, 1, 0.0, 0, 18, 0, 0, 108, 1148, 832], 2467 | [1, 0.0, 2, 0.0, 0, 133, 0, 1, 52, 394, 432], 2468 | [1, 0.0, 1, 0, 0, 30, 1, 0, 48, 3441, 1293], 2469 | [1, 0.0, 2, 0.0, 0, 40, 1, 0, 1434, 1642, 1684], 2470 | [1, 0.0, 1, 0.0, 0, 64, 1, 0, 33, 17955, 781], 2471 | [1, 0.0, 2, 0.0, 0, 91, 1, 1, 217, 1014, 1409], 2472 | [1, 0.0, 1, 0, 0, 0, 0, 1, 1, 1347, 872], 2473 | [1, 0.3076923076923077, 1, 0.0, 0, 0, 0, 0, 59, 161, 544], 2474 | [1, 0.0, 3, 0.0, 0, 141, 1, 1, 274, 922, 913], 2475 | [1, 0.0, 1, 0.0, 0, 69, 1, 0, 69, 904, 596], 2476 | [1, 0.0, 1, 0.0, 0, 42, 0, 0, 598, 1877, 6379], 2477 | [1, 0.0, 2, 0.0, 0, 4, 0, 1, 11, 660, 643], 2478 | [1, 0.0, 2, 0.0, 0, 24, 0, 0, 6, 345, 358], 2479 | [1, 0.0, 2, 0.0, 0, 29, 0, 0, 23, 293, 538], 2480 | [1, 0.0, 1, 0.0, 0, 10, 1, 1, 3, 690, 549]] 2481 | 2482 | 2483 | 2484 | 2485 | ```python 2486 | # Generate pandas dataframe 2487 | l_test_data = pd.DataFrame(l_table, 2488 | columns = ['profile pic', 2489 | 'nums/length username', 2490 | 'fullname words', 2491 | 'nums/length fullname', 2492 | 'name==username', 2493 | 'description length', 2494 | 'external URL', 2495 | 'private', 2496 | '#posts', 2497 | '#followers', 2498 | '#follows']) 2499 | l_test_data 2500 | ``` 2501 | 2502 | 2503 | 2504 | 2505 |
2506 | 2519 | 2520 | 2521 | 2522 | 2523 | 2524 | 2525 | 2526 | 2527 | 2528 | 2529 | 2530 | 2531 | 2532 | 2533 | 2534 | 2535 | 2536 | 2537 | 2538 | 2539 | 2540 | 2541 | 2542 | 2543 | 2544 | 2545 | 2546 | 2547 | 2548 | 2549 | 2550 | 2551 | 2552 | 2553 | 2554 | 2555 | 2556 | 2557 | 2558 | 2559 | 2560 | 2561 | 2562 | 2563 | 2564 | 2565 | 2566 | 2567 | 2568 | 2569 | 2570 | 2571 | 2572 | 2573 | 2574 | 2575 | 2576 | 2577 | 2578 | 2579 | 2580 | 2581 | 2582 | 2583 | 2584 | 2585 | 2586 | 2587 | 2588 | 2589 | 2590 | 2591 | 2592 | 2593 | 2594 | 2595 | 2596 | 2597 | 2598 | 2599 | 2600 | 2601 | 2602 | 2603 | 2604 | 2605 | 2606 | 2607 | 2608 | 2609 | 2610 | 2611 | 2612 | 2613 | 2614 | 2615 | 2616 | 2617 | 2618 | 2619 | 2620 | 2621 | 2622 | 2623 | 2624 | 2625 | 2626 | 2627 | 2628 | 2629 | 2630 | 2631 | 2632 | 2633 | 2634 | 2635 | 2636 | 2637 | 2638 | 2639 | 2640 | 2641 | 2642 | 2643 | 2644 | 2645 | 2646 | 2647 | 2648 | 2649 | 2650 | 2651 | 2652 | 2653 | 2654 | 2655 | 2656 | 2657 | 2658 | 2659 | 2660 | 2661 | 2662 | 2663 | 2664 | 2665 | 2666 | 2667 | 2668 | 2669 | 2670 | 2671 | 2672 | 2673 | 2674 | 2675 | 2676 | 2677 | 2678 | 2679 | 2680 | 2681 | 2682 | 2683 | 2684 | 2685 | 2686 | 2687 | 2688 | 2689 | 2690 | 2691 | 2692 | 2693 | 2694 | 2695 | 2696 | 2697 | 2698 | 2699 | 2700 | 2701 | 2702 | 2703 | 2704 | 2705 | 2706 | 2707 | 2708 | 2709 | 2710 | 2711 | 2712 | 2713 | 2714 | 2715 | 2716 | 2717 | 2718 | 2719 | 2720 | 2721 | 2722 | 2723 | 2724 | 2725 | 2726 | 2727 | 2728 | 2729 | 2730 | 2731 | 2732 | 2733 | 2734 | 2735 | 2736 | 2737 | 2738 | 2739 | 2740 | 2741 | 2742 | 2743 | 2744 | 2745 | 2746 | 2747 | 2748 | 2749 | 2750 | 2751 | 2752 | 2753 | 2754 | 2755 | 2756 | 2757 | 2758 | 2759 | 2760 | 2761 | 2762 | 2763 | 2764 | 2765 | 2766 | 2767 | 2768 | 2769 | 2770 | 2771 | 2772 | 2773 | 2774 | 2775 | 2776 | 2777 | 2778 | 2779 | 2780 | 2781 | 2782 | 2783 | 2784 | 2785 | 2786 | 2787 | 2788 | 2789 | 2790 | 2791 | 2792 | 2793 | 2794 | 2795 | 2796 | 2797 | 2798 | 2799 | 2800 | 2801 | 2802 | 2803 | 2804 | 2805 | 2806 | 2807 | 2808 | 2809 | 2810 | 2811 | 2812 | 2813 | 2814 | 2815 | 2816 | 2817 | 2818 | 2819 | 2820 | 2821 | 2822 | 2823 | 2824 | 2825 | 2826 | 2827 | 2828 | 2829 | 2830 | 2831 | 2832 | 2833 | 2834 | 2835 | 2836 | 2837 | 2838 | 2839 | 2840 | 2841 | 2842 | 2843 | 2844 | 2845 | 2846 | 2847 | 2848 | 2849 | 2850 | 2851 | 2852 | 2853 | 2854 | 2855 | 2856 | 2857 | 2858 | 2859 | 2860 | 2861 | 2862 | 2863 | 2864 | 2865 | 2866 | 2867 | 2868 | 2869 | 2870 | 2871 | 2872 | 2873 | 2874 | 2875 | 2876 | 2877 | 2878 | 2879 | 2880 | 2881 | 2882 | 2883 | 2884 | 2885 | 2886 | 2887 | 2888 | 2889 | 2890 | 2891 | 2892 | 2893 | 2894 | 2895 | 2896 | 2897 | 2898 | 2899 | 2900 | 2901 | 2902 | 2903 | 2904 | 2905 | 2906 | 2907 | 2908 | 2909 | 2910 | 2911 | 2912 | 2913 | 2914 | 2915 | 2916 | 2917 | 2918 | 2919 | 2920 | 2921 | 2922 | 2923 | 2924 | 2925 | 2926 | 2927 | 2928 | 2929 | 2930 | 2931 | 2932 | 2933 | 2934 | 2935 | 2936 | 2937 | 2938 | 2939 | 2940 | 2941 | 2942 | 2943 | 2944 | 2945 | 2946 | 2947 | 2948 | 2949 | 2950 | 2951 | 2952 | 2953 | 2954 | 2955 | 2956 | 2957 | 2958 | 2959 | 2960 | 2961 | 2962 | 2963 | 2964 | 2965 | 2966 | 2967 | 2968 | 2969 | 2970 | 2971 | 2972 | 2973 | 2974 | 2975 | 2976 | 2977 | 2978 | 2979 | 2980 | 2981 | 2982 | 2983 | 2984 | 2985 | 2986 | 2987 | 2988 | 2989 | 2990 | 2991 | 2992 | 2993 | 2994 | 2995 | 2996 | 2997 | 2998 | 2999 | 3000 | 3001 | 3002 | 3003 | 3004 | 3005 | 3006 | 3007 | 3008 | 3009 | 3010 | 3011 | 3012 | 3013 | 3014 | 3015 | 3016 | 3017 | 3018 | 3019 | 3020 | 3021 | 3022 | 3023 | 3024 | 3025 | 3026 | 3027 | 3028 | 3029 | 3030 | 3031 | 3032 | 3033 | 3034 | 3035 | 3036 | 3037 | 3038 | 3039 | 3040 | 3041 | 3042 | 3043 | 3044 | 3045 | 3046 | 3047 | 3048 | 3049 | 3050 | 3051 | 3052 | 3053 | 3054 | 3055 | 3056 | 3057 | 3058 | 3059 | 3060 | 3061 | 3062 | 3063 | 3064 | 3065 | 3066 | 3067 | 3068 | 3069 | 3070 | 3071 | 3072 | 3073 | 3074 | 3075 | 3076 | 3077 | 3078 | 3079 | 3080 | 3081 | 3082 | 3083 | 3084 | 3085 | 3086 | 3087 | 3088 | 3089 | 3090 | 3091 | 3092 | 3093 | 3094 | 3095 | 3096 | 3097 | 3098 | 3099 | 3100 | 3101 | 3102 | 3103 | 3104 | 3105 | 3106 | 3107 | 3108 | 3109 | 3110 | 3111 | 3112 | 3113 | 3114 | 3115 | 3116 | 3117 | 3118 | 3119 | 3120 | 3121 | 3122 | 3123 | 3124 | 3125 | 3126 | 3127 | 3128 | 3129 | 3130 | 3131 | 3132 | 3133 | 3134 | 3135 | 3136 | 3137 | 3138 | 3139 | 3140 | 3141 | 3142 | 3143 | 3144 | 3145 | 3146 | 3147 | 3148 | 3149 | 3150 | 3151 | 3152 | 3153 | 3154 | 3155 | 3156 | 3157 | 3158 | 3159 | 3160 | 3161 | 3162 | 3163 | 3164 | 3165 | 3166 | 3167 | 3168 | 3169 | 3170 | 3171 | 3172 | 3173 | 3174 | 3175 | 3176 | 3177 | 3178 | 3179 | 3180 | 3181 | 3182 | 3183 | 3184 | 3185 | 3186 | 3187 | 3188 | 3189 | 3190 | 3191 | 3192 | 3193 | 3194 | 3195 | 3196 | 3197 | 3198 | 3199 | 3200 | 3201 | 3202 | 3203 | 3204 | 3205 | 3206 | 3207 | 3208 | 3209 | 3210 | 3211 | 3212 | 3213 | 3214 | 3215 | 3216 | 3217 | 3218 | 3219 | 3220 | 3221 | 3222 | 3223 | 3224 | 3225 | 3226 | 3227 | 3228 | 3229 | 3230 | 3231 | 3232 | 3233 | 3234 | 3235 | 3236 | 3237 | 3238 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#follows
010.00000010.003000621177
110.00000010.0069011319421229
210.00000020.00830160915582925
310.00000010.00390085129401255
410.00000010.00361010616261050
500.00000010.000017371350
610.00000020.00961040516562843
710.00000020.0051091363854
810.00000010.001015433371
910.00000060.0093107313561081
1010.00000030.008011188966966
1110.00000030.0000115614011249
1210.00000020.001181011565572423
1310.00000010.001200841552661
1410.00000010.0080009914132479
1510.00000010.0023011211161031
1610.00000010.002000871864692
1710.00000030.0062101712661107
1810.00000020.00200115636579
1910.00000040.001701127546536
2010.00000010.0018005918678
2110.28571410.0000102035
2210.00000020.008003914901321
2310.00000020.0000010519547
2410.00000020.00000439331101
2510.00000020.00100119613612
2610.25000030.00139101041738999
2710.00000030.0042101729731339
2810.00000010.002001107749857
2910.00000040.00119106556751904
3010.00000010.001031048100752379
3110.00000010.0000012534563
3210.00000010.000015822201418
3310.00000010.00111118775514
3410.00000030.0030001010701364
3510.00000010.0018001081148832
3610.00000020.001330152394432
3710.00000010.0030104834411293
3810.00000020.004010143416421684
3910.00000010.0064103317955781
4010.00000020.00911121710141409
4110.00000010.0000111347872
4210.30769210.0000059161544
4310.00000030.0014111274922913
4410.00000010.00691069904596
4510.00000010.00420059818776379
4610.00000020.0040111660643
4710.00000020.0024006345358
4810.00000020.00290023293538
4910.00000010.0010113690549
3239 |
3240 | 3241 | 3242 | 3243 | Finally, make the prediction! 3244 | 3245 | 3246 | ```python 3247 | rfc = RandomForestClassifier() 3248 | rfc_model = rfc.fit(train_X, train_Y) 3249 | rfc_labels_likes = rfc_model.predict(l_test_data) 3250 | rfc_labels_likes 3251 | ``` 3252 | 3253 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). 3254 | 3255 | 3256 | 3257 | 3258 | 3259 | 3260 | array([1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3261 | 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3262 | 0, 0, 0, 0, 0, 0]) 3263 | 3264 | 3265 | 3266 | Calculate the fake accounts that liked the user's media 3267 | 3268 | 3269 | ```python 3270 | no_fake_likes = len([x for x in rfc_labels_likes if x==1]) 3271 | ``` 3272 | 3273 | Calculate the media likes authenticity 3274 | 3275 | 3276 | ```python 3277 | media_authenticity = (len(random_likers) - no_fake_likes) * 100 / len(random_likers) 3278 | print("The media with the ID:XXXXX has " + str(media_authenticity) + "% authentic likes.") 3279 | ``` 3280 | 3281 | The media with the ID:XXXXX has 92.0% authentic likes. 3282 | 3283 | 3284 | ## Part 7: Comparison With Another User 3285 | I have specifically chosen user X because I trusted their social media 'game' and seemed to have a loyal and engaged following. Let's compare their metrics with a user Y, a user that has a noticable follower growth spike when examined on SocialBlade. 3286 | 3287 | I am going to skip the explanation here because it's just a repetition of the steps performed on user X. 3288 | 3289 | 3290 | ```python 3291 | # Re-login because of API call limits 3292 | api = login() 3293 | ``` 3294 | 3295 | username: ins.tafakebusters 3296 | password: ········ 3297 | 3298 | 3299 | 3300 | ```python 3301 | userID_y = get_ID('') 3302 | ``` 3303 | 3304 | 3305 | ```python 3306 | rank = api.generate_uuid() 3307 | ``` 3308 | 3309 | **USER Y FOLLOWERS ANALYSIS** 3310 | 3311 | 3312 | ```python 3313 | y_followers = get_followers(userID_y, rank) 3314 | ``` 3315 | 3316 | 3317 | ```python 3318 | y_random_followers = random.sample(y_followers, 50) 3319 | ``` 3320 | 3321 | 3322 | ```python 3323 | y_infos = [] 3324 | 3325 | for follower in y_random_followers: 3326 | info = api.user_info(get_ID(follower))['user'] 3327 | y_infos.append(info) 3328 | ``` 3329 | 3330 | 3331 | ```python 3332 | y_table = [] 3333 | 3334 | for info in y_infos: 3335 | y_table.append(get_data(info)) 3336 | 3337 | y_table 3338 | ``` 3339 | 3340 | 3341 | 3342 | 3343 | [[1, 0.14285714285714285, 1, 0.0, 0, 0, 0, 0, 16, 32, 1549], 3344 | [1, 0.2222222222222222, 1, 0.0, 0, 0, 0, 1, 15, 337, 2058], 3345 | [1, 0.25, 2, 0.0, 0, 0, 0, 0, 5, 310, 6343], 3346 | [1, 0.0, 4, 0.0, 0, 97, 0, 0, 1, 14107, 7514], 3347 | [1, 0.36363636363636365, 2, 0.0, 0, 0, 0, 0, 16, 8, 1050], 3348 | [1, 0.25, 2, 0.0, 0, 13, 0, 0, 15, 87, 6741], 3349 | [1, 0.0, 1, 0, 0, 0, 0, 1, 21, 24, 5862], 3350 | [1, 0.0, 1, 0, 0, 13, 0, 1, 27, 1289, 689], 3351 | [1, 0.0, 1, 0.0, 0, 29, 0, 1, 0, 31, 148], 3352 | [1, 0.0, 1, 0, 0, 119, 0, 0, 32, 636, 1293], 3353 | [1, 0.0, 4, 0.0, 0, 20, 0, 0, 144, 3617, 1346], 3354 | [1, 0.21428571428571427, 2, 0.0, 0, 0, 0, 0, 17, 71, 7495], 3355 | [1, 0.13333333333333333, 2, 0.0, 0, 113, 0, 1, 3, 305, 303], 3356 | [0, 0.4444444444444444, 2, 0.0, 0, 0, 0, 1, 1, 63, 283], 3357 | [1, 0.0, 3, 0.0, 0, 0, 0, 0, 17, 115, 7506], 3358 | [0, 0.0625, 2, 0.0, 0, 0, 0, 1, 272, 1446, 2362], 3359 | [1, 0.15384615384615385, 2, 0.0, 0, 0, 0, 0, 6, 1150, 732], 3360 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 15, 60, 1631], 3361 | [1, 0.0, 1, 0, 0, 13, 0, 0, 15, 11, 221], 3362 | [1, 0.0, 1, 0, 0, 1, 0, 1, 0, 21, 23], 3363 | [1, 0.23076923076923078, 1, 0, 0, 0, 0, 0, 0, 4, 173], 3364 | [1, 0.25, 1, 0.0, 0, 20, 0, 0, 1, 29, 457], 3365 | [1, 0.5, 1, 0.0, 0, 0, 0, 0, 1, 831, 5424], 3366 | [1, 0.0, 3, 0.0, 0, 150, 1, 0, 158, 7063, 1355], 3367 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 15, 39, 2045], 3368 | [1, 0.0, 4, 0.05555555555555555, 0, 127, 0, 0, 196, 486, 198], 3369 | [1, 0.0, 1, 0.0, 0, 76, 0, 1, 7, 509, 372], 3370 | [1, 0.0, 2, 0.0, 0, 48, 0, 0, 1, 5079, 879], 3371 | [1, 0.0, 1, 0.0, 0, 19, 0, 1, 9, 1778, 1477], 3372 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 15, 29, 543], 3373 | [1, 0.0, 3, 0.0, 0, 77, 0, 1, 784, 526, 1235], 3374 | [1, 0.0, 2, 0.0, 0, 81, 1, 0, 3, 9123, 6144], 3375 | [1, 0.0, 2, 0.0, 0, 33, 0, 0, 15, 134, 416], 3376 | [1, 0.0, 2, 0.0, 0, 79, 0, 1, 38, 506, 804], 3377 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 20, 27, 2557], 3378 | [1, 0.125, 2, 0.0, 0, 0, 0, 0, 15, 9, 1151], 3379 | [1, 0.42105263157894735, 2, 0.0, 0, 0, 0, 0, 18, 12, 1212], 3380 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 15, 14, 600], 3381 | [1, 0.0, 5, 0.0, 0, 25, 0, 0, 12, 1224, 774], 3382 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 15, 23, 2056], 3383 | [1, 0.42857142857142855, 1, 0.0, 0, 0, 0, 0, 18, 27, 395], 3384 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 10, 444, 1116], 3385 | [1, 0.0, 1, 0.0, 0, 43, 0, 0, 57, 214, 2377], 3386 | [1, 0.047619047619047616, 2, 0.0, 0, 0, 0, 1, 15, 15, 6047], 3387 | [1, 0.05263157894736842, 2, 0.0, 0, 1, 0, 0, 15, 55, 5313], 3388 | [1, 0.18181818181818182, 2, 0.0, 0, 0, 0, 0, 16, 95, 1228], 3389 | [1, 0.15384615384615385, 1, 0.0, 0, 0, 0, 0, 16, 56, 3665], 3390 | [1, 0.0, 1, 0, 0, 0, 0, 0, 15, 5, 1568], 3391 | [0, 0.16666666666666666, 2, 0.0, 0, 0, 0, 1, 3, 8, 28], 3392 | [1, 0.4117647058823529, 2, 0.0, 0, 0, 0, 0, 1, 69, 196]] 3393 | 3394 | 3395 | 3396 | 3397 | ```python 3398 | # Generate pandas dataframe 3399 | y_test_data = pd.DataFrame(y_table, 3400 | columns = ['profile pic', 3401 | 'nums/length username', 3402 | 'fullname words', 3403 | 'nums/length fullname', 3404 | 'name==username', 3405 | 'description length', 3406 | 'external URL', 3407 | 'private', 3408 | '#posts', 3409 | '#followers', 3410 | '#follows']) 3411 | y_test_data 3412 | ``` 3413 | 3414 | 3415 | 3416 | 3417 |
3418 | 3431 | 3432 | 3433 | 3434 | 3435 | 3436 | 3437 | 3438 | 3439 | 3440 | 3441 | 3442 | 3443 | 3444 | 3445 | 3446 | 3447 | 3448 | 3449 | 3450 | 3451 | 3452 | 3453 | 3454 | 3455 | 3456 | 3457 | 3458 | 3459 | 3460 | 3461 | 3462 | 3463 | 3464 | 3465 | 3466 | 3467 | 3468 | 3469 | 3470 | 3471 | 3472 | 3473 | 3474 | 3475 | 3476 | 3477 | 3478 | 3479 | 3480 | 3481 | 3482 | 3483 | 3484 | 3485 | 3486 | 3487 | 3488 | 3489 | 3490 | 3491 | 3492 | 3493 | 3494 | 3495 | 3496 | 3497 | 3498 | 3499 | 3500 | 3501 | 3502 | 3503 | 3504 | 3505 | 3506 | 3507 | 3508 | 3509 | 3510 | 3511 | 3512 | 3513 | 3514 | 3515 | 3516 | 3517 | 3518 | 3519 | 3520 | 3521 | 3522 | 3523 | 3524 | 3525 | 3526 | 3527 | 3528 | 3529 | 3530 | 3531 | 3532 | 3533 | 3534 | 3535 | 3536 | 3537 | 3538 | 3539 | 3540 | 3541 | 3542 | 3543 | 3544 | 3545 | 3546 | 3547 | 3548 | 3549 | 3550 | 3551 | 3552 | 3553 | 3554 | 3555 | 3556 | 3557 | 3558 | 3559 | 3560 | 3561 | 3562 | 3563 | 3564 | 3565 | 3566 | 3567 | 3568 | 3569 | 3570 | 3571 | 3572 | 3573 | 3574 | 3575 | 3576 | 3577 | 3578 | 3579 | 3580 | 3581 | 3582 | 3583 | 3584 | 3585 | 3586 | 3587 | 3588 | 3589 | 3590 | 3591 | 3592 | 3593 | 3594 | 3595 | 3596 | 3597 | 3598 | 3599 | 3600 | 3601 | 3602 | 3603 | 3604 | 3605 | 3606 | 3607 | 3608 | 3609 | 3610 | 3611 | 3612 | 3613 | 3614 | 3615 | 3616 | 3617 | 3618 | 3619 | 3620 | 3621 | 3622 | 3623 | 3624 | 3625 | 3626 | 3627 | 3628 | 3629 | 3630 | 3631 | 3632 | 3633 | 3634 | 3635 | 3636 | 3637 | 3638 | 3639 | 3640 | 3641 | 3642 | 3643 | 3644 | 3645 | 3646 | 3647 | 3648 | 3649 | 3650 | 3651 | 3652 | 3653 | 3654 | 3655 | 3656 | 3657 | 3658 | 3659 | 3660 | 3661 | 3662 | 3663 | 3664 | 3665 | 3666 | 3667 | 3668 | 3669 | 3670 | 3671 | 3672 | 3673 | 3674 | 3675 | 3676 | 3677 | 3678 | 3679 | 3680 | 3681 | 3682 | 3683 | 3684 | 3685 | 3686 | 3687 | 3688 | 3689 | 3690 | 3691 | 3692 | 3693 | 3694 | 3695 | 3696 | 3697 | 3698 | 3699 | 3700 | 3701 | 3702 | 3703 | 3704 | 3705 | 3706 | 3707 | 3708 | 3709 | 3710 | 3711 | 3712 | 3713 | 3714 | 3715 | 3716 | 3717 | 3718 | 3719 | 3720 | 3721 | 3722 | 3723 | 3724 | 3725 | 3726 | 3727 | 3728 | 3729 | 3730 | 3731 | 3732 | 3733 | 3734 | 3735 | 3736 | 3737 | 3738 | 3739 | 3740 | 3741 | 3742 | 3743 | 3744 | 3745 | 3746 | 3747 | 3748 | 3749 | 3750 | 3751 | 3752 | 3753 | 3754 | 3755 | 3756 | 3757 | 3758 | 3759 | 3760 | 3761 | 3762 | 3763 | 3764 | 3765 | 3766 | 3767 | 3768 | 3769 | 3770 | 3771 | 3772 | 3773 | 3774 | 3775 | 3776 | 3777 | 3778 | 3779 | 3780 | 3781 | 3782 | 3783 | 3784 | 3785 | 3786 | 3787 | 3788 | 3789 | 3790 | 3791 | 3792 | 3793 | 3794 | 3795 | 3796 | 3797 | 3798 | 3799 | 3800 | 3801 | 3802 | 3803 | 3804 | 3805 | 3806 | 3807 | 3808 | 3809 | 3810 | 3811 | 3812 | 3813 | 3814 | 3815 | 3816 | 3817 | 3818 | 3819 | 3820 | 3821 | 3822 | 3823 | 3824 | 3825 | 3826 | 3827 | 3828 | 3829 | 3830 | 3831 | 3832 | 3833 | 3834 | 3835 | 3836 | 3837 | 3838 | 3839 | 3840 | 3841 | 3842 | 3843 | 3844 | 3845 | 3846 | 3847 | 3848 | 3849 | 3850 | 3851 | 3852 | 3853 | 3854 | 3855 | 3856 | 3857 | 3858 | 3859 | 3860 | 3861 | 3862 | 3863 | 3864 | 3865 | 3866 | 3867 | 3868 | 3869 | 3870 | 3871 | 3872 | 3873 | 3874 | 3875 | 3876 | 3877 | 3878 | 3879 | 3880 | 3881 | 3882 | 3883 | 3884 | 3885 | 3886 | 3887 | 3888 | 3889 | 3890 | 3891 | 3892 | 3893 | 3894 | 3895 | 3896 | 3897 | 3898 | 3899 | 3900 | 3901 | 3902 | 3903 | 3904 | 3905 | 3906 | 3907 | 3908 | 3909 | 3910 | 3911 | 3912 | 3913 | 3914 | 3915 | 3916 | 3917 | 3918 | 3919 | 3920 | 3921 | 3922 | 3923 | 3924 | 3925 | 3926 | 3927 | 3928 | 3929 | 3930 | 3931 | 3932 | 3933 | 3934 | 3935 | 3936 | 3937 | 3938 | 3939 | 3940 | 3941 | 3942 | 3943 | 3944 | 3945 | 3946 | 3947 | 3948 | 3949 | 3950 | 3951 | 3952 | 3953 | 3954 | 3955 | 3956 | 3957 | 3958 | 3959 | 3960 | 3961 | 3962 | 3963 | 3964 | 3965 | 3966 | 3967 | 3968 | 3969 | 3970 | 3971 | 3972 | 3973 | 3974 | 3975 | 3976 | 3977 | 3978 | 3979 | 3980 | 3981 | 3982 | 3983 | 3984 | 3985 | 3986 | 3987 | 3988 | 3989 | 3990 | 3991 | 3992 | 3993 | 3994 | 3995 | 3996 | 3997 | 3998 | 3999 | 4000 | 4001 | 4002 | 4003 | 4004 | 4005 | 4006 | 4007 | 4008 | 4009 | 4010 | 4011 | 4012 | 4013 | 4014 | 4015 | 4016 | 4017 | 4018 | 4019 | 4020 | 4021 | 4022 | 4023 | 4024 | 4025 | 4026 | 4027 | 4028 | 4029 | 4030 | 4031 | 4032 | 4033 | 4034 | 4035 | 4036 | 4037 | 4038 | 4039 | 4040 | 4041 | 4042 | 4043 | 4044 | 4045 | 4046 | 4047 | 4048 | 4049 | 4050 | 4051 | 4052 | 4053 | 4054 | 4055 | 4056 | 4057 | 4058 | 4059 | 4060 | 4061 | 4062 | 4063 | 4064 | 4065 | 4066 | 4067 | 4068 | 4069 | 4070 | 4071 | 4072 | 4073 | 4074 | 4075 | 4076 | 4077 | 4078 | 4079 | 4080 | 4081 | 4082 | 4083 | 4084 | 4085 | 4086 | 4087 | 4088 | 4089 | 4090 | 4091 | 4092 | 4093 | 4094 | 4095 | 4096 | 4097 | 4098 | 4099 | 4100 | 4101 | 4102 | 4103 | 4104 | 4105 | 4106 | 4107 | 4108 | 4109 | 4110 | 4111 | 4112 | 4113 | 4114 | 4115 | 4116 | 4117 | 4118 | 4119 | 4120 | 4121 | 4122 | 4123 | 4124 | 4125 | 4126 | 4127 | 4128 | 4129 | 4130 | 4131 | 4132 | 4133 | 4134 | 4135 | 4136 | 4137 | 4138 | 4139 | 4140 | 4141 | 4142 | 4143 | 4144 | 4145 | 4146 | 4147 | 4148 | 4149 | 4150 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#follows
010.14285710.000000000016321549
110.22222210.0000000001153372058
210.25000020.000000000053106343
310.00000040.000000097001141077514
410.36363620.00000000001681050
510.25000020.0000000130015876741
610.00000010.000000000121245862
710.00000010.00000001301271289689
810.00000010.00000002901031148
910.00000010.000000011900326361293
1010.00000040.0000000200014436171346
1110.21428620.000000000017717495
1210.13333320.0000000113013305303
1300.44444420.0000000001163283
1410.00000030.0000000000171157506
1500.06250020.000000000127214462362
1610.15384620.000000000061150732
1710.00000020.000000000015601631
1810.00000010.000000013001511221
1910.00000010.000000010102123
2010.23076910.000000000004173
2110.25000010.00000002000129457
2210.50000010.000000000018315424
2310.00000030.00000001501015870631355
2410.00000010.000000000115392045
2510.00000040.055556012700196486198
2610.00000010.000000076017509372
2710.00000020.0000000480015079879
2810.00000010.00000001901917781477
2910.00000020.00000000001529543
3010.00000030.000000077017845261235
3110.00000020.00000008110391236144
3210.00000020.0000000330015134416
3310.00000020.0000000790138506804
3410.00000020.000000000020272557
3510.12500020.00000000001591151
3610.42105320.000000000018121212
3710.00000010.00000000001514600
3810.00000050.00000002500121224774
3910.00000010.000000000015232056
4010.42857110.00000000001827395
4110.00000020.0000000001104441116
4210.00000010.00000004300572142377
4310.04761920.000000000115156047
4410.05263220.000000010015555313
4510.18181820.000000000016951228
4610.15384610.000000000016563665
4710.00000010.00000000001551568
4800.16666720.00000000013828
4910.41176520.0000000000169196
4151 |
4152 | 4153 | 4154 | 4155 | 4156 | ```python 4157 | # Predict (no retraining!) 4158 | rfc_labels_y = rfc_model.predict(y_test_data) 4159 | rfc_labels_y 4160 | ``` 4161 | 4162 | 4163 | 4164 | 4165 | array([1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 4166 | 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 4167 | 1, 1, 1, 1, 1, 1]) 4168 | 4169 | 4170 | 4171 | 4172 | ```python 4173 | # Calculate the number of fake accounts in the random sample of 50 followers 4174 | no_fakes_y = len([x for x in rfc_labels_y if x==1]) 4175 | ``` 4176 | 4177 | 4178 | ```python 4179 | # Calculate the authenticity 4180 | y_authenticity = (len(y_random_followers) - no_fakes_y) * 100 / len(y_random_followers) 4181 | print("User Y's Instagram Followers is " + str(y_authenticity) + "% authentic.") 4182 | ``` 4183 | 4184 | User Y's Instagram Followers is 38.0% authentic. 4185 | 4186 | 4187 | Ahh, the joys of being right! 4188 | 4189 | **USER Y LIKES ANALYSIS** 4190 | 4191 | 4192 | ```python 4193 | y_posts = get_user_posts(userID_y, 10) 4194 | ``` 4195 | 4196 | Total posts retrieved: 18 4197 | 4198 | 4199 | 4200 | ```python 4201 | y_random_post = random.sample(y_posts, 1) 4202 | ``` 4203 | 4204 | 4205 | ```python 4206 | y_likers = api.media_likers(y_random_post[0]['id']) 4207 | ``` 4208 | 4209 | 4210 | ```python 4211 | y_likers_usernames = [liker['username'] for liker in y_likers['users']] 4212 | ``` 4213 | 4214 | 4215 | ```python 4216 | y_random_likers = random.sample(y_likers_usernames, 50) 4217 | ``` 4218 | 4219 | 4220 | ```python 4221 | y_likers_infos = [] 4222 | 4223 | for liker in y_random_likers: 4224 | info = api.user_info(get_ID(liker))['user'] 4225 | y_likers_infos.append(info) 4226 | ``` 4227 | 4228 | 4229 | ```python 4230 | y_likers_table = [] 4231 | 4232 | for info in y_likers_infos: 4233 | y_likers_table.append(get_data(info)) 4234 | 4235 | y_likers_table 4236 | ``` 4237 | 4238 | 4239 | 4240 | 4241 | [[1, 0.0, 2, 0.0, 0, 0, 0, 0, 2, 897, 830], 4242 | [0, 0.0, 2, 0.0, 0, 0, 0, 1, 0, 129, 132], 4243 | [1, 0.0, 2, 0.0, 0, 8, 0, 1, 72, 1157, 698], 4244 | [1, 0.0, 1, 0, 0, 10, 0, 1, 6, 1410, 619], 4245 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 0, 1916, 731], 4246 | [1, 0.2222222222222222, 3, 0.0, 0, 72, 0, 1, 13, 950, 649], 4247 | [1, 0.0, 1, 0.0, 0, 19, 0, 1, 17, 1543, 1289], 4248 | [1, 0.2, 5, 0.0, 0, 11, 0, 0, 33, 1076, 606], 4249 | [1, 0.0, 1, 0.0, 0, 104, 0, 1, 6, 202, 485], 4250 | [1, 0.2, 1, 0.0, 0, 15, 0, 0, 7, 1262, 679], 4251 | [1, 0.15384615384615385, 2, 0.0, 0, 0, 0, 0, 6, 1150, 732], 4252 | [1, 0.0, 1, 0.0, 0, 17, 1, 0, 28, 2442, 629], 4253 | [1, 0.0, 2, 0.0, 0, 61, 0, 0, 159, 556, 765], 4254 | [1, 0.0, 2, 0.0, 0, 34, 0, 1, 10, 531, 526], 4255 | [1, 0.0, 3, 0.0, 0, 127, 0, 0, 23, 1137, 909], 4256 | [1, 0.0, 2, 0.0, 0, 66, 0, 1, 25, 583, 805], 4257 | [1, 0.13333333333333333, 2, 0.0, 0, 67, 1, 0, 141, 4615, 1948], 4258 | [1, 0.0, 2, 0.0, 0, 47, 0, 1, 387, 75, 162], 4259 | [1, 0.0, 1, 0.0, 0, 142, 0, 1, 8144, 664, 1527], 4260 | [1, 0.0, 3, 0.0, 0, 4, 0, 1, 1, 466, 325], 4261 | [1, 0.058823529411764705, 1, 0.0, 0, 32, 0, 0, 14, 419, 414], 4262 | [1, 0.0, 3, 0.0, 0, 75, 1, 0, 353, 1399, 764], 4263 | [1, 0.0, 1, 0, 0, 0, 0, 0, 9, 611, 554], 4264 | [1, 0.0, 1, 0.0, 0, 29, 0, 1, 3, 2064, 1077], 4265 | [1, 0.0, 1, 0.0, 0, 26, 0, 1, 37, 628, 714], 4266 | [1, 0.0, 2, 0.0, 0, 89, 1, 1, 243, 2316, 1030], 4267 | [1, 0.0, 2, 0.0, 0, 140, 1, 0, 666, 4460, 492], 4268 | [1, 0.0, 2, 0.0, 0, 20, 0, 0, 71, 4101, 878], 4269 | [1, 0.0, 2, 0.0, 0, 5, 0, 0, 148, 424, 716], 4270 | [1, 0.0, 1, 0, 0, 0, 0, 1, 2, 640, 730], 4271 | [1, 0.0, 2, 0.0, 0, 64, 0, 1, 8, 1141, 891], 4272 | [1, 0.0, 3, 0.0, 0, 29, 0, 1, 10, 1378, 986], 4273 | [1, 0.0, 2, 0.0, 0, 14, 0, 1, 3, 994, 698], 4274 | [1, 0.0, 1, 0.0, 0, 29, 0, 1, 43, 181, 169], 4275 | [1, 0.0, 1, 0.0, 0, 58, 1, 0, 24, 1144, 1091], 4276 | [1, 0.0, 2, 0.0, 0, 25, 0, 1, 36, 687, 574], 4277 | [1, 0.0, 3, 0.0, 0, 8, 0, 1, 33, 1846, 996], 4278 | [1, 0.5714285714285714, 2, 0.0, 0, 18, 0, 1, 202, 1180, 600], 4279 | [1, 0.0, 2, 0.0, 0, 7, 0, 0, 45, 1206, 676], 4280 | [1, 0.0, 2, 0.0, 0, 76, 0, 0, 12, 661, 3004], 4281 | [1, 0.0, 1, 0.0, 0, 9, 0, 1, 5, 759, 706], 4282 | [0, 0.0, 3, 0.0, 0, 61, 0, 1, 9, 439, 612], 4283 | [1, 0.16666666666666666, 1, 0.0, 0, 0, 0, 1, 3, 911, 822], 4284 | [1, 0.4, 2, 0.0, 0, 82, 0, 0, 99, 556, 733], 4285 | [1, 0.0, 2, 0.0, 0, 80, 0, 1, 21, 478, 385], 4286 | [1, 0.0, 1, 0, 0, 0, 0, 1, 0, 653, 312], 4287 | [1, 0.0, 1, 0.0, 0, 13, 0, 1, 40, 713, 657], 4288 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 4, 113, 311], 4289 | [1, 0.0, 2, 0.0, 0, 33, 0, 0, 74, 3564, 1051], 4290 | [1, 0.0, 1, 0.0, 0, 121, 0, 0, 958, 904, 479]] 4291 | 4292 | 4293 | 4294 | 4295 | ```python 4296 | y_likers_data = pd.DataFrame(y_likers_table, 4297 | columns = ['profile pic', 4298 | 'nums/length username', 4299 | 'fullname words', 4300 | 'nums/length fullname', 4301 | 'name==username', 4302 | 'description length', 4303 | 'external URL', 4304 | 'private', 4305 | '#posts', 4306 | '#followers', 4307 | '#follows']) 4308 | y_likers_data 4309 | ``` 4310 | 4311 | 4312 | 4313 | 4314 |
4315 | 4328 | 4329 | 4330 | 4331 | 4332 | 4333 | 4334 | 4335 | 4336 | 4337 | 4338 | 4339 | 4340 | 4341 | 4342 | 4343 | 4344 | 4345 | 4346 | 4347 | 4348 | 4349 | 4350 | 4351 | 4352 | 4353 | 4354 | 4355 | 4356 | 4357 | 4358 | 4359 | 4360 | 4361 | 4362 | 4363 | 4364 | 4365 | 4366 | 4367 | 4368 | 4369 | 4370 | 4371 | 4372 | 4373 | 4374 | 4375 | 4376 | 4377 | 4378 | 4379 | 4380 | 4381 | 4382 | 4383 | 4384 | 4385 | 4386 | 4387 | 4388 | 4389 | 4390 | 4391 | 4392 | 4393 | 4394 | 4395 | 4396 | 4397 | 4398 | 4399 | 4400 | 4401 | 4402 | 4403 | 4404 | 4405 | 4406 | 4407 | 4408 | 4409 | 4410 | 4411 | 4412 | 4413 | 4414 | 4415 | 4416 | 4417 | 4418 | 4419 | 4420 | 4421 | 4422 | 4423 | 4424 | 4425 | 4426 | 4427 | 4428 | 4429 | 4430 | 4431 | 4432 | 4433 | 4434 | 4435 | 4436 | 4437 | 4438 | 4439 | 4440 | 4441 | 4442 | 4443 | 4444 | 4445 | 4446 | 4447 | 4448 | 4449 | 4450 | 4451 | 4452 | 4453 | 4454 | 4455 | 4456 | 4457 | 4458 | 4459 | 4460 | 4461 | 4462 | 4463 | 4464 | 4465 | 4466 | 4467 | 4468 | 4469 | 4470 | 4471 | 4472 | 4473 | 4474 | 4475 | 4476 | 4477 | 4478 | 4479 | 4480 | 4481 | 4482 | 4483 | 4484 | 4485 | 4486 | 4487 | 4488 | 4489 | 4490 | 4491 | 4492 | 4493 | 4494 | 4495 | 4496 | 4497 | 4498 | 4499 | 4500 | 4501 | 4502 | 4503 | 4504 | 4505 | 4506 | 4507 | 4508 | 4509 | 4510 | 4511 | 4512 | 4513 | 4514 | 4515 | 4516 | 4517 | 4518 | 4519 | 4520 | 4521 | 4522 | 4523 | 4524 | 4525 | 4526 | 4527 | 4528 | 4529 | 4530 | 4531 | 4532 | 4533 | 4534 | 4535 | 4536 | 4537 | 4538 | 4539 | 4540 | 4541 | 4542 | 4543 | 4544 | 4545 | 4546 | 4547 | 4548 | 4549 | 4550 | 4551 | 4552 | 4553 | 4554 | 4555 | 4556 | 4557 | 4558 | 4559 | 4560 | 4561 | 4562 | 4563 | 4564 | 4565 | 4566 | 4567 | 4568 | 4569 | 4570 | 4571 | 4572 | 4573 | 4574 | 4575 | 4576 | 4577 | 4578 | 4579 | 4580 | 4581 | 4582 | 4583 | 4584 | 4585 | 4586 | 4587 | 4588 | 4589 | 4590 | 4591 | 4592 | 4593 | 4594 | 4595 | 4596 | 4597 | 4598 | 4599 | 4600 | 4601 | 4602 | 4603 | 4604 | 4605 | 4606 | 4607 | 4608 | 4609 | 4610 | 4611 | 4612 | 4613 | 4614 | 4615 | 4616 | 4617 | 4618 | 4619 | 4620 | 4621 | 4622 | 4623 | 4624 | 4625 | 4626 | 4627 | 4628 | 4629 | 4630 | 4631 | 4632 | 4633 | 4634 | 4635 | 4636 | 4637 | 4638 | 4639 | 4640 | 4641 | 4642 | 4643 | 4644 | 4645 | 4646 | 4647 | 4648 | 4649 | 4650 | 4651 | 4652 | 4653 | 4654 | 4655 | 4656 | 4657 | 4658 | 4659 | 4660 | 4661 | 4662 | 4663 | 4664 | 4665 | 4666 | 4667 | 4668 | 4669 | 4670 | 4671 | 4672 | 4673 | 4674 | 4675 | 4676 | 4677 | 4678 | 4679 | 4680 | 4681 | 4682 | 4683 | 4684 | 4685 | 4686 | 4687 | 4688 | 4689 | 4690 | 4691 | 4692 | 4693 | 4694 | 4695 | 4696 | 4697 | 4698 | 4699 | 4700 | 4701 | 4702 | 4703 | 4704 | 4705 | 4706 | 4707 | 4708 | 4709 | 4710 | 4711 | 4712 | 4713 | 4714 | 4715 | 4716 | 4717 | 4718 | 4719 | 4720 | 4721 | 4722 | 4723 | 4724 | 4725 | 4726 | 4727 | 4728 | 4729 | 4730 | 4731 | 4732 | 4733 | 4734 | 4735 | 4736 | 4737 | 4738 | 4739 | 4740 | 4741 | 4742 | 4743 | 4744 | 4745 | 4746 | 4747 | 4748 | 4749 | 4750 | 4751 | 4752 | 4753 | 4754 | 4755 | 4756 | 4757 | 4758 | 4759 | 4760 | 4761 | 4762 | 4763 | 4764 | 4765 | 4766 | 4767 | 4768 | 4769 | 4770 | 4771 | 4772 | 4773 | 4774 | 4775 | 4776 | 4777 | 4778 | 4779 | 4780 | 4781 | 4782 | 4783 | 4784 | 4785 | 4786 | 4787 | 4788 | 4789 | 4790 | 4791 | 4792 | 4793 | 4794 | 4795 | 4796 | 4797 | 4798 | 4799 | 4800 | 4801 | 4802 | 4803 | 4804 | 4805 | 4806 | 4807 | 4808 | 4809 | 4810 | 4811 | 4812 | 4813 | 4814 | 4815 | 4816 | 4817 | 4818 | 4819 | 4820 | 4821 | 4822 | 4823 | 4824 | 4825 | 4826 | 4827 | 4828 | 4829 | 4830 | 4831 | 4832 | 4833 | 4834 | 4835 | 4836 | 4837 | 4838 | 4839 | 4840 | 4841 | 4842 | 4843 | 4844 | 4845 | 4846 | 4847 | 4848 | 4849 | 4850 | 4851 | 4852 | 4853 | 4854 | 4855 | 4856 | 4857 | 4858 | 4859 | 4860 | 4861 | 4862 | 4863 | 4864 | 4865 | 4866 | 4867 | 4868 | 4869 | 4870 | 4871 | 4872 | 4873 | 4874 | 4875 | 4876 | 4877 | 4878 | 4879 | 4880 | 4881 | 4882 | 4883 | 4884 | 4885 | 4886 | 4887 | 4888 | 4889 | 4890 | 4891 | 4892 | 4893 | 4894 | 4895 | 4896 | 4897 | 4898 | 4899 | 4900 | 4901 | 4902 | 4903 | 4904 | 4905 | 4906 | 4907 | 4908 | 4909 | 4910 | 4911 | 4912 | 4913 | 4914 | 4915 | 4916 | 4917 | 4918 | 4919 | 4920 | 4921 | 4922 | 4923 | 4924 | 4925 | 4926 | 4927 | 4928 | 4929 | 4930 | 4931 | 4932 | 4933 | 4934 | 4935 | 4936 | 4937 | 4938 | 4939 | 4940 | 4941 | 4942 | 4943 | 4944 | 4945 | 4946 | 4947 | 4948 | 4949 | 4950 | 4951 | 4952 | 4953 | 4954 | 4955 | 4956 | 4957 | 4958 | 4959 | 4960 | 4961 | 4962 | 4963 | 4964 | 4965 | 4966 | 4967 | 4968 | 4969 | 4970 | 4971 | 4972 | 4973 | 4974 | 4975 | 4976 | 4977 | 4978 | 4979 | 4980 | 4981 | 4982 | 4983 | 4984 | 4985 | 4986 | 4987 | 4988 | 4989 | 4990 | 4991 | 4992 | 4993 | 4994 | 4995 | 4996 | 4997 | 4998 | 4999 | 5000 | 5001 | 5002 | 5003 | 5004 | 5005 | 5006 | 5007 | 5008 | 5009 | 5010 | 5011 | 5012 | 5013 | 5014 | 5015 | 5016 | 5017 | 5018 | 5019 | 5020 | 5021 | 5022 | 5023 | 5024 | 5025 | 5026 | 5027 | 5028 | 5029 | 5030 | 5031 | 5032 | 5033 | 5034 | 5035 | 5036 | 5037 | 5038 | 5039 | 5040 | 5041 | 5042 | 5043 | 5044 | 5045 | 5046 | 5047 |
profile picnums/length usernamefullname wordsnums/length fullnamename==usernamedescription lengthexternal URLprivate#posts#followers#follows
010.00000020.000002897830
100.00000020.000010129132
210.00000020.00801721157698
310.00000010.00100161410619
410.00000010.0000001916731
510.22222230.00720113950649
610.00000010.0019011715431289
710.20000050.001100331076606
810.00000010.00104016202485
910.20000010.00150071262679
1010.15384620.0000061150732
1110.00000010.001710282442629
1210.00000020.006100159556765
1310.00000020.00340110531526
1410.00000030.0012700231137909
1510.00000020.00660125583805
1610.13333320.00671014146151948
1710.00000020.00470138775162
1810.00000010.001420181446641527
1910.00000030.004011466325
2010.05882410.00320014419414
2110.00000030.0075103531399764
2210.00000010.000009611554
2310.00000010.002901320641077
2410.00000010.00260137628714
2510.00000020.00891124323161030
2610.00000020.00140106664460492
2710.00000020.002000714101878
2810.00000020.00500148424716
2910.00000010.000012640730
3010.00000020.00640181141891
3110.00000030.002901101378986
3210.00000020.0014013994698
3310.00000010.00290143181169
3410.00000010.0058102411441091
3510.00000020.00250136687574
3610.00000030.00801331846996
3710.57142920.0018012021180600
3810.00000020.00700451206676
3910.00000020.007600126613004
4010.00000010.009015759706
4100.00000030.0061019439612
4210.16666710.000013911822
4310.40000020.00820099556733
4410.00000020.00800121478385
4510.00000010.000010653312
4610.00000010.00130140713657
4710.00000020.000014113311
4810.00000020.0033007435641051
4910.00000010.0012100958904479
5048 |
5049 | 5050 | 5051 | 5052 | 5053 | ```python 5054 | # Predict! 5055 | y_likers_pred = rfc_model.predict(y_likers_data) 5056 | y_likers_pred 5057 | ``` 5058 | 5059 | 5060 | 5061 | 5062 | array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5063 | 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 5064 | 0, 0, 0, 0, 0, 0]) 5065 | 5066 | 5067 | 5068 | 5069 | ```python 5070 | # Calculate the number of fake likes 5071 | no_fakes_yl = len([x for x in y_likers_pred if x==1]) 5072 | 5073 | # Calculate media likes authenticity 5074 | y_post_authenticity = (len(y_random_likers) - no_fakes_yl) * 100 / len(y_random_likers) 5075 | print("The media with the ID:YYYYY has " + str(y_post_authenticity) + "% authentic likes.") 5076 | ``` 5077 | 5078 | The media with the ID:YYYYY has 96.0% authentic likes. 5079 | 5080 | 5081 | Very high likes authenticity but very low follower authenticity? How is that possible? 5082 | 5083 | We can use **engagement rates** to explain this phenomena further. 5084 | 5085 | Engagement rate = average number of engagements (likes+comments) / number of followers) 5086 | 5087 | 5088 | ```python 5089 | y_posts[0].keys() 5090 | ``` 5091 | 5092 | 5093 | 5094 | 5095 | dict_keys(['taken_at', 'pk', 'id', 'device_timestamp', 'media_type', 'code', 'client_cache_key', 'filter_type', 'carousel_media_count', 'carousel_media', 'can_see_insights_as_brand', 'location', 'lat', 'lng', 'user', 'can_viewer_reshare', 'caption_is_edited', 'comment_likes_enabled', 'comment_threading_enabled', 'has_more_comments', 'max_num_visible_preview_comments', 'preview_comments', 'can_view_more_preview_comments', 'comment_count', 'inline_composer_display_condition', 'inline_composer_imp_trigger_time', 'like_count', 'has_liked', 'top_likers', 'photo_of_you', 'caption', 'can_viewer_save', 'organic_tracking_token']) 5096 | 5097 | 5098 | 5099 | 5100 | ```python 5101 | count = 0 5102 | 5103 | for post in y_posts: 5104 | count += post['comment_count'] 5105 | count += post['like_count'] 5106 | 5107 | average_engagements = count / len(y_posts) 5108 | engagement_rate = average_engagements*100 / len(y_followers) 5109 | 5110 | engagement_rate 5111 | ``` 5112 | 5113 | 5114 | 5115 | 5116 | 9.50268408791654 5117 | 5118 | 5119 | 5120 | This means that only roughly 9.5% of user Y's followers engage with their content. 5121 | 5122 | ## Part 8: Thoughts 5123 | 5124 | **Making sense of the result** 5125 | 5126 | So user X received an 82% follower authenticity score and a 92% media likes authenticity on one of their posts. Is that good enough? What about user Y with a 35% follower authenticity score and a 96% media likes authenticity? 5127 | 5128 | Since this entire notebook is an exploratory analysis, there's not really a hard line between a 'good' influencer and a 'bad' influencer. For user X, we can tell that the user has authentic and loyal followers. However for user Y, we can assume that they have a rather low authentic follower score, however their likes consist of real followers. This means that user Y might have invested on buying followers, but not likes! This causes a really low engagement rate. 5129 | 5130 | In fact, with a little bit more research, you can sort of establish a pattern just by observation: 5131 | - High follower authenticity, high media authenticity, high engagement rate = authentic user 5132 | - Low follower authenticity, high media authenticity, low engagement rate = buys followers, does not buy likes 5133 | - Low follower authenticity, high media authenticity, high engagement rate = buys followers AND likes 5134 | - ... and so on! 5135 | 5136 | **So is this influencer worth investing or not?** 5137 | 5138 | Remember that we used a *random sample* of 50 followers out of thousands. As objective as random sampling could be, it still isn't an *absolutely complete* picture of the user's followers. However, the follower authenticity combined with the media likes authenticity still provides an insight for brands who are planning to invest on the influencer. 5139 | 5140 | Personally, I feel like any number under 50% is rather suspicious, and there are other ways that you can confirm this suspicion: 5141 | - Low engagement rates (engagement rate = average number of engagements (likes+comments) / number of followers) 5142 | - Spikes in follower growth (uneven growth chart) 5143 | - Comments (loyal followers acutally care about the user's content) 5144 | 5145 | But of course, you have to be aware of tech-savvy influencers who cheats the audit system and try to avoid getting caught, such as influencers who buys 'drip-followers' - i.e. you buy followers in bulk but they arrive slowly. This method will make their follower growth seem gradual. 5146 | 5147 | **Conclusion** 5148 | 5149 | The rapid growth of technology allows anyone with a computer to create bots to follow users and like media on any platform. However, this also means that our ability to detect fake engagements should also improve! 5150 | 5151 | Businesses, small or large, invest on social media influencers to reach a wider audience, especially during times of a global pandemic where everyone is constantly on their phones! Less tech-savvy and less aware ones are prone to this kind of misinformation. 5152 | 5153 | For brands who rely on influencers for marketing, it is highly recommended to check out services such as SocialBlade to check user authenticity and engagement. Some services are more pricey, but is definitely worth the investment! 5154 | 5155 | --------------------------------------------------------------------------------