├── IGAudit.pdf
├── graph_out.png
├── terminal_out.png
├── IGAudit_mdfiles
├── output_19_1.png
├── output_28_1.png
├── output_33_2.png
└── IGAudit.md
├── StatisticalMethod
├── README.md
└── instabusted.py
├── README.md
├── test.csv
└── train.csv
/IGAudit.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit.pdf
--------------------------------------------------------------------------------
/graph_out.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/graph_out.png
--------------------------------------------------------------------------------
/terminal_out.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/terminal_out.png
--------------------------------------------------------------------------------
/IGAudit_mdfiles/output_19_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit_mdfiles/output_19_1.png
--------------------------------------------------------------------------------
/IGAudit_mdfiles/output_28_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit_mdfiles/output_28_1.png
--------------------------------------------------------------------------------
/IGAudit_mdfiles/output_33_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/athiyadeviyani/IGAudit/HEAD/IGAudit_mdfiles/output_33_2.png
--------------------------------------------------------------------------------
/StatisticalMethod/README.md:
--------------------------------------------------------------------------------
1 | # InstaBusted
2 |
3 | #FakersGonnaFake: Using Simple Statistical Tools to Audit Instagram Accounts for Authenticity
4 |
5 | ## Sample output
6 |
7 | Terminal output
8 |
9 |
10 |
11 | Graph output
12 |
13 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # IGAudit
2 | #FakersGonnaFake: Using Simple Statistical Tools and Machine Learning to Audit Instagram Accounts for Authenticity
3 |
4 | **Motivation:** During lockdown, businesses have started increasing the use of social media influencers to market their products while their physical outlets are temporary closed. However, it is sad that there are some that will try and game the system for their own good. But in a world where a single influencer's post is worth as much as an average 9-5 Joe's annual salary, influencer marketing fake followers and fake engagement is a price that brands shouldn't have to pay for.
5 |
6 | *Inspired by igaudit.io, a very userful tool that was unfortunately taken down by Facebook recently.*
7 |
8 | ## Got your attention? Great!
9 | - If you want to read through the code and the outputs quickly, head over to the [PDF](https://github.com/athiyadeviyani/IGAudit/blob/master/IGAudit.pdf), [Markdown](https://github.com/athiyadeviyani/IGAudit/blob/master/IGAudit_mdfiles/IGAudit.md), or the [HTML](https://igaudit-by-tia.glitch.me/) version of the Jupyter Notebook
10 | - If you want to get your hands dirty and understand the inner workings of the code, make sure you install the following requirements and run the [Jupyter Notebook](https://github.com/athiyadeviyani/IGAudit/blob/master/IGAudit.ipynb)!
11 |
12 | ``$ pip install numpy pandas seaborn matplotlib sklearn``
13 |
``$ pip install git+https://git@github.com/ping/instagram_private_api.git@1.6.0``
14 |
15 | ## Contents
16 | - Short Introduction
17 | - Part 1: Understanding and Splitting the Data
18 | - Part 2: Comparing Classification Models
19 | - Logistic Regression
20 | - K-Nearest Neighbors
21 | - Decision Trees
22 | - Random Forest
23 | - Part 3: Obtaining Instagram Data
24 | - Getter methods for the API
25 | - Part 4: Preparing the Data
26 | - Understanding the data obtained
27 | - Filtering and extracting relevant information
28 | - Part 5: Make the prediction!
29 | - Part 6: Extension - Fake Likes
30 | - Part 7: Comparison with Another User
31 | - Part 8: Thoughts
32 | - Making sense of the result
33 | - So is this influencer worth investing or not?
34 | - Conclusion
35 |
--------------------------------------------------------------------------------
/StatisticalMethod/instabusted.py:
--------------------------------------------------------------------------------
1 | from instagram_private_api import Client, ClientCompatPatch
2 | from matplotlib import pyplot as plt
3 |
4 | import getpass
5 | import random
6 |
7 |
8 | # INITIAL AUTHENTICATION
9 | def login():
10 | username = input("username: ")
11 | password = getpass.getpass("password: ")
12 | api = Client(username, password)
13 | return api
14 |
15 | api = login()
16 |
17 | def get_ID(username):
18 | return api.username_info(username)['user']['pk']
19 |
20 | def get_followers(userID, rank):
21 | followers = []
22 | next_max_id = True
23 |
24 | while next_max_id:
25 | if next_max_id == True: next_max_id=''
26 | f = api.user_followers(userID, rank, max_id=next_max_id)
27 | followers.extend(f.get('users', []))
28 | next_max_id = f.get('next_max_id', '')
29 |
30 | user_fer = [dic['username'] for dic in followers]
31 |
32 | return user_fer
33 |
34 |
35 | # GET TARGET USER INFORMATION
36 | username = input("Instagram username for analysis: ")
37 |
38 | uid = get_ID(username)
39 | rank = api.generate_uuid()
40 |
41 |
42 | # GET USER'S FOLLOWERS
43 | followers = get_followers(uid, rank)
44 |
45 | print("This user has " + str(len(followers)) + " followers.")
46 |
47 | print("============================")
48 |
49 |
50 | # GENERATE RANDOM SAMPLE (for efficiency)
51 | samples = int(input("Number of random samples (recommended: 1-100): "))
52 | print("Generating " + str(samples) + " random samples for " + username + " followers!")
53 | random_followers = random.sample(followers, samples)
54 |
55 | print("Analyzing random samples...")
56 |
57 |
58 | # START ANALYSIS OF THE RANDOM SAMPLE
59 | suspicious = []
60 | tuples = []
61 |
62 | i = 0
63 | for follower in random_followers:
64 | f_id = get_ID(follower)
65 | followers = api.user_info(f_id)['user'].get('follower_count')
66 | followings = api.user_info(f_id)['user'].get('following_count')
67 |
68 | tuples.append((followers, followings))
69 |
70 | # SMOOTH!
71 | if followers == 0:
72 | followers = 1
73 |
74 | if followings == 0:
75 | followings = 1
76 |
77 | ratio = followings/followers
78 |
79 | i += 1
80 |
81 | if (i%10==0):
82 | print(str(i) + " out of " + str(len(random_followers)) + " followers processed.")
83 |
84 | # 'FAKENESS' THRESHOLD
85 | # e.g. user_x has 100 followers and 2000 followings, user_x is flagged 'suspicious'
86 | if ratio > 20:
87 | suspicious.append(follower)
88 |
89 | print(str(len(suspicious)) + " suspicious accounts detected!")
90 |
91 |
92 | # CALCULATE THE OVERALL AUTHENTICITY OF THE USER
93 | percentage_fake = len(suspicious)*100/len(random_followers)
94 |
95 | print(username + " has " + str(100-percentage_fake) + "% authenticity!")
96 |
97 |
98 | # GENERATE THE GRAPH
99 | x = [x[0] for x in tuples]
100 | y = [x[1] for x in tuples]
101 |
102 | f, ax = plt.subplots(figsize=(16,10))
103 | plt.scatter(x,y)
104 | plt.plot([i for i in range(max(max(x), max(y)))],
105 | [i for i in range(max(max(x), max(y)))],
106 | color = 'red',
107 | linewidth = 2, label='following:followers = 1:1'
108 | )
109 | plt.text(2500, 4000, 'following:followers = 1:1', size=14)
110 | plt.title('Following:Followers plot for user:' + username + ' Instagram Followers', size=20)
111 | plt.xlabel('Followers', size=14)
112 | plt.ylabel('Following', size=14)
113 |
114 | plt.show()
115 |
--------------------------------------------------------------------------------
/test.csv:
--------------------------------------------------------------------------------
1 | profile pic,nums/length username,fullname words,nums/length fullname,name==username,description length,external URL,private,#posts,#followers,#follows,fake
2 | 1,0.33,1,0.33,1,30,0,1,35,488,604,0
3 | 1,0,5,0,0,64,0,1,3,35,6,0
4 | 1,0,2,0,0,82,0,1,319,328,668,0
5 | 1,0,1,0,0,143,0,1,273,14890,7369,0
6 | 1,0.5,1,0,0,76,0,1,6,225,356,0
7 | 1,0,1,0,0,0,0,1,6,362,424,0
8 | 1,0,1,0,0,132,0,1,9,213,254,0
9 | 1,0,2,0,0,0,0,1,19,552,521,0
10 | 1,0,2,0,0,96,0,1,17,122,143,0
11 | 1,0,1,0,0,78,0,1,9,834,358,0
12 | 1,0,1,0,0,0,0,1,53,229,492,0
13 | 1,0.14,1,0,0,78,1,1,97,1913,436,0
14 | 1,0.14,2,0,0,61,0,1,17,200,437,0
15 | 1,0.33,2,0,0,45,0,1,8,130,622,0
16 | 1,0.1,2,0,0,43,0,0,60,192,141,0
17 | 1,0,2,0,0,56,0,1,51,498,337,0
18 | 1,0.33,2,0,0,86,0,1,25,96,499,0
19 | 1,0,1,0,0,97,0,1,188,202,605,0
20 | 1,0,3,0,0,46,0,1,590,175,199,0
21 | 1,0,2,0,0,39,0,1,251,223,694,0
22 | 1,0.5,1,0,0,0,0,1,0,189,276,0
23 | 1,0,0,0,0,28,0,1,58,486,862,0
24 | 1,0.22,2,0,0,63,0,1,46,464,367,0
25 | 1,0,2,0,0,24,0,1,19,150,157,0
26 | 1,0.1,1,0,0,4,0,1,250,2983,545,0
27 | 1,0,1,0,0,27,0,1,28,116,138,0
28 | 1,0,1,0,0,137,1,0,1065,155537,1395,0
29 | 1,0,1,0,0,69,0,1,209,248,490,0
30 | 1,0.33,1,0,0,0,0,1,5,348,347,0
31 | 1,0,2,0,0,147,1,0,1879,4021842,5514,0
32 | 1,0,2,0,0,0,0,0,9,366,552,0
33 | 1,0,2,0,0,138,1,0,253,1064,573,0
34 | 1,0,2,0,0,117,1,0,90,81267,963,0
35 | 1,0,2,0,0,0,0,1,8,400,449,0
36 | 1,0.17,1,0,0,0,0,0,12,361,562,0
37 | 1,0,2,0,0,14,0,0,13,228,346,0
38 | 1,0,2,0.3,0,18,0,0,59,855,151,0
39 | 1,0,4,0,0,54,1,0,77,777,148,0
40 | 1,0,1,0,0,0,1,0,14,264,151,0
41 | 1,0.15,2,0,0,73,1,0,330,1572,3504,0
42 | 1,0,0,0,0,47,0,1,2,510,185,0
43 | 1,0,2,0,0,84,1,0,932,1027419,293,0
44 | 1,0,1,0,0,39,1,0,6,710,549,0
45 | 1,0.09,2,0,0,77,0,0,463,2267,466,0
46 | 1,0,1,0,0,33,0,1,62,2055,993,0
47 | 1,0,2,0,0,149,1,1,247,814,1111,0
48 | 1,0,2,0,0,8,0,1,85,668,605,0
49 | 1,0,2,0,0,0,0,0,0,87,40,0
50 | 1,0,2,0,0,18,0,1,118,461,1055,0
51 | 1,0,9,0,0,133,1,0,307,602517,482,0
52 | 1,0,3,0,0,0,0,0,9,62,47,0
53 | 1,0,2,0,0,81,0,1,25,341,274,0
54 | 1,0,2,0,0,4,0,0,42,717,223,0
55 | 1,0,1,0,0,13,0,1,14,386,363,0
56 | 1,0,2,0,0,20,0,0,78,673,568,0
57 | 1,0,7,0,0,24,0,1,465,654,535,0
58 | 1,0,2,0,0,19,0,0,65,751,577,0
59 | 1,0.14,2,0,0,0,0,0,9,209,276,0
60 | 1,0,2,0,0,146,0,0,145,573,474,0
61 | 1,0,2,0,0,0,0,1,1,284,505,0
62 | 0,0.05,1,0.0,0,0,0,0,0,0,2,1
63 | 1,0.27,1,0.0,0,0,0,0,0,45,64,1
64 | 0,0.07,1,0.0,0,0,0,0,0,19,30,1
65 | 0,0.0,1,0.0,1,0,0,0,0,69,694,1
66 | 0,0.0,2,0.0,0,0,0,0,0,22,82,1
67 | 0,0.22,0,0,0,0,0,0,0,31,124,1
68 | 0,0.0,3,0.0,0,0,0,0,0,9,25,1
69 | 0,0.0,1,0.0,1,0,0,0,0,69,694,1
70 | 0,0.0,1,0.0,0,0,0,0,0,23,33,1
71 | 0,0.62,1,0.4,0,0,0,0,1,17,34,1
72 | 0,0.0,2,0.0,0,14,0,0,0,46,38,1
73 | 0,0.42,1,0.0,0,0,0,0,0,16,2,1
74 | 0,0.62,1,0.4,0,0,0,0,0,0,18,1
75 | 0,0.5,1,0.4,0,0,0,0,0,21,1,1
76 | 0,0.31,0,0,0,0,0,0,0,52,15,1
77 | 0,0.27,1,0.0,0,0,0,0,1,24,2,1
78 | 0,0.5,2,0.0,0,0,0,0,0,13,22,1
79 | 0,0.75,1,1.0,0,0,0,0,0,227,353,1
80 | 1,0.43,1,0.0,0,0,0,0,2,10,24,1
81 | 0,0.25,1,0.0,0,0,0,0,0,46,18,1
82 | 1,0.88,1,1.0,0,0,0,0,3,57,22,1
83 | 1,0.89,1,1.0,0,0,0,0,8,341,2287,1
84 | 1,0.0,2,0.0,0,0,0,0,3,1789,6153,1
85 | 1,0.27,1,0.0,0,0,0,0,0,45,64,1
86 | 0,0.0,1,0.0,0,0,0,0,0,21,31,1
87 | 1,0.56,1,0.56,1,20,0,0,34,309,250,1
88 | 1,0.0,3,0.0,0,58,0,0,4,1742,6172,1
89 | 0,0.14,1,0.0,0,0,0,0,0,1906,2129,1
90 | 1,0.0,5,0.0,0,0,0,0,1,39,324,1
91 | 0,0.5,0,0,0,0,0,0,1,17,33,1
92 | 1,0.0,2,0.0,0,0,0,0,1,62,126,1
93 | 1,0.5,1,0.0,0,0,0,0,0,119,350,1
94 | 0,0.33,1,0.0,0,0,0,0,0,2997,764,1
95 | 1,0.0,0,0,0,0,0,0,15,772,3239,1
96 | 1,0.0,1,0.0,0,0,0,0,4,129,920,1
97 | 1,0.38,2,0.25,0,0,0,0,70,94,105,1
98 | 0,0.0,1,0.0,0,0,0,0,2,37,58,1
99 | 1,0.33,1,0.0,0,0,0,0,81,75,55,1
100 | 0,0.31,1,0.33,0,0,0,0,0,42,175,1
101 | 1,0.67,1,0.5,0,0,0,0,5,145,202,1
102 | 1,0.38,1,0.0,0,0,0,0,21,128,636,1
103 | 1,0.38,1,0.0,0,0,0,0,3,88,72,1
104 | 0,0.0,0,0,0,0,0,0,2,1987,7453,1
105 | 1,0.48,1,0.0,0,0,0,0,16,100,162,1
106 | 1,0.57,1,0.0,0,0,0,0,3,214,829,1
107 | 1,0.88,1,1.0,0,0,0,0,23,227,776,1
108 | 1,0.89,2,0.0,0,0,0,0,6,192,942,1
109 | 0,0.44,1,0.44,1,112,0,0,4,415,1445,1
110 | 0,0.0,1,0.0,0,0,0,0,0,926,4239,1
111 | 1,0.44,1,0.0,0,0,0,0,60,238,1381,1
112 | 1,0.0,1,0.0,0,0,0,0,1,193,669,1
113 | 1,0.0,1,0.25,0,0,0,0,0,49,235,1
114 | 0,0.44,1,0.0,0,0,0,0,0,13,7,1
115 | 1,0.45,1,0.4,0,0,0,0,2,74,270,1
116 | 1,0.33,1,0.0,0,0,0,0,8,88,76,1
117 | 1,0.29,1,0.0,0,0,0,0,13,114,811,1
118 | 1,0.4,1,0.0,0,0,0,0,4,150,164,1
119 | 1,0.0,2,0.0,0,0,0,0,3,833,3572,1
120 | 0,0.17,1,0.0,0,0,0,0,1,219,1695,1
121 | 1,0.44,1,0.0,0,0,0,0,3,39,68,1
--------------------------------------------------------------------------------
/train.csv:
--------------------------------------------------------------------------------
1 | profile pic,nums/length username,fullname words,nums/length fullname,name==username,description length,external URL,private,#posts,#followers,#follows,fake
2 | 1,0.27,0,0,0,53,0,0,32,1000,955,0
3 | 1,0,2,0,0,44,0,0,286,2740,533,0
4 | 1,0.1,2,0,0,0,0,1,13,159,98,0
5 | 1,0,1,0,0,82,0,0,679,414,651,0
6 | 1,0,2,0,0,0,0,1,6,151,126,0
7 | 1,0,4,0,0,81,1,0,344,669987,150,0
8 | 1,0,2,0,0,50,0,0,16,122,177,0
9 | 1,0,2,0,0,0,0,0,33,1078,76,0
10 | 1,0,0,0,0,71,0,0,72,1824,2713,0
11 | 1,0,2,0,0,40,1,0,213,12945,813,0
12 | 1,0,2,0,0,54,0,0,648,9884,1173,0
13 | 1,0,2,0,0,54,1,0,76,1188,365,0
14 | 1,0,2,0,0,0,1,0,298,945,583,0
15 | 1,0,2,0,0,103,1,0,117,12033,248,0
16 | 1,0,2,0,0,98,1,0,487,1962,2701,0
17 | 1,0,3,0,0,46,0,0,254,50374,900,0
18 | 1,0,3,0,0,0,0,0,59,7007,289,0
19 | 1,0.29,3,0,0,48,0,0,1570,1128,694,0
20 | 1,0,2,0,0,63,1,0,378,34670,1878,0
21 | 1,0,2,0,0,106,1,0,526,2338,776,0
22 | 1,0,2,0,0,40,0,0,228,3516,999,0
23 | 1,0,1,0,0,35,1,1,35,1809,416,0
24 | 1,0,2,0,0,30,0,0,281,427,470,0
25 | 1,0,1,0,0,27,0,0,285,759,956,0
26 | 1,0,0,0,0,0,0,0,148,15338538,61,0
27 | 1,0,1,0,0,109,1,1,57,109,179,0
28 | 1,0,6,0,0,0,0,1,17,536,665,0
29 | 1,0,2,0,0,132,1,0,511,121354,176,0
30 | 1,0,2,0,0,126,1,0,230,2284,130,0
31 | 1,0,2,0,0,122,0,1,15,186,174,0
32 | 1,0,2,0,0,138,0,1,980,687,1517,0
33 | 1,0.13,0,0,0,0,0,1,53,966,952,0
34 | 1,0,2,0,0,50,0,1,111,177,170,0
35 | 1,0,2,0,0,35,0,0,719,744,967,0
36 | 1,0,2,0,0,56,1,0,1164,542073,674,0
37 | 1,0.18,2,0,0,9,0,0,497,5315651,2703,0
38 | 1,0.33,0,0,0,0,0,1,18,267,328,0
39 | 1,0,2,0,0,81,0,0,50,691,680,0
40 | 1,0,2,0,0,134,0,1,74,120,112,0
41 | 1,0,2,0,0,0,0,0,8,105,98,0
42 | 1,0,0,0,0,2,0,0,7389,890969,11,0
43 | 1,0,2,0,0,0,1,0,420,361853,583,0
44 | 1,0,2,0,0,23,0,0,433,3678,1359,0
45 | 1,0,2,0,0,138,1,0,156,92192,16,0
46 | 1,0,4,0,0,35,0,0,4494,12397719,8,0
47 | 1,0,3,0,0,93,0,0,751,380510,0,0
48 | 1,0,2,0,0,4,0,1,4,132,183,0
49 | 1,0,2,0,0,1,0,1,27,162,208,0
50 | 1,0,1,0,0,4,0,0,91,369,546,0
51 | 1,0,0,0,0,23,0,0,262,1476,666,0
52 | 1,0,3,0,0,91,1,0,274,1798,461,0
53 | 1,0,2,0,0,57,0,0,271,2118,1109,0
54 | 1,0,1,0,0,108,1,0,713,812,432,0
55 | 1,0,2,0.12,0,30,1,0,200,7217,761,0
56 | 1,0,0,0,0,82,0,0,12,313,376,0
57 | 1,0.12,1,0,0,12,1,0,26,64,261,0
58 | 1,0,2,0,0,54,0,0,75,1759,643,0
59 | 1,0,1,0,0,0,0,1,94,404,283,0
60 | 1,0,1,0,0,12,0,0,63,1843,598,0
61 | 1,0.12,2,0,0,0,0,0,69,320377,228,0
62 | 1,0,1,0,0,3,0,0,12,108,97,0
63 | 1,0,1,0,0,39,1,0,63,384,447,0
64 | 1,0.19,2,0,0,0,0,0,19,60,100,0
65 | 1,0,1,0,0,68,1,0,100,802,151,0
66 | 1,0,2,0,0,129,1,0,661,51145,528,0
67 | 1,0,2,0,0,57,1,0,149,1582,1882,0
68 | 1,0,2,0,0,64,0,0,22,223,266,0
69 | 1,0,1,0,0,42,0,0,400,18842,744,0
70 | 1,0,2,0,0,71,1,0,149,10240,1255,0
71 | 1,0,3,0,0,0,0,0,122,539,639,0
72 | 1,0.33,2,0,0,70,0,0,74,399,452,0
73 | 1,0,1,0,0,74,0,0,13,581,568,0
74 | 1,0,1,0,0,8,0,1,8,166,163,0
75 | 1,0,2,0,0,35,0,0,77,417,362,0
76 | 1,0.2,2,0,0,0,0,1,5,266,324,0
77 | 1,0,1,0,0,0,0,1,3,33,37,0
78 | 1,0,2,0,0,28,0,1,106,494,998,0
79 | 1,0,2,0,0,18,0,1,14,178,245,0
80 | 1,0,3,0,0,28,0,0,172,470,288,0
81 | 1,0.33,1,0,0,36,0,0,111,807,675,0
82 | 1,0,2,0,0,2,0,0,38,162,256,0
83 | 1,0.06,2,0,0,11,0,1,19,417,395,0
84 | 1,0,3,0,0,70,1,0,227,17303,360,0
85 | 1,0,2,0,0,29,0,0,221,1439,629,0
86 | 1,0,2,0,0,24,1,0,580,91446,526,0
87 | 1,0,3,0,0,21,1,0,40,824,489,0
88 | 1,0,1,0,0,81,0,0,101,741,1440,0
89 | 1,0,1,0,0,34,1,1,157,1267,899,0
90 | 1,0,0,0,0,40,0,0,197,4594,1713,0
91 | 1,0,1,0,0,12,0,0,61,1135,899,0
92 | 1,0,2,0,0,0,0,1,698,1926,1410,0
93 | 1,0,2,0,0,59,0,1,49,1068,1925,0
94 | 1,0,1,0,0,15,0,1,85,815,748,0
95 | 1,0,2,0,0,54,0,0,77,565,469,0
96 | 1,0,1,0,0,16,0,0,58,2556,1074,0
97 | 1,0,2,0,0,73,0,1,232,1312,935,0
98 | 1,0,1,0,0,24,0,0,20,699,599,0
99 | 1,0,0,0,0,0,0,0,98,4328,418,0
100 | 1,0,0,0,0,26,1,0,559,2487,999,0
101 | 1,0,3,0,0,0,0,0,189,673,438,0
102 | 1,0,2,0,0,0,0,1,388,730,413,0
103 | 1,0,0,0,0,0,0,1,28,59,55,0
104 | 1,0,3,0,0,0,0,1,156,289,222,0
105 | 1,0,1,0,0,28,0,1,6,19,20,0
106 | 1,0,2,0,0,55,1,0,69,3551,173,0
107 | 1,0,1,0,0,140,1,0,775,19512,591,0
108 | 1,0,0,0,0,122,1,0,205,2756,638,0
109 | 1,0,2,0,0,113,1,0,209,5406,589,0
110 | 1,0,2,0,0,38,0,1,334,459,390,0
111 | 1,0,2,0,0,0,0,0,9,218,75,0
112 | 1,0,2,0,0,23,0,0,6,796,1155,0
113 | 1,0,1,0,0,0,0,0,416,1113,1854,0
114 | 1,0.2,1,0,0,89,0,1,13,138,208,0
115 | 1,0.44,1,0,0,30,0,1,33,205,164,0
116 | 1,0,1,0,0,0,0,0,1,331,333,0
117 | 1,0.1,1,0.1,0,0,0,0,711,748,4659,0
118 | 1,0,1,0,0,0,0,0,114,490,1093,0
119 | 1,0,1,0.08,0,12,0,1,16,456,413,0
120 | 1,0,0,0,0,123,1,0,107,971,2047,0
121 | 1,0.24,1,0.24,0,0,0,0,5,497,438,0
122 | 1,0.14,2,0,0,0,0,1,7,99,132,0
123 | 1,0,0,0,0,0,0,1,21,193,413,0
124 | 1,0,2,0,0,40,0,0,65,492,689,0
125 | 1,0.29,2,0,0,0,0,0,10,167,164,0
126 | 1,0,2,0,0,33,0,1,59,99,178,0
127 | 1,0,0,0,0,0,0,1,137,916,1142,0
128 | 1,0,2,0,0,5,0,1,10,196,209,0
129 | 1,0,1,0,0,0,0,1,571,765,424,0
130 | 1,0.18,2,0,0,23,0,1,24,45,80,0
131 | 1,0,1,0,0,35,1,0,328,634,719,0
132 | 1,0,4,0,0,150,1,0,161,1383,7500,0
133 | 1,0,1,0,0,26,0,1,280,650,703,0
134 | 1,0,2,0,0,149,0,1,92,484,3296,0
135 | 1,0,3,0,0,129,0,0,31,200,270,0
136 | 1,0,2,0,0,0,0,0,0,192,65,0
137 | 1,0,0,0,0,18,0,1,25,553,610,0
138 | 1,0,2,0,0,74,1,0,921,27477,7202,0
139 | 1,0,2,0,0,0,1,1,1020,464,1039,0
140 | 1,0,1,0,0,59,0,0,301,1057,524,0
141 | 1,0,3,0,0,148,0,0,9,413,138,0
142 | 1,0,0,0,0,0,1,0,158,389,806,0
143 | 1,0.29,1,0,0,15,0,0,43,505,503,0
144 | 1,0,12,0,0,46,0,1,24,941,1208,0
145 | 1,0,2,0,0,5,0,1,60,2598,802,0
146 | 1,0,1,0,0,98,0,0,4,59,111,0
147 | 1,0,2,0,0,55,0,0,220,622,475,0
148 | 1,0,2,0,0,19,1,0,1159,2719,1061,0
149 | 1,0.36,5,0,0,71,0,0,8,216,305,0
150 | 1,0.22,2,0,0,133,0,0,396,881,375,0
151 | 1,0.08,2,0,0,150,1,0,38,870,72,0
152 | 1,0,2,0,0,43,0,1,1,265,371,0
153 | 1,0.15,2,0,0,37,0,1,2,1204,633,0
154 | 1,0,1,0,0,35,0,1,43,655,1016,0
155 | 1,0,2,0,0,87,0,0,131,1662,1065,0
156 | 1,0.14,2,0,0,59,0,0,7,14222,7399,0
157 | 1,0,2,0,0,0,0,0,57,483,1216,0
158 | 1,0,3,0,0,0,0,0,36,1204,2928,0
159 | 1,0,2,0,0,9,0,1,91,408,635,0
160 | 1,0,2,0,0,12,0,1,8,303,417,0
161 | 1,0.1,0,0,0,0,0,0,11,125,101,0
162 | 1,0.09,1,0,0,0,0,0,252,229,383,0
163 | 1,0.15,3,0,0,95,1,1,15,357,489,0
164 | 1,0,1,0,0,0,0,0,74,137,96,0
165 | 1,0,2,0,0,46,0,1,83,255,535,0
166 | 1,0,3,0,0,123,0,1,126,87,399,0
167 | 1,0,6,0,0,117,1,0,663,326946,3,0
168 | 1,0,1,0,0,26,0,1,0,114,446,0
169 | 1,0,0,0,0,0,0,1,10,167,387,0
170 | 1,0,2,0,0,58,0,1,8,1247,1196,0
171 | 1,0,1,0,0,0,0,0,24,585,1364,0
172 | 1,0,1,0,0,30,0,1,65,135,232,0
173 | 1,0,3,0,0,62,1,0,64,722,261,0
174 | 1,0.45,1,0.25,0,137,1,0,664,714,1159,0
175 | 1,0,2,0,0,149,1,0,130,39867,4664,0
176 | 0,0,0,0,0,14,0,1,131,533,1060,0
177 | 1,0,2,0,0,19,0,1,917,1158,3932,0
178 | 1,0,10,0,0,131,1,0,2,45834,280,0
179 | 1,0,2,0,0,0,0,0,142,876,529,0
180 | 1,0.09,5,0,0,5,0,0,165,3003,455,0
181 | 1,0.09,2,0,0,0,0,0,80,1298,407,0
182 | 1,0,2,0,0,11,0,1,32,3800,278,0
183 | 1,0,2,0,0,0,0,1,81,1358,127,0
184 | 1,0,2,0,0,27,1,0,334,6741307,111,0
185 | 1,0.38,2,0,0,10,1,0,8,791,456,0
186 | 1,0,2,0,0,72,1,0,373,732075,363,0
187 | 1,0,2,0,0,3,0,1,56,373,360,0
188 | 1,0,3,0,0,51,0,1,7,162,311,0
189 | 1,0,1,0,0,44,0,1,12,309,364,0
190 | 1,0,2,0,0,0,0,0,5,135,176,0
191 | 1,0,2,0,0,73,1,0,77,244,172,0
192 | 1,0.18,1,0,0,70,0,1,93,67,149,0
193 | 1,0,2,0,0,35,0,0,192,984,76,0
194 | 1,0,2,0,0,13,0,1,8,751,1223,0
195 | 1,0,4,0.25,0,105,0,1,145,781,529,0
196 | 1,0,1,0.33,0,91,0,0,135,1761,905,0
197 | 1,0.14,2,0,0,0,0,1,18,318,523,0
198 | 1,0,2,0,0,48,0,0,222,5282,652,0
199 | 1,0,2,0,0,48,0,0,222,5282,652,0
200 | 1,0,2,0,0,126,1,1,5,228,238,0
201 | 1,0,2,0,0,0,1,1,119,393,502,0
202 | 1,0,2,0,0,53,0,0,201,875,754,0
203 | 1,0,0,0,0,8,0,1,12,173,373,0
204 | 1,0,1,0,0,67,0,0,301,3896490,351,0
205 | 1,0,1,0,0,20,0,0,20,106,133,0
206 | 1,0.08,2,0,0,26,1,0,112,206,32,0
207 | 1,0,0,0,0,86,0,1,54,259,1371,0
208 | 1,0,1,0,0,51,0,1,98,1013,996,0
209 | 1,0.11,4,0,0,26,1,0,16,738,544,0
210 | 1,0,0,0,0,18,0,1,133,1008,517,0
211 | 1,0,2,0,0,96,1,0,142,2441,396,0
212 | 1,0,2,0,0,17,0,0,63,416,292,0
213 | 1,0,2,0,0,0,0,1,70,516,463,0
214 | 1,0,2,0,0,62,1,0,403,8711,345,0
215 | 1,0,2,0,0,86,0,0,15,433,584,0
216 | 1,0,2,0,0,148,1,0,990,18515,1000,0
217 | 1,0,2,0,0,1,0,1,12,70,67,0
218 | 1,0.22,3,0,0,39,0,0,26,5863,157,0
219 | 1,0.36,2,0,0,35,0,0,2,1677,716,0
220 | 1,0,0,0,0,103,0,0,24,617,272,0
221 | 1,0,2,0,0,0,1,0,411,31182,414,0
222 | 1,0,2,0,0,61,0,1,217,1152,292,0
223 | 1,0.15,1,0,0,44,0,0,156,8578,164,0
224 | 1,0,1,0,0,0,0,0,389,4347,748,0
225 | 1,0,1,0,0,0,0,1,8,319,335,0
226 | 1,0,1,0,0,0,0,1,3,189,177,0
227 | 1,0,1,0,0,112,0,1,22,743,331,0
228 | 1,0.18,1,0,0,123,0,0,59,11204,124,0
229 | 1,0,3,0,0,24,0,1,144,419,271,0
230 | 1,0,2,0,0,34,0,0,2,81,268,0
231 | 1,0,1,0,0,19,0,0,78,947,582,0
232 | 1,0,2,0,0,0,0,0,28,206,333,0
233 | 1,0,2,0,0,42,0,0,111,541,701,0
234 | 1,0.18,3,0,0,50,0,0,6,392,237,0
235 | 1,0.22,5,0,0,67,0,0,4,4177,3678,0
236 | 1,0.17,2,0,0,134,1,0,31,265,321,0
237 | 1,0,2,0,0,101,0,0,86,272,1486,0
238 | 1,0,2,0,0,0,0,0,240,425,917,0
239 | 1,0,3,0,0,0,0,1,11,150,178,0
240 | 1,0,2,0,0,17,0,1,44,711,673,0
241 | 1,0,2,0,0,0,0,1,9,89,121,0
242 | 1,0,2,0,0,32,0,1,62,742,653,0
243 | 1,0,2,0,0,0,0,0,0,96,50,0
244 | 1,0,1,0,1,80,1,1,15,77,107,0
245 | 1,0,2,0,0,2,0,1,53,195,395,0
246 | 1,0,2,0,0,0,0,1,2,9,13,0
247 | 1,0,1,0,0,146,1,0,52,10794,3164,0
248 | 1,0.19,0,0,0,0,0,0,1,104,15,0
249 | 1,0,0,0,0,0,1,1,62,318,569,0
250 | 1,0.18,1,0,0,0,0,0,247,355,137,0
251 | 1,0.31,1,0,0,0,0,1,9,99,488,0
252 | 1,0,0,0,0,6,0,0,13,300,372,0
253 | 1,0.15,1,0,0,0,0,0,204,139,61,0
254 | 1,0,2,0,0,0,0,1,0,13,77,0
255 | 1,0.22,1,0.14,0,0,0,1,51,606,430,0
256 | 1,0,1,0,0,64,0,0,15,428,333,0
257 | 1,0,1,0,0,0,0,0,108,494,545,0
258 | 1,0.24,0,0,0,0,0,1,353,1261,2187,0
259 | 1,0,1,0,0,0,0,1,5,68,87,0
260 | 1,0,2,0,0,49,1,0,560,205558,53,0
261 | 1,0,2,0,0,23,0,0,85,668,609,0
262 | 1,0,2,0,0,120,1,0,95,1456,555,0
263 | 1,0.14,2,0,0,34,0,0,14,410,387,0
264 | 1,0,3,0,0,25,0,1,52,298,555,0
265 | 1,0,2,0,0,0,0,1,7,254,345,0
266 | 1,0,2,0,0,12,0,0,197,1070,1072,0
267 | 1,0.17,2,0,0,0,0,1,89,1167,618,0
268 | 1,0,2,0,0,9,0,1,34,335,300,0
269 | 1,0.08,2,0,0,1,0,0,283,346,426,0
270 | 1,0,2,0,0,18,0,1,65,1746,1631,0
271 | 1,0,2,0,0,34,0,1,126,268,370,0
272 | 1,0,3,0,0,23,0,1,327,537,1002,0
273 | 1,0,1,0,0,19,0,1,42,805,488,0
274 | 1,0,2,0,0,139,0,0,36,1504,52,0
275 | 1,0,2,0,0,13,0,1,3,104,119,0
276 | 1,0,2,0,0,50,0,0,53,380,462,0
277 | 1,0,3,0,0,46,1,1,7,257,346,0
278 | 1,0,1,0,0,30,0,1,103,1775,7500,0
279 | 1,0.3,2,0,0,26,0,1,241,1456,1200,0
280 | 1,0,0,0,0,0,0,1,103,265,264,0
281 | 1,0,1,0,0,0,0,1,93,1051,694,0
282 | 1,0.11,1,0,0,0,0,0,1,77,34,0
283 | 0,0,1,0,0,27,0,1,16,220,323,0
284 | 1,0.07,2,0,0,37,1,1,32,178,81,0
285 | 1,0,1,0,0,31,0,1,1232,728,1213,0
286 | 1,0,1,0,0,20,0,1,75,668,294,0
287 | 1,0,2,0,0,7,0,0,5,406,408,0
288 | 1,0,2,0,0,0,0,0,3,37,22,0
289 | 1,0,2,0,0,0,0,1,30,56,114,0
290 | 0,0.22,1,0.0,0,0,0,0,0,90,333,1
291 | 0,0.38,1,0.0,0,0,0,0,0,60,31,1
292 | 0,0.43,1,0.0,0,0,0,1,2,271,445,1
293 | 1,0.0,0,0,0,0,0,1,3,1,80,1
294 | 1,0.5,3,0.0,0,24,0,1,13,158,309,1
295 | 0,0.31,2,0.0,0,0,0,0,0,39,64,1
296 | 0,0.22,1,0.22,0,43,0,1,1,0,11,1
297 | 0,0.0,1,0.0,0,0,0,1,0,0,853,1
298 | 1,0.25,1,0.0,0,0,0,1,10,0,23,1
299 | 0,0.0,2,0.0,0,0,0,0,0,12,5,1
300 | 0,0.33,1,0.0,0,0,0,0,0,10,18,1
301 | 0,0.33,1,0.0,0,0,0,1,0,1,34,1
302 | 0,0.0,1,0.0,0,0,0,0,0,1,8,1
303 | 0,0.0,0,0,0,0,0,1,0,0,0,1
304 | 0,0.12,1,0.0,0,0,0,1,0,31,213,1
305 | 0,0.18,1,0.0,0,0,0,1,0,5,10,1
306 | 0,0.5,1,0.0,0,0,0,1,5,18,58,1
307 | 1,0.5,1,0.0,0,0,0,1,0,12,77,1
308 | 0,0.5,1,0.0,0,0,0,1,0,6,37,1
309 | 1,0.57,1,0.0,0,0,0,0,0,47,10,1
310 | 0,0.45,3,0.0,0,0,0,0,0,33,4,1
311 | 0,0.5,1,0.0,0,0,0,1,141,4,279,1
312 | 0,0.57,1,0.0,0,0,0,1,6,5,15,1
313 | 1,0.0,1,0.0,0,0,0,1,1,0,44,1
314 | 0,0.44,1,0.43,0,0,0,1,0,5,17,1
315 | 0,0.0,1,0.0,0,0,0,1,1,107,42,1
316 | 1,0.88,1,0.0,0,0,0,1,39,8,60,1
317 | 0,0.0,1,0.0,0,0,0,0,0,48,6,1
318 | 0,0.22,0,0,0,0,0,1,9,2,215,1
319 | 0,0.0,2,0.0,0,0,0,1,0,51,126,1
320 | 1,0.0,1,0.0,0,43,0,1,21,5,48,1
321 | 0,0.0,1,0.0,0,0,0,1,0,0,802,1
322 | 0,0.55,1,0.0,0,0,0,0,0,26,46,1
323 | 0,0.07,1,0.0,0,0,0,1,0,0,601,1
324 | 1,0.8,1,0.0,0,13,0,1,3,76,168,1
325 | 0,0.0,1,0.4,0,0,0,1,0,165,689,1
326 | 0,0.0,1,0.0,0,0,0,0,0,115,230,1
327 | 0,0.31,1,0.36,0,0,0,0,0,6,15,1
328 | 0,0.4,1,0.0,0,0,0,0,0,24,49,1
329 | 0,0.33,1,0.0,0,0,0,1,0,40,236,1
330 | 0,0.0,1,0.0,0,0,0,1,6,32,52,1
331 | 0,0.33,1,0.33,1,0,0,1,0,0,35,1
332 | 0,0.36,1,0.0,0,0,0,0,0,21,44,1
333 | 1,0.55,1,0.29,0,0,0,1,1,79,767,1
334 | 1,0.41,1,0.0,0,0,0,1,4,8,37,1
335 | 1,0.43,0,0,0,0,0,0,0,49,79,1
336 | 0,0.11,1,0.11,1,0,0,1,0,31,91,1
337 | 1,0.0,1,0.0,1,0,0,1,0,15,41,1
338 | 1,0.25,1,0.0,0,0,0,0,0,4,3,1
339 | 1,0.44,1,0.0,0,0,0,1,0,0,229,1
340 | 1,0.3,0,0,0,0,0,1,17,316,1165,1
341 | 0,0.1,1,0.0,0,0,0,1,0,0,0,1
342 | 0,0.0,1,0.0,0,0,0,0,1,3,1,1
343 | 0,0.0,2,0.0,0,0,0,0,0,2,30,1
344 | 1,0.29,2,0.0,0,9,0,1,7,221,244,1
345 | 0,0.0,1,0.0,0,0,0,0,13,181,935,1
346 | 0,0.0,2,0.0,0,0,0,1,0,9,167,1
347 | 0,0.67,1,0.0,0,0,0,0,0,2,0,1
348 | 1,0.31,1,0.31,1,0,0,1,0,25,86,1
349 | 1,0.0,3,0.0,0,18,0,0,5,26,129,1
350 | 1,0.89,1,0.89,0,0,0,1,124,15,305,1
351 | 1,0.0,1,0.0,0,10,0,0,9,59,48,1
352 | 0,0.2,1,0.2,1,61,0,1,5,7,47,1
353 | 0,0.12,1,0.0,0,0,0,1,2,40,179,1
354 | 0,0.0,1,0.0,0,0,0,0,0,51,41,1
355 | 1,0.0,5,0.0,0,0,0,0,150,133,750,1
356 | 1,0.0,0,0,0,0,0,1,29,25,39,1
357 | 1,0.18,1,0.0,0,0,0,1,108,864,3646,1
358 | 1,0.0,0,0,0,0,0,0,77,73,35,1
359 | 1,0.38,1,0.4,0,0,0,1,3,184,170,1
360 | 1,0.43,1,0.0,0,0,0,1,7,161,333,1
361 | 1,0.22,1,0.22,1,0,0,1,3,42,694,1
362 | 1,0.0,1,0.0,0,22,0,1,131,279,1124,1
363 | 1,0.0,1,0.0,0,0,0,0,40,219,155,1
364 | 1,0.24,1,0.24,1,0,0,1,4,18,106,1
365 | 1,0.44,1,0.0,0,0,0,1,84,106,171,1
366 | 1,0.5,0,0,0,2,0,1,299,34,108,1
367 | 0,0.0,1,0.0,0,0,0,0,11,42,26,1
368 | 1,0.16,0,0,0,146,0,1,0,1,13,1
369 | 0,0.89,0,0,0,0,0,1,0,38,68,1
370 | 1,0.58,1,0.36,0,6,0,1,1,34,44,1
371 | 1,0.29,1,0.0,0,50,0,1,4,23,151,1
372 | 1,0.3,1,0.0,0,0,0,1,0,23,37,1
373 | 0,0.5,2,0.0,0,0,0,0,0,83,139,1
374 | 0,0.4,1,0.0,0,0,0,0,0,25,28,1
375 | 1,0.0,0,0,0,39,0,1,3,13,38,1
376 | 0,0.25,1,0.0,0,0,0,0,0,64,60,1
377 | 0,0.36,1,0.0,0,0,0,0,0,90,69,1
378 | 0,0.33,2,0.0,0,0,0,0,0,82,6,1
379 | 0,0.0,0,0,0,0,0,0,0,140,319,1
380 | 0,0.36,1,0.0,0,0,0,1,0,0,29,1
381 | 1,0.3,1,0.0,0,5,0,1,1,51,420,1
382 | 1,0.12,1,0.0,0,91,0,0,7,38,87,1
383 | 0,0.33,1,0.38,0,2,0,1,3,21,733,1
384 | 1,0.38,1,0.0,0,0,0,1,0,5,121,1
385 | 1,0.4,1,0.0,0,0,0,1,0,21,45,1
386 | 0,0.57,0,0,0,0,0,1,5,124,249,1
387 | 0,0.22,1,0.0,0,0,0,1,0,5,56,1
388 | 1,0.38,1,0.0,0,37,0,1,1,2,16,1
389 | 0,0.64,2,0.0,0,0,0,0,0,42,46,1
390 | 1,0.33,1,0.33,0,0,0,1,12,48,45,1
391 | 0,0.31,1,0.31,1,0,0,0,0,32,25,1
392 | 0,0.0,1,0.0,0,0,0,1,0,0,71,1
393 | 0,0.25,1,0.0,0,0,0,0,0,0,0,1
394 | 1,0.12,1,0.0,0,0,0,1,1,10,104,1
395 | 1,0.38,1,0.0,0,0,0,1,92,19,31,1
396 | 0,0.31,1,0.0,0,0,0,1,0,138,99,1
397 | 0,0.2,0,0,0,0,0,1,0,11,157,1
398 | 0,0.0,1,0.0,0,0,0,1,0,1,64,1
399 | 0,0.42,1,0.0,0,0,0,0,0,39,40,1
400 | 1,0.0,0,0,0,148,0,0,21,446,1762,1
401 | 1,0.57,1,0.0,0,0,0,1,7,9,82,1
402 | 0,0.0,0,0,0,0,0,0,9,589,2980,1
403 | 0,0.06,1,0.0,0,0,0,0,0,27,63,1
404 | 1,0.0,2,0.0,0,0,0,0,14,143,500,1
405 | 0,0.0,2,0.0,0,0,0,1,0,0,177,1
406 | 0,0.0,3,0.0,0,0,0,1,0,0,130,1
407 | 0,0.21,1,0.0,0,0,0,0,3,124,135,1
408 | 1,0.07,0,0,0,0,0,1,0,12,192,1
409 | 0,0.42,1,0.0,0,0,0,0,0,16,36,1
410 | 0,0.21,1,0.0,0,0,0,0,1,1,2,1
411 | 0,0.38,1,0.0,0,0,0,0,1,120,181,1
412 | 0,0.0,2,0.0,0,0,0,1,0,0,71,1
413 | 0,0.0,1,0.0,0,0,0,0,1,26,8,1
414 | 1,0.14,3,0.0,0,0,0,0,0,57,167,1
415 | 0,0.15,2,0.0,0,0,0,0,1,13,22,1
416 | 1,0.0,4,0.0,0,149,0,0,28,1031,208,1
417 | 1,0.18,1,0.0,0,0,0,0,0,42,146,1
418 | 1,0.33,2,0.0,0,0,0,0,25,70,64,1
419 | 1,0.18,1,0.0,0,0,0,1,0,0,8,1
420 | 1,0.3,1,0.0,0,0,0,0,0,46,92,1
421 | 1,0.43,2,0.0,0,0,0,0,111,834,4118,1
422 | 0,0.0,2,0.0,0,22,0,1,2,389,392,1
423 | 1,0.33,0,0,0,0,0,0,0,43,221,1
424 | 0,0.33,1,0.0,0,0,0,1,0,23,40,1
425 | 0,0.0,1,0.0,0,0,0,1,0,5,82,1
426 | 0,0.71,0,0,0,0,0,1,0,4,25,1
427 | 0,0.5,1,0.0,0,0,0,0,0,17,44,1
428 | 1,0.25,1,0.0,0,2,0,1,5,182,227,1
429 | 1,0.33,1,0.0,0,0,0,0,10,108,304,1
430 | 0,0.5,1,0.33,0,0,0,0,3,43,123,1
431 | 0,0.29,1,0.0,0,0,0,0,19,77,8,1
432 | 0,0.09,1,0.0,0,0,0,0,0,32,87,1
433 | 0,0.25,1,0.0,0,0,0,1,0,0,11,1
434 | 1,0.13,1,0.0,0,0,0,1,8,40,66,1
435 | 0,0.33,1,0.33,1,20,0,0,2,53,303,1
436 | 1,0.14,2,0.0,0,0,0,0,10,67,90,1
437 | 1,0.0,3,0.0,0,0,0,0,29,122,336,1
438 | 0,0.0,2,0.0,0,0,0,0,0,18,49,1
439 | 0,0.4,1,0.0,0,0,0,1,0,24,88,1
440 | 1,0.0,1,0.0,0,0,0,0,2,35,136,1
441 | 1,0.4,1,0.4,1,0,0,1,3,119,151,1
442 | 1,0.1,1,0.0,0,0,0,1,5,6,24,1
443 | 0,0.43,1,0.4,0,0,0,0,0,34,26,1
444 | 1,0.0,3,0.0,0,148,0,0,63,272,295,1
445 | 1,0.44,2,0.0,0,0,0,0,1,53,137,1
446 | 1,0.33,0,0,0,50,0,1,4,818,618,1
447 | 1,0.0,3,0.0,0,0,0,0,34,82,74,1
448 | 1,0.38,2,0.0,0,2,0,0,0,40,233,1
449 | 1,0.4,0,0,0,0,0,1,0,21,76,1
450 | 1,0.33,1,0.0,0,0,0,0,6,59,38,1
451 | 1,0.12,1,0.0,0,0,0,0,2,102,109,1
452 | 1,0.29,0,0,0,0,0,1,7,576,474,1
453 | 0,0.57,2,0.4,0,0,0,0,0,19,16,1
454 | 1,0.0,0,0,0,0,0,1,0,66,161,1
455 | 1,0.0,1,0.0,0,0,0,1,5,310,894,1
456 | 1,0.0,1,0.0,1,34,0,1,0,16,17,1
457 | 0,0.0,1,0.0,0,0,0,0,0,15,4,1
458 | 1,0.07,1,0.0,0,0,0,1,0,47,98,1
459 | 1,0.0,1,0.0,1,0,0,0,4,88,61,1
460 | 1,0.27,1,0.0,0,0,0,1,1,51,59,1
461 | 1,0.0,2,0.0,0,0,0,0,26,505,2330,1
462 | 1,0.36,1,0.0,0,0,0,0,9,159,433,1
463 | 1,0.29,1,0.33,0,0,0,1,4,77,1269,1
464 | 1,0.83,1,0.0,0,32,0,0,4,61,76,1
465 | 0,0.67,1,0.0,0,0,0,1,0,7,14,1
466 | 1,0.38,1,0.33,0,0,0,1,0,11,60,1
467 | 0,0.58,1,0.0,0,0,0,0,0,49,22,1
468 | 0,0.0,1,0.0,0,0,0,0,0,15,595,1
469 | 1,0.2,2,0.0,0,0,0,0,1,1,9,1
470 | 0,0.0,1,0.0,0,0,0,0,0,17,1,1
471 | 0,0.29,2,0.0,0,0,0,0,0,15,1,1
472 | 1,0.25,1,0.0,0,59,0,0,28,358,1990,1
473 | 0,0.25,1,0.0,0,0,0,0,0,20,7,1
474 | 0,0.0,1,0.0,0,6,0,0,1,50,32,1
475 | 0,0.0,1,0.0,0,0,0,0,0,57,76,1
476 | 0,0.31,1,0.0,0,0,0,0,2,12,15,1
477 | 1,0.67,1,0.0,0,0,0,0,2,218,792,1
478 | 0,0.33,1,0.25,0,0,0,0,9,78,98,1
479 | 0,0.47,1,0.0,0,0,0,0,1,16,23,1
480 | 0,0.36,1,0.0,0,0,0,0,8,39,17,1
481 | 0,0.0,1,0.0,0,0,0,0,0,15,1,1
482 | 0,0.4,0,0,0,0,0,0,0,30,48,1
483 | 0,0.33,1,0.0,0,0,0,0,0,85,638,1
484 | 0,0.33,1,0.0,0,0,0,0,0,77,747,1
485 | 0,0.55,2,0.0,0,0,0,0,0,49,82,1
486 | 0,0.27,1,0.27,1,0,0,0,0,16,24,1
487 | 0,0.44,1,0.0,0,0,0,0,0,8,6,1
488 | 0,0.44,1,0.0,0,0,0,0,2,34,22,1
489 | 0,0.47,1,0.0,0,0,0,0,2,45,83,1
490 | 0,0.33,1,0.0,0,0,0,0,0,49,0,1
491 | 0,0.0,2,0.0,0,0,0,0,0,15,5,1
492 | 0,0.25,1,0.0,0,0,0,0,0,92,403,1
493 | 1,0.91,1,0.0,0,0,0,0,0,75,26,1
494 | 0,0.44,1,0.0,0,0,0,0,0,10,0,1
495 | 1,0.12,1,0.0,0,1,0,0,0,23,26,1
496 | 0,0.2,1,0.0,0,0,0,0,0,22,11,1
497 | 0,0.24,2,0.0,0,0,0,0,0,55,46,1
498 | 0,0.27,1,0.0,0,0,0,0,0,16,0,1
499 | 0,0.25,1,0.0,0,0,0,0,0,7,20,1
500 | 0,0.28,1,0.24,0,0,0,0,0,86,0,1
501 | 0,0.38,1,0.0,0,0,0,0,0,3,11,1
502 | 0,0.57,2,0.0,0,0,0,0,0,12,30,1
503 | 1,0.44,1,0.0,0,0,0,0,1,14,56,1
504 | 0,0.47,1,0.0,0,0,0,0,0,24,22,1
505 | 0,0.0,2,0.0,0,0,0,0,0,52,1,1
506 | 0,0.44,1,0.0,0,0,0,0,1,16,27,1
507 | 0,0.44,1,0.0,0,0,0,0,4,17,20,1
508 | 0,0.57,2,0.0,0,0,0,0,0,50,49,1
509 | 0,0.54,1,0.0,0,0,0,0,5,136,1029,1
510 | 1,0.43,1,0.0,0,0,0,0,10,178,1417,1
511 | 0,0.89,0,0,0,0,0,0,1,50,39,1
512 | 1,0.38,1,0.0,0,0,0,0,2,207,2426,1
513 | 0,0.36,1,0.0,0,0,0,1,0,178,828,1
514 | 0,0.0,1,0.0,1,0,0,0,0,16,26,1
515 | 0,0.14,1,0.0,0,0,0,0,0,49,2,1
516 | 0,0.2,1,0.33,0,0,0,0,2,49,12,1
517 | 0,0.1,0,0,0,0,0,0,0,37,4,1
518 | 0,0.5,1,0.4,0,0,0,0,0,21,0,1
519 | 1,0.58,1,0.44,0,0,0,0,21,356,2176,1
520 | 0,0.44,2,0.0,0,0,0,0,0,46,4,1
521 | 0,0.22,1,0.33,0,0,0,0,0,49,0,1
522 | 0,0.0,1,0.0,0,0,0,0,0,49,58,1
523 | 0,0.0,1,0.0,0,0,0,0,0,44,45,1
524 | 0,0.13,2,0.0,0,0,0,0,0,42,240,1
525 | 0,0.38,1,0.33,0,0,0,0,0,47,9,1
526 | 0,0.1,2,0.0,0,0,0,0,0,48,1,1
527 | 1,0.91,1,0.0,0,0,0,0,0,75,26,1
528 | 0,0.08,1,0.0,0,0,0,0,0,115,289,1
529 | 0,0.88,1,1.0,0,0,0,0,5,53,66,1
530 | 0,0.67,1,0.0,0,0,0,0,0,32,23,1
531 | 0,0.44,1,0.0,0,0,0,0,0,39,153,1
532 | 0,0.18,1,0.0,0,0,0,0,0,3033,1155,1
533 | 0,0.25,1,0.0,0,0,0,0,0,3003,825,1
534 | 0,0.88,1,1.0,0,0,0,0,1,34,36,1
535 | 0,0.36,1,0.0,0,0,0,0,37,218,1528,1
536 | 1,0.33,2,0.22,0,0,0,0,8,210,1543,1
537 | 1,0.2,2,0.0,0,19,0,0,4,1489,3715,1
538 | 0,0.5,1,0.5,1,0,0,0,0,10,2,1
539 | 0,0.33,1,0.0,0,0,0,0,28,201,812,1
540 | 0,0.0,2,0.0,0,0,0,0,0,351,2663,1
541 | 0,0.43,1,0.0,0,0,0,0,0,52,16,1
542 | 0,0.25,1,0.18,0,0,0,0,6,156,423,1
543 | 0,0.43,2,0.0,0,0,0,0,12,85,477,1
544 | 0,0.24,1,0.0,0,0,0,0,0,35,38,1
545 | 0,0.44,1,0.0,0,0,0,0,0,55,28,1
546 | 0,0.27,2,0.0,0,43,0,1,0,16,51,1
547 | 0,0.38,1,0.0,0,0,0,0,0,43,88,1
548 | 0,0.4,1,0.0,0,0,0,0,2,97,408,1
549 | 0,0.57,1,0.57,1,0,0,0,2,34,112,1
550 | 0,0.0,1,0.0,1,0,0,0,0,2346,7272,1
551 | 0,0.25,1,0.0,0,0,0,0,0,49,95,1
552 | 1,0.19,0,0,0,0,0,1,2,2,55,1
553 | 0,0.46,1,0.46,1,0,0,0,2,40,19,1
554 | 0,0.46,1,0.0,0,0,0,1,0,332,1333,1
555 | 0,0.73,1,0.0,0,0,0,0,0,14,542,1
556 | 1,0.6,1,0.5,0,0,0,0,12,65,162,1
557 | 1,0.46,1,0.5,0,0,0,0,1,20,52,1
558 | 0,0.31,1,0.31,1,0,0,0,0,26,27,1
559 | 1,0.43,1,0.43,0,0,0,0,0,72,434,1
560 | 1,0.0,1,0.0,0,0,0,0,1,55,2,1
561 | 1,0.5,1,0.4,0,33,0,0,12,77,108,1
562 | 1,0.0,4,0.25,0,0,0,1,59,31,215,1
563 | 0,0.86,2,0.18,0,0,0,0,0,57,130,1
564 | 0,0.62,1,0.0,0,0,0,1,0,58,347,1
565 | 1,0.33,2,0.0,0,0,0,0,23,47,139,1
566 | 1,0.16,1,0.12,0,43,0,0,3,31,37,1
567 | 1,0.92,1,1.0,0,0,0,0,11,96,242,1
568 | 1,0.27,1,0.0,0,19,0,0,8,126,860,1
569 | 1,0.25,1,0.0,0,0,0,1,102,39,229,1
570 | 1,0.43,1,0.0,0,5,0,0,6,66,161,1
571 | 1,0.31,3,0.0,0,0,0,0,25,87,698,1
572 | 1,0.2,1,0.0,0,28,0,0,0,15,64,1
573 | 1,0.55,1,0.44,0,0,0,0,33,166,596,1
574 | 1,0.38,1,0.33,0,21,0,0,44,66,75,1
575 | 1,0.57,2,0.0,0,0,0,0,4,96,339,1
576 | 1,0.57,1,0.0,0,11,0,0,0,57,73,1
577 | 1,0.27,1,0.0,0,0,0,0,2,150,487,1
--------------------------------------------------------------------------------
/IGAudit_mdfiles/IGAudit.md:
--------------------------------------------------------------------------------
1 | # IG Audit
2 | Objective: Using Simple Statistical Tools and Machine Learning to Audit Instagram Accounts for Authenticity
3 |
4 | Motivation: During lockdown, businesses have started increasing the use of social media influencers to market their products while their physical outlets are temporary closed. However, it is sad that there are some that will try and game the system for their own good. But in a world where a single influencer's post is worth as much as an average 9-5 Joe's annual salary, influencer marketing fake followers and fake engagement is a price that brands shouldn't have to pay for.
5 |
6 | *Inspired by igaudit.io that was taken down by Facebook only recently.*
7 |
8 |
9 | ```python
10 | # Imports
11 |
12 | import numpy as np
13 | import pandas as pd
14 | import seaborn as sns
15 | import matplotlib.pyplot as plt
16 | %matplotlib inline
17 |
18 | from sklearn.linear_model import LogisticRegression
19 | from sklearn.neighbors import KNeighborsClassifier
20 | from sklearn.tree import DecisionTreeClassifier
21 | from sklearn.ensemble import RandomForestClassifier
22 | from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
23 | from sklearn.model_selection import GridSearchCV, cross_val_score, StratifiedKFold, learning_curve
24 | from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier, ExtraTreesClassifier, VotingClassifier
25 |
26 | from instagram_private_api import Client, ClientCompatPatch
27 | import getpass
28 |
29 | import random
30 | ```
31 |
32 | ## Part 1: Understanding and Splitting the Data
33 | Dataset source: https://www.kaggle.com/eswarchandt/is-your-insta-fake-or-genuine
34 |
35 | Import the data
36 |
37 |
38 | ```python
39 | train = pd.read_csv("train.csv")
40 | test = pd.read_csv("test.csv")
41 | ```
42 |
43 | Inspect the training data
44 |
45 |
46 | ```python
47 | train.head()
48 | ```
49 |
50 |
51 |
52 |
53 |
54 |
67 |
68 |
69 |
70 | |
71 | profile pic |
72 | nums/length username |
73 | fullname words |
74 | nums/length fullname |
75 | name==username |
76 | description length |
77 | external URL |
78 | private |
79 | #posts |
80 | #followers |
81 | #follows |
82 | fake |
83 |
84 |
85 |
86 |
87 | | 0 |
88 | 1 |
89 | 0.27 |
90 | 0 |
91 | 0.0 |
92 | 0 |
93 | 53 |
94 | 0 |
95 | 0 |
96 | 32 |
97 | 1000 |
98 | 955 |
99 | 0 |
100 |
101 |
102 | | 1 |
103 | 1 |
104 | 0.00 |
105 | 2 |
106 | 0.0 |
107 | 0 |
108 | 44 |
109 | 0 |
110 | 0 |
111 | 286 |
112 | 2740 |
113 | 533 |
114 | 0 |
115 |
116 |
117 | | 2 |
118 | 1 |
119 | 0.10 |
120 | 2 |
121 | 0.0 |
122 | 0 |
123 | 0 |
124 | 0 |
125 | 1 |
126 | 13 |
127 | 159 |
128 | 98 |
129 | 0 |
130 |
131 |
132 | | 3 |
133 | 1 |
134 | 0.00 |
135 | 1 |
136 | 0.0 |
137 | 0 |
138 | 82 |
139 | 0 |
140 | 0 |
141 | 679 |
142 | 414 |
143 | 651 |
144 | 0 |
145 |
146 |
147 | | 4 |
148 | 1 |
149 | 0.00 |
150 | 2 |
151 | 0.0 |
152 | 0 |
153 | 0 |
154 | 0 |
155 | 1 |
156 | 6 |
157 | 151 |
158 | 126 |
159 | 0 |
160 |
161 |
162 |
163 |
164 |
165 |
166 |
167 | The features in the training data are the following:
168 | - profile pic: does the user have a profile picture?
169 | - nums/length username: ratio of numerical to alphabetical characters in the username
170 | - fullname words: how many words are in the user's full name?
171 | - nums/length fullname: ratio of numerical to alphabetical characters in the full name
172 | - name==username: is the user's full name the same as the username?
173 | - description length: how many characters is in the user's Instagram bio?
174 | - external URL: does the user have an external URL linked to their profile?
175 | - private: is the user private?
176 | - #posts: number of posts
177 | - #followers: number of people following the user
178 | - #follows: number of people the user follows
179 | - fake: if the user is fake, fake=1, else fake=0
180 |
181 |
182 | ```python
183 | train.describe()
184 | ```
185 |
186 |
187 |
188 |
189 |
190 |
203 |
204 |
205 |
206 | |
207 | profile pic |
208 | nums/length username |
209 | fullname words |
210 | nums/length fullname |
211 | name==username |
212 | description length |
213 | external URL |
214 | private |
215 | #posts |
216 | #followers |
217 | #follows |
218 | fake |
219 |
220 |
221 |
222 |
223 | | count |
224 | 576.000000 |
225 | 576.000000 |
226 | 576.000000 |
227 | 576.000000 |
228 | 576.000000 |
229 | 576.000000 |
230 | 576.000000 |
231 | 576.000000 |
232 | 576.000000 |
233 | 5.760000e+02 |
234 | 576.000000 |
235 | 576.000000 |
236 |
237 |
238 | | mean |
239 | 0.701389 |
240 | 0.163837 |
241 | 1.460069 |
242 | 0.036094 |
243 | 0.034722 |
244 | 22.623264 |
245 | 0.116319 |
246 | 0.381944 |
247 | 107.489583 |
248 | 8.530724e+04 |
249 | 508.381944 |
250 | 0.500000 |
251 |
252 |
253 | | std |
254 | 0.458047 |
255 | 0.214096 |
256 | 1.052601 |
257 | 0.125121 |
258 | 0.183234 |
259 | 37.702987 |
260 | 0.320886 |
261 | 0.486285 |
262 | 402.034431 |
263 | 9.101485e+05 |
264 | 917.981239 |
265 | 0.500435 |
266 |
267 |
268 | | min |
269 | 0.000000 |
270 | 0.000000 |
271 | 0.000000 |
272 | 0.000000 |
273 | 0.000000 |
274 | 0.000000 |
275 | 0.000000 |
276 | 0.000000 |
277 | 0.000000 |
278 | 0.000000e+00 |
279 | 0.000000 |
280 | 0.000000 |
281 |
282 |
283 | | 25% |
284 | 0.000000 |
285 | 0.000000 |
286 | 1.000000 |
287 | 0.000000 |
288 | 0.000000 |
289 | 0.000000 |
290 | 0.000000 |
291 | 0.000000 |
292 | 0.000000 |
293 | 3.900000e+01 |
294 | 57.500000 |
295 | 0.000000 |
296 |
297 |
298 | | 50% |
299 | 1.000000 |
300 | 0.000000 |
301 | 1.000000 |
302 | 0.000000 |
303 | 0.000000 |
304 | 0.000000 |
305 | 0.000000 |
306 | 0.000000 |
307 | 9.000000 |
308 | 1.505000e+02 |
309 | 229.500000 |
310 | 0.500000 |
311 |
312 |
313 | | 75% |
314 | 1.000000 |
315 | 0.310000 |
316 | 2.000000 |
317 | 0.000000 |
318 | 0.000000 |
319 | 34.000000 |
320 | 0.000000 |
321 | 1.000000 |
322 | 81.500000 |
323 | 7.160000e+02 |
324 | 589.500000 |
325 | 1.000000 |
326 |
327 |
328 | | max |
329 | 1.000000 |
330 | 0.920000 |
331 | 12.000000 |
332 | 1.000000 |
333 | 1.000000 |
334 | 150.000000 |
335 | 1.000000 |
336 | 1.000000 |
337 | 7389.000000 |
338 | 1.533854e+07 |
339 | 7500.000000 |
340 | 1.000000 |
341 |
342 |
343 |
344 |
345 |
346 |
347 |
348 |
349 | ```python
350 | train.info()
351 | ```
352 |
353 |
354 | RangeIndex: 576 entries, 0 to 575
355 | Data columns (total 12 columns):
356 | # Column Non-Null Count Dtype
357 | --- ------ -------------- -----
358 | 0 profile pic 576 non-null int64
359 | 1 nums/length username 576 non-null float64
360 | 2 fullname words 576 non-null int64
361 | 3 nums/length fullname 576 non-null float64
362 | 4 name==username 576 non-null int64
363 | 5 description length 576 non-null int64
364 | 6 external URL 576 non-null int64
365 | 7 private 576 non-null int64
366 | 8 #posts 576 non-null int64
367 | 9 #followers 576 non-null int64
368 | 10 #follows 576 non-null int64
369 | 11 fake 576 non-null int64
370 | dtypes: float64(2), int64(10)
371 | memory usage: 54.1 KB
372 |
373 |
374 |
375 | ```python
376 | train.shape
377 | ```
378 |
379 |
380 |
381 |
382 | (576, 12)
383 |
384 |
385 |
386 | Inspect the test data
387 |
388 |
389 | ```python
390 | test.head()
391 | ```
392 |
393 |
394 |
395 |
396 |
397 |
410 |
411 |
412 |
413 | |
414 | profile pic |
415 | nums/length username |
416 | fullname words |
417 | nums/length fullname |
418 | name==username |
419 | description length |
420 | external URL |
421 | private |
422 | #posts |
423 | #followers |
424 | #follows |
425 | fake |
426 |
427 |
428 |
429 |
430 | | 0 |
431 | 1 |
432 | 0.33 |
433 | 1 |
434 | 0.33 |
435 | 1 |
436 | 30 |
437 | 0 |
438 | 1 |
439 | 35 |
440 | 488 |
441 | 604 |
442 | 0 |
443 |
444 |
445 | | 1 |
446 | 1 |
447 | 0.00 |
448 | 5 |
449 | 0.00 |
450 | 0 |
451 | 64 |
452 | 0 |
453 | 1 |
454 | 3 |
455 | 35 |
456 | 6 |
457 | 0 |
458 |
459 |
460 | | 2 |
461 | 1 |
462 | 0.00 |
463 | 2 |
464 | 0.00 |
465 | 0 |
466 | 82 |
467 | 0 |
468 | 1 |
469 | 319 |
470 | 328 |
471 | 668 |
472 | 0 |
473 |
474 |
475 | | 3 |
476 | 1 |
477 | 0.00 |
478 | 1 |
479 | 0.00 |
480 | 0 |
481 | 143 |
482 | 0 |
483 | 1 |
484 | 273 |
485 | 14890 |
486 | 7369 |
487 | 0 |
488 |
489 |
490 | | 4 |
491 | 1 |
492 | 0.50 |
493 | 1 |
494 | 0.00 |
495 | 0 |
496 | 76 |
497 | 0 |
498 | 1 |
499 | 6 |
500 | 225 |
501 | 356 |
502 | 0 |
503 |
504 |
505 |
506 |
507 |
508 |
509 |
510 |
511 | ```python
512 | test.describe()
513 | ```
514 |
515 |
516 |
517 |
518 |
519 |
532 |
533 |
534 |
535 | |
536 | profile pic |
537 | nums/length username |
538 | fullname words |
539 | nums/length fullname |
540 | name==username |
541 | description length |
542 | external URL |
543 | private |
544 | #posts |
545 | #followers |
546 | #follows |
547 | fake |
548 |
549 |
550 |
551 |
552 | | count |
553 | 120.000000 |
554 | 120.000000 |
555 | 120.000000 |
556 | 120.000000 |
557 | 120.000000 |
558 | 120.000000 |
559 | 120.000000 |
560 | 120.000000 |
561 | 120.000000 |
562 | 1.200000e+02 |
563 | 120.000000 |
564 | 120.000000 |
565 |
566 |
567 | | mean |
568 | 0.758333 |
569 | 0.179917 |
570 | 1.550000 |
571 | 0.071333 |
572 | 0.041667 |
573 | 27.200000 |
574 | 0.100000 |
575 | 0.308333 |
576 | 82.866667 |
577 | 4.959472e+04 |
578 | 779.266667 |
579 | 0.500000 |
580 |
581 |
582 | | std |
583 | 0.429888 |
584 | 0.241492 |
585 | 1.187116 |
586 | 0.209429 |
587 | 0.200664 |
588 | 42.588632 |
589 | 0.301258 |
590 | 0.463741 |
591 | 230.468136 |
592 | 3.816126e+05 |
593 | 1409.383558 |
594 | 0.502096 |
595 |
596 |
597 | | min |
598 | 0.000000 |
599 | 0.000000 |
600 | 0.000000 |
601 | 0.000000 |
602 | 0.000000 |
603 | 0.000000 |
604 | 0.000000 |
605 | 0.000000 |
606 | 0.000000 |
607 | 0.000000e+00 |
608 | 1.000000 |
609 | 0.000000 |
610 |
611 |
612 | | 25% |
613 | 1.000000 |
614 | 0.000000 |
615 | 1.000000 |
616 | 0.000000 |
617 | 0.000000 |
618 | 0.000000 |
619 | 0.000000 |
620 | 0.000000 |
621 | 1.000000 |
622 | 6.725000e+01 |
623 | 119.250000 |
624 | 0.000000 |
625 |
626 |
627 | | 50% |
628 | 1.000000 |
629 | 0.000000 |
630 | 1.000000 |
631 | 0.000000 |
632 | 0.000000 |
633 | 0.000000 |
634 | 0.000000 |
635 | 0.000000 |
636 | 8.000000 |
637 | 2.165000e+02 |
638 | 354.500000 |
639 | 0.500000 |
640 |
641 |
642 | | 75% |
643 | 1.000000 |
644 | 0.330000 |
645 | 2.000000 |
646 | 0.000000 |
647 | 0.000000 |
648 | 45.250000 |
649 | 0.000000 |
650 | 1.000000 |
651 | 58.250000 |
652 | 5.932500e+02 |
653 | 668.250000 |
654 | 1.000000 |
655 |
656 |
657 | | max |
658 | 1.000000 |
659 | 0.890000 |
660 | 9.000000 |
661 | 1.000000 |
662 | 1.000000 |
663 | 149.000000 |
664 | 1.000000 |
665 | 1.000000 |
666 | 1879.000000 |
667 | 4.021842e+06 |
668 | 7453.000000 |
669 | 1.000000 |
670 |
671 |
672 |
673 |
674 |
675 |
676 |
677 |
678 | ```python
679 | test.info()
680 | ```
681 |
682 |
683 | RangeIndex: 120 entries, 0 to 119
684 | Data columns (total 12 columns):
685 | # Column Non-Null Count Dtype
686 | --- ------ -------------- -----
687 | 0 profile pic 120 non-null int64
688 | 1 nums/length username 120 non-null float64
689 | 2 fullname words 120 non-null int64
690 | 3 nums/length fullname 120 non-null float64
691 | 4 name==username 120 non-null int64
692 | 5 description length 120 non-null int64
693 | 6 external URL 120 non-null int64
694 | 7 private 120 non-null int64
695 | 8 #posts 120 non-null int64
696 | 9 #followers 120 non-null int64
697 | 10 #follows 120 non-null int64
698 | 11 fake 120 non-null int64
699 | dtypes: float64(2), int64(10)
700 | memory usage: 11.4 KB
701 |
702 |
703 |
704 | ```python
705 | test.shape
706 | ```
707 |
708 |
709 |
710 |
711 | (120, 12)
712 |
713 |
714 |
715 | Check for NULL values
716 |
717 |
718 | ```python
719 | print(train.isna().values.any().sum())
720 | print(test.isna().values.any().sum())
721 | ```
722 |
723 | 0
724 | 0
725 |
726 |
727 | Create a correlation matrix for the features in the training data to check for significantly relevant features
728 |
729 |
730 | ```python
731 | fig, ax = plt.subplots(figsize=(15,10))
732 | corr=train.corr()
733 | sns.heatmap(corr, annot=True)
734 | ```
735 |
736 |
737 |
738 |
739 |
740 |
741 |
742 |
743 |
744 | 
745 |
746 |
747 | Split the training set into data and labels
748 |
749 |
750 | ```python
751 | # Labels
752 | train_Y = train.fake
753 | train_Y = pd.DataFrame(train_Y)
754 |
755 | # Data
756 | train_X = train.drop(columns='fake')
757 | train_X.head()
758 | ```
759 |
760 |
761 |
762 |
763 |
764 |
777 |
778 |
779 |
780 | |
781 | profile pic |
782 | nums/length username |
783 | fullname words |
784 | nums/length fullname |
785 | name==username |
786 | description length |
787 | external URL |
788 | private |
789 | #posts |
790 | #followers |
791 | #follows |
792 |
793 |
794 |
795 |
796 | | 0 |
797 | 1 |
798 | 0.27 |
799 | 0 |
800 | 0.0 |
801 | 0 |
802 | 53 |
803 | 0 |
804 | 0 |
805 | 32 |
806 | 1000 |
807 | 955 |
808 |
809 |
810 | | 1 |
811 | 1 |
812 | 0.00 |
813 | 2 |
814 | 0.0 |
815 | 0 |
816 | 44 |
817 | 0 |
818 | 0 |
819 | 286 |
820 | 2740 |
821 | 533 |
822 |
823 |
824 | | 2 |
825 | 1 |
826 | 0.10 |
827 | 2 |
828 | 0.0 |
829 | 0 |
830 | 0 |
831 | 0 |
832 | 1 |
833 | 13 |
834 | 159 |
835 | 98 |
836 |
837 |
838 | | 3 |
839 | 1 |
840 | 0.00 |
841 | 1 |
842 | 0.0 |
843 | 0 |
844 | 82 |
845 | 0 |
846 | 0 |
847 | 679 |
848 | 414 |
849 | 651 |
850 |
851 |
852 | | 4 |
853 | 1 |
854 | 0.00 |
855 | 2 |
856 | 0.0 |
857 | 0 |
858 | 0 |
859 | 0 |
860 | 1 |
861 | 6 |
862 | 151 |
863 | 126 |
864 |
865 |
866 |
867 |
868 |
869 |
870 |
871 | Split the test set into data and labels
872 |
873 |
874 | ```python
875 | # Labels
876 | test_Y = test.fake
877 | test_Y = pd.DataFrame(test_Y)
878 |
879 | # Data
880 | test_X = test.drop(columns='fake')
881 | test_X.head()
882 | ```
883 |
884 |
885 |
886 |
887 |
888 |
901 |
902 |
903 |
904 | |
905 | profile pic |
906 | nums/length username |
907 | fullname words |
908 | nums/length fullname |
909 | name==username |
910 | description length |
911 | external URL |
912 | private |
913 | #posts |
914 | #followers |
915 | #follows |
916 |
917 |
918 |
919 |
920 | | 0 |
921 | 1 |
922 | 0.33 |
923 | 1 |
924 | 0.33 |
925 | 1 |
926 | 30 |
927 | 0 |
928 | 1 |
929 | 35 |
930 | 488 |
931 | 604 |
932 |
933 |
934 | | 1 |
935 | 1 |
936 | 0.00 |
937 | 5 |
938 | 0.00 |
939 | 0 |
940 | 64 |
941 | 0 |
942 | 1 |
943 | 3 |
944 | 35 |
945 | 6 |
946 |
947 |
948 | | 2 |
949 | 1 |
950 | 0.00 |
951 | 2 |
952 | 0.00 |
953 | 0 |
954 | 82 |
955 | 0 |
956 | 1 |
957 | 319 |
958 | 328 |
959 | 668 |
960 |
961 |
962 | | 3 |
963 | 1 |
964 | 0.00 |
965 | 1 |
966 | 0.00 |
967 | 0 |
968 | 143 |
969 | 0 |
970 | 1 |
971 | 273 |
972 | 14890 |
973 | 7369 |
974 |
975 |
976 | | 4 |
977 | 1 |
978 | 0.50 |
979 | 1 |
980 | 0.00 |
981 | 0 |
982 | 76 |
983 | 0 |
984 | 1 |
985 | 6 |
986 | 225 |
987 | 356 |
988 |
989 |
990 |
991 |
992 |
993 |
994 |
995 | ## Part 2: Comparing Classification Models
996 |
997 | **Baseline Classifier**
998 |
Classify everything as the majority class.
999 |
1000 |
1001 | ```python
1002 | # Baseline classifier
1003 | fakes = len([i for i in train.fake if i==1])
1004 | auth = len([i for i in train.fake if i==0])
1005 | fakes, auth
1006 |
1007 | # classify everything as fake
1008 | pred = [1 for i in range(len(test_X))]
1009 | pred = np.array(pred)
1010 | print("Baseline accuracy: " + str(accuracy_score(pred, test_Y)))
1011 | ```
1012 |
1013 | Baseline accuracy: 0.5
1014 |
1015 |
1016 | **Statistical Method**
1017 |
Classify all users with a following to follower ratio above a certain threshold as 'fake'.
1018 |
i.e. a user with 10 follower and 200 followings will be classified as fake if the threshold r=20
1019 |
1020 |
1021 | ```python
1022 | # Statistical method
1023 | def stat_predict(test_X, r):
1024 | pred = []
1025 | for row in range(len(test_X)):
1026 | followers = test_X.loc[row]['#followers']
1027 | followings = test_X.loc[row]['#follows']
1028 | if followers == 0:
1029 | followers = 1
1030 | if followings == 0:
1031 | followings == 1
1032 |
1033 | ratio = followings/followers
1034 |
1035 | if ratio >= r:
1036 | pred.append(1)
1037 | else:
1038 | pred.append(0)
1039 |
1040 | return np.array(pred)
1041 | accuracies = []
1042 | for i in [x / 10.0 for x in range(5, 255, 5)]:
1043 | prediction = stat_predict(test_X, i)
1044 | accuracies.append(accuracy_score(prediction, test_Y))
1045 |
1046 | f, ax = plt.subplots(figsize=(20,10))
1047 | plt.plot([x / 10.0 for x in range(5, 255, 5)], accuracies)
1048 | plt.plot([2.5 for i in range(len(accuracies))], accuracies, color='red')
1049 | plt.title("Accuracy for different thresholds", size=30)
1050 | plt.xlabel('Ratio', fontsize=20)
1051 | plt.ylabel('Accuracy', fontsize=20)
1052 | print("Maximum Accuracy for the statistical method: " + str(max(accuracies)))
1053 | ```
1054 |
1055 | Maximum Accuracy for the statistical method: 0.7
1056 |
1057 |
1058 |
1059 | 
1060 |
1061 |
1062 | **Logistic Regression**
1063 |
1064 |
1065 | ```python
1066 | lm = LogisticRegression()
1067 |
1068 | # Train the model
1069 | model1 = lm.fit(train_X, train_Y)
1070 |
1071 | # Make a prediction
1072 | lm_predict = model1.predict(test_X)
1073 | ```
1074 |
1075 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/sklearn/utils/validation.py:73: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1076 | return f(**kwargs)
1077 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py:762: ConvergenceWarning: lbfgs failed to converge (status=1):
1078 | STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
1079 |
1080 | Increase the number of iterations (max_iter) or scale the data as shown in:
1081 | https://scikit-learn.org/stable/modules/preprocessing.html
1082 | Please also refer to the documentation for alternative solver options:
1083 | https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
1084 | extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
1085 |
1086 |
1087 |
1088 | ```python
1089 | # Compute the accuracy of the model
1090 | acc = accuracy_score(lm_predict, test_Y)
1091 | print("Logistic Regression accuracy: " + str(acc))
1092 | ```
1093 |
1094 | Logistic Regression accuracy: 0.9083333333333333
1095 |
1096 |
1097 | **KNN Classifier**
1098 |
1099 |
1100 | ```python
1101 | accuracies = []
1102 |
1103 | # Compare the accuracies of using the KNN classifier with different number of neighbors
1104 | for i in range(1,10):
1105 | knn = KNeighborsClassifier(n_neighbors=i)
1106 | model_2 = knn.fit(train_X,train_Y)
1107 | knn_predict = model_2.predict(test_X)
1108 | accuracy = accuracy_score(knn_predict,test_Y)
1109 | accuracies.append(accuracy)
1110 |
1111 | max_acc = (0, 0)
1112 | for i in range(1, 10):
1113 | if accuracies[i-1] > max_acc[1]:
1114 | max_acc = (i, accuracies[i-1])
1115 |
1116 | max_acc
1117 |
1118 | f, ax = plt.subplots(figsize=(20,10))
1119 | plt.plot([i for i in range(1,10)], accuracies)
1120 | plt.plot([7 for i in range(len(accuracies))], accuracies, color='red')
1121 | plt.title("Accuracy for different n-neighbors", size=30)
1122 | plt.xlabel('Number of neighbors', fontsize=20)
1123 | plt.ylabel('Accuracy', fontsize=20)
1124 |
1125 | print("The highest accuracy obtained using KNN is " + str(max_acc[1]) + " achieved by a value of n=" + str(max_acc[0]))
1126 | ```
1127 |
1128 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1129 |
1130 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1131 |
1132 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1133 |
1134 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1135 |
1136 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1137 |
1138 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1139 |
1140 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1141 |
1142 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1143 |
1144 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
1145 |
1146 |
1147 |
1148 | The highest accuracy obtained using KNN is 0.8666666666666667 achieved by a value of n=7
1149 |
1150 |
1151 |
1152 | 
1153 |
1154 |
1155 | **Decision Tree Classifier**
1156 |
1157 |
1158 | ```python
1159 | DT = DecisionTreeClassifier()
1160 |
1161 | # Train the model
1162 | model3 = DT.fit(train_X, train_Y)
1163 |
1164 | # Make a prediction
1165 | DT_predict = model3.predict(test_X)
1166 | ```
1167 |
1168 |
1169 | ```python
1170 | # Compute the accuracy of the model
1171 | acc = accuracy_score(DT_predict, test_Y)
1172 | print("Decision Tree accuracy: " + str(acc))
1173 | ```
1174 |
1175 | Decision Tree accuracy: 0.9
1176 |
1177 |
1178 | **Random Forest Classifier**
1179 |
1180 |
1181 | ```python
1182 | rfc = RandomForestClassifier()
1183 |
1184 | # Train the model
1185 | model_4 = rfc.fit(train_X, train_Y)
1186 |
1187 | # Make a prediction
1188 | rfc_predict = model_4.predict(test_X)
1189 | ```
1190 |
1191 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:4: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
1192 | after removing the cwd from sys.path.
1193 |
1194 |
1195 |
1196 | ```python
1197 | # Compute the accuracy of the model
1198 | acc = accuracy_score(rfc_predict, test_Y)
1199 | print("Random Forest accuracy: " + str(acc))
1200 | ```
1201 |
1202 | Random Forest accuracy: 0.925
1203 |
1204 |
1205 | ## Part 3: Obtaining Instagram Data
1206 | We are going to use the hassle-free unofficial Instagram API.
To install: ```$ pip install git+https://git@github.com/ping/instagram_private_api.git@1.6.0```
1207 |
1208 | Log in to your Instagram account (preferably not your personal one! I created one just for this project 😉)
1209 |
1210 |
1211 | ```python
1212 | def login():
1213 | username = input("username: ")
1214 | password = getpass.getpass("password: ")
1215 | api = Client(username, password)
1216 | return api
1217 |
1218 | api = login()
1219 | ```
1220 |
1221 | username: ins.tapolice
1222 | password: ········
1223 |
1224 |
1225 | Get the Instagram user ID
1226 |
1227 |
1228 | ```python
1229 | def get_ID(username):
1230 | return api.username_info(username)['user']['pk']
1231 | ```
1232 |
1233 |
1234 | ```python
1235 | # The user used for the experiment below is anonymised!
1236 | # i.e. this cell was run and then changed to protect the user's anonymity
1237 | userID = get_ID('')
1238 | ```
1239 |
1240 | The API needs some sort of rank to query followers, posts, etc.
1241 |
1242 |
1243 | ```python
1244 | rank = api.generate_uuid()
1245 | ```
1246 |
1247 | Get the user's list follower usernames (this may take a while, depending on how many followers the user have)
1248 |
1249 |
1250 | ```python
1251 | def get_followers(userID, rank):
1252 | followers = []
1253 | next_max_id = True
1254 |
1255 | while next_max_id:
1256 | if next_max_id == True: next_max_id=''
1257 | f = api.user_followers(userID, rank, max_id=next_max_id)
1258 | followers.extend(f.get('users', []))
1259 | next_max_id = f.get('next_max_id', '')
1260 |
1261 | user_fer = [dic['username'] for dic in followers]
1262 |
1263 | return user_fer
1264 | ```
1265 |
1266 |
1267 | ```python
1268 | followers = get_followers(userID, rank)
1269 | ```
1270 |
1271 |
1272 | ```python
1273 | # You can check the number of followers if you'd like to
1274 | # len(followers)
1275 | ```
1276 |
1277 | ## Part 4: Preparing the Data
1278 |
1279 | Inspect the data (and what other data can you obtain from it) and compare it with the train and test tables above. Find out what you need to do to obtain the features for a data point in order to make a prediction.
1280 |
1281 | Recall that the features for a data point are the following:
1282 | - profile pic: does the user have a profile picture?
1283 | - nums/length username: ratio of numerical to alphabetical characters in the username
1284 | - fullname words: how many words are in the user's full name?
1285 | - nums/length fullname: ratio of numerical to alphabetical characters in the full name
1286 | - name==username: is the user's full name the same as the username?
1287 | - description length: how many characters is in the user's Instagram bio?
1288 | - external URL: does the user have an external URL linked to their profile?
1289 | - private: is the user private?
1290 | - #posts: number of posts
1291 | - #followers: number of people following the user
1292 | - #follows: number of people the user follows
1293 | - fake: if the user is fake, fake=1, else fake=0
1294 |
1295 |
1296 | ```python
1297 | # This will print the first follower username on the list
1298 | # print(followers[0])
1299 | ```
1300 |
1301 |
1302 | ```python
1303 | # This will get the information on a certain user
1304 | info = api.user_info(get_ID(followers[0]))['user']
1305 |
1306 | # Check what information is available for one particular user
1307 | info.keys()
1308 | ```
1309 |
1310 |
1311 |
1312 |
1313 | dict_keys(['pk', 'username', 'full_name', 'is_private', 'profile_pic_url', 'profile_pic_id', 'is_verified', 'has_anonymous_profile_picture', 'media_count', 'geo_media_count', 'follower_count', 'following_count', 'following_tag_count', 'biography', 'biography_with_entities', 'external_url', 'external_lynx_url', 'total_igtv_videos', 'total_clips_count', 'total_ar_effects', 'usertags_count', 'is_favorite', 'is_favorite_for_stories', 'is_favorite_for_highlights', 'live_subscription_status', 'is_interest_account', 'has_chaining', 'hd_profile_pic_versions', 'hd_profile_pic_url_info', 'mutual_followers_count', 'has_highlight_reels', 'can_be_reported_as_fraud', 'is_eligible_for_smb_support_flow', 'smb_support_partner', 'smb_delivery_partner', 'smb_donation_partner', 'smb_support_delivery_partner', 'displayed_action_button_type', 'direct_messaging', 'fb_page_call_to_action_id', 'address_street', 'business_contact_method', 'category', 'city_id', 'city_name', 'contact_phone_number', 'is_call_to_action_enabled', 'latitude', 'longitude', 'public_email', 'public_phone_country_code', 'public_phone_number', 'zip', 'instagram_location_id', 'is_business', 'account_type', 'professional_conversion_suggested_account_type', 'can_hide_category', 'can_hide_public_contacts', 'should_show_category', 'should_show_public_contacts', 'personal_account_ads_page_name', 'personal_account_ads_page_id', 'include_direct_blacklist_status', 'is_potential_business', 'show_post_insights_entry_point', 'is_bestie', 'has_unseen_besties_media', 'show_account_transparency_details', 'show_leave_feedback', 'robi_feedback_source', 'auto_expand_chaining', 'highlight_reshare_disabled', 'is_memorialized', 'open_external_url_with_in_app_browser'])
1314 |
1315 |
1316 |
1317 | You can see that we have pretty much all the features to make a user data point for prediction, but we need to filter and extract them, and perform some very minor calculations. The following function will do just that:
1318 |
1319 |
1320 | ```python
1321 | def get_data(info):
1322 |
1323 | """Extract the information from the returned JSON.
1324 |
1325 | This function will return the following array:
1326 | data = [profile pic,
1327 | nums/length username,
1328 | full name words,
1329 | nums/length full name,
1330 | name==username,
1331 | description length,
1332 | external URL,
1333 | private,
1334 | #posts,
1335 | #followers,
1336 | #followings]
1337 | """
1338 |
1339 | data = []
1340 |
1341 | # Does the user have a profile photo?
1342 | profile_pic = not info['has_anonymous_profile_picture']
1343 | if profile_pic == True:
1344 | profile_pic = 1
1345 | else:
1346 | profile_pic = 0
1347 | data.append(profile_pic)
1348 |
1349 | # Ratio of number of numerical chars in username to its length
1350 | username = info['username']
1351 | uname_ratio = len([x for x in username if x.isdigit()]) / float(len(username))
1352 | data.append(uname_ratio)
1353 |
1354 | # Full name in word tokens
1355 | full_name = info['full_name']
1356 | fname_tokens = len(full_name.split(' '))
1357 | data.append(fname_tokens)
1358 |
1359 | # Ratio of number of numerical characters in full name to its length
1360 | if len(full_name) == 0:
1361 | fname_ratio = 0
1362 | else:
1363 | fname_ratio = len([x for x in full_name if x.isdigit()]) / float(len(full_name))
1364 | data.append(fname_ratio)
1365 |
1366 | # Is name == username?
1367 | name_eq_uname = (full_name == username)
1368 | if name_eq_uname == True:
1369 | name_eq_uname = 1
1370 | else:
1371 | name_eq_uname = 0
1372 | data.append(name_eq_uname)
1373 |
1374 | # Number of characters on user bio
1375 | bio_length = len(info['biography'])
1376 | data.append(bio_length)
1377 |
1378 | # Does the user have an external URL?
1379 | ext_url = info['external_url'] != ''
1380 | if ext_url == True:
1381 | ext_url = 1
1382 | else:
1383 | ext_url = 0
1384 | data.append(ext_url)
1385 |
1386 | # Is the user private or no?
1387 | private = info['is_private']
1388 | if private == True:
1389 | private = 1
1390 | else:
1391 | private = 0
1392 | data.append(private)
1393 |
1394 | # Number of posts
1395 | posts = info['media_count']
1396 | data.append(posts)
1397 |
1398 | # Number of followers
1399 | followers = info['follower_count']
1400 | data.append(followers)
1401 |
1402 | # Number of followings
1403 | followings = info['following_count']
1404 | data.append(followings)
1405 |
1406 |
1407 | return data
1408 | ```
1409 |
1410 |
1411 | ```python
1412 | # Check if the function returns as expected
1413 | get_data(info)
1414 | ```
1415 |
1416 |
1417 |
1418 |
1419 | [1, 0.0, 3, 0.0, 0, 118, 1, 0, 589, 22227, 510]
1420 |
1421 |
1422 |
1423 | Unfortunately the Instagram Private API has a very limited number of API calls per hour so we will not be able to analyse *all* of the user's followers.
1424 |
1425 | Fortunately, I took Statistics and learned that **random sampling** is useful to cull a smaller sample size from a larger population and use it to research and make generalizations about the larger group.
1426 |
1427 | This will allow us to make user authenticity approximations despite the API limitations and still have a data that is representative of the user's followers.
1428 |
1429 |
1430 | ```python
1431 | # Get a random sample of 50 followers
1432 | random_followers = random.sample(followers, 50)
1433 | ```
1434 |
1435 | Get user information for each follower
1436 |
1437 |
1438 | ```python
1439 | f_infos = []
1440 |
1441 | for follower in random_followers:
1442 | info = api.user_info(get_ID(follower))['user']
1443 | f_infos.append(info)
1444 | ```
1445 |
1446 | Extract the relevant features
1447 |
1448 |
1449 | ```python
1450 | f_table = []
1451 |
1452 | for info in f_infos:
1453 | f_table.append(get_data(info))
1454 |
1455 | f_table
1456 | ```
1457 |
1458 |
1459 |
1460 |
1461 | [[1, 0.0, 3, 0.0, 0, 43, 0, 1, 108, 788, 764],
1462 | [1, 0.0, 1, 0, 0, 45, 0, 0, 1, 252, 483],
1463 | [1, 0.0, 3, 0.0, 0, 90, 0, 0, 536, 1818, 7486],
1464 | [1, 0.5, 3, 0.0, 0, 0, 0, 0, 157, 148, 813],
1465 | [1, 0.0, 1, 0.0, 0, 102, 0, 1, 24, 481, 592],
1466 | [1, 0.0, 1, 0.0, 0, 59, 0, 1, 19, 773, 3639],
1467 | [1, 0.0, 1, 0, 0, 8, 0, 1, 0, 3, 3639],
1468 | [1, 0.0, 3, 0.0, 0, 90, 1, 0, 27, 63, 19],
1469 | [1, 0.0, 4, 0.0, 0, 148, 0, 1, 458, 682, 436],
1470 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 35, 1054, 1046],
1471 | [1, 0.36363636363636365, 1, 0.0, 0, 96, 0, 1, 96, 50, 98],
1472 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 2, 10, 202],
1473 | [1, 0.0, 2, 0.0, 0, 135, 1, 1, 159, 52, 240],
1474 | [1, 0.0, 1, 0.0, 0, 20, 0, 0, 87, 1864, 692],
1475 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 35, 275, 2039],
1476 | [1, 0.0625, 3, 0.0, 0, 98, 0, 0, 9, 98, 847],
1477 | [1, 0.0, 3, 0.0, 0, 92, 0, 1, 10, 11, 46],
1478 | [1, 0.0, 2, 0.0, 0, 69, 0, 1, 16, 2686, 6570],
1479 | [1, 0.0, 2, 0.0, 0, 68, 0, 1, 31, 18, 64],
1480 | [1, 0.0, 3, 0.0, 0, 6, 0, 0, 27, 1628, 1037],
1481 | [1, 0.0, 1, 0, 0, 2, 0, 0, 21, 1730, 1298],
1482 | [0, 0.18181818181818182, 2, 0.0, 0, 0, 0, 1, 219, 183, 275],
1483 | [1, 0.0, 2, 0.0, 0, 38, 0, 0, 11, 645, 4452],
1484 | [1, 0.0, 2, 0.0, 0, 30, 1, 0, 42, 1258, 952],
1485 | [1, 0.0, 1, 0.0, 0, 9, 0, 0, 2, 629, 485],
1486 | [1, 0.23529411764705882, 1, 0.0, 0, 62, 0, 1, 12, 1270, 951],
1487 | [1, 0.0, 1, 0.0, 0, 86, 0, 0, 299, 1669, 1133],
1488 | [1, 0.0, 2, 0.0, 0, 14, 0, 0, 11, 753, 853],
1489 | [1, 0.2, 2, 0.0, 0, 9, 0, 0, 0, 213, 700],
1490 | [1, 0.0, 1, 0.0, 0, 133, 0, 1, 11, 28, 169],
1491 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 3, 1395, 794],
1492 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 71, 831, 1024],
1493 | [1, 0.0, 3, 0.0, 0, 29, 0, 0, 61, 680, 566],
1494 | [1, 0.0, 2, 0.0, 0, 64, 0, 0, 1729, 6114, 5758],
1495 | [1, 0.0, 2, 0.0, 0, 17, 0, 0, 73, 2104, 7091],
1496 | [1, 0.0, 3, 0.0, 0, 36, 0, 1, 20, 728, 4139],
1497 | [1, 0.0, 2, 0.0, 0, 106, 0, 1, 23, 83, 458],
1498 | [1, 0.0, 2, 0.0, 0, 31, 0, 1, 78, 2035, 1035],
1499 | [1, 0.0, 2, 0.0, 0, 35, 0, 1, 12, 11549, 712],
1500 | [1, 0.0, 3, 0.08333333333333333, 0, 100, 0, 1, 56, 39, 190],
1501 | [1, 0.13333333333333333, 1, 0.0, 0, 103, 0, 1, 109, 1053, 6221],
1502 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 49, 412, 520],
1503 | [1, 0.0, 1, 0, 0, 7, 0, 0, 110, 317, 334],
1504 | [1, 0.0, 1, 0.0, 0, 31, 1, 0, 141, 2490, 1043],
1505 | [1, 0.18181818181818182, 2, 0.0, 0, 35, 1, 0, 320, 2345, 861],
1506 | [1, 0.0, 3, 0.0, 0, 115, 0, 1, 1336, 1018, 1208],
1507 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 39, 37, 611],
1508 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 0, 513, 633],
1509 | [1, 0.0, 2, 0.0, 0, 46, 0, 0, 23, 83, 306],
1510 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 30, 126, 372]]
1511 |
1512 |
1513 |
1514 | Create a pandas dataframe
1515 |
1516 |
1517 | ```python
1518 | test_data = pd.DataFrame(f_table,
1519 | columns = ['profile pic',
1520 | 'nums/length username',
1521 | 'fullname words',
1522 | 'nums/length fullname',
1523 | 'name==username',
1524 | 'description length',
1525 | 'external URL',
1526 | 'private',
1527 | '#posts',
1528 | '#followers',
1529 | '#follows'])
1530 | test_data
1531 | ```
1532 |
1533 |
1534 |
1535 |
1536 |
1537 |
1550 |
1551 |
1552 |
1553 | |
1554 | profile pic |
1555 | nums/length username |
1556 | fullname words |
1557 | nums/length fullname |
1558 | name==username |
1559 | description length |
1560 | external URL |
1561 | private |
1562 | #posts |
1563 | #followers |
1564 | #follows |
1565 |
1566 |
1567 |
1568 |
1569 | | 0 |
1570 | 1 |
1571 | 0.000000 |
1572 | 3 |
1573 | 0.000000 |
1574 | 0 |
1575 | 43 |
1576 | 0 |
1577 | 1 |
1578 | 108 |
1579 | 788 |
1580 | 764 |
1581 |
1582 |
1583 | | 1 |
1584 | 1 |
1585 | 0.000000 |
1586 | 1 |
1587 | 0.000000 |
1588 | 0 |
1589 | 45 |
1590 | 0 |
1591 | 0 |
1592 | 1 |
1593 | 252 |
1594 | 483 |
1595 |
1596 |
1597 | | 2 |
1598 | 1 |
1599 | 0.000000 |
1600 | 3 |
1601 | 0.000000 |
1602 | 0 |
1603 | 90 |
1604 | 0 |
1605 | 0 |
1606 | 536 |
1607 | 1818 |
1608 | 7486 |
1609 |
1610 |
1611 | | 3 |
1612 | 1 |
1613 | 0.500000 |
1614 | 3 |
1615 | 0.000000 |
1616 | 0 |
1617 | 0 |
1618 | 0 |
1619 | 0 |
1620 | 157 |
1621 | 148 |
1622 | 813 |
1623 |
1624 |
1625 | | 4 |
1626 | 1 |
1627 | 0.000000 |
1628 | 1 |
1629 | 0.000000 |
1630 | 0 |
1631 | 102 |
1632 | 0 |
1633 | 1 |
1634 | 24 |
1635 | 481 |
1636 | 592 |
1637 |
1638 |
1639 | | 5 |
1640 | 1 |
1641 | 0.000000 |
1642 | 1 |
1643 | 0.000000 |
1644 | 0 |
1645 | 59 |
1646 | 0 |
1647 | 1 |
1648 | 19 |
1649 | 773 |
1650 | 3639 |
1651 |
1652 |
1653 | | 6 |
1654 | 1 |
1655 | 0.000000 |
1656 | 1 |
1657 | 0.000000 |
1658 | 0 |
1659 | 8 |
1660 | 0 |
1661 | 1 |
1662 | 0 |
1663 | 3 |
1664 | 3639 |
1665 |
1666 |
1667 | | 7 |
1668 | 1 |
1669 | 0.000000 |
1670 | 3 |
1671 | 0.000000 |
1672 | 0 |
1673 | 90 |
1674 | 1 |
1675 | 0 |
1676 | 27 |
1677 | 63 |
1678 | 19 |
1679 |
1680 |
1681 | | 8 |
1682 | 1 |
1683 | 0.000000 |
1684 | 4 |
1685 | 0.000000 |
1686 | 0 |
1687 | 148 |
1688 | 0 |
1689 | 1 |
1690 | 458 |
1691 | 682 |
1692 | 436 |
1693 |
1694 |
1695 | | 9 |
1696 | 1 |
1697 | 0.000000 |
1698 | 2 |
1699 | 0.000000 |
1700 | 0 |
1701 | 0 |
1702 | 0 |
1703 | 1 |
1704 | 35 |
1705 | 1054 |
1706 | 1046 |
1707 |
1708 |
1709 | | 10 |
1710 | 1 |
1711 | 0.363636 |
1712 | 1 |
1713 | 0.000000 |
1714 | 0 |
1715 | 96 |
1716 | 0 |
1717 | 1 |
1718 | 96 |
1719 | 50 |
1720 | 98 |
1721 |
1722 |
1723 | | 11 |
1724 | 1 |
1725 | 0.000000 |
1726 | 1 |
1727 | 0.000000 |
1728 | 0 |
1729 | 0 |
1730 | 0 |
1731 | 1 |
1732 | 2 |
1733 | 10 |
1734 | 202 |
1735 |
1736 |
1737 | | 12 |
1738 | 1 |
1739 | 0.000000 |
1740 | 2 |
1741 | 0.000000 |
1742 | 0 |
1743 | 135 |
1744 | 1 |
1745 | 1 |
1746 | 159 |
1747 | 52 |
1748 | 240 |
1749 |
1750 |
1751 | | 13 |
1752 | 1 |
1753 | 0.000000 |
1754 | 1 |
1755 | 0.000000 |
1756 | 0 |
1757 | 20 |
1758 | 0 |
1759 | 0 |
1760 | 87 |
1761 | 1864 |
1762 | 692 |
1763 |
1764 |
1765 | | 14 |
1766 | 1 |
1767 | 0.000000 |
1768 | 1 |
1769 | 0.000000 |
1770 | 0 |
1771 | 0 |
1772 | 0 |
1773 | 1 |
1774 | 35 |
1775 | 275 |
1776 | 2039 |
1777 |
1778 |
1779 | | 15 |
1780 | 1 |
1781 | 0.062500 |
1782 | 3 |
1783 | 0.000000 |
1784 | 0 |
1785 | 98 |
1786 | 0 |
1787 | 0 |
1788 | 9 |
1789 | 98 |
1790 | 847 |
1791 |
1792 |
1793 | | 16 |
1794 | 1 |
1795 | 0.000000 |
1796 | 3 |
1797 | 0.000000 |
1798 | 0 |
1799 | 92 |
1800 | 0 |
1801 | 1 |
1802 | 10 |
1803 | 11 |
1804 | 46 |
1805 |
1806 |
1807 | | 17 |
1808 | 1 |
1809 | 0.000000 |
1810 | 2 |
1811 | 0.000000 |
1812 | 0 |
1813 | 69 |
1814 | 0 |
1815 | 1 |
1816 | 16 |
1817 | 2686 |
1818 | 6570 |
1819 |
1820 |
1821 | | 18 |
1822 | 1 |
1823 | 0.000000 |
1824 | 2 |
1825 | 0.000000 |
1826 | 0 |
1827 | 68 |
1828 | 0 |
1829 | 1 |
1830 | 31 |
1831 | 18 |
1832 | 64 |
1833 |
1834 |
1835 | | 19 |
1836 | 1 |
1837 | 0.000000 |
1838 | 3 |
1839 | 0.000000 |
1840 | 0 |
1841 | 6 |
1842 | 0 |
1843 | 0 |
1844 | 27 |
1845 | 1628 |
1846 | 1037 |
1847 |
1848 |
1849 | | 20 |
1850 | 1 |
1851 | 0.000000 |
1852 | 1 |
1853 | 0.000000 |
1854 | 0 |
1855 | 2 |
1856 | 0 |
1857 | 0 |
1858 | 21 |
1859 | 1730 |
1860 | 1298 |
1861 |
1862 |
1863 | | 21 |
1864 | 0 |
1865 | 0.181818 |
1866 | 2 |
1867 | 0.000000 |
1868 | 0 |
1869 | 0 |
1870 | 0 |
1871 | 1 |
1872 | 219 |
1873 | 183 |
1874 | 275 |
1875 |
1876 |
1877 | | 22 |
1878 | 1 |
1879 | 0.000000 |
1880 | 2 |
1881 | 0.000000 |
1882 | 0 |
1883 | 38 |
1884 | 0 |
1885 | 0 |
1886 | 11 |
1887 | 645 |
1888 | 4452 |
1889 |
1890 |
1891 | | 23 |
1892 | 1 |
1893 | 0.000000 |
1894 | 2 |
1895 | 0.000000 |
1896 | 0 |
1897 | 30 |
1898 | 1 |
1899 | 0 |
1900 | 42 |
1901 | 1258 |
1902 | 952 |
1903 |
1904 |
1905 | | 24 |
1906 | 1 |
1907 | 0.000000 |
1908 | 1 |
1909 | 0.000000 |
1910 | 0 |
1911 | 9 |
1912 | 0 |
1913 | 0 |
1914 | 2 |
1915 | 629 |
1916 | 485 |
1917 |
1918 |
1919 | | 25 |
1920 | 1 |
1921 | 0.235294 |
1922 | 1 |
1923 | 0.000000 |
1924 | 0 |
1925 | 62 |
1926 | 0 |
1927 | 1 |
1928 | 12 |
1929 | 1270 |
1930 | 951 |
1931 |
1932 |
1933 | | 26 |
1934 | 1 |
1935 | 0.000000 |
1936 | 1 |
1937 | 0.000000 |
1938 | 0 |
1939 | 86 |
1940 | 0 |
1941 | 0 |
1942 | 299 |
1943 | 1669 |
1944 | 1133 |
1945 |
1946 |
1947 | | 27 |
1948 | 1 |
1949 | 0.000000 |
1950 | 2 |
1951 | 0.000000 |
1952 | 0 |
1953 | 14 |
1954 | 0 |
1955 | 0 |
1956 | 11 |
1957 | 753 |
1958 | 853 |
1959 |
1960 |
1961 | | 28 |
1962 | 1 |
1963 | 0.200000 |
1964 | 2 |
1965 | 0.000000 |
1966 | 0 |
1967 | 9 |
1968 | 0 |
1969 | 0 |
1970 | 0 |
1971 | 213 |
1972 | 700 |
1973 |
1974 |
1975 | | 29 |
1976 | 1 |
1977 | 0.000000 |
1978 | 1 |
1979 | 0.000000 |
1980 | 0 |
1981 | 133 |
1982 | 0 |
1983 | 1 |
1984 | 11 |
1985 | 28 |
1986 | 169 |
1987 |
1988 |
1989 | | 30 |
1990 | 1 |
1991 | 0.000000 |
1992 | 2 |
1993 | 0.000000 |
1994 | 0 |
1995 | 0 |
1996 | 0 |
1997 | 1 |
1998 | 3 |
1999 | 1395 |
2000 | 794 |
2001 |
2002 |
2003 | | 31 |
2004 | 1 |
2005 | 0.000000 |
2006 | 2 |
2007 | 0.000000 |
2008 | 0 |
2009 | 0 |
2010 | 0 |
2011 | 0 |
2012 | 71 |
2013 | 831 |
2014 | 1024 |
2015 |
2016 |
2017 | | 32 |
2018 | 1 |
2019 | 0.000000 |
2020 | 3 |
2021 | 0.000000 |
2022 | 0 |
2023 | 29 |
2024 | 0 |
2025 | 0 |
2026 | 61 |
2027 | 680 |
2028 | 566 |
2029 |
2030 |
2031 | | 33 |
2032 | 1 |
2033 | 0.000000 |
2034 | 2 |
2035 | 0.000000 |
2036 | 0 |
2037 | 64 |
2038 | 0 |
2039 | 0 |
2040 | 1729 |
2041 | 6114 |
2042 | 5758 |
2043 |
2044 |
2045 | | 34 |
2046 | 1 |
2047 | 0.000000 |
2048 | 2 |
2049 | 0.000000 |
2050 | 0 |
2051 | 17 |
2052 | 0 |
2053 | 0 |
2054 | 73 |
2055 | 2104 |
2056 | 7091 |
2057 |
2058 |
2059 | | 35 |
2060 | 1 |
2061 | 0.000000 |
2062 | 3 |
2063 | 0.000000 |
2064 | 0 |
2065 | 36 |
2066 | 0 |
2067 | 1 |
2068 | 20 |
2069 | 728 |
2070 | 4139 |
2071 |
2072 |
2073 | | 36 |
2074 | 1 |
2075 | 0.000000 |
2076 | 2 |
2077 | 0.000000 |
2078 | 0 |
2079 | 106 |
2080 | 0 |
2081 | 1 |
2082 | 23 |
2083 | 83 |
2084 | 458 |
2085 |
2086 |
2087 | | 37 |
2088 | 1 |
2089 | 0.000000 |
2090 | 2 |
2091 | 0.000000 |
2092 | 0 |
2093 | 31 |
2094 | 0 |
2095 | 1 |
2096 | 78 |
2097 | 2035 |
2098 | 1035 |
2099 |
2100 |
2101 | | 38 |
2102 | 1 |
2103 | 0.000000 |
2104 | 2 |
2105 | 0.000000 |
2106 | 0 |
2107 | 35 |
2108 | 0 |
2109 | 1 |
2110 | 12 |
2111 | 11549 |
2112 | 712 |
2113 |
2114 |
2115 | | 39 |
2116 | 1 |
2117 | 0.000000 |
2118 | 3 |
2119 | 0.083333 |
2120 | 0 |
2121 | 100 |
2122 | 0 |
2123 | 1 |
2124 | 56 |
2125 | 39 |
2126 | 190 |
2127 |
2128 |
2129 | | 40 |
2130 | 1 |
2131 | 0.133333 |
2132 | 1 |
2133 | 0.000000 |
2134 | 0 |
2135 | 103 |
2136 | 0 |
2137 | 1 |
2138 | 109 |
2139 | 1053 |
2140 | 6221 |
2141 |
2142 |
2143 | | 41 |
2144 | 1 |
2145 | 0.000000 |
2146 | 1 |
2147 | 0.000000 |
2148 | 0 |
2149 | 0 |
2150 | 0 |
2151 | 0 |
2152 | 49 |
2153 | 412 |
2154 | 520 |
2155 |
2156 |
2157 | | 42 |
2158 | 1 |
2159 | 0.000000 |
2160 | 1 |
2161 | 0.000000 |
2162 | 0 |
2163 | 7 |
2164 | 0 |
2165 | 0 |
2166 | 110 |
2167 | 317 |
2168 | 334 |
2169 |
2170 |
2171 | | 43 |
2172 | 1 |
2173 | 0.000000 |
2174 | 1 |
2175 | 0.000000 |
2176 | 0 |
2177 | 31 |
2178 | 1 |
2179 | 0 |
2180 | 141 |
2181 | 2490 |
2182 | 1043 |
2183 |
2184 |
2185 | | 44 |
2186 | 1 |
2187 | 0.181818 |
2188 | 2 |
2189 | 0.000000 |
2190 | 0 |
2191 | 35 |
2192 | 1 |
2193 | 0 |
2194 | 320 |
2195 | 2345 |
2196 | 861 |
2197 |
2198 |
2199 | | 45 |
2200 | 1 |
2201 | 0.000000 |
2202 | 3 |
2203 | 0.000000 |
2204 | 0 |
2205 | 115 |
2206 | 0 |
2207 | 1 |
2208 | 1336 |
2209 | 1018 |
2210 | 1208 |
2211 |
2212 |
2213 | | 46 |
2214 | 1 |
2215 | 0.000000 |
2216 | 1 |
2217 | 0.000000 |
2218 | 0 |
2219 | 0 |
2220 | 0 |
2221 | 1 |
2222 | 39 |
2223 | 37 |
2224 | 611 |
2225 |
2226 |
2227 | | 47 |
2228 | 1 |
2229 | 0.000000 |
2230 | 1 |
2231 | 0.000000 |
2232 | 0 |
2233 | 0 |
2234 | 0 |
2235 | 1 |
2236 | 0 |
2237 | 513 |
2238 | 633 |
2239 |
2240 |
2241 | | 48 |
2242 | 1 |
2243 | 0.000000 |
2244 | 2 |
2245 | 0.000000 |
2246 | 0 |
2247 | 46 |
2248 | 0 |
2249 | 0 |
2250 | 23 |
2251 | 83 |
2252 | 306 |
2253 |
2254 |
2255 | | 49 |
2256 | 1 |
2257 | 0.000000 |
2258 | 1 |
2259 | 0.000000 |
2260 | 0 |
2261 | 0 |
2262 | 0 |
2263 | 0 |
2264 | 30 |
2265 | 126 |
2266 | 372 |
2267 |
2268 |
2269 |
2270 |
2271 |
2272 |
2273 |
2274 | ## Part 5: Make the prediction!
2275 | In part 2, we have compared the different classifiers and found that the Random Forest Classifier had the highest accuracy at 92.5%. Therefore, we are going to use this classifier to make the prediction.
2276 |
2277 |
2278 | ```python
2279 | rfc = RandomForestClassifier()
2280 |
2281 | # Train the model
2282 | # We've done this in Part 2 but I'm redoing it here for coherence ☺️
2283 | rfc_model = rfc.fit(train_X, train_Y)
2284 | ```
2285 |
2286 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:5: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
2287 | """
2288 |
2289 |
2290 |
2291 | ```python
2292 | rfc_labels = rfc_model.predict(test_data)
2293 | rfc_labels
2294 | ```
2295 |
2296 |
2297 |
2298 |
2299 | array([0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,
2300 | 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
2301 | 0, 0, 1, 0, 0, 0])
2302 |
2303 |
2304 |
2305 | Calculate the number of fake accounts in the random sample of 50 followers
2306 |
2307 |
2308 | ```python
2309 | no_fakes = len([x for x in rfc_labels if x==1])
2310 | ```
2311 |
2312 | Calculate the Instagram user's authenticity,
2313 |
where authenticity = (#followers - #fakes)*100 / #followers
2314 |
2315 |
2316 | ```python
2317 | authenticity = (len(random_followers) - no_fakes) * 100 / len(random_followers)
2318 | print("User X's Instagram Followers is " + str(authenticity) + "% authentic.")
2319 | ```
2320 |
2321 | User X's Instagram Followers is 82.0% authentic.
2322 |
2323 |
2324 | ## Part 6: Extension - Fake Likes
2325 | The method above can also be extended to check fake likes within a post.
2326 |
2327 | Get the user's posts
2328 |
2329 |
2330 | ```python
2331 | def get_user_posts(userID, min_posts_to_be_retrieved):
2332 | # Retrieve all posts from my profile
2333 | my_posts = []
2334 | has_more_posts = True
2335 | max_id = ''
2336 |
2337 | while has_more_posts:
2338 | feed = api.user_feed(userID, max_id=max_id)
2339 | if feed.get('more_available') is not True:
2340 | has_more_posts = False
2341 |
2342 | max_id = feed.get('next_max_id', '')
2343 | my_posts.extend(feed.get('items'))
2344 |
2345 | # time.sleep(2) to avoid flooding
2346 |
2347 | if len(my_posts) > min_posts_to_be_retrieved:
2348 | print('Total posts retrieved: ' + str(len(my_posts)))
2349 | return my_posts
2350 |
2351 | if has_more_posts:
2352 | print(str(len(my_posts)) + ' posts retrieved so far...')
2353 |
2354 | print('Total posts retrieved: ' + str(len(my_posts)))
2355 |
2356 | return my_posts
2357 | ```
2358 |
2359 |
2360 | ```python
2361 | posts = get_user_posts(userID, 10)
2362 | ```
2363 |
2364 | Total posts retrieved: 18
2365 |
2366 |
2367 | Pick one post to analyse (here I'm just going to pick by random)
2368 |
2369 |
2370 | ```python
2371 | random_post = random.sample(posts, 1)
2372 | ```
2373 |
2374 | Get post likers
2375 |
2376 |
2377 | ```python
2378 | random_post[0].keys()
2379 | ```
2380 |
2381 |
2382 |
2383 |
2384 | dict_keys(['taken_at', 'pk', 'id', 'device_timestamp', 'media_type', 'code', 'client_cache_key', 'filter_type', 'carousel_media_count', 'carousel_media', 'can_see_insights_as_brand', 'location', 'lat', 'lng', 'user', 'can_viewer_reshare', 'caption_is_edited', 'comment_likes_enabled', 'comment_threading_enabled', 'has_more_comments', 'next_max_id', 'max_num_visible_preview_comments', 'preview_comments', 'can_view_more_preview_comments', 'comment_count', 'inline_composer_display_condition', 'inline_composer_imp_trigger_time', 'like_count', 'has_liked', 'top_likers', 'photo_of_you', 'usertags', 'caption', 'can_viewer_save', 'organic_tracking_token'])
2385 |
2386 |
2387 |
2388 |
2389 | ```python
2390 | likers = api.media_likers(random_post[0]['id'])
2391 | ```
2392 |
2393 | Get a list of usernames
2394 |
2395 |
2396 | ```python
2397 | likers_usernames = [liker['username'] for liker in likers['users']]
2398 | ```
2399 |
2400 | Get a random sample of 50 users
2401 |
2402 |
2403 | ```python
2404 | random_likers = random.sample(likers_usernames, 50)
2405 | ```
2406 |
2407 | Retrieve the information for the 50 users
2408 |
2409 |
2410 | ```python
2411 | l_infos = []
2412 |
2413 | for liker in random_likers:
2414 | info = api.user_info(get_ID(liker))['user']
2415 | l_infos.append(info)
2416 | ```
2417 |
2418 |
2419 | ```python
2420 | l_table = []
2421 |
2422 | for info in l_infos:
2423 | l_table.append(get_data(info))
2424 |
2425 | l_table
2426 | ```
2427 |
2428 |
2429 |
2430 |
2431 | [[1, 0.0, 1, 0, 0, 30, 0, 0, 6, 21, 177],
2432 | [1, 0.0, 1, 0.0, 0, 69, 0, 1, 131, 942, 1229],
2433 | [1, 0.0, 2, 0.0, 0, 83, 0, 1, 609, 1558, 2925],
2434 | [1, 0.0, 1, 0.0, 0, 39, 0, 0, 851, 2940, 1255],
2435 | [1, 0.0, 1, 0.0, 0, 36, 1, 0, 106, 1626, 1050],
2436 | [0, 0.0, 1, 0, 0, 0, 0, 1, 7, 371, 350],
2437 | [1, 0.0, 2, 0.0, 0, 96, 1, 0, 405, 1656, 2843],
2438 | [1, 0.0, 2, 0.0, 0, 5, 1, 0, 9, 1363, 854],
2439 | [1, 0.0, 1, 0, 0, 1, 0, 1, 5, 433, 371],
2440 | [1, 0.0, 6, 0.0, 0, 93, 1, 0, 73, 1356, 1081],
2441 | [1, 0.0, 3, 0.0, 0, 80, 1, 1, 188, 966, 966],
2442 | [1, 0.0, 3, 0.0, 0, 0, 0, 1, 156, 1401, 1249],
2443 | [1, 0.0, 2, 0.0, 0, 118, 1, 0, 115, 6557, 2423],
2444 | [1, 0.0, 1, 0.0, 0, 12, 0, 0, 84, 1552, 661],
2445 | [1, 0.0, 1, 0.0, 0, 80, 0, 0, 99, 1413, 2479],
2446 | [1, 0.0, 1, 0.0, 0, 23, 0, 1, 12, 1116, 1031],
2447 | [1, 0.0, 1, 0.0, 0, 20, 0, 0, 87, 1864, 692],
2448 | [1, 0.0, 3, 0.0, 0, 62, 1, 0, 17, 1266, 1107],
2449 | [1, 0.0, 2, 0.0, 0, 20, 0, 1, 15, 636, 579],
2450 | [1, 0.0, 4, 0.0, 0, 17, 0, 1, 127, 546, 536],
2451 | [1, 0.0, 1, 0.0, 0, 18, 0, 0, 5, 918, 678],
2452 | [1, 0.2857142857142857, 1, 0.0, 0, 0, 0, 1, 0, 20, 35],
2453 | [1, 0.0, 2, 0.0, 0, 8, 0, 0, 39, 1490, 1321],
2454 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 10, 519, 547],
2455 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 43, 933, 1101],
2456 | [1, 0.0, 2, 0.0, 0, 10, 0, 1, 19, 613, 612],
2457 | [1, 0.25, 3, 0.0, 0, 139, 1, 0, 104, 1738, 999],
2458 | [1, 0.0, 3, 0.0, 0, 42, 1, 0, 17, 2973, 1339],
2459 | [1, 0.0, 1, 0.0, 0, 20, 0, 1, 107, 749, 857],
2460 | [1, 0.0, 4, 0.0, 0, 119, 1, 0, 655, 675, 1904],
2461 | [1, 0.0, 1, 0.0, 0, 103, 1, 0, 48, 10075, 2379],
2462 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 12, 534, 563],
2463 | [1, 0.0, 1, 0, 0, 0, 0, 1, 58, 2220, 1418],
2464 | [1, 0.0, 1, 0.0, 0, 11, 1, 1, 18, 775, 514],
2465 | [1, 0.0, 3, 0.0, 0, 30, 0, 0, 10, 1070, 1364],
2466 | [1, 0.0, 1, 0.0, 0, 18, 0, 0, 108, 1148, 832],
2467 | [1, 0.0, 2, 0.0, 0, 133, 0, 1, 52, 394, 432],
2468 | [1, 0.0, 1, 0, 0, 30, 1, 0, 48, 3441, 1293],
2469 | [1, 0.0, 2, 0.0, 0, 40, 1, 0, 1434, 1642, 1684],
2470 | [1, 0.0, 1, 0.0, 0, 64, 1, 0, 33, 17955, 781],
2471 | [1, 0.0, 2, 0.0, 0, 91, 1, 1, 217, 1014, 1409],
2472 | [1, 0.0, 1, 0, 0, 0, 0, 1, 1, 1347, 872],
2473 | [1, 0.3076923076923077, 1, 0.0, 0, 0, 0, 0, 59, 161, 544],
2474 | [1, 0.0, 3, 0.0, 0, 141, 1, 1, 274, 922, 913],
2475 | [1, 0.0, 1, 0.0, 0, 69, 1, 0, 69, 904, 596],
2476 | [1, 0.0, 1, 0.0, 0, 42, 0, 0, 598, 1877, 6379],
2477 | [1, 0.0, 2, 0.0, 0, 4, 0, 1, 11, 660, 643],
2478 | [1, 0.0, 2, 0.0, 0, 24, 0, 0, 6, 345, 358],
2479 | [1, 0.0, 2, 0.0, 0, 29, 0, 0, 23, 293, 538],
2480 | [1, 0.0, 1, 0.0, 0, 10, 1, 1, 3, 690, 549]]
2481 |
2482 |
2483 |
2484 |
2485 | ```python
2486 | # Generate pandas dataframe
2487 | l_test_data = pd.DataFrame(l_table,
2488 | columns = ['profile pic',
2489 | 'nums/length username',
2490 | 'fullname words',
2491 | 'nums/length fullname',
2492 | 'name==username',
2493 | 'description length',
2494 | 'external URL',
2495 | 'private',
2496 | '#posts',
2497 | '#followers',
2498 | '#follows'])
2499 | l_test_data
2500 | ```
2501 |
2502 |
2503 |
2504 |
2505 |
2506 |
2519 |
2520 |
2521 |
2522 | |
2523 | profile pic |
2524 | nums/length username |
2525 | fullname words |
2526 | nums/length fullname |
2527 | name==username |
2528 | description length |
2529 | external URL |
2530 | private |
2531 | #posts |
2532 | #followers |
2533 | #follows |
2534 |
2535 |
2536 |
2537 |
2538 | | 0 |
2539 | 1 |
2540 | 0.000000 |
2541 | 1 |
2542 | 0.0 |
2543 | 0 |
2544 | 30 |
2545 | 0 |
2546 | 0 |
2547 | 6 |
2548 | 21 |
2549 | 177 |
2550 |
2551 |
2552 | | 1 |
2553 | 1 |
2554 | 0.000000 |
2555 | 1 |
2556 | 0.0 |
2557 | 0 |
2558 | 69 |
2559 | 0 |
2560 | 1 |
2561 | 131 |
2562 | 942 |
2563 | 1229 |
2564 |
2565 |
2566 | | 2 |
2567 | 1 |
2568 | 0.000000 |
2569 | 2 |
2570 | 0.0 |
2571 | 0 |
2572 | 83 |
2573 | 0 |
2574 | 1 |
2575 | 609 |
2576 | 1558 |
2577 | 2925 |
2578 |
2579 |
2580 | | 3 |
2581 | 1 |
2582 | 0.000000 |
2583 | 1 |
2584 | 0.0 |
2585 | 0 |
2586 | 39 |
2587 | 0 |
2588 | 0 |
2589 | 851 |
2590 | 2940 |
2591 | 1255 |
2592 |
2593 |
2594 | | 4 |
2595 | 1 |
2596 | 0.000000 |
2597 | 1 |
2598 | 0.0 |
2599 | 0 |
2600 | 36 |
2601 | 1 |
2602 | 0 |
2603 | 106 |
2604 | 1626 |
2605 | 1050 |
2606 |
2607 |
2608 | | 5 |
2609 | 0 |
2610 | 0.000000 |
2611 | 1 |
2612 | 0.0 |
2613 | 0 |
2614 | 0 |
2615 | 0 |
2616 | 1 |
2617 | 7 |
2618 | 371 |
2619 | 350 |
2620 |
2621 |
2622 | | 6 |
2623 | 1 |
2624 | 0.000000 |
2625 | 2 |
2626 | 0.0 |
2627 | 0 |
2628 | 96 |
2629 | 1 |
2630 | 0 |
2631 | 405 |
2632 | 1656 |
2633 | 2843 |
2634 |
2635 |
2636 | | 7 |
2637 | 1 |
2638 | 0.000000 |
2639 | 2 |
2640 | 0.0 |
2641 | 0 |
2642 | 5 |
2643 | 1 |
2644 | 0 |
2645 | 9 |
2646 | 1363 |
2647 | 854 |
2648 |
2649 |
2650 | | 8 |
2651 | 1 |
2652 | 0.000000 |
2653 | 1 |
2654 | 0.0 |
2655 | 0 |
2656 | 1 |
2657 | 0 |
2658 | 1 |
2659 | 5 |
2660 | 433 |
2661 | 371 |
2662 |
2663 |
2664 | | 9 |
2665 | 1 |
2666 | 0.000000 |
2667 | 6 |
2668 | 0.0 |
2669 | 0 |
2670 | 93 |
2671 | 1 |
2672 | 0 |
2673 | 73 |
2674 | 1356 |
2675 | 1081 |
2676 |
2677 |
2678 | | 10 |
2679 | 1 |
2680 | 0.000000 |
2681 | 3 |
2682 | 0.0 |
2683 | 0 |
2684 | 80 |
2685 | 1 |
2686 | 1 |
2687 | 188 |
2688 | 966 |
2689 | 966 |
2690 |
2691 |
2692 | | 11 |
2693 | 1 |
2694 | 0.000000 |
2695 | 3 |
2696 | 0.0 |
2697 | 0 |
2698 | 0 |
2699 | 0 |
2700 | 1 |
2701 | 156 |
2702 | 1401 |
2703 | 1249 |
2704 |
2705 |
2706 | | 12 |
2707 | 1 |
2708 | 0.000000 |
2709 | 2 |
2710 | 0.0 |
2711 | 0 |
2712 | 118 |
2713 | 1 |
2714 | 0 |
2715 | 115 |
2716 | 6557 |
2717 | 2423 |
2718 |
2719 |
2720 | | 13 |
2721 | 1 |
2722 | 0.000000 |
2723 | 1 |
2724 | 0.0 |
2725 | 0 |
2726 | 12 |
2727 | 0 |
2728 | 0 |
2729 | 84 |
2730 | 1552 |
2731 | 661 |
2732 |
2733 |
2734 | | 14 |
2735 | 1 |
2736 | 0.000000 |
2737 | 1 |
2738 | 0.0 |
2739 | 0 |
2740 | 80 |
2741 | 0 |
2742 | 0 |
2743 | 99 |
2744 | 1413 |
2745 | 2479 |
2746 |
2747 |
2748 | | 15 |
2749 | 1 |
2750 | 0.000000 |
2751 | 1 |
2752 | 0.0 |
2753 | 0 |
2754 | 23 |
2755 | 0 |
2756 | 1 |
2757 | 12 |
2758 | 1116 |
2759 | 1031 |
2760 |
2761 |
2762 | | 16 |
2763 | 1 |
2764 | 0.000000 |
2765 | 1 |
2766 | 0.0 |
2767 | 0 |
2768 | 20 |
2769 | 0 |
2770 | 0 |
2771 | 87 |
2772 | 1864 |
2773 | 692 |
2774 |
2775 |
2776 | | 17 |
2777 | 1 |
2778 | 0.000000 |
2779 | 3 |
2780 | 0.0 |
2781 | 0 |
2782 | 62 |
2783 | 1 |
2784 | 0 |
2785 | 17 |
2786 | 1266 |
2787 | 1107 |
2788 |
2789 |
2790 | | 18 |
2791 | 1 |
2792 | 0.000000 |
2793 | 2 |
2794 | 0.0 |
2795 | 0 |
2796 | 20 |
2797 | 0 |
2798 | 1 |
2799 | 15 |
2800 | 636 |
2801 | 579 |
2802 |
2803 |
2804 | | 19 |
2805 | 1 |
2806 | 0.000000 |
2807 | 4 |
2808 | 0.0 |
2809 | 0 |
2810 | 17 |
2811 | 0 |
2812 | 1 |
2813 | 127 |
2814 | 546 |
2815 | 536 |
2816 |
2817 |
2818 | | 20 |
2819 | 1 |
2820 | 0.000000 |
2821 | 1 |
2822 | 0.0 |
2823 | 0 |
2824 | 18 |
2825 | 0 |
2826 | 0 |
2827 | 5 |
2828 | 918 |
2829 | 678 |
2830 |
2831 |
2832 | | 21 |
2833 | 1 |
2834 | 0.285714 |
2835 | 1 |
2836 | 0.0 |
2837 | 0 |
2838 | 0 |
2839 | 0 |
2840 | 1 |
2841 | 0 |
2842 | 20 |
2843 | 35 |
2844 |
2845 |
2846 | | 22 |
2847 | 1 |
2848 | 0.000000 |
2849 | 2 |
2850 | 0.0 |
2851 | 0 |
2852 | 8 |
2853 | 0 |
2854 | 0 |
2855 | 39 |
2856 | 1490 |
2857 | 1321 |
2858 |
2859 |
2860 | | 23 |
2861 | 1 |
2862 | 0.000000 |
2863 | 2 |
2864 | 0.0 |
2865 | 0 |
2866 | 0 |
2867 | 0 |
2868 | 0 |
2869 | 10 |
2870 | 519 |
2871 | 547 |
2872 |
2873 |
2874 | | 24 |
2875 | 1 |
2876 | 0.000000 |
2877 | 2 |
2878 | 0.0 |
2879 | 0 |
2880 | 0 |
2881 | 0 |
2882 | 0 |
2883 | 43 |
2884 | 933 |
2885 | 1101 |
2886 |
2887 |
2888 | | 25 |
2889 | 1 |
2890 | 0.000000 |
2891 | 2 |
2892 | 0.0 |
2893 | 0 |
2894 | 10 |
2895 | 0 |
2896 | 1 |
2897 | 19 |
2898 | 613 |
2899 | 612 |
2900 |
2901 |
2902 | | 26 |
2903 | 1 |
2904 | 0.250000 |
2905 | 3 |
2906 | 0.0 |
2907 | 0 |
2908 | 139 |
2909 | 1 |
2910 | 0 |
2911 | 104 |
2912 | 1738 |
2913 | 999 |
2914 |
2915 |
2916 | | 27 |
2917 | 1 |
2918 | 0.000000 |
2919 | 3 |
2920 | 0.0 |
2921 | 0 |
2922 | 42 |
2923 | 1 |
2924 | 0 |
2925 | 17 |
2926 | 2973 |
2927 | 1339 |
2928 |
2929 |
2930 | | 28 |
2931 | 1 |
2932 | 0.000000 |
2933 | 1 |
2934 | 0.0 |
2935 | 0 |
2936 | 20 |
2937 | 0 |
2938 | 1 |
2939 | 107 |
2940 | 749 |
2941 | 857 |
2942 |
2943 |
2944 | | 29 |
2945 | 1 |
2946 | 0.000000 |
2947 | 4 |
2948 | 0.0 |
2949 | 0 |
2950 | 119 |
2951 | 1 |
2952 | 0 |
2953 | 655 |
2954 | 675 |
2955 | 1904 |
2956 |
2957 |
2958 | | 30 |
2959 | 1 |
2960 | 0.000000 |
2961 | 1 |
2962 | 0.0 |
2963 | 0 |
2964 | 103 |
2965 | 1 |
2966 | 0 |
2967 | 48 |
2968 | 10075 |
2969 | 2379 |
2970 |
2971 |
2972 | | 31 |
2973 | 1 |
2974 | 0.000000 |
2975 | 1 |
2976 | 0.0 |
2977 | 0 |
2978 | 0 |
2979 | 0 |
2980 | 0 |
2981 | 12 |
2982 | 534 |
2983 | 563 |
2984 |
2985 |
2986 | | 32 |
2987 | 1 |
2988 | 0.000000 |
2989 | 1 |
2990 | 0.0 |
2991 | 0 |
2992 | 0 |
2993 | 0 |
2994 | 1 |
2995 | 58 |
2996 | 2220 |
2997 | 1418 |
2998 |
2999 |
3000 | | 33 |
3001 | 1 |
3002 | 0.000000 |
3003 | 1 |
3004 | 0.0 |
3005 | 0 |
3006 | 11 |
3007 | 1 |
3008 | 1 |
3009 | 18 |
3010 | 775 |
3011 | 514 |
3012 |
3013 |
3014 | | 34 |
3015 | 1 |
3016 | 0.000000 |
3017 | 3 |
3018 | 0.0 |
3019 | 0 |
3020 | 30 |
3021 | 0 |
3022 | 0 |
3023 | 10 |
3024 | 1070 |
3025 | 1364 |
3026 |
3027 |
3028 | | 35 |
3029 | 1 |
3030 | 0.000000 |
3031 | 1 |
3032 | 0.0 |
3033 | 0 |
3034 | 18 |
3035 | 0 |
3036 | 0 |
3037 | 108 |
3038 | 1148 |
3039 | 832 |
3040 |
3041 |
3042 | | 36 |
3043 | 1 |
3044 | 0.000000 |
3045 | 2 |
3046 | 0.0 |
3047 | 0 |
3048 | 133 |
3049 | 0 |
3050 | 1 |
3051 | 52 |
3052 | 394 |
3053 | 432 |
3054 |
3055 |
3056 | | 37 |
3057 | 1 |
3058 | 0.000000 |
3059 | 1 |
3060 | 0.0 |
3061 | 0 |
3062 | 30 |
3063 | 1 |
3064 | 0 |
3065 | 48 |
3066 | 3441 |
3067 | 1293 |
3068 |
3069 |
3070 | | 38 |
3071 | 1 |
3072 | 0.000000 |
3073 | 2 |
3074 | 0.0 |
3075 | 0 |
3076 | 40 |
3077 | 1 |
3078 | 0 |
3079 | 1434 |
3080 | 1642 |
3081 | 1684 |
3082 |
3083 |
3084 | | 39 |
3085 | 1 |
3086 | 0.000000 |
3087 | 1 |
3088 | 0.0 |
3089 | 0 |
3090 | 64 |
3091 | 1 |
3092 | 0 |
3093 | 33 |
3094 | 17955 |
3095 | 781 |
3096 |
3097 |
3098 | | 40 |
3099 | 1 |
3100 | 0.000000 |
3101 | 2 |
3102 | 0.0 |
3103 | 0 |
3104 | 91 |
3105 | 1 |
3106 | 1 |
3107 | 217 |
3108 | 1014 |
3109 | 1409 |
3110 |
3111 |
3112 | | 41 |
3113 | 1 |
3114 | 0.000000 |
3115 | 1 |
3116 | 0.0 |
3117 | 0 |
3118 | 0 |
3119 | 0 |
3120 | 1 |
3121 | 1 |
3122 | 1347 |
3123 | 872 |
3124 |
3125 |
3126 | | 42 |
3127 | 1 |
3128 | 0.307692 |
3129 | 1 |
3130 | 0.0 |
3131 | 0 |
3132 | 0 |
3133 | 0 |
3134 | 0 |
3135 | 59 |
3136 | 161 |
3137 | 544 |
3138 |
3139 |
3140 | | 43 |
3141 | 1 |
3142 | 0.000000 |
3143 | 3 |
3144 | 0.0 |
3145 | 0 |
3146 | 141 |
3147 | 1 |
3148 | 1 |
3149 | 274 |
3150 | 922 |
3151 | 913 |
3152 |
3153 |
3154 | | 44 |
3155 | 1 |
3156 | 0.000000 |
3157 | 1 |
3158 | 0.0 |
3159 | 0 |
3160 | 69 |
3161 | 1 |
3162 | 0 |
3163 | 69 |
3164 | 904 |
3165 | 596 |
3166 |
3167 |
3168 | | 45 |
3169 | 1 |
3170 | 0.000000 |
3171 | 1 |
3172 | 0.0 |
3173 | 0 |
3174 | 42 |
3175 | 0 |
3176 | 0 |
3177 | 598 |
3178 | 1877 |
3179 | 6379 |
3180 |
3181 |
3182 | | 46 |
3183 | 1 |
3184 | 0.000000 |
3185 | 2 |
3186 | 0.0 |
3187 | 0 |
3188 | 4 |
3189 | 0 |
3190 | 1 |
3191 | 11 |
3192 | 660 |
3193 | 643 |
3194 |
3195 |
3196 | | 47 |
3197 | 1 |
3198 | 0.000000 |
3199 | 2 |
3200 | 0.0 |
3201 | 0 |
3202 | 24 |
3203 | 0 |
3204 | 0 |
3205 | 6 |
3206 | 345 |
3207 | 358 |
3208 |
3209 |
3210 | | 48 |
3211 | 1 |
3212 | 0.000000 |
3213 | 2 |
3214 | 0.0 |
3215 | 0 |
3216 | 29 |
3217 | 0 |
3218 | 0 |
3219 | 23 |
3220 | 293 |
3221 | 538 |
3222 |
3223 |
3224 | | 49 |
3225 | 1 |
3226 | 0.000000 |
3227 | 1 |
3228 | 0.0 |
3229 | 0 |
3230 | 10 |
3231 | 1 |
3232 | 1 |
3233 | 3 |
3234 | 690 |
3235 | 549 |
3236 |
3237 |
3238 |
3239 |
3240 |
3241 |
3242 |
3243 | Finally, make the prediction!
3244 |
3245 |
3246 | ```python
3247 | rfc = RandomForestClassifier()
3248 | rfc_model = rfc.fit(train_X, train_Y)
3249 | rfc_labels_likes = rfc_model.predict(l_test_data)
3250 | rfc_labels_likes
3251 | ```
3252 |
3253 | /Users/athiyadeviyani/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
3254 |
3255 |
3256 |
3257 |
3258 |
3259 |
3260 | array([1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
3261 | 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
3262 | 0, 0, 0, 0, 0, 0])
3263 |
3264 |
3265 |
3266 | Calculate the fake accounts that liked the user's media
3267 |
3268 |
3269 | ```python
3270 | no_fake_likes = len([x for x in rfc_labels_likes if x==1])
3271 | ```
3272 |
3273 | Calculate the media likes authenticity
3274 |
3275 |
3276 | ```python
3277 | media_authenticity = (len(random_likers) - no_fake_likes) * 100 / len(random_likers)
3278 | print("The media with the ID:XXXXX has " + str(media_authenticity) + "% authentic likes.")
3279 | ```
3280 |
3281 | The media with the ID:XXXXX has 92.0% authentic likes.
3282 |
3283 |
3284 | ## Part 7: Comparison With Another User
3285 | I have specifically chosen user X because I trusted their social media 'game' and seemed to have a loyal and engaged following. Let's compare their metrics with a user Y, a user that has a noticable follower growth spike when examined on SocialBlade.
3286 |
3287 | I am going to skip the explanation here because it's just a repetition of the steps performed on user X.
3288 |
3289 |
3290 | ```python
3291 | # Re-login because of API call limits
3292 | api = login()
3293 | ```
3294 |
3295 | username: ins.tafakebusters
3296 | password: ········
3297 |
3298 |
3299 |
3300 | ```python
3301 | userID_y = get_ID('')
3302 | ```
3303 |
3304 |
3305 | ```python
3306 | rank = api.generate_uuid()
3307 | ```
3308 |
3309 | **USER Y FOLLOWERS ANALYSIS**
3310 |
3311 |
3312 | ```python
3313 | y_followers = get_followers(userID_y, rank)
3314 | ```
3315 |
3316 |
3317 | ```python
3318 | y_random_followers = random.sample(y_followers, 50)
3319 | ```
3320 |
3321 |
3322 | ```python
3323 | y_infos = []
3324 |
3325 | for follower in y_random_followers:
3326 | info = api.user_info(get_ID(follower))['user']
3327 | y_infos.append(info)
3328 | ```
3329 |
3330 |
3331 | ```python
3332 | y_table = []
3333 |
3334 | for info in y_infos:
3335 | y_table.append(get_data(info))
3336 |
3337 | y_table
3338 | ```
3339 |
3340 |
3341 |
3342 |
3343 | [[1, 0.14285714285714285, 1, 0.0, 0, 0, 0, 0, 16, 32, 1549],
3344 | [1, 0.2222222222222222, 1, 0.0, 0, 0, 0, 1, 15, 337, 2058],
3345 | [1, 0.25, 2, 0.0, 0, 0, 0, 0, 5, 310, 6343],
3346 | [1, 0.0, 4, 0.0, 0, 97, 0, 0, 1, 14107, 7514],
3347 | [1, 0.36363636363636365, 2, 0.0, 0, 0, 0, 0, 16, 8, 1050],
3348 | [1, 0.25, 2, 0.0, 0, 13, 0, 0, 15, 87, 6741],
3349 | [1, 0.0, 1, 0, 0, 0, 0, 1, 21, 24, 5862],
3350 | [1, 0.0, 1, 0, 0, 13, 0, 1, 27, 1289, 689],
3351 | [1, 0.0, 1, 0.0, 0, 29, 0, 1, 0, 31, 148],
3352 | [1, 0.0, 1, 0, 0, 119, 0, 0, 32, 636, 1293],
3353 | [1, 0.0, 4, 0.0, 0, 20, 0, 0, 144, 3617, 1346],
3354 | [1, 0.21428571428571427, 2, 0.0, 0, 0, 0, 0, 17, 71, 7495],
3355 | [1, 0.13333333333333333, 2, 0.0, 0, 113, 0, 1, 3, 305, 303],
3356 | [0, 0.4444444444444444, 2, 0.0, 0, 0, 0, 1, 1, 63, 283],
3357 | [1, 0.0, 3, 0.0, 0, 0, 0, 0, 17, 115, 7506],
3358 | [0, 0.0625, 2, 0.0, 0, 0, 0, 1, 272, 1446, 2362],
3359 | [1, 0.15384615384615385, 2, 0.0, 0, 0, 0, 0, 6, 1150, 732],
3360 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 15, 60, 1631],
3361 | [1, 0.0, 1, 0, 0, 13, 0, 0, 15, 11, 221],
3362 | [1, 0.0, 1, 0, 0, 1, 0, 1, 0, 21, 23],
3363 | [1, 0.23076923076923078, 1, 0, 0, 0, 0, 0, 0, 4, 173],
3364 | [1, 0.25, 1, 0.0, 0, 20, 0, 0, 1, 29, 457],
3365 | [1, 0.5, 1, 0.0, 0, 0, 0, 0, 1, 831, 5424],
3366 | [1, 0.0, 3, 0.0, 0, 150, 1, 0, 158, 7063, 1355],
3367 | [1, 0.0, 1, 0.0, 0, 0, 0, 1, 15, 39, 2045],
3368 | [1, 0.0, 4, 0.05555555555555555, 0, 127, 0, 0, 196, 486, 198],
3369 | [1, 0.0, 1, 0.0, 0, 76, 0, 1, 7, 509, 372],
3370 | [1, 0.0, 2, 0.0, 0, 48, 0, 0, 1, 5079, 879],
3371 | [1, 0.0, 1, 0.0, 0, 19, 0, 1, 9, 1778, 1477],
3372 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 15, 29, 543],
3373 | [1, 0.0, 3, 0.0, 0, 77, 0, 1, 784, 526, 1235],
3374 | [1, 0.0, 2, 0.0, 0, 81, 1, 0, 3, 9123, 6144],
3375 | [1, 0.0, 2, 0.0, 0, 33, 0, 0, 15, 134, 416],
3376 | [1, 0.0, 2, 0.0, 0, 79, 0, 1, 38, 506, 804],
3377 | [1, 0.0, 2, 0.0, 0, 0, 0, 0, 20, 27, 2557],
3378 | [1, 0.125, 2, 0.0, 0, 0, 0, 0, 15, 9, 1151],
3379 | [1, 0.42105263157894735, 2, 0.0, 0, 0, 0, 0, 18, 12, 1212],
3380 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 15, 14, 600],
3381 | [1, 0.0, 5, 0.0, 0, 25, 0, 0, 12, 1224, 774],
3382 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 15, 23, 2056],
3383 | [1, 0.42857142857142855, 1, 0.0, 0, 0, 0, 0, 18, 27, 395],
3384 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 10, 444, 1116],
3385 | [1, 0.0, 1, 0.0, 0, 43, 0, 0, 57, 214, 2377],
3386 | [1, 0.047619047619047616, 2, 0.0, 0, 0, 0, 1, 15, 15, 6047],
3387 | [1, 0.05263157894736842, 2, 0.0, 0, 1, 0, 0, 15, 55, 5313],
3388 | [1, 0.18181818181818182, 2, 0.0, 0, 0, 0, 0, 16, 95, 1228],
3389 | [1, 0.15384615384615385, 1, 0.0, 0, 0, 0, 0, 16, 56, 3665],
3390 | [1, 0.0, 1, 0, 0, 0, 0, 0, 15, 5, 1568],
3391 | [0, 0.16666666666666666, 2, 0.0, 0, 0, 0, 1, 3, 8, 28],
3392 | [1, 0.4117647058823529, 2, 0.0, 0, 0, 0, 0, 1, 69, 196]]
3393 |
3394 |
3395 |
3396 |
3397 | ```python
3398 | # Generate pandas dataframe
3399 | y_test_data = pd.DataFrame(y_table,
3400 | columns = ['profile pic',
3401 | 'nums/length username',
3402 | 'fullname words',
3403 | 'nums/length fullname',
3404 | 'name==username',
3405 | 'description length',
3406 | 'external URL',
3407 | 'private',
3408 | '#posts',
3409 | '#followers',
3410 | '#follows'])
3411 | y_test_data
3412 | ```
3413 |
3414 |
3415 |
3416 |
3417 |
3418 |
3431 |
3432 |
3433 |
3434 | |
3435 | profile pic |
3436 | nums/length username |
3437 | fullname words |
3438 | nums/length fullname |
3439 | name==username |
3440 | description length |
3441 | external URL |
3442 | private |
3443 | #posts |
3444 | #followers |
3445 | #follows |
3446 |
3447 |
3448 |
3449 |
3450 | | 0 |
3451 | 1 |
3452 | 0.142857 |
3453 | 1 |
3454 | 0.000000 |
3455 | 0 |
3456 | 0 |
3457 | 0 |
3458 | 0 |
3459 | 16 |
3460 | 32 |
3461 | 1549 |
3462 |
3463 |
3464 | | 1 |
3465 | 1 |
3466 | 0.222222 |
3467 | 1 |
3468 | 0.000000 |
3469 | 0 |
3470 | 0 |
3471 | 0 |
3472 | 1 |
3473 | 15 |
3474 | 337 |
3475 | 2058 |
3476 |
3477 |
3478 | | 2 |
3479 | 1 |
3480 | 0.250000 |
3481 | 2 |
3482 | 0.000000 |
3483 | 0 |
3484 | 0 |
3485 | 0 |
3486 | 0 |
3487 | 5 |
3488 | 310 |
3489 | 6343 |
3490 |
3491 |
3492 | | 3 |
3493 | 1 |
3494 | 0.000000 |
3495 | 4 |
3496 | 0.000000 |
3497 | 0 |
3498 | 97 |
3499 | 0 |
3500 | 0 |
3501 | 1 |
3502 | 14107 |
3503 | 7514 |
3504 |
3505 |
3506 | | 4 |
3507 | 1 |
3508 | 0.363636 |
3509 | 2 |
3510 | 0.000000 |
3511 | 0 |
3512 | 0 |
3513 | 0 |
3514 | 0 |
3515 | 16 |
3516 | 8 |
3517 | 1050 |
3518 |
3519 |
3520 | | 5 |
3521 | 1 |
3522 | 0.250000 |
3523 | 2 |
3524 | 0.000000 |
3525 | 0 |
3526 | 13 |
3527 | 0 |
3528 | 0 |
3529 | 15 |
3530 | 87 |
3531 | 6741 |
3532 |
3533 |
3534 | | 6 |
3535 | 1 |
3536 | 0.000000 |
3537 | 1 |
3538 | 0.000000 |
3539 | 0 |
3540 | 0 |
3541 | 0 |
3542 | 1 |
3543 | 21 |
3544 | 24 |
3545 | 5862 |
3546 |
3547 |
3548 | | 7 |
3549 | 1 |
3550 | 0.000000 |
3551 | 1 |
3552 | 0.000000 |
3553 | 0 |
3554 | 13 |
3555 | 0 |
3556 | 1 |
3557 | 27 |
3558 | 1289 |
3559 | 689 |
3560 |
3561 |
3562 | | 8 |
3563 | 1 |
3564 | 0.000000 |
3565 | 1 |
3566 | 0.000000 |
3567 | 0 |
3568 | 29 |
3569 | 0 |
3570 | 1 |
3571 | 0 |
3572 | 31 |
3573 | 148 |
3574 |
3575 |
3576 | | 9 |
3577 | 1 |
3578 | 0.000000 |
3579 | 1 |
3580 | 0.000000 |
3581 | 0 |
3582 | 119 |
3583 | 0 |
3584 | 0 |
3585 | 32 |
3586 | 636 |
3587 | 1293 |
3588 |
3589 |
3590 | | 10 |
3591 | 1 |
3592 | 0.000000 |
3593 | 4 |
3594 | 0.000000 |
3595 | 0 |
3596 | 20 |
3597 | 0 |
3598 | 0 |
3599 | 144 |
3600 | 3617 |
3601 | 1346 |
3602 |
3603 |
3604 | | 11 |
3605 | 1 |
3606 | 0.214286 |
3607 | 2 |
3608 | 0.000000 |
3609 | 0 |
3610 | 0 |
3611 | 0 |
3612 | 0 |
3613 | 17 |
3614 | 71 |
3615 | 7495 |
3616 |
3617 |
3618 | | 12 |
3619 | 1 |
3620 | 0.133333 |
3621 | 2 |
3622 | 0.000000 |
3623 | 0 |
3624 | 113 |
3625 | 0 |
3626 | 1 |
3627 | 3 |
3628 | 305 |
3629 | 303 |
3630 |
3631 |
3632 | | 13 |
3633 | 0 |
3634 | 0.444444 |
3635 | 2 |
3636 | 0.000000 |
3637 | 0 |
3638 | 0 |
3639 | 0 |
3640 | 1 |
3641 | 1 |
3642 | 63 |
3643 | 283 |
3644 |
3645 |
3646 | | 14 |
3647 | 1 |
3648 | 0.000000 |
3649 | 3 |
3650 | 0.000000 |
3651 | 0 |
3652 | 0 |
3653 | 0 |
3654 | 0 |
3655 | 17 |
3656 | 115 |
3657 | 7506 |
3658 |
3659 |
3660 | | 15 |
3661 | 0 |
3662 | 0.062500 |
3663 | 2 |
3664 | 0.000000 |
3665 | 0 |
3666 | 0 |
3667 | 0 |
3668 | 1 |
3669 | 272 |
3670 | 1446 |
3671 | 2362 |
3672 |
3673 |
3674 | | 16 |
3675 | 1 |
3676 | 0.153846 |
3677 | 2 |
3678 | 0.000000 |
3679 | 0 |
3680 | 0 |
3681 | 0 |
3682 | 0 |
3683 | 6 |
3684 | 1150 |
3685 | 732 |
3686 |
3687 |
3688 | | 17 |
3689 | 1 |
3690 | 0.000000 |
3691 | 2 |
3692 | 0.000000 |
3693 | 0 |
3694 | 0 |
3695 | 0 |
3696 | 0 |
3697 | 15 |
3698 | 60 |
3699 | 1631 |
3700 |
3701 |
3702 | | 18 |
3703 | 1 |
3704 | 0.000000 |
3705 | 1 |
3706 | 0.000000 |
3707 | 0 |
3708 | 13 |
3709 | 0 |
3710 | 0 |
3711 | 15 |
3712 | 11 |
3713 | 221 |
3714 |
3715 |
3716 | | 19 |
3717 | 1 |
3718 | 0.000000 |
3719 | 1 |
3720 | 0.000000 |
3721 | 0 |
3722 | 1 |
3723 | 0 |
3724 | 1 |
3725 | 0 |
3726 | 21 |
3727 | 23 |
3728 |
3729 |
3730 | | 20 |
3731 | 1 |
3732 | 0.230769 |
3733 | 1 |
3734 | 0.000000 |
3735 | 0 |
3736 | 0 |
3737 | 0 |
3738 | 0 |
3739 | 0 |
3740 | 4 |
3741 | 173 |
3742 |
3743 |
3744 | | 21 |
3745 | 1 |
3746 | 0.250000 |
3747 | 1 |
3748 | 0.000000 |
3749 | 0 |
3750 | 20 |
3751 | 0 |
3752 | 0 |
3753 | 1 |
3754 | 29 |
3755 | 457 |
3756 |
3757 |
3758 | | 22 |
3759 | 1 |
3760 | 0.500000 |
3761 | 1 |
3762 | 0.000000 |
3763 | 0 |
3764 | 0 |
3765 | 0 |
3766 | 0 |
3767 | 1 |
3768 | 831 |
3769 | 5424 |
3770 |
3771 |
3772 | | 23 |
3773 | 1 |
3774 | 0.000000 |
3775 | 3 |
3776 | 0.000000 |
3777 | 0 |
3778 | 150 |
3779 | 1 |
3780 | 0 |
3781 | 158 |
3782 | 7063 |
3783 | 1355 |
3784 |
3785 |
3786 | | 24 |
3787 | 1 |
3788 | 0.000000 |
3789 | 1 |
3790 | 0.000000 |
3791 | 0 |
3792 | 0 |
3793 | 0 |
3794 | 1 |
3795 | 15 |
3796 | 39 |
3797 | 2045 |
3798 |
3799 |
3800 | | 25 |
3801 | 1 |
3802 | 0.000000 |
3803 | 4 |
3804 | 0.055556 |
3805 | 0 |
3806 | 127 |
3807 | 0 |
3808 | 0 |
3809 | 196 |
3810 | 486 |
3811 | 198 |
3812 |
3813 |
3814 | | 26 |
3815 | 1 |
3816 | 0.000000 |
3817 | 1 |
3818 | 0.000000 |
3819 | 0 |
3820 | 76 |
3821 | 0 |
3822 | 1 |
3823 | 7 |
3824 | 509 |
3825 | 372 |
3826 |
3827 |
3828 | | 27 |
3829 | 1 |
3830 | 0.000000 |
3831 | 2 |
3832 | 0.000000 |
3833 | 0 |
3834 | 48 |
3835 | 0 |
3836 | 0 |
3837 | 1 |
3838 | 5079 |
3839 | 879 |
3840 |
3841 |
3842 | | 28 |
3843 | 1 |
3844 | 0.000000 |
3845 | 1 |
3846 | 0.000000 |
3847 | 0 |
3848 | 19 |
3849 | 0 |
3850 | 1 |
3851 | 9 |
3852 | 1778 |
3853 | 1477 |
3854 |
3855 |
3856 | | 29 |
3857 | 1 |
3858 | 0.000000 |
3859 | 2 |
3860 | 0.000000 |
3861 | 0 |
3862 | 0 |
3863 | 0 |
3864 | 0 |
3865 | 15 |
3866 | 29 |
3867 | 543 |
3868 |
3869 |
3870 | | 30 |
3871 | 1 |
3872 | 0.000000 |
3873 | 3 |
3874 | 0.000000 |
3875 | 0 |
3876 | 77 |
3877 | 0 |
3878 | 1 |
3879 | 784 |
3880 | 526 |
3881 | 1235 |
3882 |
3883 |
3884 | | 31 |
3885 | 1 |
3886 | 0.000000 |
3887 | 2 |
3888 | 0.000000 |
3889 | 0 |
3890 | 81 |
3891 | 1 |
3892 | 0 |
3893 | 3 |
3894 | 9123 |
3895 | 6144 |
3896 |
3897 |
3898 | | 32 |
3899 | 1 |
3900 | 0.000000 |
3901 | 2 |
3902 | 0.000000 |
3903 | 0 |
3904 | 33 |
3905 | 0 |
3906 | 0 |
3907 | 15 |
3908 | 134 |
3909 | 416 |
3910 |
3911 |
3912 | | 33 |
3913 | 1 |
3914 | 0.000000 |
3915 | 2 |
3916 | 0.000000 |
3917 | 0 |
3918 | 79 |
3919 | 0 |
3920 | 1 |
3921 | 38 |
3922 | 506 |
3923 | 804 |
3924 |
3925 |
3926 | | 34 |
3927 | 1 |
3928 | 0.000000 |
3929 | 2 |
3930 | 0.000000 |
3931 | 0 |
3932 | 0 |
3933 | 0 |
3934 | 0 |
3935 | 20 |
3936 | 27 |
3937 | 2557 |
3938 |
3939 |
3940 | | 35 |
3941 | 1 |
3942 | 0.125000 |
3943 | 2 |
3944 | 0.000000 |
3945 | 0 |
3946 | 0 |
3947 | 0 |
3948 | 0 |
3949 | 15 |
3950 | 9 |
3951 | 1151 |
3952 |
3953 |
3954 | | 36 |
3955 | 1 |
3956 | 0.421053 |
3957 | 2 |
3958 | 0.000000 |
3959 | 0 |
3960 | 0 |
3961 | 0 |
3962 | 0 |
3963 | 18 |
3964 | 12 |
3965 | 1212 |
3966 |
3967 |
3968 | | 37 |
3969 | 1 |
3970 | 0.000000 |
3971 | 1 |
3972 | 0.000000 |
3973 | 0 |
3974 | 0 |
3975 | 0 |
3976 | 0 |
3977 | 15 |
3978 | 14 |
3979 | 600 |
3980 |
3981 |
3982 | | 38 |
3983 | 1 |
3984 | 0.000000 |
3985 | 5 |
3986 | 0.000000 |
3987 | 0 |
3988 | 25 |
3989 | 0 |
3990 | 0 |
3991 | 12 |
3992 | 1224 |
3993 | 774 |
3994 |
3995 |
3996 | | 39 |
3997 | 1 |
3998 | 0.000000 |
3999 | 1 |
4000 | 0.000000 |
4001 | 0 |
4002 | 0 |
4003 | 0 |
4004 | 0 |
4005 | 15 |
4006 | 23 |
4007 | 2056 |
4008 |
4009 |
4010 | | 40 |
4011 | 1 |
4012 | 0.428571 |
4013 | 1 |
4014 | 0.000000 |
4015 | 0 |
4016 | 0 |
4017 | 0 |
4018 | 0 |
4019 | 18 |
4020 | 27 |
4021 | 395 |
4022 |
4023 |
4024 | | 41 |
4025 | 1 |
4026 | 0.000000 |
4027 | 2 |
4028 | 0.000000 |
4029 | 0 |
4030 | 0 |
4031 | 0 |
4032 | 1 |
4033 | 10 |
4034 | 444 |
4035 | 1116 |
4036 |
4037 |
4038 | | 42 |
4039 | 1 |
4040 | 0.000000 |
4041 | 1 |
4042 | 0.000000 |
4043 | 0 |
4044 | 43 |
4045 | 0 |
4046 | 0 |
4047 | 57 |
4048 | 214 |
4049 | 2377 |
4050 |
4051 |
4052 | | 43 |
4053 | 1 |
4054 | 0.047619 |
4055 | 2 |
4056 | 0.000000 |
4057 | 0 |
4058 | 0 |
4059 | 0 |
4060 | 1 |
4061 | 15 |
4062 | 15 |
4063 | 6047 |
4064 |
4065 |
4066 | | 44 |
4067 | 1 |
4068 | 0.052632 |
4069 | 2 |
4070 | 0.000000 |
4071 | 0 |
4072 | 1 |
4073 | 0 |
4074 | 0 |
4075 | 15 |
4076 | 55 |
4077 | 5313 |
4078 |
4079 |
4080 | | 45 |
4081 | 1 |
4082 | 0.181818 |
4083 | 2 |
4084 | 0.000000 |
4085 | 0 |
4086 | 0 |
4087 | 0 |
4088 | 0 |
4089 | 16 |
4090 | 95 |
4091 | 1228 |
4092 |
4093 |
4094 | | 46 |
4095 | 1 |
4096 | 0.153846 |
4097 | 1 |
4098 | 0.000000 |
4099 | 0 |
4100 | 0 |
4101 | 0 |
4102 | 0 |
4103 | 16 |
4104 | 56 |
4105 | 3665 |
4106 |
4107 |
4108 | | 47 |
4109 | 1 |
4110 | 0.000000 |
4111 | 1 |
4112 | 0.000000 |
4113 | 0 |
4114 | 0 |
4115 | 0 |
4116 | 0 |
4117 | 15 |
4118 | 5 |
4119 | 1568 |
4120 |
4121 |
4122 | | 48 |
4123 | 0 |
4124 | 0.166667 |
4125 | 2 |
4126 | 0.000000 |
4127 | 0 |
4128 | 0 |
4129 | 0 |
4130 | 1 |
4131 | 3 |
4132 | 8 |
4133 | 28 |
4134 |
4135 |
4136 | | 49 |
4137 | 1 |
4138 | 0.411765 |
4139 | 2 |
4140 | 0.000000 |
4141 | 0 |
4142 | 0 |
4143 | 0 |
4144 | 0 |
4145 | 1 |
4146 | 69 |
4147 | 196 |
4148 |
4149 |
4150 |
4151 |
4152 |
4153 |
4154 |
4155 |
4156 | ```python
4157 | # Predict (no retraining!)
4158 | rfc_labels_y = rfc_model.predict(y_test_data)
4159 | rfc_labels_y
4160 | ```
4161 |
4162 |
4163 |
4164 |
4165 | array([1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1,
4166 | 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1,
4167 | 1, 1, 1, 1, 1, 1])
4168 |
4169 |
4170 |
4171 |
4172 | ```python
4173 | # Calculate the number of fake accounts in the random sample of 50 followers
4174 | no_fakes_y = len([x for x in rfc_labels_y if x==1])
4175 | ```
4176 |
4177 |
4178 | ```python
4179 | # Calculate the authenticity
4180 | y_authenticity = (len(y_random_followers) - no_fakes_y) * 100 / len(y_random_followers)
4181 | print("User Y's Instagram Followers is " + str(y_authenticity) + "% authentic.")
4182 | ```
4183 |
4184 | User Y's Instagram Followers is 38.0% authentic.
4185 |
4186 |
4187 | Ahh, the joys of being right!
4188 |
4189 | **USER Y LIKES ANALYSIS**
4190 |
4191 |
4192 | ```python
4193 | y_posts = get_user_posts(userID_y, 10)
4194 | ```
4195 |
4196 | Total posts retrieved: 18
4197 |
4198 |
4199 |
4200 | ```python
4201 | y_random_post = random.sample(y_posts, 1)
4202 | ```
4203 |
4204 |
4205 | ```python
4206 | y_likers = api.media_likers(y_random_post[0]['id'])
4207 | ```
4208 |
4209 |
4210 | ```python
4211 | y_likers_usernames = [liker['username'] for liker in y_likers['users']]
4212 | ```
4213 |
4214 |
4215 | ```python
4216 | y_random_likers = random.sample(y_likers_usernames, 50)
4217 | ```
4218 |
4219 |
4220 | ```python
4221 | y_likers_infos = []
4222 |
4223 | for liker in y_random_likers:
4224 | info = api.user_info(get_ID(liker))['user']
4225 | y_likers_infos.append(info)
4226 | ```
4227 |
4228 |
4229 | ```python
4230 | y_likers_table = []
4231 |
4232 | for info in y_likers_infos:
4233 | y_likers_table.append(get_data(info))
4234 |
4235 | y_likers_table
4236 | ```
4237 |
4238 |
4239 |
4240 |
4241 | [[1, 0.0, 2, 0.0, 0, 0, 0, 0, 2, 897, 830],
4242 | [0, 0.0, 2, 0.0, 0, 0, 0, 1, 0, 129, 132],
4243 | [1, 0.0, 2, 0.0, 0, 8, 0, 1, 72, 1157, 698],
4244 | [1, 0.0, 1, 0, 0, 10, 0, 1, 6, 1410, 619],
4245 | [1, 0.0, 1, 0.0, 0, 0, 0, 0, 0, 1916, 731],
4246 | [1, 0.2222222222222222, 3, 0.0, 0, 72, 0, 1, 13, 950, 649],
4247 | [1, 0.0, 1, 0.0, 0, 19, 0, 1, 17, 1543, 1289],
4248 | [1, 0.2, 5, 0.0, 0, 11, 0, 0, 33, 1076, 606],
4249 | [1, 0.0, 1, 0.0, 0, 104, 0, 1, 6, 202, 485],
4250 | [1, 0.2, 1, 0.0, 0, 15, 0, 0, 7, 1262, 679],
4251 | [1, 0.15384615384615385, 2, 0.0, 0, 0, 0, 0, 6, 1150, 732],
4252 | [1, 0.0, 1, 0.0, 0, 17, 1, 0, 28, 2442, 629],
4253 | [1, 0.0, 2, 0.0, 0, 61, 0, 0, 159, 556, 765],
4254 | [1, 0.0, 2, 0.0, 0, 34, 0, 1, 10, 531, 526],
4255 | [1, 0.0, 3, 0.0, 0, 127, 0, 0, 23, 1137, 909],
4256 | [1, 0.0, 2, 0.0, 0, 66, 0, 1, 25, 583, 805],
4257 | [1, 0.13333333333333333, 2, 0.0, 0, 67, 1, 0, 141, 4615, 1948],
4258 | [1, 0.0, 2, 0.0, 0, 47, 0, 1, 387, 75, 162],
4259 | [1, 0.0, 1, 0.0, 0, 142, 0, 1, 8144, 664, 1527],
4260 | [1, 0.0, 3, 0.0, 0, 4, 0, 1, 1, 466, 325],
4261 | [1, 0.058823529411764705, 1, 0.0, 0, 32, 0, 0, 14, 419, 414],
4262 | [1, 0.0, 3, 0.0, 0, 75, 1, 0, 353, 1399, 764],
4263 | [1, 0.0, 1, 0, 0, 0, 0, 0, 9, 611, 554],
4264 | [1, 0.0, 1, 0.0, 0, 29, 0, 1, 3, 2064, 1077],
4265 | [1, 0.0, 1, 0.0, 0, 26, 0, 1, 37, 628, 714],
4266 | [1, 0.0, 2, 0.0, 0, 89, 1, 1, 243, 2316, 1030],
4267 | [1, 0.0, 2, 0.0, 0, 140, 1, 0, 666, 4460, 492],
4268 | [1, 0.0, 2, 0.0, 0, 20, 0, 0, 71, 4101, 878],
4269 | [1, 0.0, 2, 0.0, 0, 5, 0, 0, 148, 424, 716],
4270 | [1, 0.0, 1, 0, 0, 0, 0, 1, 2, 640, 730],
4271 | [1, 0.0, 2, 0.0, 0, 64, 0, 1, 8, 1141, 891],
4272 | [1, 0.0, 3, 0.0, 0, 29, 0, 1, 10, 1378, 986],
4273 | [1, 0.0, 2, 0.0, 0, 14, 0, 1, 3, 994, 698],
4274 | [1, 0.0, 1, 0.0, 0, 29, 0, 1, 43, 181, 169],
4275 | [1, 0.0, 1, 0.0, 0, 58, 1, 0, 24, 1144, 1091],
4276 | [1, 0.0, 2, 0.0, 0, 25, 0, 1, 36, 687, 574],
4277 | [1, 0.0, 3, 0.0, 0, 8, 0, 1, 33, 1846, 996],
4278 | [1, 0.5714285714285714, 2, 0.0, 0, 18, 0, 1, 202, 1180, 600],
4279 | [1, 0.0, 2, 0.0, 0, 7, 0, 0, 45, 1206, 676],
4280 | [1, 0.0, 2, 0.0, 0, 76, 0, 0, 12, 661, 3004],
4281 | [1, 0.0, 1, 0.0, 0, 9, 0, 1, 5, 759, 706],
4282 | [0, 0.0, 3, 0.0, 0, 61, 0, 1, 9, 439, 612],
4283 | [1, 0.16666666666666666, 1, 0.0, 0, 0, 0, 1, 3, 911, 822],
4284 | [1, 0.4, 2, 0.0, 0, 82, 0, 0, 99, 556, 733],
4285 | [1, 0.0, 2, 0.0, 0, 80, 0, 1, 21, 478, 385],
4286 | [1, 0.0, 1, 0, 0, 0, 0, 1, 0, 653, 312],
4287 | [1, 0.0, 1, 0.0, 0, 13, 0, 1, 40, 713, 657],
4288 | [1, 0.0, 2, 0.0, 0, 0, 0, 1, 4, 113, 311],
4289 | [1, 0.0, 2, 0.0, 0, 33, 0, 0, 74, 3564, 1051],
4290 | [1, 0.0, 1, 0.0, 0, 121, 0, 0, 958, 904, 479]]
4291 |
4292 |
4293 |
4294 |
4295 | ```python
4296 | y_likers_data = pd.DataFrame(y_likers_table,
4297 | columns = ['profile pic',
4298 | 'nums/length username',
4299 | 'fullname words',
4300 | 'nums/length fullname',
4301 | 'name==username',
4302 | 'description length',
4303 | 'external URL',
4304 | 'private',
4305 | '#posts',
4306 | '#followers',
4307 | '#follows'])
4308 | y_likers_data
4309 | ```
4310 |
4311 |
4312 |
4313 |
4314 |
4315 |
4328 |
4329 |
4330 |
4331 | |
4332 | profile pic |
4333 | nums/length username |
4334 | fullname words |
4335 | nums/length fullname |
4336 | name==username |
4337 | description length |
4338 | external URL |
4339 | private |
4340 | #posts |
4341 | #followers |
4342 | #follows |
4343 |
4344 |
4345 |
4346 |
4347 | | 0 |
4348 | 1 |
4349 | 0.000000 |
4350 | 2 |
4351 | 0.0 |
4352 | 0 |
4353 | 0 |
4354 | 0 |
4355 | 0 |
4356 | 2 |
4357 | 897 |
4358 | 830 |
4359 |
4360 |
4361 | | 1 |
4362 | 0 |
4363 | 0.000000 |
4364 | 2 |
4365 | 0.0 |
4366 | 0 |
4367 | 0 |
4368 | 0 |
4369 | 1 |
4370 | 0 |
4371 | 129 |
4372 | 132 |
4373 |
4374 |
4375 | | 2 |
4376 | 1 |
4377 | 0.000000 |
4378 | 2 |
4379 | 0.0 |
4380 | 0 |
4381 | 8 |
4382 | 0 |
4383 | 1 |
4384 | 72 |
4385 | 1157 |
4386 | 698 |
4387 |
4388 |
4389 | | 3 |
4390 | 1 |
4391 | 0.000000 |
4392 | 1 |
4393 | 0.0 |
4394 | 0 |
4395 | 10 |
4396 | 0 |
4397 | 1 |
4398 | 6 |
4399 | 1410 |
4400 | 619 |
4401 |
4402 |
4403 | | 4 |
4404 | 1 |
4405 | 0.000000 |
4406 | 1 |
4407 | 0.0 |
4408 | 0 |
4409 | 0 |
4410 | 0 |
4411 | 0 |
4412 | 0 |
4413 | 1916 |
4414 | 731 |
4415 |
4416 |
4417 | | 5 |
4418 | 1 |
4419 | 0.222222 |
4420 | 3 |
4421 | 0.0 |
4422 | 0 |
4423 | 72 |
4424 | 0 |
4425 | 1 |
4426 | 13 |
4427 | 950 |
4428 | 649 |
4429 |
4430 |
4431 | | 6 |
4432 | 1 |
4433 | 0.000000 |
4434 | 1 |
4435 | 0.0 |
4436 | 0 |
4437 | 19 |
4438 | 0 |
4439 | 1 |
4440 | 17 |
4441 | 1543 |
4442 | 1289 |
4443 |
4444 |
4445 | | 7 |
4446 | 1 |
4447 | 0.200000 |
4448 | 5 |
4449 | 0.0 |
4450 | 0 |
4451 | 11 |
4452 | 0 |
4453 | 0 |
4454 | 33 |
4455 | 1076 |
4456 | 606 |
4457 |
4458 |
4459 | | 8 |
4460 | 1 |
4461 | 0.000000 |
4462 | 1 |
4463 | 0.0 |
4464 | 0 |
4465 | 104 |
4466 | 0 |
4467 | 1 |
4468 | 6 |
4469 | 202 |
4470 | 485 |
4471 |
4472 |
4473 | | 9 |
4474 | 1 |
4475 | 0.200000 |
4476 | 1 |
4477 | 0.0 |
4478 | 0 |
4479 | 15 |
4480 | 0 |
4481 | 0 |
4482 | 7 |
4483 | 1262 |
4484 | 679 |
4485 |
4486 |
4487 | | 10 |
4488 | 1 |
4489 | 0.153846 |
4490 | 2 |
4491 | 0.0 |
4492 | 0 |
4493 | 0 |
4494 | 0 |
4495 | 0 |
4496 | 6 |
4497 | 1150 |
4498 | 732 |
4499 |
4500 |
4501 | | 11 |
4502 | 1 |
4503 | 0.000000 |
4504 | 1 |
4505 | 0.0 |
4506 | 0 |
4507 | 17 |
4508 | 1 |
4509 | 0 |
4510 | 28 |
4511 | 2442 |
4512 | 629 |
4513 |
4514 |
4515 | | 12 |
4516 | 1 |
4517 | 0.000000 |
4518 | 2 |
4519 | 0.0 |
4520 | 0 |
4521 | 61 |
4522 | 0 |
4523 | 0 |
4524 | 159 |
4525 | 556 |
4526 | 765 |
4527 |
4528 |
4529 | | 13 |
4530 | 1 |
4531 | 0.000000 |
4532 | 2 |
4533 | 0.0 |
4534 | 0 |
4535 | 34 |
4536 | 0 |
4537 | 1 |
4538 | 10 |
4539 | 531 |
4540 | 526 |
4541 |
4542 |
4543 | | 14 |
4544 | 1 |
4545 | 0.000000 |
4546 | 3 |
4547 | 0.0 |
4548 | 0 |
4549 | 127 |
4550 | 0 |
4551 | 0 |
4552 | 23 |
4553 | 1137 |
4554 | 909 |
4555 |
4556 |
4557 | | 15 |
4558 | 1 |
4559 | 0.000000 |
4560 | 2 |
4561 | 0.0 |
4562 | 0 |
4563 | 66 |
4564 | 0 |
4565 | 1 |
4566 | 25 |
4567 | 583 |
4568 | 805 |
4569 |
4570 |
4571 | | 16 |
4572 | 1 |
4573 | 0.133333 |
4574 | 2 |
4575 | 0.0 |
4576 | 0 |
4577 | 67 |
4578 | 1 |
4579 | 0 |
4580 | 141 |
4581 | 4615 |
4582 | 1948 |
4583 |
4584 |
4585 | | 17 |
4586 | 1 |
4587 | 0.000000 |
4588 | 2 |
4589 | 0.0 |
4590 | 0 |
4591 | 47 |
4592 | 0 |
4593 | 1 |
4594 | 387 |
4595 | 75 |
4596 | 162 |
4597 |
4598 |
4599 | | 18 |
4600 | 1 |
4601 | 0.000000 |
4602 | 1 |
4603 | 0.0 |
4604 | 0 |
4605 | 142 |
4606 | 0 |
4607 | 1 |
4608 | 8144 |
4609 | 664 |
4610 | 1527 |
4611 |
4612 |
4613 | | 19 |
4614 | 1 |
4615 | 0.000000 |
4616 | 3 |
4617 | 0.0 |
4618 | 0 |
4619 | 4 |
4620 | 0 |
4621 | 1 |
4622 | 1 |
4623 | 466 |
4624 | 325 |
4625 |
4626 |
4627 | | 20 |
4628 | 1 |
4629 | 0.058824 |
4630 | 1 |
4631 | 0.0 |
4632 | 0 |
4633 | 32 |
4634 | 0 |
4635 | 0 |
4636 | 14 |
4637 | 419 |
4638 | 414 |
4639 |
4640 |
4641 | | 21 |
4642 | 1 |
4643 | 0.000000 |
4644 | 3 |
4645 | 0.0 |
4646 | 0 |
4647 | 75 |
4648 | 1 |
4649 | 0 |
4650 | 353 |
4651 | 1399 |
4652 | 764 |
4653 |
4654 |
4655 | | 22 |
4656 | 1 |
4657 | 0.000000 |
4658 | 1 |
4659 | 0.0 |
4660 | 0 |
4661 | 0 |
4662 | 0 |
4663 | 0 |
4664 | 9 |
4665 | 611 |
4666 | 554 |
4667 |
4668 |
4669 | | 23 |
4670 | 1 |
4671 | 0.000000 |
4672 | 1 |
4673 | 0.0 |
4674 | 0 |
4675 | 29 |
4676 | 0 |
4677 | 1 |
4678 | 3 |
4679 | 2064 |
4680 | 1077 |
4681 |
4682 |
4683 | | 24 |
4684 | 1 |
4685 | 0.000000 |
4686 | 1 |
4687 | 0.0 |
4688 | 0 |
4689 | 26 |
4690 | 0 |
4691 | 1 |
4692 | 37 |
4693 | 628 |
4694 | 714 |
4695 |
4696 |
4697 | | 25 |
4698 | 1 |
4699 | 0.000000 |
4700 | 2 |
4701 | 0.0 |
4702 | 0 |
4703 | 89 |
4704 | 1 |
4705 | 1 |
4706 | 243 |
4707 | 2316 |
4708 | 1030 |
4709 |
4710 |
4711 | | 26 |
4712 | 1 |
4713 | 0.000000 |
4714 | 2 |
4715 | 0.0 |
4716 | 0 |
4717 | 140 |
4718 | 1 |
4719 | 0 |
4720 | 666 |
4721 | 4460 |
4722 | 492 |
4723 |
4724 |
4725 | | 27 |
4726 | 1 |
4727 | 0.000000 |
4728 | 2 |
4729 | 0.0 |
4730 | 0 |
4731 | 20 |
4732 | 0 |
4733 | 0 |
4734 | 71 |
4735 | 4101 |
4736 | 878 |
4737 |
4738 |
4739 | | 28 |
4740 | 1 |
4741 | 0.000000 |
4742 | 2 |
4743 | 0.0 |
4744 | 0 |
4745 | 5 |
4746 | 0 |
4747 | 0 |
4748 | 148 |
4749 | 424 |
4750 | 716 |
4751 |
4752 |
4753 | | 29 |
4754 | 1 |
4755 | 0.000000 |
4756 | 1 |
4757 | 0.0 |
4758 | 0 |
4759 | 0 |
4760 | 0 |
4761 | 1 |
4762 | 2 |
4763 | 640 |
4764 | 730 |
4765 |
4766 |
4767 | | 30 |
4768 | 1 |
4769 | 0.000000 |
4770 | 2 |
4771 | 0.0 |
4772 | 0 |
4773 | 64 |
4774 | 0 |
4775 | 1 |
4776 | 8 |
4777 | 1141 |
4778 | 891 |
4779 |
4780 |
4781 | | 31 |
4782 | 1 |
4783 | 0.000000 |
4784 | 3 |
4785 | 0.0 |
4786 | 0 |
4787 | 29 |
4788 | 0 |
4789 | 1 |
4790 | 10 |
4791 | 1378 |
4792 | 986 |
4793 |
4794 |
4795 | | 32 |
4796 | 1 |
4797 | 0.000000 |
4798 | 2 |
4799 | 0.0 |
4800 | 0 |
4801 | 14 |
4802 | 0 |
4803 | 1 |
4804 | 3 |
4805 | 994 |
4806 | 698 |
4807 |
4808 |
4809 | | 33 |
4810 | 1 |
4811 | 0.000000 |
4812 | 1 |
4813 | 0.0 |
4814 | 0 |
4815 | 29 |
4816 | 0 |
4817 | 1 |
4818 | 43 |
4819 | 181 |
4820 | 169 |
4821 |
4822 |
4823 | | 34 |
4824 | 1 |
4825 | 0.000000 |
4826 | 1 |
4827 | 0.0 |
4828 | 0 |
4829 | 58 |
4830 | 1 |
4831 | 0 |
4832 | 24 |
4833 | 1144 |
4834 | 1091 |
4835 |
4836 |
4837 | | 35 |
4838 | 1 |
4839 | 0.000000 |
4840 | 2 |
4841 | 0.0 |
4842 | 0 |
4843 | 25 |
4844 | 0 |
4845 | 1 |
4846 | 36 |
4847 | 687 |
4848 | 574 |
4849 |
4850 |
4851 | | 36 |
4852 | 1 |
4853 | 0.000000 |
4854 | 3 |
4855 | 0.0 |
4856 | 0 |
4857 | 8 |
4858 | 0 |
4859 | 1 |
4860 | 33 |
4861 | 1846 |
4862 | 996 |
4863 |
4864 |
4865 | | 37 |
4866 | 1 |
4867 | 0.571429 |
4868 | 2 |
4869 | 0.0 |
4870 | 0 |
4871 | 18 |
4872 | 0 |
4873 | 1 |
4874 | 202 |
4875 | 1180 |
4876 | 600 |
4877 |
4878 |
4879 | | 38 |
4880 | 1 |
4881 | 0.000000 |
4882 | 2 |
4883 | 0.0 |
4884 | 0 |
4885 | 7 |
4886 | 0 |
4887 | 0 |
4888 | 45 |
4889 | 1206 |
4890 | 676 |
4891 |
4892 |
4893 | | 39 |
4894 | 1 |
4895 | 0.000000 |
4896 | 2 |
4897 | 0.0 |
4898 | 0 |
4899 | 76 |
4900 | 0 |
4901 | 0 |
4902 | 12 |
4903 | 661 |
4904 | 3004 |
4905 |
4906 |
4907 | | 40 |
4908 | 1 |
4909 | 0.000000 |
4910 | 1 |
4911 | 0.0 |
4912 | 0 |
4913 | 9 |
4914 | 0 |
4915 | 1 |
4916 | 5 |
4917 | 759 |
4918 | 706 |
4919 |
4920 |
4921 | | 41 |
4922 | 0 |
4923 | 0.000000 |
4924 | 3 |
4925 | 0.0 |
4926 | 0 |
4927 | 61 |
4928 | 0 |
4929 | 1 |
4930 | 9 |
4931 | 439 |
4932 | 612 |
4933 |
4934 |
4935 | | 42 |
4936 | 1 |
4937 | 0.166667 |
4938 | 1 |
4939 | 0.0 |
4940 | 0 |
4941 | 0 |
4942 | 0 |
4943 | 1 |
4944 | 3 |
4945 | 911 |
4946 | 822 |
4947 |
4948 |
4949 | | 43 |
4950 | 1 |
4951 | 0.400000 |
4952 | 2 |
4953 | 0.0 |
4954 | 0 |
4955 | 82 |
4956 | 0 |
4957 | 0 |
4958 | 99 |
4959 | 556 |
4960 | 733 |
4961 |
4962 |
4963 | | 44 |
4964 | 1 |
4965 | 0.000000 |
4966 | 2 |
4967 | 0.0 |
4968 | 0 |
4969 | 80 |
4970 | 0 |
4971 | 1 |
4972 | 21 |
4973 | 478 |
4974 | 385 |
4975 |
4976 |
4977 | | 45 |
4978 | 1 |
4979 | 0.000000 |
4980 | 1 |
4981 | 0.0 |
4982 | 0 |
4983 | 0 |
4984 | 0 |
4985 | 1 |
4986 | 0 |
4987 | 653 |
4988 | 312 |
4989 |
4990 |
4991 | | 46 |
4992 | 1 |
4993 | 0.000000 |
4994 | 1 |
4995 | 0.0 |
4996 | 0 |
4997 | 13 |
4998 | 0 |
4999 | 1 |
5000 | 40 |
5001 | 713 |
5002 | 657 |
5003 |
5004 |
5005 | | 47 |
5006 | 1 |
5007 | 0.000000 |
5008 | 2 |
5009 | 0.0 |
5010 | 0 |
5011 | 0 |
5012 | 0 |
5013 | 1 |
5014 | 4 |
5015 | 113 |
5016 | 311 |
5017 |
5018 |
5019 | | 48 |
5020 | 1 |
5021 | 0.000000 |
5022 | 2 |
5023 | 0.0 |
5024 | 0 |
5025 | 33 |
5026 | 0 |
5027 | 0 |
5028 | 74 |
5029 | 3564 |
5030 | 1051 |
5031 |
5032 |
5033 | | 49 |
5034 | 1 |
5035 | 0.000000 |
5036 | 1 |
5037 | 0.0 |
5038 | 0 |
5039 | 121 |
5040 | 0 |
5041 | 0 |
5042 | 958 |
5043 | 904 |
5044 | 479 |
5045 |
5046 |
5047 |
5048 |
5049 |
5050 |
5051 |
5052 |
5053 | ```python
5054 | # Predict!
5055 | y_likers_pred = rfc_model.predict(y_likers_data)
5056 | y_likers_pred
5057 | ```
5058 |
5059 |
5060 |
5061 |
5062 | array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5063 | 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
5064 | 0, 0, 0, 0, 0, 0])
5065 |
5066 |
5067 |
5068 |
5069 | ```python
5070 | # Calculate the number of fake likes
5071 | no_fakes_yl = len([x for x in y_likers_pred if x==1])
5072 |
5073 | # Calculate media likes authenticity
5074 | y_post_authenticity = (len(y_random_likers) - no_fakes_yl) * 100 / len(y_random_likers)
5075 | print("The media with the ID:YYYYY has " + str(y_post_authenticity) + "% authentic likes.")
5076 | ```
5077 |
5078 | The media with the ID:YYYYY has 96.0% authentic likes.
5079 |
5080 |
5081 | Very high likes authenticity but very low follower authenticity? How is that possible?
5082 |
5083 | We can use **engagement rates** to explain this phenomena further.
5084 |
5085 | Engagement rate = average number of engagements (likes+comments) / number of followers)
5086 |
5087 |
5088 | ```python
5089 | y_posts[0].keys()
5090 | ```
5091 |
5092 |
5093 |
5094 |
5095 | dict_keys(['taken_at', 'pk', 'id', 'device_timestamp', 'media_type', 'code', 'client_cache_key', 'filter_type', 'carousel_media_count', 'carousel_media', 'can_see_insights_as_brand', 'location', 'lat', 'lng', 'user', 'can_viewer_reshare', 'caption_is_edited', 'comment_likes_enabled', 'comment_threading_enabled', 'has_more_comments', 'max_num_visible_preview_comments', 'preview_comments', 'can_view_more_preview_comments', 'comment_count', 'inline_composer_display_condition', 'inline_composer_imp_trigger_time', 'like_count', 'has_liked', 'top_likers', 'photo_of_you', 'caption', 'can_viewer_save', 'organic_tracking_token'])
5096 |
5097 |
5098 |
5099 |
5100 | ```python
5101 | count = 0
5102 |
5103 | for post in y_posts:
5104 | count += post['comment_count']
5105 | count += post['like_count']
5106 |
5107 | average_engagements = count / len(y_posts)
5108 | engagement_rate = average_engagements*100 / len(y_followers)
5109 |
5110 | engagement_rate
5111 | ```
5112 |
5113 |
5114 |
5115 |
5116 | 9.50268408791654
5117 |
5118 |
5119 |
5120 | This means that only roughly 9.5% of user Y's followers engage with their content.
5121 |
5122 | ## Part 8: Thoughts
5123 |
5124 | **Making sense of the result**
5125 |
5126 | So user X received an 82% follower authenticity score and a 92% media likes authenticity on one of their posts. Is that good enough? What about user Y with a 35% follower authenticity score and a 96% media likes authenticity?
5127 |
5128 | Since this entire notebook is an exploratory analysis, there's not really a hard line between a 'good' influencer and a 'bad' influencer. For user X, we can tell that the user has authentic and loyal followers. However for user Y, we can assume that they have a rather low authentic follower score, however their likes consist of real followers. This means that user Y might have invested on buying followers, but not likes! This causes a really low engagement rate.
5129 |
5130 | In fact, with a little bit more research, you can sort of establish a pattern just by observation:
5131 | - High follower authenticity, high media authenticity, high engagement rate = authentic user
5132 | - Low follower authenticity, high media authenticity, low engagement rate = buys followers, does not buy likes
5133 | - Low follower authenticity, high media authenticity, high engagement rate = buys followers AND likes
5134 | - ... and so on!
5135 |
5136 | **So is this influencer worth investing or not?**
5137 |
5138 | Remember that we used a *random sample* of 50 followers out of thousands. As objective as random sampling could be, it still isn't an *absolutely complete* picture of the user's followers. However, the follower authenticity combined with the media likes authenticity still provides an insight for brands who are planning to invest on the influencer.
5139 |
5140 | Personally, I feel like any number under 50% is rather suspicious, and there are other ways that you can confirm this suspicion:
5141 | - Low engagement rates (engagement rate = average number of engagements (likes+comments) / number of followers)
5142 | - Spikes in follower growth (uneven growth chart)
5143 | - Comments (loyal followers acutally care about the user's content)
5144 |
5145 | But of course, you have to be aware of tech-savvy influencers who cheats the audit system and try to avoid getting caught, such as influencers who buys 'drip-followers' - i.e. you buy followers in bulk but they arrive slowly. This method will make their follower growth seem gradual.
5146 |
5147 | **Conclusion**
5148 |
5149 | The rapid growth of technology allows anyone with a computer to create bots to follow users and like media on any platform. However, this also means that our ability to detect fake engagements should also improve!
5150 |
5151 | Businesses, small or large, invest on social media influencers to reach a wider audience, especially during times of a global pandemic where everyone is constantly on their phones! Less tech-savvy and less aware ones are prone to this kind of misinformation.
5152 |
5153 | For brands who rely on influencers for marketing, it is highly recommended to check out services such as SocialBlade to check user authenticity and engagement. Some services are more pricey, but is definitely worth the investment!
5154 |
5155 |
--------------------------------------------------------------------------------