├── FotMob_PremScraper.py ├── PL_matchweek_1.csv └── README.md /FotMob_PremScraper.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Tue Aug 24 12:09:40 2021 4 | 5 | @author: Dean Patel 6 | 7 | FotMob Premier League Matchweek Stat Scraper 8 | **FOR EDUCATIONAL PURPOSES ONLY** 9 | """ 10 | #%% 11 | #Libraries 12 | from selenium import webdriver 13 | from selenium.webdriver.support.select import Select 14 | import pandas as pd 15 | 16 | #%% 17 | #Setting up driver and url 18 | DRIVER_PATH = 'C:/Users/deanp/Downloads/chromedriver_win32/chromedriver.exe' # YOUR PATH GOES HERE 19 | driver = webdriver.Chrome(executable_path=DRIVER_PATH) 20 | driver.get('https://www.fotmob.com/leagues/47/matches/premier-league/by-round') 21 | 22 | #%% 23 | MATCHWEEK = 1 # HERE PUT THE MATCHWEEK YOU WANT TO SCRAPE STATS FOR 24 | # identify dropdown with Select class 25 | sel = Select(driver.find_element_by_class_name('css-ph6ea9-Select')) 26 | sel.select_by_value (str(MATCHWEEK)) 27 | #%% 28 | #Block to retrieve match links for the matchweek 29 | match_divs = driver.find_elements_by_class_name('css-1ty7xja-LeagueMatchCSS') 30 | match_links = [] 31 | for match in match_divs: 32 | link = match.find_element_by_tag_name('a').get_attribute('href') 33 | link = link.replace("matchfacts", "stats") 34 | match_links.append(link) 35 | 36 | #%% 37 | #Function to get match statistics given a match link and driver element 38 | def get_match_stats(driver, match_link): 39 | driver.get(match_link) 40 | div1 = driver.find_element_by_class_name('css-hhl4y6-MFStatsContainer') 41 | statsContainers = div1.find_elements_by_class_name('css-1khnrgx-StatsContainer') 42 | 43 | #Setup 44 | home_stats = [] 45 | away_stats = [] 46 | 47 | #Block to get team names and goals for, against 48 | header = driver.find_element_by_class_name('eb9uzl80') 49 | match_info = header.find_elements_by_tag_name('span') 50 | home_team = match_info[0].text 51 | away_team = match_info[3].text 52 | home_stats.append(home_team) 53 | away_stats.append(away_team) 54 | score = match_info[1].text 55 | home_goals = score[0] 56 | away_goals = score[4] 57 | home_gd = int(home_goals) - int(away_goals) 58 | away_gd = home_gd * -1 59 | home_stats.append(home_goals) 60 | home_stats.append(away_goals) 61 | home_stats.append(home_gd) 62 | away_stats.append(away_goals) 63 | away_stats.append(home_goals) 64 | away_stats.append(away_gd) 65 | 66 | #Block to get possession stats from a single match 67 | possession_graph = statsContainers[0].find_element_by_class_name('css-pqh9gi-MFSGraphContainer') 68 | possession_nums = possession_graph.find_elements_by_tag_name("div") 69 | possession_home = possession_nums[0].get_attribute('width') 70 | possession_away = possession_nums[1].get_attribute('width') 71 | home_stats.append(possession_home) 72 | away_stats.append(possession_away) 73 | 74 | #Block to loop through stat containers, extracting data for home and away 75 | for statContainer in statsContainers: 76 | statRowWrappers = statContainer.find_elements_by_class_name('e1car83v2') 77 | for statRowWrapper in statRowWrappers: 78 | statRow = statRowWrapper.find_elements_by_tag_name('span') 79 | home_stats.append(statRow[0].text) 80 | away_stats.append(statRow[2].text) 81 | return home_stats, away_stats 82 | 83 | #%% 84 | #Block to create DataFrame and export as a csv 85 | matchweek_lists = [] 86 | labels = ['Team', 'GF', 'GA', 'GD', 'Posession', 'Expected goals (xG)', 'Total shots', 'Chances created', 'Big chances', 'Accurate passes', 'Pass success', 'Fouls conceded', 'Corners', 'Offsides', 'Shots', 'Shots on target', 'Shots off target', 'Blocked shots', 'Shots woodwork', 'Shots inside box', 'Shots outside box', 'Expected goals (xG)', 'xG first half', 'xG second half', 'xG on target (xGOT)', 'xG open play', 'xG set play', 'Accurate passes', 'Own half', 'Opposition half', 'Passes', 'Pass success', 'Touches', 'Long balls', 'Accurate long balls', 'Crosses', 'Accurate crosses', 'Throws', 'Duels won', 'Duels', 'Dribbles attempteds', 'Dribbles succeeded', 'Tackles attempted', 'Tackles succeeded', 'Aerials won', 'Interceptions', 'Discipline', 'Yellow cards', 'Red cards', 'Keeper', 'Saves', 'Diving saves', 'Saves inside box', 'Acted as sweeper', 'Punches'] 87 | 88 | for match_link in match_links: 89 | home_stats, away_stats = get_match_stats(driver, match_link) 90 | if len(home_stats) and len(away_stats) == 56: # deletes penalty xG data 91 | del home_stats[27] 92 | del away_stats[27] 93 | matchweek_lists.append(home_stats) 94 | matchweek_lists.append(away_stats) 95 | 96 | matchweek_df = pd.DataFrame(matchweek_lists, columns=labels) 97 | del matchweek_df['Discipline'] 98 | del matchweek_df['Keeper'] 99 | 100 | #%% 101 | 102 | #Export as csv 103 | matchweek_df.to_csv('PL_matchweek_' + str(MATCHWEEK) + '.csv', index=False) 104 | 105 | 106 | -------------------------------------------------------------------------------- /PL_matchweek_1.csv: -------------------------------------------------------------------------------- 1 | ,Team,GF,GA,GD,Posession,Expected goals (xG),Total shots,Chances created,Big chances,Accurate passes,Pass success,Fouls conceded,Corners,Offsides,Shots,Shots on target,Shots off target,Blocked shots,Shots woodwork,Shots inside box,Shots outside box,Expected goals (xG),xG first half,xG second half,xG on target (xGOT),xG open play,xG set play,Accurate passes,Own half,Opposition half,Passes,Pass success,Touches,Long balls,Accurate long balls,Crosses,Accurate crosses,Throws,Duels won,Duels,Dribbles attempteds,Dribbles succeeded,Tackles attempted,Tackles succeeded,Aerials won,Interceptions,Yellow cards,Red cards,Saves,Diving saves,Saves inside box,Acted as sweeper,Punches 2 | 0,Brentford,2,0,2,35,1.43,8,7,1,201,65%,12,2,1,8,3,5,0,1,7,1,1.43,0.34,1.10,1.00,0.39,1.05,201,106,95,309,65%,479,65,22,9,2,9,51,101,11,7,20,12,17,9,0,0,4,2,2,1,0 3 | 1,Arsenal,0,2,-2,65,1.20,22,19,0,488,86%,8,5,1,22,4,14,4,0,13,9,1.20,0.37,0.84,0.37,0.91,0.29,488,205,283,568,86%,776,47,20,22,10,21,50,101,20,9,9,3,20,7,0,0,1,0,1,0,0 4 | 2,Manchester United,5,1,4,49,1.47,16,19,2,334,79%,11,5,2,16,8,5,3,0,12,4,1.47,0.88,0.58,1.64,1.39,0.07,334,193,141,422,79%,604,47,25,11,1,25,38,89,24,12,6,4,11,8,1,0,2,1,0,0,0 5 | 3,Leeds United,1,5,-4,51,0.60,10,8,2,343,78%,9,4,3,10,3,6,1,0,3,7,0.60,0.22,0.38,1.01,0.37,0.23,343,242,101,438,78%,623,62,23,14,4,15,51,89,12,9,23,13,9,14,2,0,3,1,1,1,2 6 | 4,Burnley,1,2,-1,36,1.40,14,10,1,181,70%,10,7,1,14,3,8,3,2,8,6,1.40,1.03,0.36,0.94,0.34,1.06,181,76,105,259,70%,463,50,17,36,5,20,61,110,18,12,16,12,26,10,2,0,4,0,4,0,0 7 | 5,Brighton & Hove Albion,2,1,1,64,1.73,14,12,2,424,82%,7,6,0,14,8,4,2,0,12,2,1.73,0.29,1.44,1.11,1.49,0.24,424,273,151,518,82%,705,72,25,14,6,22,49,110,7,3,16,9,20,6,1,0,2,0,1,0,2 8 | 6,Chelsea,3,0,3,62,1.14,13,7,1,623,92%,15,5,0,13,6,5,2,0,5,8,1.14,0.98,0.16,1.67,0.85,0.29,623,249,374,678,92%,834,52,38,22,4,17,46,98,14,8,21,9,7,11,0,0,1,1,1,0,0 9 | 7,Crystal Palace,0,3,-3,38,0.29,4,4,0,363,86%,11,2,1,4,1,2,1,0,4,0,0.29,0.00,0.29,0.18,0.11,0.18,363,264,99,423,86%,590,39,18,5,3,13,52,98,20,13,11,7,15,9,0,0,3,3,2,1,1 10 | 8,Everton,3,1,2,48,2.09,14,14,2,235,70%,13,6,0,14,6,5,3,0,11,3,2.09,0.36,1.72,2.96,1.65,0.44,235,113,122,337,70%,564,68,26,25,6,23,68,129,14,7,23,9,25,6,2,0,2,1,2,1,1 11 | 9,Southampton,1,3,-2,52,0.78,6,5,1,256,69%,15,8,0,6,3,1,2,0,5,1,0.78,0.50,0.28,1.09,0.49,0.29,256,114,142,370,69%,581,49,10,22,4,17,61,129,23,10,21,12,19,5,0,0,3,2,2,0,2 12 | 10,Leicester City,1,0,1,56,0.51,9,9,0,505,86%,6,5,5,9,5,3,1,0,6,3,0.51,0.37,0.14,1.18,0.51,0.00,505,329,176,584,86%,781,35,16,7,2,21,42,101,20,7,15,11,10,11,1,0,3,2,2,1,0 13 | 11,Wolverhampton Wanderers,0,1,-1,44,1.49,17,12,1,366,83%,10,4,4,17,3,7,7,0,12,5,1.49,0.60,0.89,0.22,1.45,0.04,366,193,173,443,83%,645,58,36,16,6,20,59,101,32,21,20,10,12,13,2,0,4,3,2,0,0 14 | 12,Watford,3,2,1,38,0.86,13,13,1,234,74%,18,2,0,13,7,4,2,0,9,4,0.86,0.62,0.24,1.57,0.67,0.19,234,156,78,317,74%,487,63,21,8,2,9,65,127,18,13,15,6,24,9,3,0,0,0,0,0,0 15 | 13,Aston Villa,2,3,-1,62,1.25,11,7,1,435,86%,13,4,2,11,2,6,3,0,4,7,1.25,0.09,1.16,1.32,0.38,0.08,435,250,185,508,86%,702,50,28,26,2,17,62,127,16,14,13,11,17,11,1,0,4,2,2,0,0 16 | 14,Norwich City,0,3,-3,50,1.61,14,11,1,441,85%,4,3,2,14,3,8,3,0,9,5,1.61,0.16,1.44,1.71,0.36,1.24,441,320,121,520,85%,733,69,34,19,4,16,51,91,18,10,19,13,8,9,1,0,3,1,5,1,0 17 | 15,Liverpool,3,0,3,50,1.48,19,16,2,432,84%,14,11,0,19,8,7,4,0,13,6,1.48,0.75,0.73,1.17,1.10,0.38,432,233,199,512,84%,690,44,29,25,6,17,40,91,20,7,16,8,14,5,1,0,3,1,3,0,0 18 | 16,Newcastle United,2,4,-2,46,1.79,17,14,1,340,85%,4,7,1,17,3,7,7,0,9,8,1.79,1.48,0.31,1.53,1.25,0.54,340,170,170,401,85%,596,51,26,27,6,14,37,95,23,14,8,5,12,9,1,0,4,1,4,0,0 19 | 17,West Ham United,4,2,2,54,3.04,18,15,3,405,86%,3,6,2,18,9,4,5,1,12,6,3.04,0.96,2.08,3.98,1.89,0.36,405,172,233,473,86%,654,38,18,23,4,20,58,95,15,11,23,14,20,7,0,0,1,0,1,0,0 20 | 18,Tottenham Hotspur,1,0,1,34,1.17,13,11,0,227,77%,11,3,3,13,3,5,5,0,5,8,1.17,0.43,0.74,0.75,0.85,0.32,227,93,134,293,77%,494,51,25,12,3,13,51,117,19,14,15,9,14,10,2,0,4,1,3,0,0 21 | 19,Manchester City,0,1,-1,66,1.91,18,13,2,469,86%,8,11,1,18,4,9,5,0,11,7,1.91,0.87,1.04,0.44,1.03,0.88,469,195,274,548,86%,731,22,9,30,4,14,66,117,27,20,16,11,19,3,1,0,1,1,1,0,0 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # FotMob-PL-Webscraper 2 | 3 | This is the code repository for scraping football (real football) data from [FotMob](https://fotmob.com). FotMob is a popular app for all-things football-- from live lineups and player ratings to key match statistics and injury news. 4 | 5 | ## About 6 | 7 | Currently, the code is able to scrape match statistics by the Matchweek for the Premier League. It works from [this page](https://www.fotmob.com/leagues/47/matches/premier-league/by-round), going through each match and gathering statistics. It is powered by [Selenium](https://selenium-python.readthedocs.io/), a browser automation package. The final output is a csv containing match statistics such as xG (expected goals), xGOT, (expected goals on target), opposition half passes, possession, interceptions, duels, and much more for each team that played in the matchweek you specify. 8 | 9 | ## How To Use 10 | 11 | First, simply download the python file and ensure selenium and pandas packages are installed on your computer. Second, you will need a chromedriver which you can download [here](https://sites.google.com/chromium.org/driver/). Once downloaded, just copy the file path of the chromedriver.exe file into the code (where you see the global variable DRIVER_PATH. Now you are all set to scrape! You can toggle which Matchweek you would like to scrape by changing the MATCHWEEK global variable. 12 | 13 | ## Example 14 | 15 | To see an example output, download the csv file in the repo. 16 | 17 | ## Future Work 18 | 19 | Currently, this is a scraper for only Premier League data. Further, it does not currently report shot map data (location, shot type, shot result). Future commits will include collecting and reporting these data, as well as functionality for other leagues (such as La Liga, Bundesliga, Serie A, etc.). 20 | 21 | ## Disclaimer 22 | 23 | This project is strictly for educational and research purposes. This software is not intended to be used commercially whatsoever. 24 | 25 | --------------------------------------------------------------------------------