├── README.md ├── images ├── trends_data.png ├── trends_data_comparison.png └── trends_data_csv.png └── scraper.py /README.md: -------------------------------------------------------------------------------- 1 | # How to Scrape Google Trends Data With Python 2 | 3 | [![Oxylabs promo code](https://raw.githubusercontent.com/oxylabs/how-to-scrape-google-scholar/refs/heads/main/Google-Scraper-API-1090x275.png)](https://oxylabs.io/products/scraper-api/serp/google?utm_source=877&utm_medium=affiliate&groupid=877&utm_content=how-to-scrape-google-trends-github&transaction_id=102c8d36f7f0d0e5797b8f26152160) 4 | 5 | [![](https://dcbadge.limes.pink/api/server/Pds3gBmKMH?style=for-the-badge&theme=discord)](https://discord.gg/Pds3gBmKMH) [![YouTube](https://img.shields.io/badge/YouTube-Oxylabs-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/@oxylabs) 6 | 7 | - [Why scrape Google Trends data?](#why-scrape-google-trends-data) 8 | - [1. Install libraries](#1-install-libraries) 9 | - [2. Send a request](#2-send-a-request) 10 | - [3. Save results to CSV](#3-save-results-to-csv) 11 | - [4. Create a result comparison](#4-create-a-result-comparison) 12 | 13 | This article provides step-by-step instructions on how to get Google Trends data with Python and [SERP Scraper API](https://developers.oxylabs.io/scraper-apis/web-scraper-api) (a part of Web Scraper API), which requires a **paid subscription** or a **free trial**. 14 | 15 | 16 | 17 | ## Why scrape Google Trends data? 18 | Here are some of the uses for scraped Google Trends data: 19 | 20 | - **Keyword research:** Google Trends is widely used among SEO specialists and content marketers. Since it provides insights into the past and present popularity of search terms, these professionals can tailor their marketing strategies to gain more website traffic. 21 | 22 | - **Market research:** Google Trends data can be used for market research, helping businesses understand consumer interests and preferences over time. For example, e-commerce businesses can use Google Trends search insights for product development. 23 | 24 | - **Societal research:** Google Trends website is a valuable resource for journalists and researchers, offering a glimpse into societal trends and public interest in specific topics. 25 | 26 | These are just a few examples. Google Trends data can also help with investment decisions, brand reputation monitoring, and other cases. 27 | 28 | ## 1. Install libraries 29 | 30 | For this guide, you'll need the following: 31 | - Credentials for [SERP Scraper API](https://oxylabs.io/products/scraper-api/serp) – you can claim a **7-day free trial** by registering on the [dashboard](https://dashboard.oxylabs.io/en/); 32 | - [Python](https://www.python.org/downloads/); 33 | - [Requests](https://requests.readthedocs.io/en/latest/) library to make requests; 34 | - [Pandas](https://pandas.pydata.org/docs/index.html) library to manipulate received data. 35 | 36 | Open your terminal and run the following `pip` command: 37 | ```bash 38 | pip install requests pandas 39 | ``` 40 | 41 | Then, import these libraries in a new Python file: 42 | 43 | ```python 44 | import requests 45 | import pandas as pd 46 | ``` 47 | 48 | ## 2. Send a request 49 | 50 | Let’s begin with building an initial request to the API: 51 | 52 | ```python 53 | import requests 54 | from pprint import pprint 55 | 56 | USERNAME = "YourUsername" 57 | PASSWORD = "YourPassword" 58 | 59 | query = "persian cat" 60 | 61 | print(f"Getting data from Google Trends for {query} keyword..") 62 | 63 | url = "https://realtime.oxylabs.io/v1/queries" 64 | auth = (USERNAME, PASSWORD) 65 | 66 | payload = { 67 | "source": "google_trends_explore", 68 | "query": query, 69 | } 70 | 71 | try: 72 | response = requests.request("POST", url, auth=auth, json=payload, timeout=180) 73 | except requests.exceptions.RequestException as e: 74 | print("Caught exception while getting trend data") 75 | raise e 76 | 77 | data = response.json() 78 | content = data["results"][0]["content"] 79 | pprint(content) 80 | ``` 81 | 82 | For more information about possible parameters, check our [documentation](https://developers.oxylabs.io/scraper-apis/web-scraper-api/google/trends-explore). 83 | 84 | If everything’s in order, when you run the code, you should see the raw results of the query in the terminal window like this: 85 | ![](images/trends_data.png) 86 | 87 | ## 3. Save results to CSV 88 | 89 | Now that you have the results, adjust the formatting and save in the CSV format – this way, it’ll be easier to analyze the data. All this can be done with the help of the `pandas` Python library. 90 | 91 | The response you get from the API provides you with four categories of information: `interest_over_time`, `breakdown_by_region`, `related_topics`, and `related_queries`. Let’s split each category into its own separate CSV file. 92 | 93 | Begin by converting each into a `pandas` dataframe: 94 | ```python 95 | def flatten_topic_data(topics_data: List[dict]) -> List[dict]: 96 | """Flattens related_topic data""" 97 | topics_items = [] 98 | for item in topics_data[0]["items"]: 99 | item_dict = { 100 | "mid": item["topic"]["mid"], 101 | "title": item["topic"]["title"], 102 | "type": item["topic"]["type"], 103 | "value": item["value"], 104 | "formatted_value": item["formatted_value"], 105 | "link": item["link"], 106 | "keyword": topics_data[0]["keyword"], 107 | } 108 | topics_items.append(item_dict) 109 | 110 | return topics_items 111 | 112 | trend_data = json.loads(content) 113 | print("Creating dataframes..") 114 | 115 | # Interest over time 116 | iot_df = pd.DataFrame(trend_data["interest_over_time"][0]["items"]) 117 | iot_df["keyword"] = trend_data["interest_over_time"][0]["keyword"] 118 | 119 | # Breakdown by region 120 | bbr_df = pd.DataFrame(trend_data["breakdown_by_region"][0]["items"]) 121 | bbr_df["keyword"] = trend_data["breakdown_by_region"][0]["keyword"] 122 | 123 | # Related topics 124 | rt_data = flatten_topic_data(trend_data["related_topics"]) 125 | rt_df = pd.DataFrame(rt_data) 126 | 127 | # Related queries 128 | rq_df = pd.DataFrame(trend_data["related_queries"][0]["items"]) 129 | rq_df["keyword"] = trend_data["related_queries"][0]["keyword"] 130 | ``` 131 | 132 | As the data for `related_topics` is multi-leveled, you'll have to flatten the structure into a single-leveled one. Thus, the function `flatten_topic_data` was added to do so. 133 | 134 | The only thing left is to save the data to a file: 135 | ```python 136 | CSV_FILE_DIR = "./csv/" 137 | 138 | keyword = trend_data["interest_over_time"][0]["keyword"] 139 | keyword_path = os.path.join(CSV_FILE_DIR, keyword) 140 | try: 141 | os.makedirs(keyword_path, exist_ok=True) 142 | except OSError as e: 143 | print("Caught exception while creating directories") 144 | raise e 145 | 146 | print("Dumping to csv..") 147 | iot_df.to_csv(f"{keyword_path}/interest_over_time.csv", index=False) 148 | bbr_df.to_csv(f"{keyword_path}/breakdown_by_region.csv", index=False) 149 | rt_df.to_csv(f"{keyword_path}/related_topics.csv", index=False) 150 | rq_df.to_csv(f"{keyword_path}/related_queries.csv", index=False) 151 | ``` 152 | You’ve now created a folder structure to hold all of your separate CSV files grouped by keyword: 153 | 154 | ![](images/trends_data_csv.png) 155 | 156 | ## 4. Create a result comparison 157 | Let’s begin with multiple keyword handling. To make the code iterable, split it into reusable functions. 158 | 159 | First, extract the code for the request to the API into a function that takes a query as an argument and returns you the response: 160 | 161 | ```python 162 | def get_trend_data(query: str) -> dict: 163 | """Gets a dictionary of trends based on given query string from Google Trends via SERP Scraper API""" 164 | print(f"Getting data from Google Trends for {query} keyword..") 165 | url = "https://realtime.oxylabs.io/v1/queries" 166 | auth = (USERNAME, PASSWORD) 167 | payload = { 168 | "source": "google_trends_explore", 169 | "query": query, 170 | } 171 | try: 172 | response = requests.request("POST", url, auth=auth, json=payload) 173 | except requests.exceptions.RequestException as e: 174 | print("Caught exception while getting trend data") 175 | raise e 176 | 177 | data = response.json() 178 | content = data["results"][0]["content"] 179 | return json.loads(content) 180 | ``` 181 | Next, you need a function that would transform a raw response into `pandas` dataframes, save said dataframes as CSV files, and return them: 182 | ```python 183 | def dump_trend_data_to_csv(trend_data: dict) -> dict: 184 | """Dumps given trend data to generated CSV file""" 185 | CSV_FILE_DIR = "./csv/" 186 | # Interest over time 187 | print("Creating dataframes..") 188 | iot_df = pd.DataFrame(trend_data["interest_over_time"][0]["items"]) 189 | iot_df["keyword"] = trend_data["interest_over_time"][0]["keyword"] 190 | 191 | # Breakdown by region 192 | bbr_df = pd.DataFrame(trend_data["breakdown_by_region"][0]["items"]) 193 | bbr_df["keyword"] = trend_data["breakdown_by_region"][0]["keyword"] 194 | 195 | # Related topics 196 | rt_data = flatten_topic_data(trend_data["related_topics"]) 197 | rt_df = pd.DataFrame(rt_data) 198 | 199 | # Related queries 200 | rq_df = pd.DataFrame(trend_data["related_queries"][0]["items"]) 201 | rq_df["keyword"] = trend_data["related_queries"][0]["keyword"] 202 | 203 | keyword = trend_data["interest_over_time"][0]["keyword"] 204 | keyword_path = os.path.join(CSV_FILE_DIR, keyword) 205 | try: 206 | os.makedirs(keyword_path, exist_ok=True) 207 | except OSError as e: 208 | print("Caught exception while creating directories") 209 | raise e 210 | 211 | print("Dumping to csv..") 212 | iot_df.to_csv(f"{keyword_path}/interest_over_time.csv", index=False) 213 | bbr_df.to_csv(f"{keyword_path}/breakdown_by_region.csv", index=False) 214 | rt_df.to_csv(f"{keyword_path}/related_topics.csv", index=False) 215 | rq_df.to_csv(f"{keyword_path}/related_queries.csv", index=False) 216 | 217 | result_set = {} 218 | result_set["iot"] = iot_df 219 | result_set["bbr"] = bbr_df 220 | result_set["rt"] = rt_df 221 | result_set["rq"] = rq_df 222 | 223 | return result_set 224 | ``` 225 | Now that the request and dataframe creation is covered, you can create comparisons: 226 | 227 | ```python 228 | def create_comparison(trend_dataframes : dict) -> None: 229 | comparison = trend_dataframes[0] 230 | i = 1 231 | 232 | for df in trend_dataframes[1:]: 233 | comparison["iot"] = pd.merge(comparison["iot"], df["iot"], on="time", suffixes=("", f"_{i}")) 234 | comparison["bbr"] = pd.merge(comparison["bbr"], df["bbr"], on="geo_code", suffixes=("", f"_{i}")) 235 | comparison["rt"] = pd.merge(comparison["rt"], df["rt"], on="title", how="inner", suffixes=("", f"_{i}")) 236 | comparison["rq"] = pd.merge(comparison["rq"], df["rq"], on="query", how="inner", suffixes=("", f"_{i}")) 237 | i = i + 1 238 | 239 | comparison["iot"].to_csv("comparison_interest_over_time.csv", index=False) 240 | comparison["bbr"].to_csv("comparison_breakdown_by_region.csv", index=False) 241 | comparison["rt"].to_csv("comparison_related_topics.csv", index=False) 242 | comparison["rq"].to_csv("comparison_related_queries.csv", index=False) 243 | ``` 244 | This function will accept the dataframes for all the queries you have created, go over them, and merge them for comparison on key metrics. 245 | 246 | The last thing to do is to create the core logic of your application. Adding it all together, the final version of the code should look like this: 247 | 248 | ```python 249 | import json 250 | import os 251 | from typing import List 252 | 253 | import pandas as pd 254 | import requests 255 | 256 | def get_trend_data(query: str) -> dict: 257 | """Gets a dictionary of trends based on given query string from Google Trends via SERP Scraper API""" 258 | 259 | USERNAME = "yourUsername" 260 | PASSWORD = "yourPassword" 261 | print(f"Getting data from Google Trends for {query} keyword..") 262 | url = "https://realtime.oxylabs.io/v1/queries" 263 | auth = (USERNAME, PASSWORD) 264 | payload = { 265 | "source": "google_trends_explore", 266 | "query": query, 267 | } 268 | try: 269 | response = requests.request("POST", url, auth=auth, json=payload) 270 | except requests.exceptions.RequestException as e: 271 | print("Caught exception while getting trend data") 272 | raise e 273 | 274 | data = response.json() 275 | content = data["results"][0]["content"] 276 | return json.loads(content) 277 | 278 | 279 | def flatten_topic_data(topics_data: List[dict]) -> List[dict]: 280 | """Flattens related_topic data""" 281 | topics_items = [] 282 | for item in topics_data[0]["items"]: 283 | item_dict = { 284 | "mid": item["topic"]["mid"], 285 | "title": item["topic"]["title"], 286 | "type": item["topic"]["type"], 287 | "value": item["value"], 288 | "formatted_value": item["formatted_value"], 289 | "link": item["link"], 290 | "keyword": topics_data[0]["keyword"], 291 | } 292 | topics_items.append(item_dict) 293 | 294 | return topics_items 295 | 296 | 297 | def dump_trend_data_to_csv(trend_data: dict) -> dict: 298 | """Dumps given trend data to generated CSV file""" 299 | CSV_FILE_DIR = "./csv/" 300 | # Interest over time 301 | print("Creating dataframes..") 302 | iot_df = pd.DataFrame(trend_data["interest_over_time"][0]["items"]) 303 | iot_df["keyword"] = trend_data["interest_over_time"][0]["keyword"] 304 | 305 | # Breakdown by region 306 | bbr_df = pd.DataFrame(trend_data["breakdown_by_region"][0]["items"]) 307 | bbr_df["keyword"] = trend_data["breakdown_by_region"][0]["keyword"] 308 | 309 | # Related topics 310 | rt_data = flatten_topic_data(trend_data["related_topics"]) 311 | rt_df = pd.DataFrame(rt_data) 312 | 313 | # Related queries 314 | rq_df = pd.DataFrame(trend_data["related_queries"][0]["items"]) 315 | rq_df["keyword"] = trend_data["related_queries"][0]["keyword"] 316 | 317 | keyword = trend_data["interest_over_time"][0]["keyword"] 318 | keyword_path = os.path.join(CSV_FILE_DIR, keyword) 319 | try: 320 | os.makedirs(keyword_path, exist_ok=True) 321 | except OSError as e: 322 | print("Caught exception while creating directories") 323 | raise e 324 | 325 | print("Dumping to csv..") 326 | iot_df.to_csv(f"{keyword_path}/interest_over_time.csv", index=False) 327 | bbr_df.to_csv(f"{keyword_path}/breakdown_by_region.csv", index=False) 328 | rt_df.to_csv(f"{keyword_path}/related_topics.csv", index=False) 329 | rq_df.to_csv(f"{keyword_path}/related_queries.csv", index=False) 330 | 331 | result_set = {} 332 | result_set["iot"] = iot_df 333 | result_set["bbr"] = bbr_df 334 | result_set["rt"] = rt_df 335 | result_set["rq"] = rq_df 336 | 337 | return result_set 338 | 339 | def create_comparison(trend_dataframes : dict) -> None: 340 | comparison = trend_dataframes[0] 341 | i = 1 342 | 343 | for df in trend_dataframes[1:]: 344 | comparison["iot"] = pd.merge(comparison["iot"], df["iot"], on="time", suffixes=("", f"_{i}")) 345 | comparison["bbr"] = pd.merge(comparison["bbr"], df["bbr"], on="geo_code", suffixes=("", f"_{i}")) 346 | comparison["rt"] = pd.merge(comparison["rt"], df["rt"], on="title", how="inner", suffixes=("", f"_{i}")) 347 | comparison["rq"] = pd.merge(comparison["rq"], df["rq"], on="query", how="inner", suffixes=("", f"_{i}")) 348 | i = i + 1 349 | 350 | comparison["iot"].to_csv("comparison_interest_over_time.csv", index=False) 351 | comparison["bbr"].to_csv("comparison_breakdown_by_region.csv", index=False) 352 | comparison["rt"].to_csv("comparison_related_topics.csv", index=False) 353 | comparison["rq"].to_csv("comparison_related_queries.csv", index=False) 354 | 355 | def main(): 356 | keywords = ["cat", "cats"] 357 | 358 | results = [] 359 | 360 | for keyword in keywords: 361 | trend_data = get_trend_data(keyword) 362 | df_set = dump_trend_data_to_csv(trend_data) 363 | results.append(df_set) 364 | 365 | create_comparison(results) 366 | 367 | if __name__ == "__main__": 368 | main() 369 | ``` 370 | 371 | Running the code will create comparison CSV files that have the combined information of the supplied keywords on each of the categories: 372 | 373 | - `interest_over_time` 374 | 375 | - `breakdown_by_region` 376 | 377 | - `related_topics` 378 | 379 | - `related_queries` 380 | 381 | ![](images/trends_data_comparison.png) 382 | 383 | Looking to scrape data from other Google sources? [Google Sheets for Basic Web Scraping](https://github.com/oxylabs/web-scraping-google-sheets), [How to Scrape Google Shopping Results](https://github.com/oxylabs/scrape-google-shopping), [Google Play Scraper](https://github.com/oxylabs/google-play-scraper), [How To Scrape Google Jobs](https://github.com/oxylabs/how-to-scrape-google-jobs), [Google News Scrpaer](https://github.com/oxylabs/google-news-scraper), [How to Scrape Google Scholar](https://github.com/oxylabs/how-to-scrape-google-scholar), [How to Scrape Google Flights with Python](https://github.com/oxylabs/how-to-scrape-google-flights), [Scrape Google Search Results](https://github.com/oxylabs/scrape-google-python) 384 | -------------------------------------------------------------------------------- /images/trends_data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxylabs/how-to-scrape-google-trends/4acfa8f454eab3ffc94fbe1c87454776b86da145/images/trends_data.png -------------------------------------------------------------------------------- /images/trends_data_comparison.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxylabs/how-to-scrape-google-trends/4acfa8f454eab3ffc94fbe1c87454776b86da145/images/trends_data_comparison.png -------------------------------------------------------------------------------- /images/trends_data_csv.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxylabs/how-to-scrape-google-trends/4acfa8f454eab3ffc94fbe1c87454776b86da145/images/trends_data_csv.png -------------------------------------------------------------------------------- /scraper.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | from typing import List 4 | 5 | import pandas as pd 6 | import requests 7 | 8 | def get_trend_data(query: str) -> dict: 9 | """Gets a dictionary of trends based on given query string from Google Trends via SERP Scraper API""" 10 | 11 | USERNAME = "yourUsername" 12 | PASSWORD = "yourPassword" 13 | print(f"Getting data from Google Trends for {query} keyword..") 14 | url = "https://realtime.oxylabs.io/v1/queries" 15 | auth = (USERNAME, PASSWORD) 16 | payload = { 17 | "source": "google_trends_explore", 18 | "query": query, 19 | } 20 | try: 21 | response = requests.request("POST", url, auth=auth, json=payload) 22 | except requests.exceptions.RequestException as e: 23 | print("Caught exception while getting trend data") 24 | raise e 25 | 26 | data = response.json() 27 | content = data["results"][0]["content"] 28 | return json.loads(content) 29 | 30 | 31 | def flatten_topic_data(topics_data: List[dict]) -> List[dict]: 32 | """Flattens related_topic data""" 33 | topics_items = [] 34 | for item in topics_data[0]["items"]: 35 | item_dict = { 36 | "mid": item["topic"]["mid"], 37 | "title": item["topic"]["title"], 38 | "type": item["topic"]["type"], 39 | "value": item["value"], 40 | "formatted_value": item["formatted_value"], 41 | "link": item["link"], 42 | "keyword": topics_data[0]["keyword"], 43 | } 44 | topics_items.append(item_dict) 45 | 46 | return topics_items 47 | 48 | 49 | def dump_trend_data_to_csv(trend_data: dict) -> dict: 50 | """Dumps given trend data to generated CSV file""" 51 | CSV_FILE_DIR = "./csv/" 52 | # Interest over time 53 | print("Creating dataframes..") 54 | iot_df = pd.DataFrame(trend_data["interest_over_time"][0]["items"]) 55 | iot_df["keyword"] = trend_data["interest_over_time"][0]["keyword"] 56 | 57 | # Breakdown by region 58 | bbr_df = pd.DataFrame(trend_data["breakdown_by_region"][0]["items"]) 59 | bbr_df["keyword"] = trend_data["breakdown_by_region"][0]["keyword"] 60 | 61 | # Related topics 62 | rt_data = flatten_topic_data(trend_data["related_topics"]) 63 | rt_df = pd.DataFrame(rt_data) 64 | 65 | # Related queries 66 | rq_df = pd.DataFrame(trend_data["related_queries"][0]["items"]) 67 | rq_df["keyword"] = trend_data["related_queries"][0]["keyword"] 68 | 69 | keyword = trend_data["interest_over_time"][0]["keyword"] 70 | keyword_path = os.path.join(CSV_FILE_DIR, keyword) 71 | try: 72 | os.makedirs(keyword_path, exist_ok=True) 73 | except OSError as e: 74 | print("Caught exception while creating directories") 75 | raise e 76 | 77 | print("Dumping to csv..") 78 | iot_df.to_csv(f"{keyword_path}/interest_over_time.csv", index=False) 79 | bbr_df.to_csv(f"{keyword_path}/breakdown_by_region.csv", index=False) 80 | rt_df.to_csv(f"{keyword_path}/related_topics.csv", index=False) 81 | rq_df.to_csv(f"{keyword_path}/related_queries.csv", index=False) 82 | 83 | result_set = {} 84 | result_set["iot"] = iot_df 85 | result_set["bbr"] = bbr_df 86 | result_set["rt"] = rt_df 87 | result_set["rq"] = rq_df 88 | 89 | return result_set 90 | 91 | def create_comparison(trend_dataframes : dict) -> None: 92 | comparison = trend_dataframes[0] 93 | i = 1 94 | 95 | for df in trend_dataframes[1:]: 96 | comparison["iot"] = pd.merge(comparison["iot"], df["iot"], on="time", suffixes=("", f"_{i}")) 97 | comparison["bbr"] = pd.merge(comparison["bbr"], df["bbr"], on="geo_code", suffixes=("", f"_{i}")) 98 | comparison["rt"] = pd.merge(comparison["rt"], df["rt"], on="title", how="inner", suffixes=("", f"_{i}")) 99 | comparison["rq"] = pd.merge(comparison["rq"], df["rq"], on="query", how="inner", suffixes=("", f"_{i}")) 100 | i = i + 1 101 | 102 | comparison["iot"].to_csv("comparison_interest_over_time.csv", index=False) 103 | comparison["bbr"].to_csv("comparison_breakdown_by_region.csv", index=False) 104 | comparison["rt"].to_csv("comparison_related_topics.csv", index=False) 105 | comparison["rq"].to_csv("comparison_related_queries.csv", index=False) 106 | 107 | def main(): 108 | keywords = ["cat", "cats"] 109 | 110 | results = [] 111 | 112 | for keyword in keywords: 113 | trend_data = get_trend_data(keyword) 114 | df_set = dump_trend_data_to_csv(trend_data) 115 | results.append(df_set) 116 | 117 | create_comparison(results) 118 | 119 | if __name__ == "__main__": 120 | main() 121 | --------------------------------------------------------------------------------