├── javascript ├── overview │ ├── npm.md │ ├── README.md │ └── installation.md ├── javascript-code-examples │ ├── README.md │ ├── seo-scraper.md │ ├── page-speed.md │ ├── indexing │ │ ├── page-speed.md │ │ └── README.md │ └── indexing.md └── chrome-devtools │ ├── README.md │ └── chrome-seo-without-a-plugin.md ├── python ├── overview │ ├── libraries │ │ ├── README.md │ │ └── pandas.md │ ├── README.md │ └── installation.md ├── code-examples │ ├── rendering.md │ ├── knowledge-graph.md │ ├── README.md │ ├── data-extraction.md │ ├── query-management.md │ ├── reporting.md │ ├── pagespeed.md │ └── crawling.md ├── data-science │ ├── README.md │ ├── time-series.md │ └── machine-learning.md └── helpful-code.md ├── .gitbook └── assets │ ├── image (1).png │ ├── image (2).png │ ├── image (3).png │ ├── image (4).png │ ├── image (6).png │ ├── image (4) (1).png │ ├── image (6) (1).png │ ├── icodeseo192x192.png │ ├── icodeseo512x512.png │ ├── contentking-logo.png │ ├── rendering-pipeline.png │ ├── screen-shot-2020-03-17-at-4.50.33-am.png │ └── screen-shot-2020-03-17-at-4.53.01-am.png ├── technical-seo └── overview │ ├── learning-center │ ├── README.md │ ├── 1.-what-is-technical-seo.md │ ├── 3.-rendering.md │ └── 2.-crawling.md │ └── README.md ├── r-stats ├── r-stats-code-examples │ ├── README.md │ ├── knowledge-graph.md │ ├── crawling.md │ └── query-management.md └── r-stats │ ├── README.md │ └── intro-and-installation.md ├── seo-datasets-1 ├── sitemap-datasets.md ├── apis.md ├── crawl-datasets.md └── serp-datasets.md ├── README.md ├── about └── contributing-content.md └── SUMMARY.md /javascript/overview/npm.md: -------------------------------------------------------------------------------- 1 | # NPM 2 | 3 | -------------------------------------------------------------------------------- /python/overview/libraries/README.md: -------------------------------------------------------------------------------- 1 | # Libraries 2 | 3 | -------------------------------------------------------------------------------- /javascript/javascript-code-examples/README.md: -------------------------------------------------------------------------------- 1 | # JavaScript Code Examples 2 | 3 | -------------------------------------------------------------------------------- /.gitbook/assets/image (1).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (1).png -------------------------------------------------------------------------------- /.gitbook/assets/image (2).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (2).png -------------------------------------------------------------------------------- /.gitbook/assets/image (3).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (3).png -------------------------------------------------------------------------------- /.gitbook/assets/image (4).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (4).png -------------------------------------------------------------------------------- /.gitbook/assets/image (6).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (6).png -------------------------------------------------------------------------------- /.gitbook/assets/image (4) (1).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (4) (1).png -------------------------------------------------------------------------------- /.gitbook/assets/image (6) (1).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (6) (1).png -------------------------------------------------------------------------------- /.gitbook/assets/icodeseo192x192.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/icodeseo192x192.png -------------------------------------------------------------------------------- /.gitbook/assets/icodeseo512x512.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/icodeseo512x512.png -------------------------------------------------------------------------------- /.gitbook/assets/contentking-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/contentking-logo.png -------------------------------------------------------------------------------- /.gitbook/assets/rendering-pipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/rendering-pipeline.png -------------------------------------------------------------------------------- /technical-seo/overview/learning-center/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO Learning Center 3 | --- 4 | 5 | # Learning Center 6 | 7 | -------------------------------------------------------------------------------- /python/code-examples/rendering.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Rendering. 3 | --- 4 | 5 | # Rendering 6 | 7 | ## 8 | 9 | -------------------------------------------------------------------------------- /python/code-examples/knowledge-graph.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on the Knowledge Graph. 3 | --- 4 | 5 | # Knowledge Graph 6 | 7 | -------------------------------------------------------------------------------- /.gitbook/assets/screen-shot-2020-03-17-at-4.50.33-am.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/screen-shot-2020-03-17-at-4.50.33-am.png -------------------------------------------------------------------------------- /.gitbook/assets/screen-shot-2020-03-17-at-4.53.01-am.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/screen-shot-2020-03-17-at-4.53.01-am.png -------------------------------------------------------------------------------- /python/code-examples/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Various code examples which are relevant to aspects of Technical SEO. 3 | --- 4 | 5 | # Python Code Examples 6 | 7 | -------------------------------------------------------------------------------- /r-stats/r-stats-code-examples/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Various code examples which are relevant to aspects of Technical SEO. 3 | --- 4 | 5 | # R Stats Code Examples 6 | 7 | -------------------------------------------------------------------------------- /python/overview/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Python coding resources for Technical SEO 3 | --- 4 | 5 | # Python Overview 6 | 7 | {% page-ref page="installation.md" %} 8 | 9 | {% page-ref page="libraries/pandas.md" %} 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /javascript/overview/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: JavaScript coding resources for SEO 3 | --- 4 | 5 | # JavaScript Overview 6 | 7 | {% page-ref page="installation.md" %} 8 | 9 | {% page-ref page="../chrome-devtools/" %} 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /python/data-science/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Data Science coding resources for Technical SEO 3 | --- 4 | 5 | # Data Science 6 | 7 | {% page-ref page="machine-learning.md" %} 8 | 9 | {% page-ref page="time-series.md" %} 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /r-stats/r-stats/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: '#rstats coding resources for Technical SEO' 3 | --- 4 | 5 | # R Stats Overview 6 | 7 | {% page-ref page="intro-and-installation.md" %} 8 | 9 | {% page-ref page="../r-stats-code-examples/" %} 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /technical-seo/overview/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO Coding Resources 3 | --- 4 | 5 | # Technical SEO Overview 6 | 7 | {% page-ref page="learning-center/" %} 8 | 9 | ## Code Examples 10 | 11 | {% page-ref page="../../python/code-examples/rendering.md" %} 12 | 13 | {% page-ref page="../../python/code-examples/crawling.md" %} 14 | 15 | {% page-ref page="../../python/code-examples/query-management.md" %} 16 | 17 | {% page-ref page="../../python/code-examples/pagespeed.md" %} 18 | 19 | 20 | 21 | -------------------------------------------------------------------------------- /javascript/javascript-code-examples/seo-scraper.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Scraping. 3 | --- 4 | # SEO Scraper 5 | ### Scrape anything that you want with Node.js 6 | Scrape SEO elements by default or whatever you need with easy customizations with this Web Scraper built in Node.js 7 | | Author | Article | Source | Notebook | 8 | | :--- | :--- | :--- | :--- | 9 | | [Nacho Mascort](https://twitter.com/NachoMascort) | [Link](https://www.npmjs.com/package/seo-scraper) | [Link](https://github.com/NachoSEO/seo-scraper) | | -------------------------------------------------------------------------------- /python/code-examples/data-extraction.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Page Speed. 3 | --- 4 | 5 | # Data Extraction 6 | 7 | ### Scraping People Also Asked from Google Search 8 | 9 | This project uses Python and Selenium in order to extract data from the People Also Ask feature of a Google SERP. 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | Alessio Nittoli | [Link](https://nitto.li/scraping-people-also-asked/) | [Link](https://github.com/nittolese/gquestions) | | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /seo-datasets-1/sitemap-datasets.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: XML Sitemaps in DataFrame format for different industries 3 | --- 4 | 5 | # Sitemap Datasets 6 | 7 | 8 | 9 | ### News Websites' XML Sitemap 10 | 11 | A set of XML sitemaps for news websites to analyze publishing trends. Constantly being updated. 12 | 13 | | Author | Source | Notebook | 14 | | :--- | :--- | :--- | 15 | | [Elias Dabbas](https://github.com/eliasdabbas) | [Kaggle Dataset](https://www.kaggle.com/eliasdabbas/news-sitemaps) | [Kaggle Notebook](https://www.kaggle.com/eliasdabbas/bbc-com-sitemaps-analysis) | 16 | 17 | -------------------------------------------------------------------------------- /python/code-examples/query-management.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Query Management. 3 | --- 4 | 5 | # Query Management 6 | 7 | ### Auto-Categorizing Queries with Apriori and BERT 8 | 9 | Auto-categorize Google Search Console queries semantically in 6 different languages via BERT. 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | [Vincent Terrasi](https://twitter.com/VincentTerrasi) | [Link](https://dataseolabs.com/en/google-search-console-clustering-2/) | | [Link](https://colab.research.google.com/drive/14JC2uQniiVDNAUpVEjdNTyK7rmepwjWB) | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /javascript/overview/installation.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Some really good guides to installing Node.js and various other related tools. 3 | --- 4 | 5 | # Installation 6 | 7 | ## How to Install Node.js and npm on Windows 8 | 9 | Overview of installing Node.js and npm on Windows using the official installer. Also describes the different versions and shows how to test whether installation was a success. 10 | 11 | **Link**: [https://www.freecodecamp.org/news/how-to-install-node-js-and-npm-on-windows/](https://www.freecodecamp.org/news/how-to-install-node-js-and-npm-on-windows/) 12 | 13 | **By**: Unknown on [www.freecodecamp.org](https://www.freecodecamp.org/) 14 | 15 | -------------------------------------------------------------------------------- /javascript/javascript-code-examples/page-speed.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Page Speed. 3 | --- 4 | 5 | # Page Speed 6 | 7 | ### Aggregate & Automate Performance Reporting With Lighthouse & Google Data Studio 8 | 9 | This tool automates Google Lighthouse reporting using NodeJS + Puppeteer and leverage Cloud SQL in order allow visualizing in Google Data Studio 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | Dan Leibson | [Link](https://www.localseoguide.com/aggregate-automate-performance-reporting-with-lighthouse-google-data-studio/) | [Link](https://github.com/LocalSEOGuide/lighthouse-reporter) | | 14 | 15 | -------------------------------------------------------------------------------- /javascript/javascript-code-examples/indexing/page-speed.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Page Speed. 3 | --- 4 | 5 | # Page Speed 6 | 7 | ## Aggregate & Automate Performance Reporting With Lighthouse & Google Data Studio 8 | 9 | This tool automates Google Lighthouse reporting using NodeJS + Puppeteer and leverage Cloud SQL in order allow visualizing in Google Data Studio 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | Dan Leibson | [Link](https://www.localseoguide.com/aggregate-automate-performance-reporting-with-lighthouse-google-data-studio/) | | [Link](https://github.com/LocalSEOGuide/lighthouse-reporter) | 14 | 15 | -------------------------------------------------------------------------------- /javascript/javascript-code-examples/indexing.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Indexing. 3 | --- 4 | 5 | # Indexing 6 | 7 | ### Scaling Google indexation checks with Node.js 8 | 9 | New techniques we have developed internally at Builtvisible to fetch Google indexation status data, particularly for large scale sites. Requires a free [ScraperAPI ](https://www.scraperapi.com/)account. 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | [Jose Hernando](https://twitter.com/jlhernando) | [Link](https://builtvisible.com/scaling-google-indexation-checks-with-node-js/) | [Link](https://github.com/alvaro-escalante/google-index-checker) | | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /javascript/javascript-code-examples/indexing/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Indexing. 3 | --- 4 | 5 | # Indexing 6 | 7 | ### Scaling Google indexation checks with Node.js 8 | 9 | New techniques we have developed internally at Builtvisible to fetch Google indexation status data, particularly for large scale sites. Requires a free [ScraperAPI ](https://www.scraperapi.com/)account. 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | [Jose Hernando](https://twitter.com/jlhernando) | [Link](https://builtvisible.com/scaling-google-indexation-checks-with-node-js/) | [Link](https://github.com/alvaro-escalante/google-index-checker) | | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /python/code-examples/reporting.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO Code focusing on reporting 3 | --- 4 | 5 | # Reporting 6 | 7 | ### Using Python to integrate Google Trends to Google Data Studio 8 | 9 | By using the codes in this article, you can connect Google Spreadsheets and Jupyter Notebook to import data into Google Data Studio and easily share the analysis with your team. 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | Hülya Çoban | [Link](https://searchengineland.com/learn-how-to-chart-and-track-google-trends-in-data-studio-using-python-329119) | [Link](https://github.com/hulyacobans/google-trends-to-sheets/blob/master/pytrends-to-sheets.ipynb) | | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /seo-datasets-1/apis.md: -------------------------------------------------------------------------------- 1 | # APIs documentation 2 | 3 | ### SEO Tools APIs 4 | 5 | | | | | 6 | | :--- | :--- | :--- | 7 | | SEMrush | [www](https://www.semrush.com/api-documentation/) | generic | 8 | | Ahrefs | [www](https://ahrefs.com/api) | generic | 9 | | Dataforseo | [www](https://dataforseo.com/) | generic | 10 | | OPR | [www](https://www.domcop.com/openpagerank/documentation) | Alternative to Page Rank | 11 | | OnCrawl | [www](http://developer.oncrawl.com/) | Get data from your account | 12 | | SEOmoz | [www](https://moz.com/api) | generic | 13 | 14 | ### Google APIs 15 | 16 | | . | | 17 | | :--- | :--- | 18 | | Google Analytics | [www](https://developers.google.com/analytics/devguides/reporting/core/v4) | 19 | | Google Search Console | [www](https://developers.google.com/webmaster-tools) | 20 | 21 | 22 | 23 | -------------------------------------------------------------------------------- /python/code-examples/pagespeed.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Page Speed. 3 | --- 4 | 5 | # Page Speed 6 | 7 | ### Automating PageSpeed Tests with Python 8 | 9 | Google’s Page Speed Insights is a super useful tool to view a summary of a web page’s performance and uses both field and lab data to generate results. It is a great way to gain an overview of a handful of URLs, as it is used on a page-per-page basis. However, if you are working on a large site and wish to gain insights to scale, the API can be used to analyse a number of pages at a time, without needing to plug in the URLs individually. 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | [Ruth Everett](https://twitter.com/rvtheverett) | [Link](https://dev.to/rvtheverett/python-and-google-s-page-speed-api-4dbi) | | [Link](%20https://colab.research.google.com/drive/1Oe1VTocg21KIVDqROXSt15H6CoO905D0) | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /r-stats/r-stats/intro-and-installation.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Some really good guides to installing R Stats and various other related tools. 3 | --- 4 | 5 | # Intro & Installation 6 | 7 | ### Official links 8 | 9 | [The R Foundation Official Website](https://www.r-project.org/) 10 | R is an official part of the Free Software Foundation's GNU project, 11 | 12 | [RStudio](https://rstudio.com/) 13 | Most popular development environment \(IDE\) for R. It can also run Python. Desktop version is open source. 14 | Download link: [https://rstudio.com/products/rstudio/](https://rstudio.com/products/rstudio/) 15 | 16 | 17 | 18 | ### R FOR SEO: SURVIVAL GUIDE 19 | 20 | Introduction to the use of R for SEO, how to install, what are the most interesting packages. 21 | 22 | A great collection of the simple and most useful commands 23 | 24 | **Link**: [https://remibacha.com/en/r-seo-guide/](https://remibacha.com/en/r-seo-guide/) 25 | 26 | **By**: [Remi Bacha](https://twitter.com/remibacha) on [remibacha.com](%20https://remibacha.com/) 27 | 28 | -------------------------------------------------------------------------------- /seo-datasets-1/crawl-datasets.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: SEO datasets for SEO dealing with crawled content. 3 | --- 4 | 5 | # Crawl Datasets 6 | 7 | ### Common Crawl 8 | 9 | Open repository of web crawl data that can be accessed and analyzed by anyone. 10 | 11 | | Author | Source | 12 | | :--- | :--- | 13 | | [Common Crawl Team](https://commoncrawl.org/about/team/) | [www](https://commoncrawl.org/) \(direct download\) | 14 | 15 | ### Crawl of the top ranking pages for airline tickets keywords 16 | 17 | This dataset contains SERPs of the 100 most popular tourist destinations, two variations each, and for two countries \(400 queries\). The landing pages that ranked for those keywords were scraped and the two tables were merged into one big table. 18 | 19 | | Author | Source | Notebook | 20 | | :--- | :--- | :--- | 21 | | [Elias Dabbas](https://github.com/eliasdabbas) | [Kaggle Dataset](https://www.kaggle.com/eliasdabbas/flights-serps-and-landing-pages) | [Kaggle Notebook](https://www.kaggle.com/eliasdabbas/airline-tickets-serps-and-landing-pages) | 22 | 23 | -------------------------------------------------------------------------------- /technical-seo/overview/learning-center/1.-what-is-technical-seo.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Overview of Technical SEO. 3 | --- 4 | 5 | # 1. What is Technical SEO? 6 | 7 | Historically, Technical SEO \(Search Engine Optimization\) has included work by specialists dealing with client and server-side technologies that focus on search engine crawling, rendering, and indexing. 8 | 9 | Merkle Defines Technical SEO as: 10 | 11 | > Technical SEO is defined by configurations that can be implemented to the website and server \(e.g. page elements, HTTP header responses, XML Sitemaps, redirects, meta data, etc.\). Technical SEO work has either a direct or indirect impact on search engine crawling, indexing and ultimately ranking. As such, Technical SEO doesn't include analytics, keyword research, backlink profile development or social media strategies. \([source](https://technicalseo.com/)\) 12 | 13 | In 2017, [Russ Jones](https://twitter.com/rjonesx) broadened the term, in a nod to [Arthur C. Clarke](https://en.wikipedia.org/wiki/Arthur_C._Clarke), by defining Technical SEO as: 14 | 15 | > Any **sufficiently technical** action undertaken with the **intent to improve search results**. 16 | 17 | 18 | 19 | 20 | 21 | -------------------------------------------------------------------------------- /python/code-examples/crawling.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Crawling. 3 | --- 4 | 5 | # Crawling 6 | 7 | ### Comparing two crawls using Google Colab and Screaming Frog 8 | 9 | Combining Screaming Frog, Google Colab and Google Sheets to generate change detection reports without any limit. 10 | 11 | | Author | Article | Source | Notebook | 12 | | :--- | :--- | :--- | :--- | 13 | | [Alessio Nittoli](https://twitter.com/nittolese) | [Link](https://nitto.li/screaming-frog-colab/) | | [Link](https://colab.research.google.com/drive/1-CwO0GkC7RizVoZVxOE6a2ldK4d7TRHR) | 14 | 15 | ### Leverage Python and Google Cloud to extract meaningful SEO insights from server log data 16 | 17 | This is the first of a two-part series about how to scale your analyses to larger data sets from your server logs. 18 | 19 | | Author | Article | Source | Notebook | 20 | | :--- | :--- | :--- | :--- | 21 | | [Charly Wargnier](https://twitter.com/DataChaz) | [Link](https://searchengineland.com/leverage-python-and-google-cloud-to-extract-meaningful-seo-insights-from-server-log-data-329199) | [Link](https://github.com/CharlyWargnier/Server_Log_Analyser_for_SEO) | [Link](https://colab.research.google.com/drive/1h3IdoDucFg7tIEiSGTjqksuNprgkcced) | 22 | 23 | -------------------------------------------------------------------------------- /python/overview/installation.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Some really good guides to installing Python and various other related tools. 3 | --- 4 | 5 | # Installation 6 | 7 | ### Installation of Python for Machine Learning 8 | 9 | In this step-by-step tutorial, you’ll cover the basics of setting up a Python numerical computation environment for machine learning on a Windows machine using the Anaconda Python distribution. 10 | 11 | **Link**: [https://realpython.com/python-windows-machine-learning-setup/\#author](https://realpython.com/python-windows-machine-learning-setup/#author) 12 | 13 | **By**: [Renato Candido](https://realpython.com/team/rcandido/) on [realpython.com](https://realpython.com/) 14 | 15 | 16 | 17 | ### Installation and Overview of Jupyter Lab 18 | 19 | An overview of JupyterLab, the next generation of the Jupyter Notebook. 20 | 21 | **Link**: [https://towardsdatascience.com/jupyter-lab-evolution-of-the-jupyter-notebook-5297cacde6b](https://towardsdatascience.com/jupyter-lab-evolution-of-the-jupyter-notebook-5297cacde6b) 22 | 23 | **By**: [Parul Pandey](https://towardsdatascience.com/@parulnith?source=post_page-----5297cacde6b----------------------) on [towardsdatascience.com](https://towardsdatascience.com/). 24 | 25 | 26 | 27 | 28 | 29 | -------------------------------------------------------------------------------- /r-stats/r-stats-code-examples/knowledge-graph.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code to extract data from APIs 3 | --- 4 | 5 | # APIs 6 | 7 | ### GoogleKnowledgeGraphR 8 | 9 | A R 📦 to retrieve information from Google Knowledge Graph API. 10 | 11 | **Code Link**: [https://github.com/dschmeh/GoogleKnowledgeGraphR](https://github.com/dschmeh/GoogleKnowledgeGraphR) 12 | 13 | **By**: [Daniel Schmeh](https://twitter.com/dschmeh) 14 | 15 | ### How to Connect Google Analytics with R 16 | 17 | Step by step how to get your GA data into R and do some basic manipulation with that data 18 | 19 | **Tutorial link:** [**https://www.adswerve.com/blog/ga-r-heatmap-tutorial/**](https://www.adswerve.com/blog/ga-r-heatmap-tutorial/) 20 | **Code link:** [**https://github.com/analytics-pros/R-GA-Heatmap**](https://github.com/analytics-pros/R-GA-Heatmap)\*\*\*\* 21 | 22 | **By:** [Luka Cempre](https://twitter.com/lukaslo) 23 | 24 | ### **Google Search Console API R: Guide to get Started** 25 | 26 | **S**hows you how to setup a daily automated pull of Google Search Console data using R 27 | 28 | **Tutorial link:** [**https://www.ryanpraski.com/google-search-console-api-r-guide-to-get-started/**](https://www.ryanpraski.com/google-search-console-api-r-guide-to-get-started/)\*\*\*\* 29 | 30 | **By:** [Ryan Praskievicz](https://twitter.com/ryanpraski) 31 | 32 | -------------------------------------------------------------------------------- /python/data-science/time-series.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Resources for Time Series Prediction in Python 3 | --- 4 | 5 | # Time Series 6 | 7 | ## [AstPy](https://github.com/firmai/atspy) 8 | 9 | Easily develop state of the art time series models to forecast univariate data series. Simply load your data and select which models you want to test. This is the largest repository of automated structural and machine learning time series models. Please get in contact if you want to contribute a model. 10 | 11 | ### Install 12 | 13 | ```text 14 | pip install atspy 15 | ``` 16 | 17 | ### Automated Models 18 | 19 | 1. `ARIMA` - Automated ARIMA Modelling 20 | 2. `Prophet` - Modeling Multiple Seasonality With Linear or Non-linear Growth 21 | 3. `HWAAS` - Exponential Smoothing With Additive Trend and Additive Seasonality 22 | 4. `HWAMS` - Exponential Smoothing with Additive Trend and Multiplicative Seasonality 23 | 5. `NBEATS` - Neural basis expansion analysis \(now fixed at 20 Epochs\) 24 | 6. `Gluonts` - RNN-based Model \(now fixed at 20 Epochs\) 25 | 7. `TATS` - Seasonal and Trend no Box Cox 26 | 8. `TBAT` - Trend and Box Cox 27 | 9. `TBATS1` - Trend, Seasonal \(one\), and Box Cox 28 | 10. `TBATP1` - TBATS1 but Seasonal Inference is Hardcoded by Periodicity 29 | 11. `TBATS2` - TBATS1 With Two Seasonal Periods 30 | 31 | {% hint style="info" %} 32 | See full details at: [https://github.com/firmai/atspy](https://github.com/firmai/atspy) 33 | {% endhint %} 34 | 35 | 36 | 37 | -------------------------------------------------------------------------------- /python/data-science/machine-learning.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Machine Learning Resources for SEO 3 | --- 4 | 5 | # Machine Learning 6 | 7 | ## Pytorch 8 | 9 | ### [HuggingFace Transformers](https://github.com/huggingface/transformers) 10 | 11 | Transformers \(formerly known as `pytorch-transformers` and `pytorch-pretrained-bert`\) provides state-of-the-art general-purpose architectures \(BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, CTRL...\) for Natural Language Understanding \(NLU\) and Natural Language Generation \(NLG\) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch. 12 | 13 | {% hint style="info" %} 14 | See full details at: [https://github.com/huggingface/transformers](https://github.com/huggingface/transformers) 15 | {% endhint %} 16 | 17 | 18 | 19 | ## TensorFlow 20 | 21 | ### [Uber Ludwig](https://github.com/uber/ludwig) 22 | 23 | Ludwig is a toolbox built on top of TensorFlow that allows users to train and test deep learning models without the need to write code. 24 | 25 | All you need to provide is a CSV file containing your data, a list of columns to use as inputs, and a list of columns to use as outputs, Ludwig will do the rest. Simple commands can be used to train models both locally and in a distributed way, and to use them to predict new data. 26 | 27 | {% hint style="info" %} 28 | See full details at: [https://github.com/uber/ludwig](https://github.com/uber/ludwig) 29 | {% endhint %} 30 | 31 | 32 | 33 | 34 | 35 | ## Other 36 | 37 | 38 | 39 | 40 | 41 | -------------------------------------------------------------------------------- /r-stats/r-stats-code-examples/crawling.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Crawling. 3 | --- 4 | 5 | # Crawling 6 | 7 | ### Log File Analysis for SEO – Working with data visually 8 | 9 | **Link**: [https://canonicalized.com/log-file-analysis-seo/](https://canonicalized.com/log-file-analysis-seo/) 10 | 11 | **By**: [Dorian Banutoiu](https://twitter.com/canonicalizedco) on [Canonicalized](https://canonicalized.com/) 12 | 13 | 14 | 15 | ## Improve internal linking for SEO: Calculate Internal PageRank 16 | 17 | Determine what pages on your site might be seen as authoritative by search engines by computing your Internal PageRank. 18 | 19 | **Link:** [https://searchengineland.com/improve-internal-linking-calculate-internal-pagerank-r-246**883**](https://searchengineland.com/improve-internal-linking-calculate-internal-pagerank-r-246883)\*\*\*\* 20 | 21 | **By** [Paul Shapiro](https://twitter.com/fighto) on [https://searchengineland.com/](https://searchengineland.com/) 22 | 23 | ## Crawling & metadata extraction with RCrawler 24 | 25 | How to crawl a website using Rcrawler package 26 | 27 | **Link:** [https://www.gokam.co.uk/seo-crawling-metadata-extraction-with-r-rcrawler/](https://www.gokam.co.uk/seo-crawling-metadata-extraction-with-r-rcrawler/) 28 | 29 | **By**: [François JOLY](https://twitter.com/tuf) 30 | 31 | ## Classic Packages for Web Crawling 32 | 33 | ### rvest 34 | 35 | Perfect to crawl a few of URLs, easy to use css style or XPath selectors 36 | 37 | **Link:** [**https://cran.r-project.org/web/packages/rvest/rvest.pdf**](https://cran.r-project.org/web/packages/rvest/rvest.pdf)\*\*\*\* 38 | 39 | by [Hadley Wickham](https://twitter.com/hadleywickham) 40 | 41 | **Rcrawler** 42 | 43 | Good to crawl a website and extract metadata 44 | 45 | **Link:** [**https://cran.r-project.org/web/packages/Rcrawler/Rcrawler.pdf**](https://cran.r-project.org/web/packages/Rcrawler/Rcrawler.pdf)\*\*\*\* 46 | 47 | **By** [Salim Khalil](https://orcid.org/0000-0002-7804-4041) 48 | 49 | -------------------------------------------------------------------------------- /r-stats/r-stats-code-examples/query-management.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Technical SEO code focusing on Query Management. 3 | --- 4 | 5 | # Query Management 6 | 7 | ### SEO keyword research using searchConsoleR and googleAnalyticsR 8 | 9 | Method to estimate where to prioritise your SEO resources, estimating which keywords will give the greatest increase in revenue if you could improve their Google rank. 10 | 11 | **Link**: [https://code.markedmondson.me/search-console-google-analytics-r-keyword-research/](https://code.markedmondson.me/search-console-google-analytics-r-keyword-research/) 12 | 13 | **By**: [Mark Edmondson](https://twitter.com/holomarked) on [Coding in digital analytics](https://code.markedmondson.me/) 14 | 15 | 16 | 17 | ### R AND KEYWORD PLANNER 18 | 19 | Using r and google’s keyword planner to evaluate size and competitiveness of international markets. 20 | 21 | **Link**: [https://keyword-hero.com/markets-go-using-r-googles-keyword-planner-evaluate-size-competitiveness-international-markets](https://keyword-hero.com/markets-go-using-r-googles-keyword-planner-evaluate-size-competitiveness-international-markets) 22 | 23 | **By:** Max on [Keyword Hero](%20https://keyword-hero.com/) 24 | 25 | ### **A**utomate Google Search Console data downloads with R 26 | 27 | A guide on scheduling searchConsoleR to save your data to a database 28 | 29 | **Link**: [https://www.rubenvezzoli.online/automate-google-search-console-downloads/](https://www.rubenvezzoli.online/automate-google-search-console-downloads/) 30 | 31 | **By:** [Ruben Vezzoli](https://twitter.com/rubenvezzoli/) on [https://www.rubenvezzoli.online/](https://www.rubenvezzoli.online/) 32 | 33 | ## How to enhance keyword exploration in R with googleSuggestQueriesR package 34 | 35 | Guide how to research similar search queries at scale with R 36 | 37 | * broad keywords exploration 38 | * customizing recommendations 39 | * specific topic exploration 40 | 41 | Link: [https://leszeksieminski.me/r/keyword-exploration-googlesuggestqueriesr/](https://leszeksieminski.me/r/keyword-exploration-googlesuggestqueriesr/) 42 | 43 | By: [Leszek Siemiński ](https://www.linkedin.com/in/leszek-sieminski/) 44 | 45 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: >- 3 | i.codeseo.dev is a repository of code and reference library for Technical SEOs 4 | who are interested in development and data science. 5 | --- 6 | 7 | # Resources for Technical SEOs 8 | 9 | ## About 10 | 11 | We started this journey by wanting to create a Wikipedia page for "Technical Search Engine Optimization". What we learned is that there is an incredible amount of content about the subject, but it was very difficult to find authoritative sources \(or at least ones that would be viewed that way from outside the SEO world\). Most of the content was in article form and dealt with only certain aspects or only the key benefits of Technical SEO. 12 | 13 | The goals of this documentation are: 14 | 15 | * Create Wikipedia-style \(authoritatively sourced\) information about Technical SEO. 16 | * List open-source projects that members of the SEO community have contributed that include a description \(article\) as well as clean, documented code that others may use. 17 | * Provide documentation for popular workflows and code examples in Python and JavaScript \(and others?\) 18 | 19 | ## Contributing 20 | 21 | We have made a connected repo [here](https://github.com/jroakes/iCodeSEO) that is public. We strongly encourage pull requests to grow this resource. Also, with [Gitbook ](https://www.gitbook.com/)\(the SAAS used for this documentation\), we may need to upgrade to a paid plan \($40/mo\) at some point. That includes 5 seats. We are looking for a couple of permanent contributors to own the Technical SEO and Javascript sections. Knowledge of Git and Wikipedia-style writing is required. In addition, we would love to have sponsors. Please email me at [jroakes@gmail.com](mailto:jroakes@gmail.com) if you are interested. 22 | 23 | ## Inclusions of Code 24 | 25 | To be included in the Code Examples sections, the following conditions must be true: 26 | 27 | 1. The code must be explained, in an article or well-documented in the source. 28 | 2. The code must be related to a function of [Technical SEO](technical-seo/overview/learning-center/1.-what-is-technical-seo.md). 29 | 3. The code should be novel or not commonly available already. 30 | 4. The code should be repeatable, easily, by others, with reasonable cost or work. 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | ![Hosting Sponsor for i.CodeSEO.dev](.gitbook/assets/contentking-logo.png) 39 | 40 | {% hint style="info" %} 41 | For more information visit: [ContentKing](https://www.contentkingapp.com/) 42 | {% endhint %} 43 | 44 | 45 | 46 | -------------------------------------------------------------------------------- /about/contributing-content.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: How to contribute content to iCodeSEO.dev. 3 | --- 4 | 5 | # Contributing Content 6 | 7 | ### Overview 8 | 9 | This documentation is organized into the following areas: 10 | 11 | 1. **Technical SEO**: This is to be an organized and wikipedia-style area covering the major informational areas of Technical SEO. We will go with the Russ Jones definition of Technical SEO and extend the realm into: Crawling, Rendering, Indexing, Data Science, and Machine Learning. Initial areas to construct are the first three. 12 | 2. **JavaScript**: Resources for Beginning with JavaScript, Code Examples, Chrome DevTools information. Would be good to separate Node and Google AppScript code. 13 | 3. **Python**: Resources for Beginning with Python, Code Examples, Data Science Resources, and other helpful code snippets. 14 | 4. **R Stats**: Resources for Beginning with R Stats, Code Examples, etc. 15 | 16 | 17 | 18 | ### Working with Pages 19 | 20 | Here are a few points about pages: 21 | 22 | #### If the parent page is left blank, it will include a section listing of children pages. 23 | 24 | ![](../.gitbook/assets/image%20%284%29%20%281%29.png) 25 | 26 | #### 27 | 28 | #### Key page information should be filled just like a normal web page 29 | 30 | ![](../.gitbook/assets/image%20%282%29.png) 31 | 32 | #### Use the available content types to make the page as clear as possible. 33 | 34 | With this said, it is critical to keep consistently of formatting between sections. For a specific content type, if a convention is used in one section, it should be used in all. Users should expect the same experience and understanding of where to find things across sections. 35 | 36 | ![](../.gitbook/assets/image%20%284%29.png) 37 | 38 | ### 39 | 40 | ### Training Resources 41 | 42 | **<Language> Overview** sections are areas of training for the language. This should include information on Installation, libraries, etc. major thematic areas within a language can be broken out into separate sections. In many cases it is a good idea to link to really well done resources online covering different aspects for a language. If we are referencing a resource online, below is the correct format: 43 | 44 | ![](../.gitbook/assets/image%20%281%29.png) 45 | 46 | Where the resources online are narrowly relevant to Technical SEO needs, we should provide the content tailored to typical workflows. 47 | 48 | 49 | 50 | ### Code Examples 51 | 52 | All code examples sections should be organized into consistent major thematic areas of Technical SEO. Currently: 53 | 54 | * Knowledge Graph 55 | * Page Speed 56 | * Query Management 57 | * Crawling 58 | * Rendering 59 | * Indexing 60 | 61 | #### Format for code examples: 62 | 63 | ![](../.gitbook/assets/image%20%283%29.png) 64 | 65 | 66 | 67 | -------------------------------------------------------------------------------- /SUMMARY.md: -------------------------------------------------------------------------------- 1 | # Table of contents 2 | 3 | * [Resources for Technical SEOs](README.md) 4 | 5 | ## Technical SEO 6 | 7 | * [Technical SEO Overview](technical-seo/overview/README.md) 8 | * [Learning Center](technical-seo/overview/learning-center/README.md) 9 | * [1. What is Technical SEO?](technical-seo/overview/learning-center/1.-what-is-technical-seo.md) 10 | * [2. Crawling](technical-seo/overview/learning-center/2.-crawling.md) 11 | * [3. Rendering](technical-seo/overview/learning-center/3.-rendering.md) 12 | 13 | ## Javascript 14 | 15 | * [JavaScript Overview](javascript/overview/README.md) 16 | * [Installation](javascript/overview/installation.md) 17 | * [NPM](javascript/overview/npm.md) 18 | * [Chrome DevTools](javascript/chrome-devtools/README.md) 19 | * [Google Chrome SEO Without A Plugin](javascript/chrome-devtools/chrome-seo-without-a-plugin.md) 20 | * [JavaScript Code Examples](javascript/javascript-code-examples/README.md) 21 | * [Page Speed](javascript/javascript-code-examples/page-speed.md) 22 | * [Indexing](javascript/javascript-code-examples/indexing.md) 23 | 24 | ## Python 25 | 26 | * [Python Overview](python/overview/README.md) 27 | * [Installation](python/overview/installation.md) 28 | * [Libraries](python/overview/libraries/README.md) 29 | * [Pandas](python/overview/libraries/pandas.md) 30 | * [Python Code Examples](python/code-examples/README.md) 31 | * [Reporting](python/code-examples/reporting.md) 32 | * [Data Extraction](python/code-examples/data-extraction.md) 33 | * [Knowledge Graph](python/code-examples/knowledge-graph.md) 34 | * [Page Speed](python/code-examples/pagespeed.md) 35 | * [Query Management](python/code-examples/query-management.md) 36 | * [Crawling](python/code-examples/crawling.md) 37 | * [Rendering](python/code-examples/rendering.md) 38 | * [Data Science](python/data-science/README.md) 39 | * [Machine Learning](python/data-science/machine-learning.md) 40 | * [Time Series](python/data-science/time-series.md) 41 | * [Helpful Code](python/helpful-code.md) 42 | 43 | ## R Stats 44 | 45 | * [R Stats Overview](r-stats/r-stats/README.md) 46 | * [Intro & Installation](r-stats/r-stats/intro-and-installation.md) 47 | * [R Stats Code Examples](r-stats/r-stats-code-examples/README.md) 48 | * [Crawling](r-stats/r-stats-code-examples/crawling.md) 49 | * [Query Management](r-stats/r-stats-code-examples/query-management.md) 50 | * [APIs](r-stats/r-stats-code-examples/knowledge-graph.md) 51 | 52 | ## SEO Datasets 53 | 54 | * [SERP Datasets](seo-datasets-1/serp-datasets.md) 55 | * [Crawl Datasets](seo-datasets-1/crawl-datasets.md) 56 | * [Sitemap Datasets](seo-datasets-1/sitemap-datasets.md) 57 | * [APIs documentation](seo-datasets-1/apis.md) 58 | 59 | ## About 60 | 61 | * [Contributing Content](about/contributing-content.md) 62 | 63 | -------------------------------------------------------------------------------- /seo-datasets-1/serp-datasets.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Datasets for SEO dealing with search engine results pages (SERPs) 3 | --- 4 | 5 | # SERP Datasets 6 | 7 | ### Flights and airline ticket SERPs 8 | 9 | An export of Google SERPs for `flights to ` and `tickets to `for the top 100 travel destinations \(cities\). 10 | 100 destinations x 2 countries x 2 keyword variations x 10 results each = 4,000 rows. 11 | The dataset is updated once every two weeks to check progress and changes. 12 | 13 | | Author | Article | Source | Notebook | 14 | | :--- | :--- | :--- | :--- | 15 | | [Elias Dabbas](https://github.com/eliasdabbas/) | [Tutorial on SEMrush](https://www.semrush.com/blog/analyzing-search-engine-results-pages/) | [Dataset on Kaggle](https://www.kaggle.com/eliasdabbas/search-engine-results-flights-tickets-keywords) | [Binder link](https://mybinder.org/v2/gh/eliasdabbas/SEMRush_serp_tutorial/master?urlpath=lab/tree/semrush_serp_analysis.ipynb) | 16 | 17 | ### Cars Google SERPs 18 | 19 | Google SERPs for ` for sale` and ` price` keywords for fifty of the top cars, and for two countries. 20 | 21 | | Author | Article | Source | Notebook | 22 | | :--- | :--- | :--- | :--- | 23 | | [Elias Dabbas](https://github.com/eliasdabbas) | | [Dataset on Kaggle](https://www.kaggle.com/eliasdabbas/google-search-results-pages-used-cars-us) | [Kaggle Tutorial](https://www.kaggle.com/eliasdabbas/search-engine-results-pages-serps-research) | 24 | 25 | ### 2017 Local SEO Ranking Factors Data 26 | 27 | ~150k rows of data \(each row representing a different business listing on Google My Business\). That data was scraped by Places Scout and joined with a bunch of their own data, as well as link API data from AHREFs and Majestic. All in all there are ~150 data points per listings/business 28 | 29 | | Author | Article | Source | 30 | | :--- | :--- | :--- | 31 | | [Dan Leibson](https://twitter.com/DanLeibson) | [Link](https://www.localseoguide.com/open-sourcing-2017-local-seo-ranking-factors-data/) | [Dataset on Google Drive](https://drive.google.com/file/d/12vCCNOs_HrLOK4VC2fpeD_eYFIB3pIs7/view?usp=sharing) | 32 | 33 | ### Recipes of Popular Dishes: YouTube and Google SERPs 34 | 35 | 243 national dishes, with two keyword variations each ` recipe` and `how to make `. 36 | YouTube provides a much richer dataset than Google, as it contains video and channel statistics to provide context and metadata for the videos \(and channels\). 37 | Both Google and YouTube SERPs are provided for the same keywords. 38 | 39 | 40 | | Author | Article | Source | Notebook | 41 | | :--- | :--- | :--- | :--- | 42 | | [Elias Dabbas](https://github.com/eliasdabbas) | | [Dataset on Kaggle](https://www.kaggle.com/eliasdabbas/recipes-search-engine-results-data) | [Kaggle Tutorial](https://www.kaggle.com/eliasdabbas/recipes-keywords-ranking-on-google-and-youtube) | 43 | 44 | -------------------------------------------------------------------------------- /technical-seo/overview/learning-center/3.-rendering.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: >- 3 | Rendering is the process of taking crawled content and using a WRS (Web 4 | Rendering Service) to build the DOM from the HTML and embedded assets like 5 | JavaScript. 6 | --- 7 | 8 | # 3. Rendering 9 | 10 | ![Source: https://web.dev/javascript-and-google-search-io-2019/](../../../.gitbook/assets/rendering-pipeline.png) 11 | 12 | Key Items we Know about Google's Web Rendering Service \(WRS\). 13 | 14 | * It uses a near-up-to-date version of Google Chrome. 15 | * Google doesn't use cookies \(they do seem to persist cookies across 30X requests\). 16 | * Service workers are not allowed by Googlebot. 17 | * Google overrides random number functions to ensure a predictable state. 18 | * The Live Test \(Fetch and Render\) in Google Search Console, is different from Googlebot. Live Test is time sensitive, Googelbot is not. 19 | * Googlebot will try to wait until there are no longer network activity by the headless browser. 20 | * The date and time of crawling and indexing may differ. Google may index what they have if the crawl and render fails. 21 | * Googlebot does not need to paint pixels as the service runs Chromium headless, so it omits that step of the render process. 22 | 23 | Source: [Martin Splitt's talk at TechSEO Boost, 2019](https://www.catalystdigital.com/techseoboost-livestream-2019/). 24 | 25 | ## Recent Articles on JavaScript Rendering 26 | 27 | Google's Martin Splitt has spent a tremendous amount of time over the last year educating developers and SEOs on the limitations of Google's ability to render JavaScript websites. He is represented in many of the articles below as the authoritative source. 28 | 29 | 30 | 31 | ### Understand the JavaScript SEO basics 32 | 33 | JavaScript is an important part of the web platform because it provides many features that turn the web into a powerful application platform. Making your JavaScript-powered web applications discoverable via Google Search can help you find new users and re-engage existing users as they search for the content your web app provides. 34 | 35 | **Link**: [https://developers.google.com/search/docs/guides/javascript-seo-basics](https://developers.google.com/search/docs/guides/javascript-seo-basics) 36 | 37 | 38 | 39 | ### Fix Search-related JavaScript problems 40 | 41 | This guide helps you identify and fix JavaScript issues that may be blocking your page, or specific content on JavaScript powered pages, from showing up in Google Search. While Googlebot does run JavaScript, there are some differences and limitations that you need to account for when designing your pages and applications to accommodate how crawlers access and render your content. 42 | 43 | **Link**: [https://developers.google.com/search/docs/guides/fix-search-javascript](https://developers.google.com/search/docs/guides/fix-search-javascript) 44 | 45 | 46 | 47 | ### Making JavaScript and Google Search work together 48 | 49 | We introduced a new Googlebot at Google I/O and took the opportunity to discuss improvements and best practices for making JavaScript web apps work well with Google Search 50 | 51 | **Link**: [https://web.dev/javascript-and-google-search-io-2019/](https://web.dev/javascript-and-google-search-io-2019/) 52 | 53 | **By**: [Martin Splitt](https://twitter.com/g33konaut) and [Lizzi Harvey](https://twitter.com/HarveyLizzi) 54 | 55 | 56 | 57 | ### Making JavaScript Work for Search with Martin Splitt & Ashley Berman Hale 58 | 59 | Read our recap of the Q&A webinar we hosted with Google's Martin Splitt on how to make JavaScript-powered websites discoverable and indexable in search. 60 | 61 | **Link**: [https://www.deepcrawl.com/blog/webinars/making-javascript-work-for-search-martin-splitt/](https://www.deepcrawl.com/blog/webinars/making-javascript-work-for-search-martin-splitt/) 62 | 63 | **By**: [Rachel Costello](https://twitter.com/rachellcostello) 64 | 65 | 66 | 67 | -------------------------------------------------------------------------------- /python/overview/libraries/pandas.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: >- 3 | Parse Excel and CSVs files quickly and easily with the pandas library for 4 | Python. 5 | --- 6 | 7 | # Pandas 8 | 9 | ## Overview 10 | 11 | **pandas** is a [Python](https://www.python.org) package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, **real world** data analysis in Python. Additionally, it has the broader goal of becoming **the most powerful and flexible open source data analysis / manipulation tool available in any language**. It is already well on its way toward this goal. \([source](https://pandas.pydata.org/pandas-docs/stable/getting_started/overview.html)\) 12 | 13 | ## Installing Pandas 14 | 15 | Using Anaconda 16 | 17 | ``` 18 | $ conda install pandas 19 | ``` 20 | 21 | Using PyPI: 22 | 23 | ```bash 24 | $ pip install pandas 25 | ``` 26 | 27 | {% hint style="info" %} 28 | Full details found here: [https://pandas.pydata.org/pandas-docs/stable/getting\_started/install.html](https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html) 29 | {% endhint %} 30 | 31 | ## Reading Data 32 | 33 | Pandas comes with functionality to read data from many file types including: 34 | 35 | * CSV 36 | * Excel 37 | * HTML \(tables\) 38 | * SQL 39 | * BiqQuery 40 | * etc 41 | 42 | For most file-based reading, Pandas will accept as the first argument a local file \(e.g. `pd.read_csv('data.csv')` \), or a web URL \(e.g. `pd.read_csv('https://domain.com/data.csv')` \). 43 | 44 | ### CSV 45 | 46 | For CSVs, Pandas handles the datatype setting for you. The most common named parameter sent with the `read_csv` method is `skiprows=1` when the inputted CSV has non-heading rows prior to the first row. 47 | 48 | {% hint style="info" %} 49 | Pandas API Reference for read\_csv function: [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read\_csv.html](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) 50 | {% endhint %} 51 | 52 | ```bash 53 | import pandas as pd 54 | df = pd.read_csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/airline-safety/airline-safety.csv') 55 | df.head() 56 | ``` 57 | 58 | | | airline | avail\_seat\_km\_per\_week | incidents\_85\_99 | fatal\_accidents\_85\_99 | fatalities\_85\_99 | incidents\_00\_14 | fatal\_accidents\_00\_14 | fatalities\_00\_14 | 59 | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | 60 | | 0 | Aer Lingus | 320906734 | 2 | 0 | 0 | 0 | 0 | 0 | 61 | | 1 | Aeroflot\* | 1197672318 | 76 | 14 | 128 | 6 | 1 | 88 | 62 | | 2 | Aerolineas Argentinas | 385803648 | 6 | 0 | 0 | 1 | 0 | 0 | 63 | | 3 | Aeromexico\* | 596871813 | 3 | 1 | 64 | 5 | 0 | 0 | 64 | | 4 | Air Canada | 1865253802 | 2 | 0 | 0 | 2 | 0 | 0 | 65 | 66 | ### Excel 67 | 68 | For me, I use the `read_excel` method much less frequently than `read_csv` mostly due to habit and the assumption that the data is cleaner with a CSV. Common named parameters would be `sheet_name = "Sheet1"` or `skip_rows=[1]`. 69 | 70 | {% hint style="info" %} 71 | Pandas API Reference for read\_excel function: [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read\_excel.html](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html) 72 | {% endhint %} 73 | 74 | ```bash 75 | import pandas as pd 76 | df = pd.read_excel('https://www2.census.gov/programs-surveys/popest/tables/2010-2019/state/totals/nst-est2019-01.xlsx', skiprows=3) 77 | df.head() 78 | ``` 79 | 80 | | | Unnamed: 0 | Census | Estimates Base | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 81 | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | 82 | | 0 | United States | 308745538.0 | 308758105.0 | 309321666.0 | 311556874.0 | 313830990.0 | 315993715.0 | 318301008.0 | 320635163.0 | 322941311.0 | 324985539.0 | 326687501.0 | 328239523.0 | 83 | | 1 | Northeast | 55317240.0 | 55318443.0 | 55380134.0 | 55604223.0 | 55775216.0 | 55901806.0 | 56006011.0 | 56034684.0 | 56042330.0 | 56059240.0 | 56046620.0 | 55982803.0 | 84 | | 2 | Midwest | 66927001.0 | 66929725.0 | 66974416.0 | 67157800.0 | 67336743.0 | 67560379.0 | 67745167.0 | 67860583.0 | 67987540.0 | 68126781.0 | 68236628.0 | 68329004.0 | 85 | | 3 | South | 114555744.0 | 114563030.0 | 114866680.0 | 116006522.0 | 117241208.0 | 118364400.0 | 119624037.0 | 120997341.0 | 122351760.0 | 123542189.0 | 124569433.0 | 125580448.0 | 86 | | 4 | West | 71945553.0 | 71946907.0 | 72100436.0 | 72788329.0 | 73477823.0 | 74167130.0 | 74925793.0 | 75742555.0 | 76559681.0 | 77257329.0 | 77834820.0 | 78347268.0 | 87 | 88 | ### HTML Tables 89 | 90 | This is a very handy function, where there is a table on a web page that you want to grab the data from. 91 | 92 | {% hint style="info" %} 93 | Pandas API Reference for read\_html function: [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read\_html.html](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_html.html) 94 | {% endhint %} 95 | 96 | ```bash 97 | import pandas as pd 98 | df = pd.read_html('https://countrycode.org/', attrs = {'class': 'main-table'}) 99 | df[0].head() 100 | ``` 101 | 102 | | | COUNTRY | COUNTRY CODE | ISO CODES | POPULATION | AREA KM2 | GDP $USD | 103 | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | 104 | | 0 | Afghanistan | 93 | AF / AFG | 29121286 | 647500 | 20.65 Billion | 105 | | 1 | Albania | 355 | AL / ALB | 2986952 | 28748 | 12.8 Billion | 106 | | 2 | Algeria | 213 | DZ / DZA | 34586184 | 2381740 | 215.7 Billion | 107 | | 3 | American Samoa | 1-684 | AS / ASM | 57881 | 199 | 462.2 Million | 108 | | 4 | Andorra | 376 | AD / AND | 84000 | 468 | 4.8 Billion | 109 | 110 | 111 | 112 | -------------------------------------------------------------------------------- /technical-seo/overview/learning-center/2.-crawling.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: >- 3 | Crawling is how search engines discover new links and process pages it finds 4 | on the internet. Here we provide a good overview based only on reliable 5 | sources. 6 | --- 7 | 8 | # 2. Crawling 9 | 10 | Google's Search Index contains [hundred of billions of webpages and is over 100,000,000 gigibytes \(100k Terabytes\) in size](https://www.google.com/search/howsearchworks/crawling-indexing/). In order to feed that index, Google uses [Googlebot](https://support.google.com/webmasters/answer/182072?hl=en), the generic name for Google's web crawling infrastructure, to search for new pages, and add to a list of known pages. Bing uses [Bingbot](https://www.bing.com/webmaster/help/which-crawlers-does-bing-use-8c184ec0) as their standard crawler. Both Google and Bing use desktop and mobile User-Agents to crawl the web. 11 | 12 | The web has grown into an enormous resource of available information. It is important to note that Google \(and Bing\) only know about a portion of the web, called the **surface web**. The surface web is the portion that is accessible publicly by crawlers via web pages. Other parts of the web are the **deep web**, and **hidden web**. The deep web is estimated to be 4,000 to 5,000 times larger than the surface web. This [article by the University of Washington](https://guides.lib.uw.edu/c.php?g=342031&p=2300191) details the differences. 13 | 14 | Google uses a distributed crawling infrastructure that distributes load over many machines. It is also an [incremental crawler](https://www.ijarcce.com/upload/2016/january-16/IJARCCE%2052.pdf), in that it continues to refresh its collection of pages with up-to-date versions based on the perceived importance of the page to users. Crawling is distinct from rendering, although the process of crawling and rendering is often attributed to Googlebot. 15 | 16 | 17 | 18 | ### Crawl Budget 19 | 20 | In 2017, Google provided [some guidance](https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html) on how to think about crawl budget. Essentially the term "crawl budget" is a term coined by SEOs to indicate the amount of crawling resources Google would allocate to a given website. Crawl budget is a bit of a misnomer in that there are several areas that can potentially affect the rate with which Google crawls your website URLs. 21 | 22 | > First, we'd like to emphasize that crawl budget, as described below, is not something most publishers have to worry about. If new pages tend to be crawled the same day they're published, crawl budget is not something webmasters need to focus on. Likewise, if a site has fewer than a few thousand URLs, most of the time it will be crawled efficiently. \([source](https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html)\) 23 | 24 | Essentially, the goal of Technical SEOs is to ensure that, for larger websites, Google spends their available resources crawling and rendering the important pages, from a search perspective, and little to no time crawling non-valuable content. Google indicates the following categories as non-value-add URLs \(in order of significance\): 25 | 26 | * [Faceted navigation](https://webmasters.googleblog.com/2014/02/faceted-navigation-best-and-5-of-worst.html) and [session identifiers](https://webmasters.googleblog.com/2007/09/google-duplicate-content-caused-by-url.html) 27 | * [On-site duplicate content](https://webmasters.googleblog.com/2007/09/google-duplicate-content-caused-by-url.html) 28 | * [Soft error pages](https://webmasters.googleblog.com/2010/06/crawl-errors-now-reports-soft-404s.html) 29 | * Hacked pages 30 | * [Infinite spaces](https://webmasters.googleblog.com/2008/08/to-infinity-and-beyond-no.html) and proxies 31 | * Low quality and spam content 32 | 33 | Major areas that can affect the rate with which Google crawls your URLs are: 34 | 35 | * Blocked content in robots.txt 36 | * Server crawl health. Google tries to be a good citizen and is especially tuned to feedback from your server such as slowing response times and server errors. 37 | * [Crawl rate limits set in Google Search Console](https://support.google.com/webmasters/answer/48620). 38 | * Site-wide events like site moves or major content changes. 39 | * Popularity of your URLs to users. 40 | * Staleness of the URLs in Google's index. Google tries to keep their index fresh. 41 | * Reliance on many embedded assets to render pages, like JS, CSS, XHR calls. 42 | * The category of website \(eg. News, or other frequently changing unique content\). 43 | 44 | 45 | 46 | ## Cloaking 47 | 48 | {% hint style="info" %} 49 | Google defines cloaking as the practice of presenting different content or URLs to human users and search engines. Cloaking is considered a violation of Google’s [Webmaster Guidelines](https://support.google.com/webmasters/answer/answer.py?answer=35769) because it provides our users with different results than they expected. \([source](https://support.google.com/webmasters/answer/66355?hl=en)\) 50 | {% endhint %} 51 | 52 | In 2016, Google trained a classification model that was able to accurately detect cloaking 95.5% of the time with a false positive rate of 0.9%. JavaScript redirection was one of the strongest features, predictive of a positive classification. \([source](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45365.pdf)\) 53 | 54 | ## Research Articles on Crawling 55 | 56 | * [http://www.ijera.com/papers/vol8no11/p1/A0811010108.pdf](http://www.ijera.com/papers/vol8no11/p1/A0811010108.pdf) 57 | * [https://www.ijarcce.com/upload/2016/january-16/IJARCCE%2052.pdf](https://www.ijarcce.com/upload/2016/january-16/IJARCCE%2052.pdf) 58 | * [http://infolab.stanford.edu/~olston/publications/crawling\_survey.pdf](http://infolab.stanford.edu/~olston/publications/crawling_survey.pdf) 59 | * [https://web.stanford.edu/class/cs276/handouts/lecture16-crawling-jun4-6per.pdf](https://web.stanford.edu/class/cs276/handouts/lecture16-crawling-jun4-6per.pdf) 60 | * [http://web.eecs.umich.edu/~mihalcea/498IR/Lectures/WebCrawling.pdf](http://web.eecs.umich.edu/~mihalcea/498IR/Lectures/WebCrawling.pdf) 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | -------------------------------------------------------------------------------- /javascript/chrome-devtools/README.md: -------------------------------------------------------------------------------- 1 | # Chrome DevTools 2 | 3 | {% hint style="danger" %} 4 | This page is a mirror of [https://developers.google.com/web/tools/chrome-devtools](https://developers.google.com/web/tools/chrome-devtools) and needs to be updated. 5 | {% endhint %} 6 | 7 | ## Chrome DevTools 8 | 9 | Chrome DevTools is a set of web developer tools built directly into the [Google Chrome](https://www.google.com/chrome/) browser. DevTools can help you edit pages on-the-fly and diagnose problems quickly, which ultimately helps you build better websites, faster. 10 | 11 | Check out the video for live demonstrations of core DevTools workflows, including debugging CSS, prototyping CSS, debugging JavaScript, and analyzing load performance. 12 | 13 | ### Open DevTools 14 | 15 | There are many ways to open DevTools, because different users want quick access to different parts of the DevTools UI. 16 | 17 | * When you want to work with the DOM or CSS, right-click an element on the page and select **Inspect** to jump into the **Elements** panel. Or press Command+Option+C \(Mac\) or Control+Shift+C \(Windows, Linux, Chrome OS\). 18 | * When you want to see logged messages or run JavaScript, press Command+Option+J \(Mac\) or Control+Shift+J \(Windows, Linux, Chrome OS\) to jump straight into the **Console** panel. 19 | 20 | See [Open Chrome DevTools](./) for more details and workflows. 21 | 22 | ### Get started 23 | 24 | If you're a more experienced web developer, here are the recommended starting points for learning how DevTools can improve your productivity: 25 | 26 | * [View and Change the DOM](https://developers.google.com/web/tools/chrome-devtools/dom) 27 | * [View and Change a Page's Styles \(CSS\)](./) 28 | * [Debug JavaScript](https://developers.google.com/web/tools/chrome-devtools/javascript) 29 | * [View Messages and Run JavaScript in the Console](https://developers.google.com/web/tools/chrome-devtools/console/get-started) 30 | * [Optimize Website Speed](https://developers.google.com/web/tools/chrome-devtools/speed/get-started) 31 | * [Inspect Network Activity](./) 32 | 33 | ### Discover DevTools 34 | 35 | The DevTools UI can be a little overwhelming... there are so many tabs! But, if you take some time to get familiar with each tab to understand what's possible, you may discover that DevTools can seriously boost your productivity. 36 | 37 | #### Device Mode 38 | 39 | Simulate mobile devices. 40 | 41 | * [Device Mode](https://developers.google.com/web/tools/chrome-devtools/device-mode) 42 | * [Test Responsive and Device-specific Viewports](https://developers.google.com/web/tools/chrome-devtools/device-mode/emulate-mobile-viewports) 43 | * [Emulate Sensors: Geolocation & Accelerometer](https://developers.google.com/web/tools/chrome-devtools/device-mode/device-input-and-sensors) 44 | 45 | #### Elements panel 46 | 47 | View and change the DOM and CSS. 48 | 49 | * [Get Started With Viewing And Changing The DOM](https://developers.google.com/web/tools/chrome-devtools/dom) 50 | * [Get Started With Viewing And Changing CSS](./) 51 | * [Inspect and Tweak Your Pages](https://developers.google.com/web/tools/chrome-devtools/inspect-styles) 52 | * [Edit Styles](./) 53 | * [Edit the DOM](https://developers.google.com/web/tools/chrome-devtools/inspect-styles/edit-dom) 54 | * [Inspect Animations](./) 55 | * [Find Unused CSS](./) 56 | 57 | #### Console panel 58 | 59 | View messages and run JavaScript from the Console. 60 | 61 | * [Get Started With The Console](https://developers.google.com/web/tools/chrome-devtools/console/get-started) 62 | * [Using the Console](./) 63 | * [Interact from Command Line](https://developers.google.com/web/tools/chrome-devtools/console/command-line-reference) 64 | * [Console API Reference](https://developers.google.com/web/tools/chrome-devtools/console/console-reference) 65 | 66 | #### Sources panel 67 | 68 | Debug JavaScript, persist changes made in DevTools across page reloads, save and run snippets of JavaScript, and save changes that you make in DevTools to disk. 69 | 70 | * [Get Started With Debugging JavaScript](https://developers.google.com/web/tools/chrome-devtools/javascript) 71 | * [Pause Your Code With Breakpoints](https://developers.google.com/web/tools/chrome-devtools/javascript/breakpoints) 72 | * [Save Changes to Disk with Workspaces](https://developers.google.com/web/tools/setup/setup-workflow) 73 | * [Run Snippets Of Code From Any Page](https://developers.google.com/web/tools/chrome-devtools/snippets) 74 | * [JavaScript Debugging Reference](https://developers.google.com/web/tools/chrome-devtools/javascript/reference) 75 | * [Persist Changes Across Page Reloads with Local Overrides](https://developers.google.com/web/updates/2018/01/devtools#overrides) 76 | * [Find Unused JavaScript](./) 77 | 78 | #### Network panel 79 | 80 | View and debug network activity. 81 | 82 | * [Get Started](https://developers.google.com/web/tools/chrome-devtools/network-performance) 83 | * [Network Issues Guide](https://developers.google.com/web/tools/chrome-devtools/network-performance/issues) 84 | * [Network Panel Reference](https://developers.google.com/web/tools/chrome-devtools/network-performance/reference) 85 | 86 | #### Performance panel 87 | 88 | Find ways to improve load and runtime performance. 89 | 90 | * [Optimize Website Speed](https://developers.google.com/web/tools/chrome-devtools/speed/get-started) 91 | * [Get Started With Analyzing Runtime Performance](https://developers.google.com/web/tools/chrome-devtools/evaluate-performance) 92 | * [Performance Analysis Reference](https://developers.google.com/web/tools/chrome-devtools/evaluate-performance/reference) 93 | * [Analyze runtime performance](https://developers.google.com/web/tools/chrome-devtools/rendering-tools) 94 | * [Diagnose Forced Synchronous Layouts](https://developers.google.com/web/tools/chrome-devtools/rendering-tools/forced-synchronous-layouts) 95 | 96 | #### Memory panel 97 | 98 | Profile memory usage and track down leaks. 99 | 100 | * [Fix Memory Problems](https://developers.google.com/web/tools/chrome-devtools/memory-problems) 101 | * [JavaScript CPU Profiler](https://developers.google.com/web/tools/chrome-devtools/rendering-tools/js-execution) 102 | 103 | #### Application panel 104 | 105 | Inspect all resources that are loaded, including IndexedDB or Web SQL databases, local and session storage, cookies, Application Cache, images, fonts, and stylesheets. 106 | 107 | * [Debug Progressive Web Apps](https://developers.google.com/web/tools/chrome-devtools/progressive-web-apps) 108 | * [Inspect and Manage Storage, Databases, and Caches](https://developers.google.com/web/tools/chrome-devtools/manage-data/local-storage) 109 | * [Inspect and Delete Cookies](https://developers.google.com/web/tools/chrome-devtools/manage-data/cookies) 110 | * [Inspect Resources](https://developers.google.com/web/tools/chrome-devtools/manage-data/page-resources) 111 | 112 | #### Security panel 113 | 114 | Debug mixed content issues, certificate problems, and more. 115 | 116 | * [Understand Security Issues](https://developers.google.com/web/tools/chrome-devtools/security) 117 | 118 | The best place to file feature requests for Chrome DevTools is the mailing list. The team needs to understand use cases, gauge community interest, and discuss feasibility before implementing any new features. 119 | 120 | ## Additionnal tips 121 | 122 | ### Comparing raw HTML and rendered HTML 123 | 124 | Easily compare the raw HTML of a page and the rendered code with [this plugin](https://chrome.google.com/webstore/detail/view-rendered-source/ejgngohbdedoabanmclafpkoogegdpob). 125 | It also provides a _diff_ feature to see line by line what has been modified in the DOM by JavaScript. 126 | 127 | -------------------------------------------------------------------------------- /javascript/chrome-devtools/chrome-seo-without-a-plugin.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: >- 3 | This post tries to cover many SEO items that can be handled via plain vanilla 4 | Google Chrome without any tools 5 | --- 6 | 7 | # Google Chrome SEO Without A Plugin 8 | 9 | There are numerous add-ons for Chrome that handle many SEO data gathering tasks, but those are not covered here. For those that are unsure about the security of those add-ons, or whether they are mining Litecoin in the background and eating up your CPU cycles, this post is for you. But more over, it is just a reason to play around with Google searches, Chrome Developer Tools, and the Console. 10 | 11 | ## Google Searches 12 | 13 | In this section we cover a few common Google search operators used in SEO. Please let me know what I missed. 14 | 15 | ### Canonical content check 16 | 17 | info:{url} 18 | 19 | ```text 20 | info:https://www.codeseo.io 21 | ``` 22 | 23 | ### On-page internal link candidates 24 | 25 | site:{domain} {keyword} 26 | 27 | ```text 28 | site:codeseo.io "googlebot" 29 | ``` 30 | 31 | 32 | Bonus Points: Also the way to see if the correct page is ranking for a particular query. 33 | 34 | ### Pages that haven't moved to https 35 | 36 | site:{domain} -inurl:https 37 | 38 | ```text 39 | site:amazon.com -inurl:https 40 | ``` 41 | 42 | ### If your page is cached by Google 43 | 44 | cache:{url} 45 | 46 | ```text 47 | cache:https://codeseo.io 48 | ``` 49 | 50 | ### Useful parameters in Google searches 51 | 52 | * **nearby={city}**: Filters search results to nearby city. 53 | * **filter=0**: Remove personalization from Google results. 54 | * **num=100**: Display 100 Google results. 55 | 56 | Most of the above provided by [Victor Pan](https://twitter.com/victorpan). Thanks for the idea for this post Victor. Please follow him on Twitter. 57 | 58 | In this section we cover using the Console in Developer Tools to get data from Google and your pages in Chrome. 59 | 60 | ### Scrape Google links 61 | 62 | ```text 63 | $$('h3 a').join('\n') 64 | ``` 65 | 66 | ### Scrape Google Images 67 | 68 | ```text 69 | 70 | var imgs=$$('a'); var out = []; 71 | for (i in imgs){ 72 | if(imgs[i].href.indexOf("/imgres?imgurl=http")>0){ 73 | out.push(decodeURIComponent(imgs[i].href).split(/=|%|&/)[1].split("?imgref")[0]); 74 | } 75 | } 76 | out.join('\n') 77 | ``` 78 | 79 | 80 | Hat tip to Peter Nikolow. Read more at his blog [here](http://peter.nikolow.me/kak-da-smaknem-kartinki-ot-google-images/) \(Bulgarian\). 81 | 82 | ### Count links on a page 83 | 84 | ```text 85 | $$('a').length 86 | ``` 87 | 88 | ### See the page title 89 | 90 | ```text 91 | document.title 92 | ``` 93 | 94 | 95 | Bonus Points: 96 | 97 | ```text 98 | document.title.length 99 | ``` 100 | 101 | ### See the page description 102 | 103 | ```text 104 | document.all.description.content 105 | ``` 106 | 107 | 108 | Bonus Points: 109 | 110 | ```text 111 | document.all.description.content.length 112 | ``` 113 | 114 | ### See the robots meta 115 | 116 | ```text 117 | document.all.robots.content 118 | ``` 119 | 120 | ### See the canonical 121 | 122 | ```text 123 | $('link[rel="canonical"]')[0] 124 | ``` 125 | 126 | ### Easter eggs in Google Search 127 | 128 | ```text 129 | document.all['easter-egg'] 130 | ``` 131 | 132 | ### Edit a page live 133 | 134 | ```text 135 | document.designMode = "on" 136 | ``` 137 | 138 | ### Get Important N-Grams from Google search results 139 | 140 | ```text 141 | 142 | var stopwords = [ 143 | 'about', 'after', 'all', 'also', 'am', 'an', 'and', 'another', 'any', 'are', 'as', 'at', 'be', 144 | 'because', 'been', 'before', 'being', 'between', 'both', 'but', 'by', 'came', 'can', 145 | 'come', 'could', 'did', 'do', 'each', 'for', 'from', 'get', 'got', 'has', 'had', 146 | 'he', 'have', 'her', 'here', 'him', 'himself', 'his', 'how', 'if', 'in', 'into', 147 | 'is', 'it', 'like', 'make', 'many', 'me', 'might', 'more', 'most', 'much', 'must', 148 | 'my', 'never', 'now', 'of', 'on', 'only', 'or', 'other', 'our', 'out', 'over', 149 | 'said', 'same', 'see', 'should', 'since', 'some', 'still', 'such', 'take', 'than', 150 | 'that', 'the', 'their', 'them', 'then', 'there', 'these', 'they', 'this', 'those', 151 | 'through', 'to', 'too', 'under', 'up', 'very', 'was', 'way', 'we', 'well', 'were', 152 | 'what', 'where', 'which', 'while', 'who', 'with', 'would', 'you', 'your', 'a', 'i', 's'] 153 | 154 | function nGrams(sentence, limit) { 155 | 156 | ns = [1,2,3,4]; var grams = {}; 157 | var words = sentence.replace(/(?:https?|ftp):\/\/[\n\S]+/g, '').toLowerCase().split(/\W+/).filter(function (value) {return stopwords.indexOf(value.toLowerCase()) === -1}) 158 | for (n of ns){ 159 | var total = words.length - n; 160 | for(var i = 0; i <= total; i++) { 161 | var seq = ''; 162 | for (var j = i; j < i + n; j++) { seq += words[j] + ' ';} 163 | if (seq.trim().length < 3) {continue;}else{seq = seq.trim()} 164 | grams[seq] = seq in grams ? grams[seq] + 1 : 1; 165 | } 166 | } 167 | var sort = Object.keys(grams).sort(function(a,b){return grams[b]-grams[a]}); 168 | for (s of sort){ if (grams[s] < limit){break;} console.log(s, ':', grams[s]);} 169 | } 170 | 171 | var gtext = document.all.search.innerText 172 | var ng = nGrams(gtext, 3) 173 | ``` 174 | 175 | ### Get Google Analytics info 176 | 177 | ```text 178 | 179 | for (const [key, value] of Object.entries(ga.getAll()[0].b.data.values) ) { 180 | if (typeof value === 'string'){ 181 | console.log('%s: %s', key.replace(':',''), value); 182 | } 183 | } 184 | ``` 185 | 186 | 187 | Bonus Points: Check the hit count for your profiles 188 | 189 | ```text 190 | gaData 191 | ``` 192 | 193 | ### See what Google is storing to the google object on search result pages 194 | 195 | ```text 196 | 197 | for (k of Object.keys(google)){ 198 | if (typeof google[k] !== 'function') {console.log(k,google[k])} 199 | } 200 | ``` 201 | 202 | ### Get load timings from PerformanceTiming 203 | 204 | ```text 205 | 206 | for (t in window.performance.timing){ 207 | var tAll = window.performance.timing; var t0 = tAll['navigationStart']; 208 | if (tAll[t] !== "undefined" && (tAll[t]- t0) > 10){ 209 | console.log(t,':', (tAll[t]- t0)/1000, 'secs') 210 | } 211 | } 212 | ``` 213 | 214 | This section covers some of the best Developer Tools tabs. 215 | 216 | ### Security 217 | 218 | In developer tools. Look for the Security tab to ensure your page and loaded resources are secure. 219 | 220 | 221 | ### Audits 222 | 223 | Use the built-in Lighthouse audits to test your webpage for: 224 | 225 | * Speed 226 | * PWA implementation 227 | * Accessibility 228 | * Best Practices 229 | * SEO 230 | 231 | You have to get the Chrome [canary build](https://www.google.com/chrome/browser/canary.html) for the SEO audit portion to be available through Developer Tools. Otherwise, use the Chrome plugin for Lighthouse. 232 | 233 | 234 | ### Network 235 | 236 | I most often find myself using the network view for a couple things. First, it is great for verifying redirect chaining as long as you have "Preserve log" checked. 237 | 238 | 239 | It is also great for looking at time-to-first-byte \(TTFB\), the time it took a server to respond to you. Also content download timings for image assets, scripts etc. There is also a nice filmstrip view in the latest canary version which quickly shows rendering progression to time. 240 | 241 | ### Responsive 242 | 243 | In responsive view in Developer Tools, it is easy to add additional devices for Googlebot using the user-agent strings found [here](https://support.google.com/webmasters/answer/1061943?hl=en), and window sizes found [here](https://codeseo.io/console-log-hacking-for-googlebot/). This gives you the ability to browse as Googlebot and can uncover strange issues where a particular website either serves bots from different servers or handles those sessions differently. Not a good thing generally. 244 | 245 | 246 | ### Sensors 247 | 248 | In addition to the above, you can use the More Tools > Sensors portion of Developer Tools to set the latitude and longitude to appear from in Google searches. There are many services that give you the lat/lon information but I generally rely on Google \([Google Maps Api](https://maps.googleapis.com/maps/api/geocode/json?address=boston+ma)\). This option has been glitchy for me in the past. Was brought up by [Victor Pan](https://twitter.com/victorpan), confirmed working by [Dan Hinckley](https://twitter.com/dhinckley) 249 | 250 | 251 | ### Shortcuts: Full Size Screenshots 252 | 253 | This is a great tip from [Anthony Nelson](https://twitter.com/anthonydnelson). 254 | 255 | > One of my fave Chrome tips: on Console tab, hit Command+Shift+P to bring up shortcuts. Then type in "screenshot" and select "Capture Full Size Screenshot" - super easy way to get full sized png of a long page. 256 | 257 | ## More Amazing Chrome SEO Tips 258 | 259 | * Aleyda Solis has a wonderful writeup on Seach Engine Land: [Chrome’s DevTools for SEO: 10 ways to use these browser features for your SEO audits](https://searchengineland.com/chromes-devtools-seo-10-ways-use-seo-audits-266433) 260 | * Maria Cieślak goes more indepth with Chrome Developer Tools: [Overview of Chrome Developer Tools’ Most Useful Options for SEO](https://www.elephate.com/blog/chrome-developer-tools-overview/) 261 | 262 | That's all I have. If you have anything to add, please comment below or hit me up on [Twitter](https://twitter.com/jroakes). 263 | 264 | -------------------------------------------------------------------------------- /python/helpful-code.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: Helpful Python Code Snippets for Technical SEOs. 3 | --- 4 | 5 | # Helpful Code 6 | 7 | ## CTR Curves 8 | 9 | Useful for gathering simple Share-of-Traffic information from ranking position. Date can be found at: [https://www.advancedwebranking.com/ctrstudy/](https://www.advancedwebranking.com/ctrstudy/). Most major tool providers have individualized CTR curves for keywords or keyword sets. 10 | 11 | ```python 12 | def add_ctr(x): 13 | 14 | vals = {"1":0.2759, 15 | "2":0.14415, 16 | "3":0.09255, 17 | "4":0.06265, 18 | "5":0.0639, 19 | "6":0.03285, 20 | "7":0.02305, 21 | "8":0.0173, 22 | "9":0.0135, 23 | "10":0.0108, 24 | "11":0.00925, 25 | "12":0.00925, 26 | "13":0.0099, 27 | "14":0.0095, 28 | "15":0.0096, 29 | "16":0.0097, 30 | "17":0.01025, 31 | "18":0.01125, 32 | "19":0.01185, 33 | "20":0.0129} 34 | return vals[str(x)] 35 | ``` 36 | 37 | ## Random User-Agent Strings 38 | 39 | Helpful for crawling pages that don't like default Python Requests library User-Agents. 40 | 41 | ```python 42 | import random 43 | 44 | def getUA(): 45 | 46 | uastrings = ["Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\ 47 | "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36",\ 48 | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10) AppleWebKit/600.1.25 (KHTML, like Gecko) Version/8.0 Safari/600.1.25",\ 49 | "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0",\ 50 | "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\ 51 | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\ 52 | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.1.17 (KHTML, like Gecko) Version/7.1 Safari/537.85.10",\ 53 | "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko",\ 54 | "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0",\ 55 | "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"\ 56 | ] 57 | 58 | return random.choice(uastrings) 59 | ``` 60 | 61 | ## Pull Title and Description from URL 62 | 63 | Simple example of using the requests and BeautifulSoup libraries to pull `` and `<meta name="description" content="..." />` from live URLs. 64 | 65 | ```python 66 | from bs4 import BeautifulSoup 67 | import requests 68 | 69 | # Use beautifulsoup to fetch page titles. 70 | def fetch_meta(url): 71 | #Set UA as Googlebot as the server will only serve to GB 72 | headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36'} 73 | result = {'title':'', 'description':'', 'error':''} 74 | 75 | try: 76 | #Send a GET request to grab the file 77 | response = requests.get(url, headers=headers, timeout=3) 78 | 79 | #Parse the response 80 | soup = BeautifulSoup(response.text) 81 | 82 | #Extract the title 83 | result['title'] = soup.title.string 84 | description_tag = soup.find('meta', attrs={'name':'description'}) 85 | if description_tag is not None: 86 | result['description'] = description_tag.get('content') 87 | 88 | except Exception as e: 89 | result['error'] = str(e) 90 | 91 | return result 92 | 93 | 94 | ``` 95 | Example: 96 | out = fetch_meta('https://i.codeseo.dev') 97 | out 98 | 99 | [1] {'title': 'Development Resources for Technical SEOs - iCodeSEO', 100 | 'description': 'i.codeseo.dev is a repository of code and reference library for Technical SEOs who are interested in development and data science.', 101 | 'error': ''} 102 | ``` 103 | ``` 104 | 105 | ## Import Multiple Google SERPs on a Large Scale 106 | 107 | **Setup:** 108 | 109 | 1. [Create a custom search engine](https://cse.google.com/cse/). At first, you might be asked to enter a site to search. Enter any domain, then go to the control panel and remove it. Make sure you enable "Search the entire web" and image search. You will also need to get your search engine ID, which you can find on the control panel page. 110 | 2. [Enable the custom search API](https://console.cloud.google.com/apis/library/customsearch.googleapis.com). The service will allow you to retrieve and display search results from your custom search engine programmatically. You will need to create a project for this first. 111 | 3. [Create credentials for this project](https://console.developers.google.com/apis/api/customsearch.googleapis.com/credentials) so you can get your key. 112 | 4. [Enable billing for your project](https://console.cloud.google.com/billing/projects) if you want to run more than 100 queries per day. The first 100 queries are free; then for each additional 1,000 queries, you pay USD $5. 113 | 114 | ```python 115 | $ pip install advertools 116 | 117 | import advertools as adv 118 | cx = 'YOUR_GOOGLE_CUSTOM_SEARCH_ENGINE_ID' 119 | key = 'YOUR_GOOGLE_DEVELOPER_KEY' 120 | 121 | serp = adv.serp_goog(cx=cx, key=key, 122 | q=['first query', 'second query', 'third query'], 123 | gl=['us', 'ca', 'uk']) 124 | 125 | # many other parameters and combinations are available, check the documentation for details 126 | ``` 127 | 128 | ## Crawl a Website by Specifying a Sitemap URL 129 | 130 | This spider will crawl pages specified in an XML sitemap\(s\), and extract standard SEO elements \(title, h1, h2, page size, etc.\). The output of this crawl will be saved to a CSV file. 131 | 132 | How to run: 133 | 134 | 1. Modify the `sitemap_urls` attribute by adding one or more sitemap URLs. A sitemap index page is fine, and will go through all sub sitemaps. Or you can specify normal sitemaps. 135 | 2. \(optional\): Modify any elements that you want to scrape \(publishing date, author name, etc.\) through CSS or Xpath selectors. 136 | 3. Save the file with any name e.g. `my_spider.py` 137 | 4. From your terminal, run the following line: 138 | 139 | `scrapy runspider path/to/my_spider.py -o path/to/output/file.csv` 140 | 141 | where `path/to/output/file.csv`is where you want the scrape result to be saved. 142 | 143 | ```python 144 | from scrapy.spiders import SitemapSpider 145 | 146 | user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36', 147 | 148 | 149 | class SEOSitemapSpider(SitemapSpider): 150 | name = 'seo_sitemap_spider' 151 | sitemap_urls = [ 152 | 'https://www.example.com/sitemap-index.xml', # either will work, regular sitemaps or sitemap index files 153 | 'https://www.example.com/sitemap_1.xml', 154 | 'https://www.example.com/sitemap_2.xml', 155 | ] 156 | custom_settings = { 157 | 'USER_AGENT': user_agent, 158 | 'DOWNLOAD_DELAY': 0, # you might need to make this a 3 or 4 (seconds) 159 | # to be nice to the site's servers and 160 | # to prevent getting banned, 161 | 'ROBOTSTXT_OBEY': True, # SEOs are polite people, right? :) 162 | 'HTTPERROR_ALLOW_ALL': True 163 | } 164 | 165 | def parse(self, response): 166 | yield dict( 167 | url=response.url, 168 | title='@@'.join(response.css('title::text').getall()), 169 | meta_desc=response.xpath("//meta[@name='description']/@content").get(), 170 | h1='@@'.join(response.css('h1::text').getall()), 171 | h2='@@'.join(response.css('h2::text').getall()), 172 | h3='@@'.join(response.css('h3::text').getall()), 173 | body_text='\n'.join(response.css('p::text').getall()), 174 | size=len(response.body), 175 | load_time=response.meta['download_latency'], 176 | status=response.status, 177 | links_href='@@'.join([link.attrib.get('href') or '' for link in response.css('a')]), 178 | links_text='@@'.join([link.attrib.get('title') or '' for link in response.css('a')]), 179 | img_src='@@'.join([im.attrib.get('src') or '' for im in response.css('img')]), 180 | img_alt='@@'.join([im.attrib.get('alt') or '' for im in response.css('img')]), 181 | page_depth=response.meta['depth'], 182 | ) 183 | ``` 184 | 185 | Items that contain multiple elements, like multiple H2 tags in one page will be listed as one string separated by two @ signs e.g. `first h2 tag@@second tag@@third tag` You simply have to split by "@@" to get a list of elements per page. 186 | 187 | ### Text Mining SEO Data - Word Counting 188 | 189 | Many SEO reports typically come as a text list \(page titles, URLs, keywords, snippets, etc.\) together with one or more number lists \(pageviews, bounces, conversions, time on page, etc.\). 190 | 191 | Looking at the top pages with disproportionately higher numbers might be misleading if you have a significant long tail of topics in the text list. How do you uncover hidden insights? 192 | Let's try to do that with the following report: 193 | 194 | ```python 195 | page_titles = [ 196 | 'Learn Python', 197 | 'Python Data Vizualization', 198 | 'Python Programming for Beginners', 199 | 'Data Science for Marketing People', 200 | 'Data Science for Business People', 201 | 'Python for SEO', 202 | 'SEO Text Analysis' 203 | ] 204 | 205 | pageviews = [200, 150, 400, 300, 670, 120, 340] 206 | 207 | 208 | pd.DataFrame(zip(page_titles, pageviews), 209 | columns=['page_titles', 'pageviews']) 210 | ``` 211 | 212 | ![](../.gitbook/assets/screen-shot-2020-03-17-at-4.50.33-am.png) 213 | 214 | It's clear which is the most viewed article. But are there any hidden insights about the words, and topics occurring in this report that we can't immediately see? 215 | 216 | ```python 217 | import advertools as adv 218 | adv.word_frequency(page_titles, pageviews) 219 | ``` 220 | 221 | ![](../.gitbook/assets/screen-shot-2020-03-17-at-4.53.01-am.png) 222 | 223 | By counting the words in the titles we can get a better view. Counting word occurrences on an absolute basis \(simple counting\), we can see that "python" is the top topic, occurring four times. 224 | However, on a weighted count basis \(by taking into consideration the total pageviews generated by titles where the word was included\), "data" is the winner word. Pages containing "data" generated a total of 1,120 pageviews. 225 | And on a per-occurrence basis, the word "business" is the winner because it generated the most pageviews by appearing in only one page title. 226 | 227 | To summarize: 228 | - The topic that we wrote the most about was "python" 229 | - The topic that generated the most pageviews was "data" 230 | - The relatively most valuable topic was "business" 231 | 232 | 233 | --------------------------------------------------------------------------------