├── javascript
├── overview
│ ├── npm.md
│ ├── README.md
│ └── installation.md
├── javascript-code-examples
│ ├── README.md
│ ├── seo-scraper.md
│ ├── page-speed.md
│ ├── indexing
│ │ ├── page-speed.md
│ │ └── README.md
│ └── indexing.md
└── chrome-devtools
│ ├── README.md
│ └── chrome-seo-without-a-plugin.md
├── python
├── overview
│ ├── libraries
│ │ ├── README.md
│ │ └── pandas.md
│ ├── README.md
│ └── installation.md
├── code-examples
│ ├── rendering.md
│ ├── knowledge-graph.md
│ ├── README.md
│ ├── data-extraction.md
│ ├── query-management.md
│ ├── reporting.md
│ ├── pagespeed.md
│ └── crawling.md
├── data-science
│ ├── README.md
│ ├── time-series.md
│ └── machine-learning.md
└── helpful-code.md
├── .gitbook
└── assets
│ ├── image (1).png
│ ├── image (2).png
│ ├── image (3).png
│ ├── image (4).png
│ ├── image (6).png
│ ├── image (4) (1).png
│ ├── image (6) (1).png
│ ├── icodeseo192x192.png
│ ├── icodeseo512x512.png
│ ├── contentking-logo.png
│ ├── rendering-pipeline.png
│ ├── screen-shot-2020-03-17-at-4.50.33-am.png
│ └── screen-shot-2020-03-17-at-4.53.01-am.png
├── technical-seo
└── overview
│ ├── learning-center
│ ├── README.md
│ ├── 1.-what-is-technical-seo.md
│ ├── 3.-rendering.md
│ └── 2.-crawling.md
│ └── README.md
├── r-stats
├── r-stats-code-examples
│ ├── README.md
│ ├── knowledge-graph.md
│ ├── crawling.md
│ └── query-management.md
└── r-stats
│ ├── README.md
│ └── intro-and-installation.md
├── seo-datasets-1
├── sitemap-datasets.md
├── apis.md
├── crawl-datasets.md
└── serp-datasets.md
├── README.md
├── about
└── contributing-content.md
└── SUMMARY.md
/javascript/overview/npm.md:
--------------------------------------------------------------------------------
1 | # NPM
2 |
3 |
--------------------------------------------------------------------------------
/python/overview/libraries/README.md:
--------------------------------------------------------------------------------
1 | # Libraries
2 |
3 |
--------------------------------------------------------------------------------
/javascript/javascript-code-examples/README.md:
--------------------------------------------------------------------------------
1 | # JavaScript Code Examples
2 |
3 |
--------------------------------------------------------------------------------
/.gitbook/assets/image (1).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (1).png
--------------------------------------------------------------------------------
/.gitbook/assets/image (2).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (2).png
--------------------------------------------------------------------------------
/.gitbook/assets/image (3).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (3).png
--------------------------------------------------------------------------------
/.gitbook/assets/image (4).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (4).png
--------------------------------------------------------------------------------
/.gitbook/assets/image (6).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (6).png
--------------------------------------------------------------------------------
/.gitbook/assets/image (4) (1).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (4) (1).png
--------------------------------------------------------------------------------
/.gitbook/assets/image (6) (1).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/image (6) (1).png
--------------------------------------------------------------------------------
/.gitbook/assets/icodeseo192x192.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/icodeseo192x192.png
--------------------------------------------------------------------------------
/.gitbook/assets/icodeseo512x512.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/icodeseo512x512.png
--------------------------------------------------------------------------------
/.gitbook/assets/contentking-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/contentking-logo.png
--------------------------------------------------------------------------------
/.gitbook/assets/rendering-pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/rendering-pipeline.png
--------------------------------------------------------------------------------
/technical-seo/overview/learning-center/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO Learning Center
3 | ---
4 |
5 | # Learning Center
6 |
7 |
--------------------------------------------------------------------------------
/python/code-examples/rendering.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Rendering.
3 | ---
4 |
5 | # Rendering
6 |
7 | ##
8 |
9 |
--------------------------------------------------------------------------------
/python/code-examples/knowledge-graph.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on the Knowledge Graph.
3 | ---
4 |
5 | # Knowledge Graph
6 |
7 |
--------------------------------------------------------------------------------
/.gitbook/assets/screen-shot-2020-03-17-at-4.50.33-am.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/screen-shot-2020-03-17-at-4.50.33-am.png
--------------------------------------------------------------------------------
/.gitbook/assets/screen-shot-2020-03-17-at-4.53.01-am.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jroakes/iCodeSEO/HEAD/.gitbook/assets/screen-shot-2020-03-17-at-4.53.01-am.png
--------------------------------------------------------------------------------
/python/code-examples/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Various code examples which are relevant to aspects of Technical SEO.
3 | ---
4 |
5 | # Python Code Examples
6 |
7 |
--------------------------------------------------------------------------------
/r-stats/r-stats-code-examples/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Various code examples which are relevant to aspects of Technical SEO.
3 | ---
4 |
5 | # R Stats Code Examples
6 |
7 |
--------------------------------------------------------------------------------
/python/overview/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Python coding resources for Technical SEO
3 | ---
4 |
5 | # Python Overview
6 |
7 | {% page-ref page="installation.md" %}
8 |
9 | {% page-ref page="libraries/pandas.md" %}
10 |
11 |
12 |
13 |
--------------------------------------------------------------------------------
/javascript/overview/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: JavaScript coding resources for SEO
3 | ---
4 |
5 | # JavaScript Overview
6 |
7 | {% page-ref page="installation.md" %}
8 |
9 | {% page-ref page="../chrome-devtools/" %}
10 |
11 |
12 |
13 |
--------------------------------------------------------------------------------
/python/data-science/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Data Science coding resources for Technical SEO
3 | ---
4 |
5 | # Data Science
6 |
7 | {% page-ref page="machine-learning.md" %}
8 |
9 | {% page-ref page="time-series.md" %}
10 |
11 |
12 |
13 |
--------------------------------------------------------------------------------
/r-stats/r-stats/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: '#rstats coding resources for Technical SEO'
3 | ---
4 |
5 | # R Stats Overview
6 |
7 | {% page-ref page="intro-and-installation.md" %}
8 |
9 | {% page-ref page="../r-stats-code-examples/" %}
10 |
11 |
12 |
13 |
--------------------------------------------------------------------------------
/technical-seo/overview/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO Coding Resources
3 | ---
4 |
5 | # Technical SEO Overview
6 |
7 | {% page-ref page="learning-center/" %}
8 |
9 | ## Code Examples
10 |
11 | {% page-ref page="../../python/code-examples/rendering.md" %}
12 |
13 | {% page-ref page="../../python/code-examples/crawling.md" %}
14 |
15 | {% page-ref page="../../python/code-examples/query-management.md" %}
16 |
17 | {% page-ref page="../../python/code-examples/pagespeed.md" %}
18 |
19 |
20 |
21 |
--------------------------------------------------------------------------------
/javascript/javascript-code-examples/seo-scraper.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Scraping.
3 | ---
4 | # SEO Scraper
5 | ### Scrape anything that you want with Node.js
6 | Scrape SEO elements by default or whatever you need with easy customizations with this Web Scraper built in Node.js
7 | | Author | Article | Source | Notebook |
8 | | :--- | :--- | :--- | :--- |
9 | | [Nacho Mascort](https://twitter.com/NachoMascort) | [Link](https://www.npmjs.com/package/seo-scraper) | [Link](https://github.com/NachoSEO/seo-scraper) | |
--------------------------------------------------------------------------------
/python/code-examples/data-extraction.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Page Speed.
3 | ---
4 |
5 | # Data Extraction
6 |
7 | ### Scraping People Also Asked from Google Search
8 |
9 | This project uses Python and Selenium in order to extract data from the People Also Ask feature of a Google SERP.
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | Alessio Nittoli | [Link](https://nitto.li/scraping-people-also-asked/) | [Link](https://github.com/nittolese/gquestions) | |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/seo-datasets-1/sitemap-datasets.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: XML Sitemaps in DataFrame format for different industries
3 | ---
4 |
5 | # Sitemap Datasets
6 |
7 |
8 |
9 | ### News Websites' XML Sitemap
10 |
11 | A set of XML sitemaps for news websites to analyze publishing trends. Constantly being updated.
12 |
13 | | Author | Source | Notebook |
14 | | :--- | :--- | :--- |
15 | | [Elias Dabbas](https://github.com/eliasdabbas) | [Kaggle Dataset](https://www.kaggle.com/eliasdabbas/news-sitemaps) | [Kaggle Notebook](https://www.kaggle.com/eliasdabbas/bbc-com-sitemaps-analysis) |
16 |
17 |
--------------------------------------------------------------------------------
/python/code-examples/query-management.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Query Management.
3 | ---
4 |
5 | # Query Management
6 |
7 | ### Auto-Categorizing Queries with Apriori and BERT
8 |
9 | Auto-categorize Google Search Console queries semantically in 6 different languages via BERT.
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | [Vincent Terrasi](https://twitter.com/VincentTerrasi) | [Link](https://dataseolabs.com/en/google-search-console-clustering-2/) | | [Link](https://colab.research.google.com/drive/14JC2uQniiVDNAUpVEjdNTyK7rmepwjWB) |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/javascript/overview/installation.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Some really good guides to installing Node.js and various other related tools.
3 | ---
4 |
5 | # Installation
6 |
7 | ## How to Install Node.js and npm on Windows
8 |
9 | Overview of installing Node.js and npm on Windows using the official installer. Also describes the different versions and shows how to test whether installation was a success.
10 |
11 | **Link**: [https://www.freecodecamp.org/news/how-to-install-node-js-and-npm-on-windows/](https://www.freecodecamp.org/news/how-to-install-node-js-and-npm-on-windows/)
12 |
13 | **By**: Unknown on [www.freecodecamp.org](https://www.freecodecamp.org/)
14 |
15 |
--------------------------------------------------------------------------------
/javascript/javascript-code-examples/page-speed.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Page Speed.
3 | ---
4 |
5 | # Page Speed
6 |
7 | ### Aggregate & Automate Performance Reporting With Lighthouse & Google Data Studio
8 |
9 | This tool automates Google Lighthouse reporting using NodeJS + Puppeteer and leverage Cloud SQL in order allow visualizing in Google Data Studio
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | Dan Leibson | [Link](https://www.localseoguide.com/aggregate-automate-performance-reporting-with-lighthouse-google-data-studio/) | [Link](https://github.com/LocalSEOGuide/lighthouse-reporter) | |
14 |
15 |
--------------------------------------------------------------------------------
/javascript/javascript-code-examples/indexing/page-speed.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Page Speed.
3 | ---
4 |
5 | # Page Speed
6 |
7 | ## Aggregate & Automate Performance Reporting With Lighthouse & Google Data Studio
8 |
9 | This tool automates Google Lighthouse reporting using NodeJS + Puppeteer and leverage Cloud SQL in order allow visualizing in Google Data Studio
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | Dan Leibson | [Link](https://www.localseoguide.com/aggregate-automate-performance-reporting-with-lighthouse-google-data-studio/) | | [Link](https://github.com/LocalSEOGuide/lighthouse-reporter) |
14 |
15 |
--------------------------------------------------------------------------------
/javascript/javascript-code-examples/indexing.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Indexing.
3 | ---
4 |
5 | # Indexing
6 |
7 | ### Scaling Google indexation checks with Node.js
8 |
9 | New techniques we have developed internally at Builtvisible to fetch Google indexation status data, particularly for large scale sites. Requires a free [ScraperAPI ](https://www.scraperapi.com/)account.
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | [Jose Hernando](https://twitter.com/jlhernando) | [Link](https://builtvisible.com/scaling-google-indexation-checks-with-node-js/) | [Link](https://github.com/alvaro-escalante/google-index-checker) | |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/javascript/javascript-code-examples/indexing/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Indexing.
3 | ---
4 |
5 | # Indexing
6 |
7 | ### Scaling Google indexation checks with Node.js
8 |
9 | New techniques we have developed internally at Builtvisible to fetch Google indexation status data, particularly for large scale sites. Requires a free [ScraperAPI ](https://www.scraperapi.com/)account.
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | [Jose Hernando](https://twitter.com/jlhernando) | [Link](https://builtvisible.com/scaling-google-indexation-checks-with-node-js/) | [Link](https://github.com/alvaro-escalante/google-index-checker) | |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/python/code-examples/reporting.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO Code focusing on reporting
3 | ---
4 |
5 | # Reporting
6 |
7 | ### Using Python to integrate Google Trends to Google Data Studio
8 |
9 | By using the codes in this article, you can connect Google Spreadsheets and Jupyter Notebook to import data into Google Data Studio and easily share the analysis with your team.
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | Hülya Çoban | [Link](https://searchengineland.com/learn-how-to-chart-and-track-google-trends-in-data-studio-using-python-329119) | [Link](https://github.com/hulyacobans/google-trends-to-sheets/blob/master/pytrends-to-sheets.ipynb) | |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/seo-datasets-1/apis.md:
--------------------------------------------------------------------------------
1 | # APIs documentation
2 |
3 | ### SEO Tools APIs
4 |
5 | | | | |
6 | | :--- | :--- | :--- |
7 | | SEMrush | [www](https://www.semrush.com/api-documentation/) | generic |
8 | | Ahrefs | [www](https://ahrefs.com/api) | generic |
9 | | Dataforseo | [www](https://dataforseo.com/) | generic |
10 | | OPR | [www](https://www.domcop.com/openpagerank/documentation) | Alternative to Page Rank |
11 | | OnCrawl | [www](http://developer.oncrawl.com/) | Get data from your account |
12 | | SEOmoz | [www](https://moz.com/api) | generic |
13 |
14 | ### Google APIs
15 |
16 | | . | |
17 | | :--- | :--- |
18 | | Google Analytics | [www](https://developers.google.com/analytics/devguides/reporting/core/v4) |
19 | | Google Search Console | [www](https://developers.google.com/webmaster-tools) |
20 |
21 |
22 |
23 |
--------------------------------------------------------------------------------
/python/code-examples/pagespeed.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Page Speed.
3 | ---
4 |
5 | # Page Speed
6 |
7 | ### Automating PageSpeed Tests with Python
8 |
9 | Google’s Page Speed Insights is a super useful tool to view a summary of a web page’s performance and uses both field and lab data to generate results. It is a great way to gain an overview of a handful of URLs, as it is used on a page-per-page basis. However, if you are working on a large site and wish to gain insights to scale, the API can be used to analyse a number of pages at a time, without needing to plug in the URLs individually.
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | [Ruth Everett](https://twitter.com/rvtheverett) | [Link](https://dev.to/rvtheverett/python-and-google-s-page-speed-api-4dbi) | | [Link](%20https://colab.research.google.com/drive/1Oe1VTocg21KIVDqROXSt15H6CoO905D0) |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/r-stats/r-stats/intro-and-installation.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Some really good guides to installing R Stats and various other related tools.
3 | ---
4 |
5 | # Intro & Installation
6 |
7 | ### Official links
8 |
9 | [The R Foundation Official Website](https://www.r-project.org/)
10 | R is an official part of the Free Software Foundation's GNU project,
11 |
12 | [RStudio](https://rstudio.com/)
13 | Most popular development environment \(IDE\) for R. It can also run Python. Desktop version is open source.
14 | Download link: [https://rstudio.com/products/rstudio/](https://rstudio.com/products/rstudio/)
15 |
16 |
17 |
18 | ### R FOR SEO: SURVIVAL GUIDE
19 |
20 | Introduction to the use of R for SEO, how to install, what are the most interesting packages.
21 |
22 | A great collection of the simple and most useful commands
23 |
24 | **Link**: [https://remibacha.com/en/r-seo-guide/](https://remibacha.com/en/r-seo-guide/)
25 |
26 | **By**: [Remi Bacha](https://twitter.com/remibacha) on [remibacha.com](%20https://remibacha.com/)
27 |
28 |
--------------------------------------------------------------------------------
/seo-datasets-1/crawl-datasets.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: SEO datasets for SEO dealing with crawled content.
3 | ---
4 |
5 | # Crawl Datasets
6 |
7 | ### Common Crawl
8 |
9 | Open repository of web crawl data that can be accessed and analyzed by anyone.
10 |
11 | | Author | Source |
12 | | :--- | :--- |
13 | | [Common Crawl Team](https://commoncrawl.org/about/team/) | [www](https://commoncrawl.org/) \(direct download\) |
14 |
15 | ### Crawl of the top ranking pages for airline tickets keywords
16 |
17 | This dataset contains SERPs of the 100 most popular tourist destinations, two variations each, and for two countries \(400 queries\). The landing pages that ranked for those keywords were scraped and the two tables were merged into one big table.
18 |
19 | | Author | Source | Notebook |
20 | | :--- | :--- | :--- |
21 | | [Elias Dabbas](https://github.com/eliasdabbas) | [Kaggle Dataset](https://www.kaggle.com/eliasdabbas/flights-serps-and-landing-pages) | [Kaggle Notebook](https://www.kaggle.com/eliasdabbas/airline-tickets-serps-and-landing-pages) |
22 |
23 |
--------------------------------------------------------------------------------
/technical-seo/overview/learning-center/1.-what-is-technical-seo.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Overview of Technical SEO.
3 | ---
4 |
5 | # 1. What is Technical SEO?
6 |
7 | Historically, Technical SEO \(Search Engine Optimization\) has included work by specialists dealing with client and server-side technologies that focus on search engine crawling, rendering, and indexing.
8 |
9 | Merkle Defines Technical SEO as:
10 |
11 | > Technical SEO is defined by configurations that can be implemented to the website and server \(e.g. page elements, HTTP header responses, XML Sitemaps, redirects, meta data, etc.\). Technical SEO work has either a direct or indirect impact on search engine crawling, indexing and ultimately ranking. As such, Technical SEO doesn't include analytics, keyword research, backlink profile development or social media strategies. \([source](https://technicalseo.com/)\)
12 |
13 | In 2017, [Russ Jones](https://twitter.com/rjonesx) broadened the term, in a nod to [Arthur C. Clarke](https://en.wikipedia.org/wiki/Arthur_C._Clarke), by defining Technical SEO as:
14 |
15 | > Any **sufficiently technical** action undertaken with the **intent to improve search results**.
16 |
17 |
18 |
19 |
20 |
21 |
--------------------------------------------------------------------------------
/python/code-examples/crawling.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Crawling.
3 | ---
4 |
5 | # Crawling
6 |
7 | ### Comparing two crawls using Google Colab and Screaming Frog
8 |
9 | Combining Screaming Frog, Google Colab and Google Sheets to generate change detection reports without any limit.
10 |
11 | | Author | Article | Source | Notebook |
12 | | :--- | :--- | :--- | :--- |
13 | | [Alessio Nittoli](https://twitter.com/nittolese) | [Link](https://nitto.li/screaming-frog-colab/) | | [Link](https://colab.research.google.com/drive/1-CwO0GkC7RizVoZVxOE6a2ldK4d7TRHR) |
14 |
15 | ### Leverage Python and Google Cloud to extract meaningful SEO insights from server log data
16 |
17 | This is the first of a two-part series about how to scale your analyses to larger data sets from your server logs.
18 |
19 | | Author | Article | Source | Notebook |
20 | | :--- | :--- | :--- | :--- |
21 | | [Charly Wargnier](https://twitter.com/DataChaz) | [Link](https://searchengineland.com/leverage-python-and-google-cloud-to-extract-meaningful-seo-insights-from-server-log-data-329199) | [Link](https://github.com/CharlyWargnier/Server_Log_Analyser_for_SEO) | [Link](https://colab.research.google.com/drive/1h3IdoDucFg7tIEiSGTjqksuNprgkcced) |
22 |
23 |
--------------------------------------------------------------------------------
/python/overview/installation.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Some really good guides to installing Python and various other related tools.
3 | ---
4 |
5 | # Installation
6 |
7 | ### Installation of Python for Machine Learning
8 |
9 | In this step-by-step tutorial, you’ll cover the basics of setting up a Python numerical computation environment for machine learning on a Windows machine using the Anaconda Python distribution.
10 |
11 | **Link**: [https://realpython.com/python-windows-machine-learning-setup/\#author](https://realpython.com/python-windows-machine-learning-setup/#author)
12 |
13 | **By**: [Renato Candido](https://realpython.com/team/rcandido/) on [realpython.com](https://realpython.com/)
14 |
15 |
16 |
17 | ### Installation and Overview of Jupyter Lab
18 |
19 | An overview of JupyterLab, the next generation of the Jupyter Notebook.
20 |
21 | **Link**: [https://towardsdatascience.com/jupyter-lab-evolution-of-the-jupyter-notebook-5297cacde6b](https://towardsdatascience.com/jupyter-lab-evolution-of-the-jupyter-notebook-5297cacde6b)
22 |
23 | **By**: [Parul Pandey](https://towardsdatascience.com/@parulnith?source=post_page-----5297cacde6b----------------------) on [towardsdatascience.com](https://towardsdatascience.com/).
24 |
25 |
26 |
27 |
28 |
29 |
--------------------------------------------------------------------------------
/r-stats/r-stats-code-examples/knowledge-graph.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code to extract data from APIs
3 | ---
4 |
5 | # APIs
6 |
7 | ### GoogleKnowledgeGraphR
8 |
9 | A R 📦 to retrieve information from Google Knowledge Graph API.
10 |
11 | **Code Link**: [https://github.com/dschmeh/GoogleKnowledgeGraphR](https://github.com/dschmeh/GoogleKnowledgeGraphR)
12 |
13 | **By**: [Daniel Schmeh](https://twitter.com/dschmeh)
14 |
15 | ### How to Connect Google Analytics with R
16 |
17 | Step by step how to get your GA data into R and do some basic manipulation with that data
18 |
19 | **Tutorial link:** [**https://www.adswerve.com/blog/ga-r-heatmap-tutorial/**](https://www.adswerve.com/blog/ga-r-heatmap-tutorial/)
20 | **Code link:** [**https://github.com/analytics-pros/R-GA-Heatmap**](https://github.com/analytics-pros/R-GA-Heatmap)\*\*\*\*
21 |
22 | **By:** [Luka Cempre](https://twitter.com/lukaslo)
23 |
24 | ### **Google Search Console API R: Guide to get Started**
25 |
26 | **S**hows you how to setup a daily automated pull of Google Search Console data using R
27 |
28 | **Tutorial link:** [**https://www.ryanpraski.com/google-search-console-api-r-guide-to-get-started/**](https://www.ryanpraski.com/google-search-console-api-r-guide-to-get-started/)\*\*\*\*
29 |
30 | **By:** [Ryan Praskievicz](https://twitter.com/ryanpraski)
31 |
32 |
--------------------------------------------------------------------------------
/python/data-science/time-series.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Resources for Time Series Prediction in Python
3 | ---
4 |
5 | # Time Series
6 |
7 | ## [AstPy](https://github.com/firmai/atspy)
8 |
9 | Easily develop state of the art time series models to forecast univariate data series. Simply load your data and select which models you want to test. This is the largest repository of automated structural and machine learning time series models. Please get in contact if you want to contribute a model.
10 |
11 | ### Install
12 |
13 | ```text
14 | pip install atspy
15 | ```
16 |
17 | ### Automated Models
18 |
19 | 1. `ARIMA` - Automated ARIMA Modelling
20 | 2. `Prophet` - Modeling Multiple Seasonality With Linear or Non-linear Growth
21 | 3. `HWAAS` - Exponential Smoothing With Additive Trend and Additive Seasonality
22 | 4. `HWAMS` - Exponential Smoothing with Additive Trend and Multiplicative Seasonality
23 | 5. `NBEATS` - Neural basis expansion analysis \(now fixed at 20 Epochs\)
24 | 6. `Gluonts` - RNN-based Model \(now fixed at 20 Epochs\)
25 | 7. `TATS` - Seasonal and Trend no Box Cox
26 | 8. `TBAT` - Trend and Box Cox
27 | 9. `TBATS1` - Trend, Seasonal \(one\), and Box Cox
28 | 10. `TBATP1` - TBATS1 but Seasonal Inference is Hardcoded by Periodicity
29 | 11. `TBATS2` - TBATS1 With Two Seasonal Periods
30 |
31 | {% hint style="info" %}
32 | See full details at: [https://github.com/firmai/atspy](https://github.com/firmai/atspy)
33 | {% endhint %}
34 |
35 |
36 |
37 |
--------------------------------------------------------------------------------
/python/data-science/machine-learning.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Machine Learning Resources for SEO
3 | ---
4 |
5 | # Machine Learning
6 |
7 | ## Pytorch
8 |
9 | ### [HuggingFace Transformers](https://github.com/huggingface/transformers)
10 |
11 | Transformers \(formerly known as `pytorch-transformers` and `pytorch-pretrained-bert`\) provides state-of-the-art general-purpose architectures \(BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, CTRL...\) for Natural Language Understanding \(NLU\) and Natural Language Generation \(NLG\) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.
12 |
13 | {% hint style="info" %}
14 | See full details at: [https://github.com/huggingface/transformers](https://github.com/huggingface/transformers)
15 | {% endhint %}
16 |
17 |
18 |
19 | ## TensorFlow
20 |
21 | ### [Uber Ludwig](https://github.com/uber/ludwig)
22 |
23 | Ludwig is a toolbox built on top of TensorFlow that allows users to train and test deep learning models without the need to write code.
24 |
25 | All you need to provide is a CSV file containing your data, a list of columns to use as inputs, and a list of columns to use as outputs, Ludwig will do the rest. Simple commands can be used to train models both locally and in a distributed way, and to use them to predict new data.
26 |
27 | {% hint style="info" %}
28 | See full details at: [https://github.com/uber/ludwig](https://github.com/uber/ludwig)
29 | {% endhint %}
30 |
31 |
32 |
33 |
34 |
35 | ## Other
36 |
37 |
38 |
39 |
40 |
41 |
--------------------------------------------------------------------------------
/r-stats/r-stats-code-examples/crawling.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Crawling.
3 | ---
4 |
5 | # Crawling
6 |
7 | ### Log File Analysis for SEO – Working with data visually
8 |
9 | **Link**: [https://canonicalized.com/log-file-analysis-seo/](https://canonicalized.com/log-file-analysis-seo/)
10 |
11 | **By**: [Dorian Banutoiu](https://twitter.com/canonicalizedco) on [Canonicalized](https://canonicalized.com/)
12 |
13 |
14 |
15 | ## Improve internal linking for SEO: Calculate Internal PageRank
16 |
17 | Determine what pages on your site might be seen as authoritative by search engines by computing your Internal PageRank.
18 |
19 | **Link:** [https://searchengineland.com/improve-internal-linking-calculate-internal-pagerank-r-246**883**](https://searchengineland.com/improve-internal-linking-calculate-internal-pagerank-r-246883)\*\*\*\*
20 |
21 | **By** [Paul Shapiro](https://twitter.com/fighto) on [https://searchengineland.com/](https://searchengineland.com/)
22 |
23 | ## Crawling & metadata extraction with RCrawler
24 |
25 | How to crawl a website using Rcrawler package
26 |
27 | **Link:** [https://www.gokam.co.uk/seo-crawling-metadata-extraction-with-r-rcrawler/](https://www.gokam.co.uk/seo-crawling-metadata-extraction-with-r-rcrawler/)
28 |
29 | **By**: [François JOLY](https://twitter.com/tuf)
30 |
31 | ## Classic Packages for Web Crawling
32 |
33 | ### rvest
34 |
35 | Perfect to crawl a few of URLs, easy to use css style or XPath selectors
36 |
37 | **Link:** [**https://cran.r-project.org/web/packages/rvest/rvest.pdf**](https://cran.r-project.org/web/packages/rvest/rvest.pdf)\*\*\*\*
38 |
39 | by [Hadley Wickham](https://twitter.com/hadleywickham)
40 |
41 | **Rcrawler**
42 |
43 | Good to crawl a website and extract metadata
44 |
45 | **Link:** [**https://cran.r-project.org/web/packages/Rcrawler/Rcrawler.pdf**](https://cran.r-project.org/web/packages/Rcrawler/Rcrawler.pdf)\*\*\*\*
46 |
47 | **By** [Salim Khalil](https://orcid.org/0000-0002-7804-4041)
48 |
49 |
--------------------------------------------------------------------------------
/r-stats/r-stats-code-examples/query-management.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Technical SEO code focusing on Query Management.
3 | ---
4 |
5 | # Query Management
6 |
7 | ### SEO keyword research using searchConsoleR and googleAnalyticsR
8 |
9 | Method to estimate where to prioritise your SEO resources, estimating which keywords will give the greatest increase in revenue if you could improve their Google rank.
10 |
11 | **Link**: [https://code.markedmondson.me/search-console-google-analytics-r-keyword-research/](https://code.markedmondson.me/search-console-google-analytics-r-keyword-research/)
12 |
13 | **By**: [Mark Edmondson](https://twitter.com/holomarked) on [Coding in digital analytics](https://code.markedmondson.me/)
14 |
15 |
16 |
17 | ### R AND KEYWORD PLANNER
18 |
19 | Using r and google’s keyword planner to evaluate size and competitiveness of international markets.
20 |
21 | **Link**: [https://keyword-hero.com/markets-go-using-r-googles-keyword-planner-evaluate-size-competitiveness-international-markets](https://keyword-hero.com/markets-go-using-r-googles-keyword-planner-evaluate-size-competitiveness-international-markets)
22 |
23 | **By:** Max on [Keyword Hero](%20https://keyword-hero.com/)
24 |
25 | ### **A**utomate Google Search Console data downloads with R
26 |
27 | A guide on scheduling searchConsoleR to save your data to a database
28 |
29 | **Link**: [https://www.rubenvezzoli.online/automate-google-search-console-downloads/](https://www.rubenvezzoli.online/automate-google-search-console-downloads/)
30 |
31 | **By:** [Ruben Vezzoli](https://twitter.com/rubenvezzoli/) on [https://www.rubenvezzoli.online/](https://www.rubenvezzoli.online/)
32 |
33 | ## How to enhance keyword exploration in R with googleSuggestQueriesR package
34 |
35 | Guide how to research similar search queries at scale with R
36 |
37 | * broad keywords exploration
38 | * customizing recommendations
39 | * specific topic exploration
40 |
41 | Link: [https://leszeksieminski.me/r/keyword-exploration-googlesuggestqueriesr/](https://leszeksieminski.me/r/keyword-exploration-googlesuggestqueriesr/)
42 |
43 | By: [Leszek Siemiński ](https://www.linkedin.com/in/leszek-sieminski/)
44 |
45 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: >-
3 | i.codeseo.dev is a repository of code and reference library for Technical SEOs
4 | who are interested in development and data science.
5 | ---
6 |
7 | # Resources for Technical SEOs
8 |
9 | ## About
10 |
11 | We started this journey by wanting to create a Wikipedia page for "Technical Search Engine Optimization". What we learned is that there is an incredible amount of content about the subject, but it was very difficult to find authoritative sources \(or at least ones that would be viewed that way from outside the SEO world\). Most of the content was in article form and dealt with only certain aspects or only the key benefits of Technical SEO.
12 |
13 | The goals of this documentation are:
14 |
15 | * Create Wikipedia-style \(authoritatively sourced\) information about Technical SEO.
16 | * List open-source projects that members of the SEO community have contributed that include a description \(article\) as well as clean, documented code that others may use.
17 | * Provide documentation for popular workflows and code examples in Python and JavaScript \(and others?\)
18 |
19 | ## Contributing
20 |
21 | We have made a connected repo [here](https://github.com/jroakes/iCodeSEO) that is public. We strongly encourage pull requests to grow this resource. Also, with [Gitbook ](https://www.gitbook.com/)\(the SAAS used for this documentation\), we may need to upgrade to a paid plan \($40/mo\) at some point. That includes 5 seats. We are looking for a couple of permanent contributors to own the Technical SEO and Javascript sections. Knowledge of Git and Wikipedia-style writing is required. In addition, we would love to have sponsors. Please email me at [jroakes@gmail.com](mailto:jroakes@gmail.com) if you are interested.
22 |
23 | ## Inclusions of Code
24 |
25 | To be included in the Code Examples sections, the following conditions must be true:
26 |
27 | 1. The code must be explained, in an article or well-documented in the source.
28 | 2. The code must be related to a function of [Technical SEO](technical-seo/overview/learning-center/1.-what-is-technical-seo.md).
29 | 3. The code should be novel or not commonly available already.
30 | 4. The code should be repeatable, easily, by others, with reasonable cost or work.
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 | 
39 |
40 | {% hint style="info" %}
41 | For more information visit: [ContentKing](https://www.contentkingapp.com/)
42 | {% endhint %}
43 |
44 |
45 |
46 |
--------------------------------------------------------------------------------
/about/contributing-content.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: How to contribute content to iCodeSEO.dev.
3 | ---
4 |
5 | # Contributing Content
6 |
7 | ### Overview
8 |
9 | This documentation is organized into the following areas:
10 |
11 | 1. **Technical SEO**: This is to be an organized and wikipedia-style area covering the major informational areas of Technical SEO. We will go with the Russ Jones definition of Technical SEO and extend the realm into: Crawling, Rendering, Indexing, Data Science, and Machine Learning. Initial areas to construct are the first three.
12 | 2. **JavaScript**: Resources for Beginning with JavaScript, Code Examples, Chrome DevTools information. Would be good to separate Node and Google AppScript code.
13 | 3. **Python**: Resources for Beginning with Python, Code Examples, Data Science Resources, and other helpful code snippets.
14 | 4. **R Stats**: Resources for Beginning with R Stats, Code Examples, etc.
15 |
16 |
17 |
18 | ### Working with Pages
19 |
20 | Here are a few points about pages:
21 |
22 | #### If the parent page is left blank, it will include a section listing of children pages.
23 |
24 | 
25 |
26 | ####
27 |
28 | #### Key page information should be filled just like a normal web page
29 |
30 | 
31 |
32 | #### Use the available content types to make the page as clear as possible.
33 |
34 | With this said, it is critical to keep consistently of formatting between sections. For a specific content type, if a convention is used in one section, it should be used in all. Users should expect the same experience and understanding of where to find things across sections.
35 |
36 | 
37 |
38 | ###
39 |
40 | ### Training Resources
41 |
42 | **<Language> Overview** sections are areas of training for the language. This should include information on Installation, libraries, etc. major thematic areas within a language can be broken out into separate sections. In many cases it is a good idea to link to really well done resources online covering different aspects for a language. If we are referencing a resource online, below is the correct format:
43 |
44 | 
45 |
46 | Where the resources online are narrowly relevant to Technical SEO needs, we should provide the content tailored to typical workflows.
47 |
48 |
49 |
50 | ### Code Examples
51 |
52 | All code examples sections should be organized into consistent major thematic areas of Technical SEO. Currently:
53 |
54 | * Knowledge Graph
55 | * Page Speed
56 | * Query Management
57 | * Crawling
58 | * Rendering
59 | * Indexing
60 |
61 | #### Format for code examples:
62 |
63 | 
64 |
65 |
66 |
67 |
--------------------------------------------------------------------------------
/SUMMARY.md:
--------------------------------------------------------------------------------
1 | # Table of contents
2 |
3 | * [Resources for Technical SEOs](README.md)
4 |
5 | ## Technical SEO
6 |
7 | * [Technical SEO Overview](technical-seo/overview/README.md)
8 | * [Learning Center](technical-seo/overview/learning-center/README.md)
9 | * [1. What is Technical SEO?](technical-seo/overview/learning-center/1.-what-is-technical-seo.md)
10 | * [2. Crawling](technical-seo/overview/learning-center/2.-crawling.md)
11 | * [3. Rendering](technical-seo/overview/learning-center/3.-rendering.md)
12 |
13 | ## Javascript
14 |
15 | * [JavaScript Overview](javascript/overview/README.md)
16 | * [Installation](javascript/overview/installation.md)
17 | * [NPM](javascript/overview/npm.md)
18 | * [Chrome DevTools](javascript/chrome-devtools/README.md)
19 | * [Google Chrome SEO Without A Plugin](javascript/chrome-devtools/chrome-seo-without-a-plugin.md)
20 | * [JavaScript Code Examples](javascript/javascript-code-examples/README.md)
21 | * [Page Speed](javascript/javascript-code-examples/page-speed.md)
22 | * [Indexing](javascript/javascript-code-examples/indexing.md)
23 |
24 | ## Python
25 |
26 | * [Python Overview](python/overview/README.md)
27 | * [Installation](python/overview/installation.md)
28 | * [Libraries](python/overview/libraries/README.md)
29 | * [Pandas](python/overview/libraries/pandas.md)
30 | * [Python Code Examples](python/code-examples/README.md)
31 | * [Reporting](python/code-examples/reporting.md)
32 | * [Data Extraction](python/code-examples/data-extraction.md)
33 | * [Knowledge Graph](python/code-examples/knowledge-graph.md)
34 | * [Page Speed](python/code-examples/pagespeed.md)
35 | * [Query Management](python/code-examples/query-management.md)
36 | * [Crawling](python/code-examples/crawling.md)
37 | * [Rendering](python/code-examples/rendering.md)
38 | * [Data Science](python/data-science/README.md)
39 | * [Machine Learning](python/data-science/machine-learning.md)
40 | * [Time Series](python/data-science/time-series.md)
41 | * [Helpful Code](python/helpful-code.md)
42 |
43 | ## R Stats
44 |
45 | * [R Stats Overview](r-stats/r-stats/README.md)
46 | * [Intro & Installation](r-stats/r-stats/intro-and-installation.md)
47 | * [R Stats Code Examples](r-stats/r-stats-code-examples/README.md)
48 | * [Crawling](r-stats/r-stats-code-examples/crawling.md)
49 | * [Query Management](r-stats/r-stats-code-examples/query-management.md)
50 | * [APIs](r-stats/r-stats-code-examples/knowledge-graph.md)
51 |
52 | ## SEO Datasets
53 |
54 | * [SERP Datasets](seo-datasets-1/serp-datasets.md)
55 | * [Crawl Datasets](seo-datasets-1/crawl-datasets.md)
56 | * [Sitemap Datasets](seo-datasets-1/sitemap-datasets.md)
57 | * [APIs documentation](seo-datasets-1/apis.md)
58 |
59 | ## About
60 |
61 | * [Contributing Content](about/contributing-content.md)
62 |
63 |
--------------------------------------------------------------------------------
/seo-datasets-1/serp-datasets.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Datasets for SEO dealing with search engine results pages (SERPs)
3 | ---
4 |
5 | # SERP Datasets
6 |
7 | ### Flights and airline ticket SERPs
8 |
9 | An export of Google SERPs for `flights to ` and `tickets to `for the top 100 travel destinations \(cities\).
10 | 100 destinations x 2 countries x 2 keyword variations x 10 results each = 4,000 rows.
11 | The dataset is updated once every two weeks to check progress and changes.
12 |
13 | | Author | Article | Source | Notebook |
14 | | :--- | :--- | :--- | :--- |
15 | | [Elias Dabbas](https://github.com/eliasdabbas/) | [Tutorial on SEMrush](https://www.semrush.com/blog/analyzing-search-engine-results-pages/) | [Dataset on Kaggle](https://www.kaggle.com/eliasdabbas/search-engine-results-flights-tickets-keywords) | [Binder link](https://mybinder.org/v2/gh/eliasdabbas/SEMRush_serp_tutorial/master?urlpath=lab/tree/semrush_serp_analysis.ipynb) |
16 |
17 | ### Cars Google SERPs
18 |
19 | Google SERPs for ` for sale` and ` price` keywords for fifty of the top cars, and for two countries.
20 |
21 | | Author | Article | Source | Notebook |
22 | | :--- | :--- | :--- | :--- |
23 | | [Elias Dabbas](https://github.com/eliasdabbas) | | [Dataset on Kaggle](https://www.kaggle.com/eliasdabbas/google-search-results-pages-used-cars-us) | [Kaggle Tutorial](https://www.kaggle.com/eliasdabbas/search-engine-results-pages-serps-research) |
24 |
25 | ### 2017 Local SEO Ranking Factors Data
26 |
27 | ~150k rows of data \(each row representing a different business listing on Google My Business\). That data was scraped by Places Scout and joined with a bunch of their own data, as well as link API data from AHREFs and Majestic. All in all there are ~150 data points per listings/business
28 |
29 | | Author | Article | Source |
30 | | :--- | :--- | :--- |
31 | | [Dan Leibson](https://twitter.com/DanLeibson) | [Link](https://www.localseoguide.com/open-sourcing-2017-local-seo-ranking-factors-data/) | [Dataset on Google Drive](https://drive.google.com/file/d/12vCCNOs_HrLOK4VC2fpeD_eYFIB3pIs7/view?usp=sharing) |
32 |
33 | ### Recipes of Popular Dishes: YouTube and Google SERPs
34 |
35 | 243 national dishes, with two keyword variations each ` recipe` and `how to make `.
36 | YouTube provides a much richer dataset than Google, as it contains video and channel statistics to provide context and metadata for the videos \(and channels\).
37 | Both Google and YouTube SERPs are provided for the same keywords.
38 |
39 |
40 | | Author | Article | Source | Notebook |
41 | | :--- | :--- | :--- | :--- |
42 | | [Elias Dabbas](https://github.com/eliasdabbas) | | [Dataset on Kaggle](https://www.kaggle.com/eliasdabbas/recipes-search-engine-results-data) | [Kaggle Tutorial](https://www.kaggle.com/eliasdabbas/recipes-keywords-ranking-on-google-and-youtube) |
43 |
44 |
--------------------------------------------------------------------------------
/technical-seo/overview/learning-center/3.-rendering.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: >-
3 | Rendering is the process of taking crawled content and using a WRS (Web
4 | Rendering Service) to build the DOM from the HTML and embedded assets like
5 | JavaScript.
6 | ---
7 |
8 | # 3. Rendering
9 |
10 | 
11 |
12 | Key Items we Know about Google's Web Rendering Service \(WRS\).
13 |
14 | * It uses a near-up-to-date version of Google Chrome.
15 | * Google doesn't use cookies \(they do seem to persist cookies across 30X requests\).
16 | * Service workers are not allowed by Googlebot.
17 | * Google overrides random number functions to ensure a predictable state.
18 | * The Live Test \(Fetch and Render\) in Google Search Console, is different from Googlebot. Live Test is time sensitive, Googelbot is not.
19 | * Googlebot will try to wait until there are no longer network activity by the headless browser.
20 | * The date and time of crawling and indexing may differ. Google may index what they have if the crawl and render fails.
21 | * Googlebot does not need to paint pixels as the service runs Chromium headless, so it omits that step of the render process.
22 |
23 | Source: [Martin Splitt's talk at TechSEO Boost, 2019](https://www.catalystdigital.com/techseoboost-livestream-2019/).
24 |
25 | ## Recent Articles on JavaScript Rendering
26 |
27 | Google's Martin Splitt has spent a tremendous amount of time over the last year educating developers and SEOs on the limitations of Google's ability to render JavaScript websites. He is represented in many of the articles below as the authoritative source.
28 |
29 |
30 |
31 | ### Understand the JavaScript SEO basics
32 |
33 | JavaScript is an important part of the web platform because it provides many features that turn the web into a powerful application platform. Making your JavaScript-powered web applications discoverable via Google Search can help you find new users and re-engage existing users as they search for the content your web app provides.
34 |
35 | **Link**: [https://developers.google.com/search/docs/guides/javascript-seo-basics](https://developers.google.com/search/docs/guides/javascript-seo-basics)
36 |
37 |
38 |
39 | ### Fix Search-related JavaScript problems
40 |
41 | This guide helps you identify and fix JavaScript issues that may be blocking your page, or specific content on JavaScript powered pages, from showing up in Google Search. While Googlebot does run JavaScript, there are some differences and limitations that you need to account for when designing your pages and applications to accommodate how crawlers access and render your content.
42 |
43 | **Link**: [https://developers.google.com/search/docs/guides/fix-search-javascript](https://developers.google.com/search/docs/guides/fix-search-javascript)
44 |
45 |
46 |
47 | ### Making JavaScript and Google Search work together
48 |
49 | We introduced a new Googlebot at Google I/O and took the opportunity to discuss improvements and best practices for making JavaScript web apps work well with Google Search
50 |
51 | **Link**: [https://web.dev/javascript-and-google-search-io-2019/](https://web.dev/javascript-and-google-search-io-2019/)
52 |
53 | **By**:
[Martin Splitt](https://twitter.com/g33konaut) and [Lizzi Harvey](https://twitter.com/HarveyLizzi)
54 |
55 |
56 |
57 | ### Making JavaScript Work for Search with Martin Splitt & Ashley Berman Hale
58 |
59 | Read our recap of the Q&A webinar we hosted with Google's Martin Splitt on how to make JavaScript-powered websites discoverable and indexable in search.
60 |
61 | **Link**: [https://www.deepcrawl.com/blog/webinars/making-javascript-work-for-search-martin-splitt/](https://www.deepcrawl.com/blog/webinars/making-javascript-work-for-search-martin-splitt/)
62 |
63 | **By**: [Rachel Costello](https://twitter.com/rachellcostello)
64 |
65 |
66 |
67 |
--------------------------------------------------------------------------------
/python/overview/libraries/pandas.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: >-
3 | Parse Excel and CSVs files quickly and easily with the pandas library for
4 | Python.
5 | ---
6 |
7 | # Pandas
8 |
9 | ## Overview
10 |
11 | **pandas** is a [Python](https://www.python.org) package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, **real world** data analysis in Python. Additionally, it has the broader goal of becoming **the most powerful and flexible open source data analysis / manipulation tool available in any language**. It is already well on its way toward this goal. \([source](https://pandas.pydata.org/pandas-docs/stable/getting_started/overview.html)\)
12 |
13 | ## Installing Pandas
14 |
15 | Using Anaconda
16 |
17 | ```
18 | $ conda install pandas
19 | ```
20 |
21 | Using PyPI:
22 |
23 | ```bash
24 | $ pip install pandas
25 | ```
26 |
27 | {% hint style="info" %}
28 | Full details found here: [https://pandas.pydata.org/pandas-docs/stable/getting\_started/install.html](https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html)
29 | {% endhint %}
30 |
31 | ## Reading Data
32 |
33 | Pandas comes with functionality to read data from many file types including:
34 |
35 | * CSV
36 | * Excel
37 | * HTML \(tables\)
38 | * SQL
39 | * BiqQuery
40 | * etc
41 |
42 | For most file-based reading, Pandas will accept as the first argument a local file \(e.g. `pd.read_csv('data.csv')` \), or a web URL \(e.g. `pd.read_csv('https://domain.com/data.csv')` \).
43 |
44 | ### CSV
45 |
46 | For CSVs, Pandas handles the datatype setting for you. The most common named parameter sent with the `read_csv` method is `skiprows=1` when the inputted CSV has non-heading rows prior to the first row.
47 |
48 | {% hint style="info" %}
49 | Pandas API Reference for read\_csv function: [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read\_csv.html](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)
50 | {% endhint %}
51 |
52 | ```bash
53 | import pandas as pd
54 | df = pd.read_csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/airline-safety/airline-safety.csv')
55 | df.head()
56 | ```
57 |
58 | | | airline | avail\_seat\_km\_per\_week | incidents\_85\_99 | fatal\_accidents\_85\_99 | fatalities\_85\_99 | incidents\_00\_14 | fatal\_accidents\_00\_14 | fatalities\_00\_14 |
59 | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
60 | | 0 | Aer Lingus | 320906734 | 2 | 0 | 0 | 0 | 0 | 0 |
61 | | 1 | Aeroflot\* | 1197672318 | 76 | 14 | 128 | 6 | 1 | 88 |
62 | | 2 | Aerolineas Argentinas | 385803648 | 6 | 0 | 0 | 1 | 0 | 0 |
63 | | 3 | Aeromexico\* | 596871813 | 3 | 1 | 64 | 5 | 0 | 0 |
64 | | 4 | Air Canada | 1865253802 | 2 | 0 | 0 | 2 | 0 | 0 |
65 |
66 | ### Excel
67 |
68 | For me, I use the `read_excel` method much less frequently than `read_csv` mostly due to habit and the assumption that the data is cleaner with a CSV. Common named parameters would be `sheet_name = "Sheet1"` or `skip_rows=[1]`.
69 |
70 | {% hint style="info" %}
71 | Pandas API Reference for read\_excel function: [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read\_excel.html](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html)
72 | {% endhint %}
73 |
74 | ```bash
75 | import pandas as pd
76 | df = pd.read_excel('https://www2.census.gov/programs-surveys/popest/tables/2010-2019/state/totals/nst-est2019-01.xlsx', skiprows=3)
77 | df.head()
78 | ```
79 |
80 | | | Unnamed: 0 | Census | Estimates Base | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 |
81 | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
82 | | 0 | United States | 308745538.0 | 308758105.0 | 309321666.0 | 311556874.0 | 313830990.0 | 315993715.0 | 318301008.0 | 320635163.0 | 322941311.0 | 324985539.0 | 326687501.0 | 328239523.0 |
83 | | 1 | Northeast | 55317240.0 | 55318443.0 | 55380134.0 | 55604223.0 | 55775216.0 | 55901806.0 | 56006011.0 | 56034684.0 | 56042330.0 | 56059240.0 | 56046620.0 | 55982803.0 |
84 | | 2 | Midwest | 66927001.0 | 66929725.0 | 66974416.0 | 67157800.0 | 67336743.0 | 67560379.0 | 67745167.0 | 67860583.0 | 67987540.0 | 68126781.0 | 68236628.0 | 68329004.0 |
85 | | 3 | South | 114555744.0 | 114563030.0 | 114866680.0 | 116006522.0 | 117241208.0 | 118364400.0 | 119624037.0 | 120997341.0 | 122351760.0 | 123542189.0 | 124569433.0 | 125580448.0 |
86 | | 4 | West | 71945553.0 | 71946907.0 | 72100436.0 | 72788329.0 | 73477823.0 | 74167130.0 | 74925793.0 | 75742555.0 | 76559681.0 | 77257329.0 | 77834820.0 | 78347268.0 |
87 |
88 | ### HTML Tables
89 |
90 | This is a very handy function, where there is a table on a web page that you want to grab the data from.
91 |
92 | {% hint style="info" %}
93 | Pandas API Reference for read\_html function: [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read\_html.html](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_html.html)
94 | {% endhint %}
95 |
96 | ```bash
97 | import pandas as pd
98 | df = pd.read_html('https://countrycode.org/', attrs = {'class': 'main-table'})
99 | df[0].head()
100 | ```
101 |
102 | | | COUNTRY | COUNTRY CODE | ISO CODES | POPULATION | AREA KM2 | GDP $USD |
103 | | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
104 | | 0 | Afghanistan | 93 | AF / AFG | 29121286 | 647500 | 20.65 Billion |
105 | | 1 | Albania | 355 | AL / ALB | 2986952 | 28748 | 12.8 Billion |
106 | | 2 | Algeria | 213 | DZ / DZA | 34586184 | 2381740 | 215.7 Billion |
107 | | 3 | American Samoa | 1-684 | AS / ASM | 57881 | 199 | 462.2 Million |
108 | | 4 | Andorra | 376 | AD / AND | 84000 | 468 | 4.8 Billion |
109 |
110 |
111 |
112 |
--------------------------------------------------------------------------------
/technical-seo/overview/learning-center/2.-crawling.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: >-
3 | Crawling is how search engines discover new links and process pages it finds
4 | on the internet. Here we provide a good overview based only on reliable
5 | sources.
6 | ---
7 |
8 | # 2. Crawling
9 |
10 | Google's Search Index contains [hundred of billions of webpages and is over 100,000,000 gigibytes \(100k Terabytes\) in size](https://www.google.com/search/howsearchworks/crawling-indexing/). In order to feed that index, Google uses [Googlebot](https://support.google.com/webmasters/answer/182072?hl=en), the generic name for Google's web crawling infrastructure, to search for new pages, and add to a list of known pages. Bing uses [Bingbot](https://www.bing.com/webmaster/help/which-crawlers-does-bing-use-8c184ec0) as their standard crawler. Both Google and Bing use desktop and mobile User-Agents to crawl the web.
11 |
12 | The web has grown into an enormous resource of available information. It is important to note that Google \(and Bing\) only know about a portion of the web, called the **surface web**. The surface web is the portion that is accessible publicly by crawlers via web pages. Other parts of the web are the **deep web**, and **hidden web**. The deep web is estimated to be 4,000 to 5,000 times larger than the surface web. This [article by the University of Washington](https://guides.lib.uw.edu/c.php?g=342031&p=2300191) details the differences.
13 |
14 | Google uses a distributed crawling infrastructure that distributes load over many machines. It is also an [incremental crawler](https://www.ijarcce.com/upload/2016/january-16/IJARCCE%2052.pdf), in that it continues to refresh its collection of pages with up-to-date versions based on the perceived importance of the page to users. Crawling is distinct from rendering, although the process of crawling and rendering is often attributed to Googlebot.
15 |
16 |
17 |
18 | ### Crawl Budget
19 |
20 | In 2017, Google provided [some guidance](https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html) on how to think about crawl budget. Essentially the term "crawl budget" is a term coined by SEOs to indicate the amount of crawling resources Google would allocate to a given website. Crawl budget is a bit of a misnomer in that there are several areas that can potentially affect the rate with which Google crawls your website URLs.
21 |
22 | > First, we'd like to emphasize that crawl budget, as described below, is not something most publishers have to worry about. If new pages tend to be crawled the same day they're published, crawl budget is not something webmasters need to focus on. Likewise, if a site has fewer than a few thousand URLs, most of the time it will be crawled efficiently. \([source](https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html)\)
23 |
24 | Essentially, the goal of Technical SEOs is to ensure that, for larger websites, Google spends their available resources crawling and rendering the important pages, from a search perspective, and little to no time crawling non-valuable content. Google indicates the following categories as non-value-add URLs \(in order of significance\):
25 |
26 | * [Faceted navigation](https://webmasters.googleblog.com/2014/02/faceted-navigation-best-and-5-of-worst.html) and [session identifiers](https://webmasters.googleblog.com/2007/09/google-duplicate-content-caused-by-url.html)
27 | * [On-site duplicate content](https://webmasters.googleblog.com/2007/09/google-duplicate-content-caused-by-url.html)
28 | * [Soft error pages](https://webmasters.googleblog.com/2010/06/crawl-errors-now-reports-soft-404s.html)
29 | * Hacked pages
30 | * [Infinite spaces](https://webmasters.googleblog.com/2008/08/to-infinity-and-beyond-no.html) and proxies
31 | * Low quality and spam content
32 |
33 | Major areas that can affect the rate with which Google crawls your URLs are:
34 |
35 | * Blocked content in robots.txt
36 | * Server crawl health. Google tries to be a good citizen and is especially tuned to feedback from your server such as slowing response times and server errors.
37 | * [Crawl rate limits set in Google Search Console](https://support.google.com/webmasters/answer/48620).
38 | * Site-wide events like site moves or major content changes.
39 | * Popularity of your URLs to users.
40 | * Staleness of the URLs in Google's index. Google tries to keep their index fresh.
41 | * Reliance on many embedded assets to render pages, like JS, CSS, XHR calls.
42 | * The category of website \(eg. News, or other frequently changing unique content\).
43 |
44 |
45 |
46 | ## Cloaking
47 |
48 | {% hint style="info" %}
49 | Google defines cloaking as the practice of presenting different content or URLs to human users and search engines. Cloaking is considered a violation of Google’s [Webmaster Guidelines](https://support.google.com/webmasters/answer/answer.py?answer=35769) because it provides our users with different results than they expected. \([source](https://support.google.com/webmasters/answer/66355?hl=en)\)
50 | {% endhint %}
51 |
52 | In 2016, Google trained a classification model that was able to accurately detect cloaking 95.5% of the time with a false positive rate of 0.9%. JavaScript redirection was one of the strongest features, predictive of a positive classification. \([source](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45365.pdf)\)
53 |
54 | ## Research Articles on Crawling
55 |
56 | * [http://www.ijera.com/papers/vol8no11/p1/A0811010108.pdf](http://www.ijera.com/papers/vol8no11/p1/A0811010108.pdf)
57 | * [https://www.ijarcce.com/upload/2016/january-16/IJARCCE%2052.pdf](https://www.ijarcce.com/upload/2016/january-16/IJARCCE%2052.pdf)
58 | * [http://infolab.stanford.edu/~olston/publications/crawling\_survey.pdf](http://infolab.stanford.edu/~olston/publications/crawling_survey.pdf)
59 | * [https://web.stanford.edu/class/cs276/handouts/lecture16-crawling-jun4-6per.pdf](https://web.stanford.edu/class/cs276/handouts/lecture16-crawling-jun4-6per.pdf)
60 | * [http://web.eecs.umich.edu/~mihalcea/498IR/Lectures/WebCrawling.pdf](http://web.eecs.umich.edu/~mihalcea/498IR/Lectures/WebCrawling.pdf)
61 |
62 |
63 |
64 |
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 |
75 |
76 |
77 |
78 |
79 |
80 |
81 |
82 |
83 |
84 |
85 |
86 |
87 |
88 |
89 |
90 |
91 |
92 |
--------------------------------------------------------------------------------
/javascript/chrome-devtools/README.md:
--------------------------------------------------------------------------------
1 | # Chrome DevTools
2 |
3 | {% hint style="danger" %}
4 | This page is a mirror of [https://developers.google.com/web/tools/chrome-devtools](https://developers.google.com/web/tools/chrome-devtools) and needs to be updated.
5 | {% endhint %}
6 |
7 | ## Chrome DevTools
8 |
9 | Chrome DevTools is a set of web developer tools built directly into the [Google Chrome](https://www.google.com/chrome/) browser. DevTools can help you edit pages on-the-fly and diagnose problems quickly, which ultimately helps you build better websites, faster.
10 |
11 | Check out the video for live demonstrations of core DevTools workflows, including debugging CSS, prototyping CSS, debugging JavaScript, and analyzing load performance.
12 |
13 | ### Open DevTools
14 |
15 | There are many ways to open DevTools, because different users want quick access to different parts of the DevTools UI.
16 |
17 | * When you want to work with the DOM or CSS, right-click an element on the page and select **Inspect** to jump into the **Elements** panel. Or press Command+Option+C \(Mac\) or Control+Shift+C \(Windows, Linux, Chrome OS\).
18 | * When you want to see logged messages or run JavaScript, press Command+Option+J \(Mac\) or Control+Shift+J \(Windows, Linux, Chrome OS\) to jump straight into the **Console** panel.
19 |
20 | See [Open Chrome DevTools](./) for more details and workflows.
21 |
22 | ### Get started
23 |
24 | If you're a more experienced web developer, here are the recommended starting points for learning how DevTools can improve your productivity:
25 |
26 | * [View and Change the DOM](https://developers.google.com/web/tools/chrome-devtools/dom)
27 | * [View and Change a Page's Styles \(CSS\)](./)
28 | * [Debug JavaScript](https://developers.google.com/web/tools/chrome-devtools/javascript)
29 | * [View Messages and Run JavaScript in the Console](https://developers.google.com/web/tools/chrome-devtools/console/get-started)
30 | * [Optimize Website Speed](https://developers.google.com/web/tools/chrome-devtools/speed/get-started)
31 | * [Inspect Network Activity](./)
32 |
33 | ### Discover DevTools
34 |
35 | The DevTools UI can be a little overwhelming... there are so many tabs! But, if you take some time to get familiar with each tab to understand what's possible, you may discover that DevTools can seriously boost your productivity.
36 |
37 | #### Device Mode
38 |
39 | Simulate mobile devices.
40 |
41 | * [Device Mode](https://developers.google.com/web/tools/chrome-devtools/device-mode)
42 | * [Test Responsive and Device-specific Viewports](https://developers.google.com/web/tools/chrome-devtools/device-mode/emulate-mobile-viewports)
43 | * [Emulate Sensors: Geolocation & Accelerometer](https://developers.google.com/web/tools/chrome-devtools/device-mode/device-input-and-sensors)
44 |
45 | #### Elements panel
46 |
47 | View and change the DOM and CSS.
48 |
49 | * [Get Started With Viewing And Changing The DOM](https://developers.google.com/web/tools/chrome-devtools/dom)
50 | * [Get Started With Viewing And Changing CSS](./)
51 | * [Inspect and Tweak Your Pages](https://developers.google.com/web/tools/chrome-devtools/inspect-styles)
52 | * [Edit Styles](./)
53 | * [Edit the DOM](https://developers.google.com/web/tools/chrome-devtools/inspect-styles/edit-dom)
54 | * [Inspect Animations](./)
55 | * [Find Unused CSS](./)
56 |
57 | #### Console panel
58 |
59 | View messages and run JavaScript from the Console.
60 |
61 | * [Get Started With The Console](https://developers.google.com/web/tools/chrome-devtools/console/get-started)
62 | * [Using the Console](./)
63 | * [Interact from Command Line](https://developers.google.com/web/tools/chrome-devtools/console/command-line-reference)
64 | * [Console API Reference](https://developers.google.com/web/tools/chrome-devtools/console/console-reference)
65 |
66 | #### Sources panel
67 |
68 | Debug JavaScript, persist changes made in DevTools across page reloads, save and run snippets of JavaScript, and save changes that you make in DevTools to disk.
69 |
70 | * [Get Started With Debugging JavaScript](https://developers.google.com/web/tools/chrome-devtools/javascript)
71 | * [Pause Your Code With Breakpoints](https://developers.google.com/web/tools/chrome-devtools/javascript/breakpoints)
72 | * [Save Changes to Disk with Workspaces](https://developers.google.com/web/tools/setup/setup-workflow)
73 | * [Run Snippets Of Code From Any Page](https://developers.google.com/web/tools/chrome-devtools/snippets)
74 | * [JavaScript Debugging Reference](https://developers.google.com/web/tools/chrome-devtools/javascript/reference)
75 | * [Persist Changes Across Page Reloads with Local Overrides](https://developers.google.com/web/updates/2018/01/devtools#overrides)
76 | * [Find Unused JavaScript](./)
77 |
78 | #### Network panel
79 |
80 | View and debug network activity.
81 |
82 | * [Get Started](https://developers.google.com/web/tools/chrome-devtools/network-performance)
83 | * [Network Issues Guide](https://developers.google.com/web/tools/chrome-devtools/network-performance/issues)
84 | * [Network Panel Reference](https://developers.google.com/web/tools/chrome-devtools/network-performance/reference)
85 |
86 | #### Performance panel
87 |
88 | Find ways to improve load and runtime performance.
89 |
90 | * [Optimize Website Speed](https://developers.google.com/web/tools/chrome-devtools/speed/get-started)
91 | * [Get Started With Analyzing Runtime Performance](https://developers.google.com/web/tools/chrome-devtools/evaluate-performance)
92 | * [Performance Analysis Reference](https://developers.google.com/web/tools/chrome-devtools/evaluate-performance/reference)
93 | * [Analyze runtime performance](https://developers.google.com/web/tools/chrome-devtools/rendering-tools)
94 | * [Diagnose Forced Synchronous Layouts](https://developers.google.com/web/tools/chrome-devtools/rendering-tools/forced-synchronous-layouts)
95 |
96 | #### Memory panel
97 |
98 | Profile memory usage and track down leaks.
99 |
100 | * [Fix Memory Problems](https://developers.google.com/web/tools/chrome-devtools/memory-problems)
101 | * [JavaScript CPU Profiler](https://developers.google.com/web/tools/chrome-devtools/rendering-tools/js-execution)
102 |
103 | #### Application panel
104 |
105 | Inspect all resources that are loaded, including IndexedDB or Web SQL databases, local and session storage, cookies, Application Cache, images, fonts, and stylesheets.
106 |
107 | * [Debug Progressive Web Apps](https://developers.google.com/web/tools/chrome-devtools/progressive-web-apps)
108 | * [Inspect and Manage Storage, Databases, and Caches](https://developers.google.com/web/tools/chrome-devtools/manage-data/local-storage)
109 | * [Inspect and Delete Cookies](https://developers.google.com/web/tools/chrome-devtools/manage-data/cookies)
110 | * [Inspect Resources](https://developers.google.com/web/tools/chrome-devtools/manage-data/page-resources)
111 |
112 | #### Security panel
113 |
114 | Debug mixed content issues, certificate problems, and more.
115 |
116 | * [Understand Security Issues](https://developers.google.com/web/tools/chrome-devtools/security)
117 |
118 | The best place to file feature requests for Chrome DevTools is the mailing list. The team needs to understand use cases, gauge community interest, and discuss feasibility before implementing any new features.
119 |
120 | ## Additionnal tips
121 |
122 | ### Comparing raw HTML and rendered HTML
123 |
124 | Easily compare the raw HTML of a page and the rendered code with [this plugin](https://chrome.google.com/webstore/detail/view-rendered-source/ejgngohbdedoabanmclafpkoogegdpob).
125 | It also provides a _diff_ feature to see line by line what has been modified in the DOM by JavaScript.
126 |
127 |
--------------------------------------------------------------------------------
/javascript/chrome-devtools/chrome-seo-without-a-plugin.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: >-
3 | This post tries to cover many SEO items that can be handled via plain vanilla
4 | Google Chrome without any tools
5 | ---
6 |
7 | # Google Chrome SEO Without A Plugin
8 |
9 | There are numerous add-ons for Chrome that handle many SEO data gathering tasks, but those are not covered here. For those that are unsure about the security of those add-ons, or whether they are mining Litecoin in the background and eating up your CPU cycles, this post is for you. But more over, it is just a reason to play around with Google searches, Chrome Developer Tools, and the Console.
10 |
11 | ## Google Searches
12 |
13 | In this section we cover a few common Google search operators used in SEO. Please let me know what I missed.
14 |
15 | ### Canonical content check
16 |
17 | info:{url}
18 |
19 | ```text
20 | info:https://www.codeseo.io
21 | ```
22 |
23 | ### On-page internal link candidates
24 |
25 | site:{domain} {keyword}
26 |
27 | ```text
28 | site:codeseo.io "googlebot"
29 | ```
30 |
31 |
32 | Bonus Points: Also the way to see if the correct page is ranking for a particular query.
33 |
34 | ### Pages that haven't moved to https
35 |
36 | site:{domain} -inurl:https
37 |
38 | ```text
39 | site:amazon.com -inurl:https
40 | ```
41 |
42 | ### If your page is cached by Google
43 |
44 | cache:{url}
45 |
46 | ```text
47 | cache:https://codeseo.io
48 | ```
49 |
50 | ### Useful parameters in Google searches
51 |
52 | * **nearby={city}**: Filters search results to nearby city.
53 | * **filter=0**: Remove personalization from Google results.
54 | * **num=100**: Display 100 Google results.
55 |
56 | Most of the above provided by [Victor Pan](https://twitter.com/victorpan). Thanks for the idea for this post Victor. Please follow him on Twitter.
57 |
58 | In this section we cover using the Console in Developer Tools to get data from Google and your pages in Chrome.
59 |
60 | ### Scrape Google links
61 |
62 | ```text
63 | $$('h3 a').join('\n')
64 | ```
65 |
66 | ### Scrape Google Images
67 |
68 | ```text
69 |
70 | var imgs=$$('a'); var out = [];
71 | for (i in imgs){
72 | if(imgs[i].href.indexOf("/imgres?imgurl=http")>0){
73 | out.push(decodeURIComponent(imgs[i].href).split(/=|%|&/)[1].split("?imgref")[0]);
74 | }
75 | }
76 | out.join('\n')
77 | ```
78 |
79 |
80 | Hat tip to Peter Nikolow. Read more at his blog [here](http://peter.nikolow.me/kak-da-smaknem-kartinki-ot-google-images/) \(Bulgarian\).
81 |
82 | ### Count links on a page
83 |
84 | ```text
85 | $$('a').length
86 | ```
87 |
88 | ### See the page title
89 |
90 | ```text
91 | document.title
92 | ```
93 |
94 |
95 | Bonus Points:
96 |
97 | ```text
98 | document.title.length
99 | ```
100 |
101 | ### See the page description
102 |
103 | ```text
104 | document.all.description.content
105 | ```
106 |
107 |
108 | Bonus Points:
109 |
110 | ```text
111 | document.all.description.content.length
112 | ```
113 |
114 | ### See the robots meta
115 |
116 | ```text
117 | document.all.robots.content
118 | ```
119 |
120 | ### See the canonical
121 |
122 | ```text
123 | $('link[rel="canonical"]')[0]
124 | ```
125 |
126 | ### Easter eggs in Google Search
127 |
128 | ```text
129 | document.all['easter-egg']
130 | ```
131 |
132 | ### Edit a page live
133 |
134 | ```text
135 | document.designMode = "on"
136 | ```
137 |
138 | ### Get Important N-Grams from Google search results
139 |
140 | ```text
141 |
142 | var stopwords = [
143 | 'about', 'after', 'all', 'also', 'am', 'an', 'and', 'another', 'any', 'are', 'as', 'at', 'be',
144 | 'because', 'been', 'before', 'being', 'between', 'both', 'but', 'by', 'came', 'can',
145 | 'come', 'could', 'did', 'do', 'each', 'for', 'from', 'get', 'got', 'has', 'had',
146 | 'he', 'have', 'her', 'here', 'him', 'himself', 'his', 'how', 'if', 'in', 'into',
147 | 'is', 'it', 'like', 'make', 'many', 'me', 'might', 'more', 'most', 'much', 'must',
148 | 'my', 'never', 'now', 'of', 'on', 'only', 'or', 'other', 'our', 'out', 'over',
149 | 'said', 'same', 'see', 'should', 'since', 'some', 'still', 'such', 'take', 'than',
150 | 'that', 'the', 'their', 'them', 'then', 'there', 'these', 'they', 'this', 'those',
151 | 'through', 'to', 'too', 'under', 'up', 'very', 'was', 'way', 'we', 'well', 'were',
152 | 'what', 'where', 'which', 'while', 'who', 'with', 'would', 'you', 'your', 'a', 'i', 's']
153 |
154 | function nGrams(sentence, limit) {
155 |
156 | ns = [1,2,3,4]; var grams = {};
157 | var words = sentence.replace(/(?:https?|ftp):\/\/[\n\S]+/g, '').toLowerCase().split(/\W+/).filter(function (value) {return stopwords.indexOf(value.toLowerCase()) === -1})
158 | for (n of ns){
159 | var total = words.length - n;
160 | for(var i = 0; i <= total; i++) {
161 | var seq = '';
162 | for (var j = i; j < i + n; j++) { seq += words[j] + ' ';}
163 | if (seq.trim().length < 3) {continue;}else{seq = seq.trim()}
164 | grams[seq] = seq in grams ? grams[seq] + 1 : 1;
165 | }
166 | }
167 | var sort = Object.keys(grams).sort(function(a,b){return grams[b]-grams[a]});
168 | for (s of sort){ if (grams[s] < limit){break;} console.log(s, ':', grams[s]);}
169 | }
170 |
171 | var gtext = document.all.search.innerText
172 | var ng = nGrams(gtext, 3)
173 | ```
174 |
175 | ### Get Google Analytics info
176 |
177 | ```text
178 |
179 | for (const [key, value] of Object.entries(ga.getAll()[0].b.data.values) ) {
180 | if (typeof value === 'string'){
181 | console.log('%s: %s', key.replace(':',''), value);
182 | }
183 | }
184 | ```
185 |
186 |
187 | Bonus Points: Check the hit count for your profiles
188 |
189 | ```text
190 | gaData
191 | ```
192 |
193 | ### See what Google is storing to the google object on search result pages
194 |
195 | ```text
196 |
197 | for (k of Object.keys(google)){
198 | if (typeof google[k] !== 'function') {console.log(k,google[k])}
199 | }
200 | ```
201 |
202 | ### Get load timings from PerformanceTiming
203 |
204 | ```text
205 |
206 | for (t in window.performance.timing){
207 | var tAll = window.performance.timing; var t0 = tAll['navigationStart'];
208 | if (tAll[t] !== "undefined" && (tAll[t]- t0) > 10){
209 | console.log(t,':', (tAll[t]- t0)/1000, 'secs')
210 | }
211 | }
212 | ```
213 |
214 | This section covers some of the best Developer Tools tabs.
215 |
216 | ### Security
217 |
218 | In developer tools. Look for the Security tab to ensure your page and loaded resources are secure.
219 |
220 |
221 | ### Audits
222 |
223 | Use the built-in Lighthouse audits to test your webpage for:
224 |
225 | * Speed
226 | * PWA implementation
227 | * Accessibility
228 | * Best Practices
229 | * SEO
230 |
231 | You have to get the Chrome [canary build](https://www.google.com/chrome/browser/canary.html) for the SEO audit portion to be available through Developer Tools. Otherwise, use the Chrome plugin for Lighthouse.
232 |
233 |
234 | ### Network
235 |
236 | I most often find myself using the network view for a couple things. First, it is great for verifying redirect chaining as long as you have "Preserve log" checked.
237 |
238 |
239 | It is also great for looking at time-to-first-byte \(TTFB\), the time it took a server to respond to you. Also content download timings for image assets, scripts etc. There is also a nice filmstrip view in the latest canary version which quickly shows rendering progression to time.
240 |
241 | ### Responsive
242 |
243 | In responsive view in Developer Tools, it is easy to add additional devices for Googlebot using the user-agent strings found [here](https://support.google.com/webmasters/answer/1061943?hl=en), and window sizes found [here](https://codeseo.io/console-log-hacking-for-googlebot/). This gives you the ability to browse as Googlebot and can uncover strange issues where a particular website either serves bots from different servers or handles those sessions differently. Not a good thing generally.
244 |
245 |
246 | ### Sensors
247 |
248 | In addition to the above, you can use the More Tools > Sensors portion of Developer Tools to set the latitude and longitude to appear from in Google searches. There are many services that give you the lat/lon information but I generally rely on Google \([Google Maps Api](https://maps.googleapis.com/maps/api/geocode/json?address=boston+ma)\). This option has been glitchy for me in the past. Was brought up by [Victor Pan](https://twitter.com/victorpan), confirmed working by [Dan Hinckley](https://twitter.com/dhinckley)
249 |
250 |
251 | ### Shortcuts: Full Size Screenshots
252 |
253 | This is a great tip from [Anthony Nelson](https://twitter.com/anthonydnelson).
254 |
255 | > One of my fave Chrome tips: on Console tab, hit Command+Shift+P to bring up shortcuts. Then type in "screenshot" and select "Capture Full Size Screenshot" - super easy way to get full sized png of a long page.
256 |
257 | ## More Amazing Chrome SEO Tips
258 |
259 | * Aleyda Solis has a wonderful writeup on Seach Engine Land: [Chrome’s DevTools for SEO: 10 ways to use these browser features for your SEO audits](https://searchengineland.com/chromes-devtools-seo-10-ways-use-seo-audits-266433)
260 | * Maria Cieślak goes more indepth with Chrome Developer Tools: [Overview of Chrome Developer Tools’ Most Useful Options for SEO](https://www.elephate.com/blog/chrome-developer-tools-overview/)
261 |
262 | That's all I have. If you have anything to add, please comment below or hit me up on [Twitter](https://twitter.com/jroakes).
263 |
264 |
--------------------------------------------------------------------------------
/python/helpful-code.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Helpful Python Code Snippets for Technical SEOs.
3 | ---
4 |
5 | # Helpful Code
6 |
7 | ## CTR Curves
8 |
9 | Useful for gathering simple Share-of-Traffic information from ranking position. Date can be found at: [https://www.advancedwebranking.com/ctrstudy/](https://www.advancedwebranking.com/ctrstudy/). Most major tool providers have individualized CTR curves for keywords or keyword sets.
10 |
11 | ```python
12 | def add_ctr(x):
13 |
14 | vals = {"1":0.2759,
15 | "2":0.14415,
16 | "3":0.09255,
17 | "4":0.06265,
18 | "5":0.0639,
19 | "6":0.03285,
20 | "7":0.02305,
21 | "8":0.0173,
22 | "9":0.0135,
23 | "10":0.0108,
24 | "11":0.00925,
25 | "12":0.00925,
26 | "13":0.0099,
27 | "14":0.0095,
28 | "15":0.0096,
29 | "16":0.0097,
30 | "17":0.01025,
31 | "18":0.01125,
32 | "19":0.01185,
33 | "20":0.0129}
34 | return vals[str(x)]
35 | ```
36 |
37 | ## Random User-Agent Strings
38 |
39 | Helpful for crawling pages that don't like default Python Requests library User-Agents.
40 |
41 | ```python
42 | import random
43 |
44 | def getUA():
45 |
46 | uastrings = ["Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\
47 | "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36",\
48 | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10) AppleWebKit/600.1.25 (KHTML, like Gecko) Version/8.0 Safari/600.1.25",\
49 | "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0",\
50 | "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\
51 | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",\
52 | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.1.17 (KHTML, like Gecko) Version/7.1 Safari/537.85.10",\
53 | "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko",\
54 | "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0",\
55 | "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"\
56 | ]
57 |
58 | return random.choice(uastrings)
59 | ```
60 |
61 | ## Pull Title and Description from URL
62 |
63 | Simple example of using the requests and BeautifulSoup libraries to pull `` and `` from live URLs.
64 |
65 | ```python
66 | from bs4 import BeautifulSoup
67 | import requests
68 |
69 | # Use beautifulsoup to fetch page titles.
70 | def fetch_meta(url):
71 | #Set UA as Googlebot as the server will only serve to GB
72 | headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36'}
73 | result = {'title':'', 'description':'', 'error':''}
74 |
75 | try:
76 | #Send a GET request to grab the file
77 | response = requests.get(url, headers=headers, timeout=3)
78 |
79 | #Parse the response
80 | soup = BeautifulSoup(response.text)
81 |
82 | #Extract the title
83 | result['title'] = soup.title.string
84 | description_tag = soup.find('meta', attrs={'name':'description'})
85 | if description_tag is not None:
86 | result['description'] = description_tag.get('content')
87 |
88 | except Exception as e:
89 | result['error'] = str(e)
90 |
91 | return result
92 |
93 |
94 | ```
95 | Example:
96 | out = fetch_meta('https://i.codeseo.dev')
97 | out
98 |
99 | [1] {'title': 'Development Resources for Technical SEOs - iCodeSEO',
100 | 'description': 'i.codeseo.dev is a repository of code and reference library for Technical SEOs who are interested in development and data science.',
101 | 'error': ''}
102 | ```
103 | ```
104 |
105 | ## Import Multiple Google SERPs on a Large Scale
106 |
107 | **Setup:**
108 |
109 | 1. [Create a custom search engine](https://cse.google.com/cse/). At first, you might be asked to enter a site to search. Enter any domain, then go to the control panel and remove it. Make sure you enable "Search the entire web" and image search. You will also need to get your search engine ID, which you can find on the control panel page.
110 | 2. [Enable the custom search API](https://console.cloud.google.com/apis/library/customsearch.googleapis.com). The service will allow you to retrieve and display search results from your custom search engine programmatically. You will need to create a project for this first.
111 | 3. [Create credentials for this project](https://console.developers.google.com/apis/api/customsearch.googleapis.com/credentials) so you can get your key.
112 | 4. [Enable billing for your project](https://console.cloud.google.com/billing/projects) if you want to run more than 100 queries per day. The first 100 queries are free; then for each additional 1,000 queries, you pay USD $5.
113 |
114 | ```python
115 | $ pip install advertools
116 |
117 | import advertools as adv
118 | cx = 'YOUR_GOOGLE_CUSTOM_SEARCH_ENGINE_ID'
119 | key = 'YOUR_GOOGLE_DEVELOPER_KEY'
120 |
121 | serp = adv.serp_goog(cx=cx, key=key,
122 | q=['first query', 'second query', 'third query'],
123 | gl=['us', 'ca', 'uk'])
124 |
125 | # many other parameters and combinations are available, check the documentation for details
126 | ```
127 |
128 | ## Crawl a Website by Specifying a Sitemap URL
129 |
130 | This spider will crawl pages specified in an XML sitemap\(s\), and extract standard SEO elements \(title, h1, h2, page size, etc.\). The output of this crawl will be saved to a CSV file.
131 |
132 | How to run:
133 |
134 | 1. Modify the `sitemap_urls` attribute by adding one or more sitemap URLs. A sitemap index page is fine, and will go through all sub sitemaps. Or you can specify normal sitemaps.
135 | 2. \(optional\): Modify any elements that you want to scrape \(publishing date, author name, etc.\) through CSS or Xpath selectors.
136 | 3. Save the file with any name e.g. `my_spider.py`
137 | 4. From your terminal, run the following line:
138 |
139 | `scrapy runspider path/to/my_spider.py -o path/to/output/file.csv`
140 |
141 | where `path/to/output/file.csv`is where you want the scrape result to be saved.
142 |
143 | ```python
144 | from scrapy.spiders import SitemapSpider
145 |
146 | user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
147 |
148 |
149 | class SEOSitemapSpider(SitemapSpider):
150 | name = 'seo_sitemap_spider'
151 | sitemap_urls = [
152 | 'https://www.example.com/sitemap-index.xml', # either will work, regular sitemaps or sitemap index files
153 | 'https://www.example.com/sitemap_1.xml',
154 | 'https://www.example.com/sitemap_2.xml',
155 | ]
156 | custom_settings = {
157 | 'USER_AGENT': user_agent,
158 | 'DOWNLOAD_DELAY': 0, # you might need to make this a 3 or 4 (seconds)
159 | # to be nice to the site's servers and
160 | # to prevent getting banned,
161 | 'ROBOTSTXT_OBEY': True, # SEOs are polite people, right? :)
162 | 'HTTPERROR_ALLOW_ALL': True
163 | }
164 |
165 | def parse(self, response):
166 | yield dict(
167 | url=response.url,
168 | title='@@'.join(response.css('title::text').getall()),
169 | meta_desc=response.xpath("//meta[@name='description']/@content").get(),
170 | h1='@@'.join(response.css('h1::text').getall()),
171 | h2='@@'.join(response.css('h2::text').getall()),
172 | h3='@@'.join(response.css('h3::text').getall()),
173 | body_text='\n'.join(response.css('p::text').getall()),
174 | size=len(response.body),
175 | load_time=response.meta['download_latency'],
176 | status=response.status,
177 | links_href='@@'.join([link.attrib.get('href') or '' for link in response.css('a')]),
178 | links_text='@@'.join([link.attrib.get('title') or '' for link in response.css('a')]),
179 | img_src='@@'.join([im.attrib.get('src') or '' for im in response.css('img')]),
180 | img_alt='@@'.join([im.attrib.get('alt') or '' for im in response.css('img')]),
181 | page_depth=response.meta['depth'],
182 | )
183 | ```
184 |
185 | Items that contain multiple elements, like multiple H2 tags in one page will be listed as one string separated by two @ signs e.g. `first h2 tag@@second tag@@third tag` You simply have to split by "@@" to get a list of elements per page.
186 |
187 | ### Text Mining SEO Data - Word Counting
188 |
189 | Many SEO reports typically come as a text list \(page titles, URLs, keywords, snippets, etc.\) together with one or more number lists \(pageviews, bounces, conversions, time on page, etc.\).
190 |
191 | Looking at the top pages with disproportionately higher numbers might be misleading if you have a significant long tail of topics in the text list. How do you uncover hidden insights?
192 | Let's try to do that with the following report:
193 |
194 | ```python
195 | page_titles = [
196 | 'Learn Python',
197 | 'Python Data Vizualization',
198 | 'Python Programming for Beginners',
199 | 'Data Science for Marketing People',
200 | 'Data Science for Business People',
201 | 'Python for SEO',
202 | 'SEO Text Analysis'
203 | ]
204 |
205 | pageviews = [200, 150, 400, 300, 670, 120, 340]
206 |
207 |
208 | pd.DataFrame(zip(page_titles, pageviews),
209 | columns=['page_titles', 'pageviews'])
210 | ```
211 |
212 | 
213 |
214 | It's clear which is the most viewed article. But are there any hidden insights about the words, and topics occurring in this report that we can't immediately see?
215 |
216 | ```python
217 | import advertools as adv
218 | adv.word_frequency(page_titles, pageviews)
219 | ```
220 |
221 | 
222 |
223 | By counting the words in the titles we can get a better view. Counting word occurrences on an absolute basis \(simple counting\), we can see that "python" is the top topic, occurring four times.
224 | However, on a weighted count basis \(by taking into consideration the total pageviews generated by titles where the word was included\), "data" is the winner word. Pages containing "data" generated a total of 1,120 pageviews.
225 | And on a per-occurrence basis, the word "business" is the winner because it generated the most pageviews by appearing in only one page title.
226 |
227 | To summarize:
228 | - The topic that we wrote the most about was "python"
229 | - The topic that generated the most pageviews was "data"
230 | - The relatively most valuable topic was "business"
231 |
232 |
233 |
--------------------------------------------------------------------------------