├── django_check_seo ├── conf │ ├── __init__.py │ └── settings.py ├── checks │ ├── __init__.py │ ├── custom_list.py │ └── site.py ├── migrations │ ├── __init__.py │ └── 0001_initial.py ├── __init__.py ├── locale │ └── fr │ │ └── LC_MESSAGES │ │ └── django.mo ├── static │ └── django-check-seo │ │ ├── logo.png │ │ ├── logo-small.png │ │ ├── fonts │ │ ├── Roboto-Black.ttf │ │ ├── Roboto-Bold.ttf │ │ ├── Roboto-Light.ttf │ │ ├── Roboto-Thin.ttf │ │ ├── Roboto-Regular.ttf │ │ ├── NOTICE │ │ └── LICENCE │ │ └── design.css ├── apps.py ├── checks_list │ ├── __init__.py │ ├── check_keywords.py │ ├── content_words_number.py │ ├── launch_checks.py │ ├── keyword_present_first_paragraph.py │ ├── check_images.py │ ├── check_url.py │ ├── check_h2.py │ ├── check_keyword_url.py │ ├── check_h1.py │ ├── check_links.py │ ├── check_title.py │ └── check_description.py ├── models.py ├── urls.py ├── cms_toolbars.py ├── templates │ └── django_check_seo │ │ ├── element.html │ │ └── default.html └── views.py ├── setup.py ├── .flake8 ├── AUTHORS.md ├── pytest.ini ├── .bumpversion.cfg ├── .isort.cfg ├── .gitignore ├── MANIFEST.in ├── .coveragerc ├── .github └── ISSUE_TEMPLATE │ ├── feature_request.md │ └── bug_report.md ├── tests_settings.py ├── .pre-commit-config.yaml ├── setup.cfg ├── CONTRIBUTING.md ├── tests ├── test_keywords.py ├── test_url.py ├── test_images.py ├── test_keyword_present_first_paragraph.py ├── test_keyword_url.py ├── test_h1.py ├── test_links.py ├── test_h2.py ├── test_title.py ├── test_description.py └── test_content_words_number.py ├── launch_tests.sh ├── README.md ├── .gitchangelog.rc └── CHANGELOG.rst /django_check_seo/conf/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /django_check_seo/checks/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /django_check_seo/migrations/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | from setuptools import setup 3 | 4 | setup() 5 | -------------------------------------------------------------------------------- /.flake8: -------------------------------------------------------------------------------- 1 | [flake8] 2 | ignore = E501, E203, W503 3 | exclude = .git,__pycache__,build,dist, 4 | -------------------------------------------------------------------------------- /AUTHORS.md: -------------------------------------------------------------------------------- 1 | # Credits 2 | 3 | * Corentin Bettiol 4 | * Adrien Delhorme 5 | -------------------------------------------------------------------------------- /pytest.ini: -------------------------------------------------------------------------------- 1 | [pytest] 2 | python_files = tests.py test_*.py *_tests.py 3 | DJANGO_SETTINGS_MODULE = tests_settings 4 | -------------------------------------------------------------------------------- /.bumpversion.cfg: -------------------------------------------------------------------------------- 1 | [bumpversion] 2 | current_version = 1.0.1 3 | commit = True 4 | tag = True 5 | 6 | [bumpversion:file:setup.cfg] 7 | -------------------------------------------------------------------------------- /.isort.cfg: -------------------------------------------------------------------------------- 1 | [settings] 2 | multi_line_output=3 3 | include_trailing_comma=True 4 | force_grid_wrap=0 5 | use_parentheses=True 6 | line_length=88 7 | -------------------------------------------------------------------------------- /django_check_seo/__init__.py: -------------------------------------------------------------------------------- 1 | import django 2 | 3 | if django.VERSION <= (3, 2): 4 | default_app_config = "django_check_seo.apps.DjangoCheckSEOConfig" 5 | -------------------------------------------------------------------------------- /django_check_seo/locale/fr/LC_MESSAGES/django.mo: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/locale/fr/LC_MESSAGES/django.mo -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/static/django-check-seo/logo.png -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/logo-small.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/static/django-check-seo/logo-small.png -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/fonts/Roboto-Black.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/static/django-check-seo/fonts/Roboto-Black.ttf -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/fonts/Roboto-Bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/static/django-check-seo/fonts/Roboto-Bold.ttf -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/fonts/Roboto-Light.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/static/django-check-seo/fonts/Roboto-Light.ttf -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/fonts/Roboto-Thin.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/static/django-check-seo/fonts/Roboto-Thin.ttf -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/fonts/Roboto-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kapt-labs/django-check-seo/HEAD/django_check_seo/static/django-check-seo/fonts/Roboto-Regular.ttf -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/fonts/NOTICE: -------------------------------------------------------------------------------- 1 | I don't really know what to include here, see why: https://github.com/google/roboto/issues/247 2 | You can find a LICENCE file in this folder. 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | Makefile 2 | build/ 3 | dist/ 4 | django_check_seo.egg-info/ 5 | .venv/ 6 | Pipfile 7 | Pipfile.lock 8 | .vscode/ 9 | *.pyc 10 | venv2 11 | venv3 12 | .coverage 13 | .python-version 14 | -------------------------------------------------------------------------------- /django_check_seo/apps.py: -------------------------------------------------------------------------------- 1 | # -*- coding:utf-8 -*- 2 | 3 | # Third party 4 | from django.apps import AppConfig 5 | 6 | 7 | class DjangoCheckSEOConfig(AppConfig): 8 | name = "django_check_seo" 9 | verbose_name = "Django Check SEO" 10 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include LICENSE 2 | include README.md 3 | include CONTRIBUTING.md 4 | include AUTHORS.md 5 | recursive-include django_check_seo/static/ * 6 | recursive-include django_check_seo/templates/ *.html 7 | recursive-include django_check_seo/locale/ * 8 | -------------------------------------------------------------------------------- /django_check_seo/checks/custom_list.py: -------------------------------------------------------------------------------- 1 | class CustomList: 2 | def __init__(self, *args, **kwargs): 3 | self.name = kwargs.get("name", None) 4 | self.settings = kwargs.get("settings", None) 5 | self.found = kwargs.get("found", None) 6 | self.searched_in = kwargs.get("searched_in", []) 7 | self.description = kwargs.get("description", None) 8 | -------------------------------------------------------------------------------- /.coveragerc: -------------------------------------------------------------------------------- 1 | [run] 2 | omit = django_check_seo/checks/* 3 | django_check_seo/conf/* 4 | */__init__.py 5 | django_check_seo/urls.py 6 | django_check_seo/urls.py 7 | django_check_seo/views.py 8 | django_check_seo/check_list/launch_checks.py 9 | django_check_seo/apps.py 10 | django_check_seo/cms_toolbars.py 11 | django_check_seo/checks_list/launch_checks.py 12 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/__init__.py: -------------------------------------------------------------------------------- 1 | # Standard Library 2 | import glob 3 | from os.path import basename, dirname, isfile, join 4 | 5 | # list files 6 | modules = glob.glob(join(dirname(__file__), "*.py")) 7 | 8 | __all__ = [] 9 | 10 | # add them to __all__ so they can be imported 11 | for module in modules: 12 | if ( 13 | isfile(module) 14 | and not module.endswith("__init__.py") 15 | and not module.endswith("launch_checks.py") 16 | ): 17 | __all__.append(basename(module)[:-3]) 18 | -------------------------------------------------------------------------------- /django_check_seo/models.py: -------------------------------------------------------------------------------- 1 | from django.db import models 2 | 3 | try: 4 | from django.utils.translation import ugettext_lazy as _ 5 | except ImportError: 6 | from django.utils.translation import gettext_lazy as _ 7 | 8 | 9 | class DjangoCheckSEOPermissions(models.Model): 10 | class Meta: 11 | managed = False 12 | default_permissions = () 13 | permissions = ( 14 | ( 15 | "use_django_check_seo", 16 | _( 17 | "View the Check SEO button (if using Django CMS) and use Django Check SEO" 18 | ), 19 | ), 20 | ) 21 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /django_check_seo/urls.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | # hacky trick to add python2 compatibility to a python3 project after python2 eol 3 | import django 4 | from django.contrib.admin.views.decorators import staff_member_required 5 | 6 | # Local application / specific library imports 7 | from . import views 8 | 9 | version = django.get_version() 10 | if int(version[0]) > 1: 11 | from django.urls import path 12 | 13 | urlpatterns = [ 14 | path("", staff_member_required(views.IndexView.as_view()), name="Index") 15 | ] 16 | 17 | else: 18 | from django.conf.urls import url 19 | 20 | urlpatterns = [ 21 | url("^.*$", staff_member_required(views.IndexView.as_view()), name="Index"), 22 | ] 23 | -------------------------------------------------------------------------------- /tests_settings.py: -------------------------------------------------------------------------------- 1 | DATABASES = { 2 | "default": { 3 | "ENGINE": "django.db.backends.sqlite3", 4 | } 5 | } 6 | 7 | INSTALLED_APPS = ( 8 | "django.contrib.auth", 9 | "django.contrib.contenttypes", 10 | "django.contrib.sessions", 11 | "django.contrib.admin", 12 | "django.contrib.sites", 13 | "django.contrib.sitemaps", 14 | "django.contrib.staticfiles", 15 | "django.contrib.messages", 16 | ) 17 | 18 | ROOT_URLCONF = "base_urls" 19 | 20 | TEMPLATES = [ 21 | { 22 | "BACKEND": "django.template.backends.django.DjangoTemplates", 23 | "OPTIONS": { 24 | "context_processors": [], 25 | "loaders": [], 26 | }, 27 | }, 28 | ] 29 | 30 | MIDDLEWARE = [] 31 | 32 | SECRET_KEY = "a_key" 33 | -------------------------------------------------------------------------------- /django_check_seo/cms_toolbars.py: -------------------------------------------------------------------------------- 1 | # see https://github.com/kapt-labs/django-check-seo/wiki/Toolbar-shortcut#cms_toolbarspy 2 | 3 | # Third party 4 | from cms.toolbar_base import CMSToolbar 5 | from cms.toolbar_pool import toolbar_pool 6 | 7 | try: 8 | from django.utils.translation import ugettext_lazy as _ 9 | except ImportError: 10 | from django.utils.translation import gettext_lazy as _ 11 | 12 | 13 | class DjangoSeoToolbar(CMSToolbar): 14 | def populate(self): 15 | 16 | self.toolbar.add_sideframe_item( 17 | _("Check SEO"), # text 18 | "/django-check-seo/?page=" 19 | + self.request.path, # url (+ current page passed as a GET parameter) 20 | ) 21 | 22 | 23 | # register the toolbar 24 | toolbar_pool.register(DjangoSeoToolbar) 25 | -------------------------------------------------------------------------------- /django_check_seo/templates/django_check_seo/element.html: -------------------------------------------------------------------------------- 1 | {% load i18n %} 2 | {% autoescape off %} 3 |
  • 4 |
    5 | {{ element.name }} ({% trans "settings" %}: {% autoescape off %}{{ element.settings }}, {% trans "found" context "checkseo" %}: {{ element.found }}) 6 |
    7 |

    8 | {{ element.description }}{% endautoescape %} 9 |

    10 |
      11 |
    • {% trans "Searched in:" %}
    • 12 | {% for searched_in in element.searched_in %} 13 |
    • {{ searched_in }}
    • 14 | {% empty %} 15 |
    • {% trans "No data" %}
    • 16 | {%endfor %} 17 |
    18 |
    19 |
    20 |
  • 21 | {% endautoescape %} 22 | -------------------------------------------------------------------------------- /django_check_seo/conf/settings.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | from django.conf import settings 3 | 4 | # define basic SEO settings, see 5 | DJANGO_CHECK_SEO_SETTINGS = { 6 | "content_words_number": [300, 600], 7 | "internal_links": 1, 8 | "external_links": 1, 9 | "meta_title_length": [30, 60], 10 | "meta_description_length": [50, 160], 11 | "keywords_in_first_words": 50, 12 | "max_link_depth": 4, 13 | "max_url_length": 70, 14 | } 15 | # update settings redefined in projectname/settings.py 16 | DJANGO_CHECK_SEO_SETTINGS.update(getattr(settings, "DJANGO_CHECK_SEO_SETTINGS", {})) 17 | 18 | # define css selector to search content into (used for retrieving main content of the page) 19 | DJANGO_CHECK_SEO_SEARCH_IN = { 20 | "type": "exclude", 21 | "selectors": ["header", ".cover-section", "#footer"], 22 | } 23 | 24 | # 25 | DJANGO_CHECK_SEO_EXCLUDE_CONTENT = getattr( 26 | settings, "DJANGO_CHECK_SEO_EXCLUDE_CONTENT", "" 27 | ) 28 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Screenshots** 24 | If applicable, add screenshots to help explain your problem. 25 | 26 | **Desktop (please complete the following information):** 27 | - OS: [e.g. iOS] 28 | - Browser [e.g. chrome, safari] 29 | - Version [e.g. 22] 30 | 31 | **Smartphone (please complete the following information):** 32 | - Device: [e.g. iPhone6] 33 | - OS: [e.g. iOS8.1] 34 | - Browser [e.g. stock browser, safari] 35 | - Version [e.g. 22] 36 | 37 | **Additional context** 38 | Add any other context about the problem here. 39 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | - repo: https://github.com/ambv/black 3 | rev: 23.11.0 4 | hooks: 5 | - id: black 6 | files: ^.*\.py$ 7 | 8 | - repo: https://github.com/pre-commit/pre-commit-hooks 9 | rev: v4.5.0 10 | hooks: 11 | - id: check-added-large-files 12 | - id: check-merge-conflict 13 | - id: debug-statements 14 | 15 | - repo: https://github.com/PyCQA/isort 16 | rev: 5.12.0 17 | hooks: 18 | - id: isort 19 | files: ^.*\.py$ 20 | args: ["--profile", "black"] 21 | 22 | - repo: https://github.com/PyCQA/flake8 23 | rev: 6.1.0 24 | hooks: 25 | - id: flake8 26 | files: ^.*\.py$ 27 | 28 | - repo: https://github.com/rtts/djhtml 29 | rev: 3.0.6 30 | hooks: 31 | - id: djhtml 32 | args: [--tabwidth=2] 33 | files: ^.*\.html$ 34 | - id: djjs 35 | - id: djcss 36 | 37 | - repo: https://gitlab.com/kapt/open-source/git-hooks 38 | rev: v1.2.0 39 | hooks: 40 | - id: commit-msg 41 | always_run: true 42 | 43 | -------------------------------------------------------------------------------- /django_check_seo/migrations/0001_initial.py: -------------------------------------------------------------------------------- 1 | # Generated by Django 4.2.9 on 2024-01-25 12:38 2 | 3 | from django.db import migrations, models 4 | 5 | 6 | class Migration(migrations.Migration): 7 | initial = True 8 | 9 | dependencies = [] 10 | 11 | operations = [ 12 | migrations.CreateModel( 13 | name="DjangoCheckSEOPermissions", 14 | fields=[ 15 | ( 16 | "id", 17 | models.AutoField( 18 | auto_created=True, 19 | primary_key=True, 20 | serialize=False, 21 | verbose_name="ID", 22 | ), 23 | ), 24 | ], 25 | options={ 26 | "permissions": ( 27 | ( 28 | "use_django_check_seo", 29 | "View the Check SEO button (if using Django CMS) and use Django Check SEO", 30 | ), 31 | ), 32 | "managed": False, 33 | "default_permissions": (), 34 | }, 35 | ), 36 | ] 37 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | name = django-check-seo 3 | version = 1.0.1 4 | description = Django Check SEO will check the SEO aspects of your site for you, and will provide advice in case of problems. 5 | long_description = file: README.md 6 | long_description_content_type = text/markdown 7 | url = https://github.com/kapt-labs/django-check-seo 8 | author = Dev Kapt 9 | author_email = dev@kapt.mobi 10 | classifiers = 11 | Environment :: Web Environment 12 | Framework :: Django 13 | Intended Audience :: Developers 14 | License :: OSI Approved :: GNU General Public License v3 (GPLv3) 15 | Programming Language :: Python 16 | Programming Language :: Python :: 2 17 | Programming Language :: Python :: 2.7 18 | Programming Language :: Python :: 3 19 | Programming Language :: Python :: 3 20 | Programming Language :: Python :: 3.7 21 | Programming Language :: Python :: 3.8 22 | Topic :: Internet :: WWW/HTTP 23 | Topic :: Internet :: WWW/HTTP :: Dynamic Content 24 | 25 | [options] 26 | include_package_data = true 27 | packages = find: 28 | install_requires = 29 | djangocms-page-meta 30 | requests 31 | beautifulsoup4 32 | lxml 33 | unidecode 34 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | How can you contribute to this project ? 4 | 5 | ## Add issue 6 | 7 | You can help by finding new issues & reporting them by creating [new issues](https://github.com/kapt-labs/django-check-seo/issues). 8 | 9 | ## Add code 10 | 11 | You can help by picking an issue, or choosing to add a new test/feature (create an issue before you start coding). 12 | 13 | 0. Create a new issue, receive positive feedback. 14 | 15 | 1. Fork the repo, clone it. 16 | 17 | 2. Install pre-commit & unit tests dependencies. 18 | ```bash 19 | python3 -m pip install pre-commit python3-venv 20 | python2 -m pip install virtualenv 21 | ``` 22 | 23 | 3. Install pre-commit hooks. 24 | ```bash 25 | pre-commit install 26 | ``` 27 | 28 | 4. Create new branch. 29 | ```bash 30 | git checkout -b mybranch 31 | ``` 32 | 33 | 5. Add your code. 34 | 35 | 6. (*Facultative*) Add tests ? 36 | 37 | 7. Add yourself in [AUTHORS.md](AUTHORS.md). 38 | 39 | 8. Commit, push. 40 | *Make sure that pre-commit runs isort, black, flake8 & `launch_checks.sh`. [Example](https://github.com/kapt-labs/django-check-seo/commit/da1d0be5d3ebe6734585cd5dd7027186d432ccd0#commitcomment-38147459).* 41 | 42 | 9. Create a [Pull Request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request). 43 | 44 | 10. ![That's all folks!](https://i.imgur.com/o2Tcd2E.png) 45 | 46 | ---- 47 | 48 | ### Commit description guidelines 49 | 50 | We're using bluejava's [git-commit-guide](https://github.com/bluejava/git-commit-guide) for our commits description. Here's a quick reference: 51 | 52 | ![Reference git-commit-guide](https://raw.githubusercontent.com/bluejava/git-commit-guide/master/gitCommitMsgGuideQuickReference.png) 53 | -------------------------------------------------------------------------------- /django_check_seo/checks/site.py: -------------------------------------------------------------------------------- 1 | # Standard Library 2 | import re 3 | 4 | # Local application / specific library imports 5 | from ..conf import settings 6 | 7 | 8 | class Site: 9 | """Structure containing a good amount of resources from the targeted webpage: 10 | - the settings 11 | - the soup (from beautifulsoup) 12 | - the content (all html except header & menu) 13 | - the full url 14 | - the keywords 15 | - the problems & warnings 16 | """ 17 | 18 | def __init__(self, soup, full_url): 19 | """Populate some vars. 20 | 21 | Arguments: 22 | soup {bs4.element} -- beautiful soup content (html) 23 | full_url {str} -- full url 24 | """ 25 | self.settings = settings 26 | 27 | self.soup = soup 28 | 29 | # take all content 30 | self.content = self.soup.find_all("body") 31 | 32 | # if we have settings to remove unwanted blocks 33 | if settings.DJANGO_CHECK_SEO_EXCLUDE_CONTENT != "": 34 | # iterate through each body (should be only 1) 35 | for body in self.content: 36 | # and remove selecte blocks 37 | for node in body.select(settings.DJANGO_CHECK_SEO_EXCLUDE_CONTENT): 38 | node.extract() 39 | 40 | # get content without doublewords thx to custom separator ("

    Title


    Content

    " -> TitleContent) 41 | self.content_text = "" 42 | for c in self.content: 43 | self.content_text += c.get_text(separator=" ") 44 | 45 | # strip multiple carriage return (with optional space) to only one 46 | self.content_text = re.sub(r"(\n( ?))+", "\n", self.content_text) 47 | # strip multiples spaces (>3) to only 2 (for title readability) 48 | self.content_text = re.sub(r" +", " ", self.content_text) 49 | 50 | self.full_url = full_url 51 | self.keywords = [] 52 | self.problems = [] 53 | self.warnings = [] 54 | self.success = [] 55 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_keywords.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | from django.utils.translation import gettext as _ 3 | from django.utils.translation import pgettext 4 | 5 | # Local application / specific library imports 6 | from ..checks import custom_list 7 | 8 | 9 | def importance(): 10 | """Scripts with higher importance will be executed first. 11 | 12 | Returns: 13 | int -- Importance of the script. 14 | """ 15 | return 5 16 | 17 | 18 | def run(site): 19 | """Checks that meta tag exists and contain at least one keyword. 20 | Populate site.keywords list with keywords found. 21 | 22 | Arguments: 23 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 24 | """ 25 | 26 | no_keywords = custom_list.CustomList( 27 | name=_("No keywords in meta keywords field"), 28 | settings=pgettext("masculin", "at least one"), 29 | found=pgettext("masculin", "none"), 30 | description=_( 31 | "Django-check-seo uses the keywords in the meta keywords field to check all other tests related to the keywords. A series of problems and warnings are related to keywords, and will therefore systematically be activated if the keywords are not filled in." 32 | ), 33 | ) 34 | 35 | keywords_found = custom_list.CustomList( 36 | name=_("Keywords found in meta keywords field"), 37 | settings=pgettext("masculin", "at least one"), 38 | found="", 39 | description=no_keywords.description, 40 | ) 41 | 42 | meta = site.soup.find_all("meta") 43 | for tag in meta: 44 | if ( 45 | "name" in tag.attrs 46 | and tag.attrs["name"] == "keywords" 47 | and "content" in tag.attrs 48 | and tag.attrs["content"] != "" 49 | ): 50 | # get keywords for next checks 51 | site.keywords = list(map(str.strip, tag.attrs["content"].split(","))) 52 | 53 | keywords_found.found = len(site.keywords) 54 | keywords_found.searched_in = site.keywords 55 | site.success.append(keywords_found) 56 | 57 | return 58 | 59 | site.problems.append(no_keywords) 60 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/content_words_number.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | from django.utils.translation import gettext as _ 3 | 4 | # Local application / specific library imports 5 | from ..checks import custom_list 6 | 7 | 8 | def importance(): 9 | """Scripts with higher importance will be executed first. 10 | 11 | Returns: 12 | int -- Importance of the script. 13 | """ 14 | return 1 15 | 16 | 17 | def run(site): 18 | """Checks the number of words in the extracted content. 19 | 20 | Arguments: 21 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 22 | """ 23 | 24 | short_content = custom_list.CustomList( 25 | name=_("Content is too short"), 26 | settings=_("at least {min} words, more than {min2} if possible").format( 27 | min=site.settings.DJANGO_CHECK_SEO_SETTINGS["content_words_number"][0], 28 | min2=site.settings.DJANGO_CHECK_SEO_SETTINGS["content_words_number"][1], 29 | ), 30 | description=_( 31 | "Longer articles tend to be better ranked, but will require better writing skills than shorter articles." 32 | ), 33 | ) 34 | 35 | nb_words = len(site.content_text.split()) 36 | short_content.found = nb_words 37 | short_content.searched_in = [site.content_text] 38 | 39 | # too few words 40 | if nb_words < site.settings.DJANGO_CHECK_SEO_SETTINGS["content_words_number"][0]: 41 | site.problems.append(short_content) 42 | 43 | elif nb_words < site.settings.DJANGO_CHECK_SEO_SETTINGS["content_words_number"][1]: 44 | site.warnings.append(short_content) 45 | 46 | else: 47 | short_content.name = _("Content length is right") 48 | short_content.searched_in = [ 49 | " ".join( 50 | site.content_text.split()[ 51 | : site.settings.DJANGO_CHECK_SEO_SETTINGS["content_words_number"][1] 52 | ] 53 | ) 54 | + '' 55 | + " ".join( 56 | site.content_text.split()[ 57 | site.settings.DJANGO_CHECK_SEO_SETTINGS["content_words_number"][1] : 58 | ] 59 | ) 60 | + "" 61 | ] 62 | site.success.append(short_content) 63 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/launch_checks.py: -------------------------------------------------------------------------------- 1 | # Standard Library 2 | import importlib 3 | import sys 4 | 5 | from . import * # noqa: F403,F401 6 | 7 | # hacky thing aiming to add python2 compatibility after eol of python2 8 | try: 9 | ModuleNotFoundError 10 | except NameError: 11 | ModuleNotFoundError = ImportError 12 | 13 | try: 14 | from checks_list import * # noqa: F403,F401 15 | 16 | nomodules = False 17 | except ModuleNotFoundError: 18 | nomodules = True 19 | 20 | 21 | def launch_checks(site): 22 | """All the checks are performed here. Called in get_context_data(). 23 | All functions should do its test(s), then add a dict in site.problems or site.warnings. 24 | 25 | Arguments: 26 | site {Site} -- A set of useful vars that can be used by the functions (including problems & warnings, two lists of dict). 27 | """ 28 | 29 | modules_order = [] 30 | 31 | # hacky trick to add python2 compatibility to a python3 project after python2 eol 32 | python_2_compatibility_array = [ 33 | "django_check_seo.checks_list.launch_checks", 34 | "django_check_seo.checks_list.glob", 35 | "django_check_seo.checks_list.re", 36 | "django_check_seo.checks_list.bs4", 37 | "django_check_seo.checks_list.sys", 38 | "django_check_seo.checks_list.os", 39 | "django_check_seo.checks_list.importlib", 40 | "django_check_seo.checks_list.urlparse", 41 | "django_check_seo.checks_list.django", 42 | "django_check_seo.checks_list.unidecode", 43 | "django_check_seo.checks_list.__future__", 44 | ] 45 | 46 | # only get modules in ...checks.* 47 | for module_name in sys.modules: 48 | if ( 49 | "django_check_seo.checks_list." in module_name 50 | and module_name not in python_2_compatibility_array 51 | ) or (module_name.startswith("checks_list.")): 52 | module = importlib.import_module(module_name) 53 | get_module_order = getattr(module, "importance") 54 | 55 | # get the importance 56 | modules_order.append([module, get_module_order()]) 57 | 58 | # execute modules with higher importance first from sorted list 59 | for module in sorted(modules_order, key=lambda x: x[1], reverse=True): 60 | getattr(module[0], "run")(site) 61 | -------------------------------------------------------------------------------- /tests/test_keywords.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Use ./launch_tests.sh to launch these tests. 4 | 5 | from bs4 import BeautifulSoup 6 | from django_check_seo.checks import site 7 | 8 | html_content = """ 9 | 10 | 11 | 12 | 13 | 14 | 15 | """ 16 | 17 | 18 | class init: 19 | def __init__(self): 20 | self.keywords = [] 21 | self.problems = [] 22 | self.warnings = [] 23 | self.success = [] 24 | 25 | self.soup = BeautifulSoup(html_content, features="lxml") 26 | self.full_url = "https://localhost/fake-url/title-of-the-page/" 27 | # populate class with data 28 | self.page_stats = site.Site(self.soup, self.full_url) 29 | 30 | 31 | def test_keyword_importance(): 32 | from django_check_seo.checks_list import check_keywords 33 | 34 | assert check_keywords.importance() == 5 35 | 36 | 37 | def test_keyword_kw(): 38 | from django_check_seo.checks_list import check_keywords 39 | 40 | site = init() 41 | 42 | check_keywords.run(site) 43 | 44 | for success in site.success: 45 | if success.name == "Keywords found in meta keywords field": 46 | assert success.name == "Keywords found in meta keywords field" 47 | assert success.settings == "at least one" 48 | assert success.found == 2 49 | assert success.searched_in == ["description", "title"] 50 | assert ( 51 | success.description 52 | == "Django-check-seo uses the keywords in the meta keywords field to check all other tests related to the keywords. A series of problems and warnings are related to keywords, and will therefore systematically be activated if the keywords are not filled in." 53 | ) 54 | 55 | 56 | def test_keyword_nokw(): 57 | from django_check_seo.checks_list import check_keywords 58 | 59 | site = init() 60 | site.soup.select('meta[name="keywords"]')[0]["content"] = "" 61 | 62 | check_keywords.run(site) 63 | 64 | for problem in site.problems: 65 | if problem.name == "No keywords in meta keywords field": 66 | assert problem.name == "No keywords in meta keywords field" 67 | assert problem.settings == "at least one" 68 | assert problem.found == "none" 69 | assert problem.searched_in == [] 70 | assert ( 71 | problem.description 72 | == "Django-check-seo uses the keywords in the meta keywords field to check all other tests related to the keywords. A series of problems and warnings are related to keywords, and will therefore systematically be activated if the keywords are not filled in." 73 | ) 74 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/keyword_present_first_paragraph.py: -------------------------------------------------------------------------------- 1 | import re 2 | 3 | # Third party 4 | from django.utils.translation import gettext as _ 5 | from django.utils.translation import pgettext 6 | 7 | # Local application / specific library imports 8 | from ..checks import custom_list 9 | 10 | 11 | def importance(): 12 | """Scripts with higher importance will be executed first. 13 | 14 | Returns: 15 | int -- Importance of the script. 16 | """ 17 | return 1 18 | 19 | 20 | def run(site): 21 | """Checks that at least one keyword is included in the first paragraph. 22 | 23 | Arguments: 24 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 25 | """ 26 | 27 | no_keywords = custom_list.CustomList( 28 | name=_("No keyword in first paragraph"), 29 | settings=_("before {settings} words").format( 30 | settings=site.settings.DJANGO_CHECK_SEO_SETTINGS["keywords_in_first_words"] 31 | ), 32 | found="", 33 | description=_( 34 | "The reader will be relieved to find one of his keywords in the first paragraph of your page, and the same logic applies to Google, which will consider the content more relevant." 35 | ), 36 | ) 37 | 38 | first_words = site.content_text.lower().split()[ 39 | : site.settings.DJANGO_CHECK_SEO_SETTINGS["keywords_in_first_words"] 40 | ] 41 | # check text and not list of words in order to find keywords that looks like "this is a keyword" in text "words this is a keyword words" 42 | first_words = " ".join(first_words) 43 | first_words_text = first_words.lower() 44 | 45 | occurrence = [] 46 | first_words_kw = [] 47 | 48 | for keyword in site.keywords: 49 | keyword = keyword.lower() 50 | nb_occurrences = len( 51 | re.findall( 52 | r"(^| |\n|,|\.|!|\?)" + keyword + r"s?($| |\n|,|\.|!|\?)", 53 | first_words, 54 | ) 55 | ) 56 | occurrence.append(nb_occurrences) 57 | 58 | if nb_occurrences > 0: 59 | first_words_text = first_words_text.replace( 60 | keyword, '' + keyword + "" 61 | ) 62 | if no_keywords.found != "": 63 | no_keywords.found += ", " 64 | no_keywords.found += keyword 65 | first_words_kw.append(first_words_text) 66 | 67 | no_keywords.searched_in = first_words_kw 68 | 69 | # no keyword was found in first paragraph 70 | if not any(i > 0 for i in occurrence): 71 | no_keywords.found = pgettext("masculin", "none") 72 | site.problems.append(no_keywords) 73 | 74 | else: 75 | no_keywords.name = _("Keywords found in first paragraph") 76 | site.success.append(no_keywords) 77 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_images.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | import bs4 3 | from django.utils.translation import gettext as _ 4 | 5 | # Local application / specific library imports 6 | from ..checks import custom_list 7 | 8 | 9 | def importance(): 10 | """Scripts with higher importance will be executed first. 11 | 12 | Returns: 13 | int -- Importance of the script. 14 | """ 15 | return 1 16 | 17 | 18 | def run(site): 19 | """Checks that each image has a filled alt tag. 20 | 21 | Arguments: 22 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 23 | """ 24 | 25 | lack_alt = custom_list.CustomList( 26 | name=_("Images lack alt tag"), 27 | settings=_("all images"), 28 | found="", 29 | description=_( 30 | 'Your images should have an alt tag, because it improves accessibility for visually impaired people.
    But "sometimes there is non-text content that really is not meant to be seen or understood by the user" (WCAG). For this kind of non-text content, you can leave your alt tag empty.
    The name of the file is important too, because it helps Google understand what your image is about. For example, you could rename a file named "IMG0001.jpg" to "tree_with_a_bird.jpg".' 31 | ), 32 | ) 33 | 34 | enough_alt = custom_list.CustomList( 35 | name=_("Images have alt tag"), 36 | settings=_("all images"), 37 | found="", 38 | description=lack_alt.description, 39 | ) 40 | 41 | images = bs4.element.ResultSet(None) 42 | 43 | for c in site.content: 44 | images += c.find_all("img") 45 | 46 | warning = 0 47 | imgs = [] 48 | 49 | for image in images: 50 | img_str = image.attrs["src"].split("/")[-1] 51 | if ( 52 | "alt" not in image.attrs 53 | or image.attrs["alt"] == "None" 54 | or image.attrs["alt"] == "" 55 | ): 56 | warning += 1 57 | 58 | # bold without alt tag content 59 | if image.attrs["src"] != "": 60 | imgs.append( 61 | '' 64 | + img_str 65 | + "" 66 | ) 67 | # bold without alt tag content & without src content (dead img ?) 68 | else: 69 | imgs.append("unknown image") 70 | 71 | # normal with alt tag content 72 | else: 73 | imgs.append( 74 | '' 77 | + img_str 78 | + " (" 79 | + image.attrs["alt"] 80 | + ")" 81 | ) 82 | 83 | if warning > 0: 84 | lack_alt.found = warning 85 | lack_alt.searched_in = imgs 86 | site.warnings.append(lack_alt) 87 | else: 88 | if len(images) > 0: 89 | enough_alt.found = len(images) 90 | enough_alt.searched_in = imgs 91 | site.success.append(enough_alt) 92 | -------------------------------------------------------------------------------- /django_check_seo/views.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | from bs4 import BeautifulSoup 4 | from django.contrib.auth.mixins import PermissionRequiredMixin 5 | from django.test import Client 6 | from django.utils.translation import gettext as _ 7 | from django.utils.translation import ngettext 8 | from django.views import generic 9 | 10 | from .checks import site 11 | from .checks_list import launch_checks 12 | from .conf import settings 13 | 14 | 15 | class IndexView(PermissionRequiredMixin, generic.base.TemplateView): 16 | template_name = "django_check_seo/default.html" 17 | permission_required = "django_check_seo.use_django_check_seo" 18 | 19 | def get_context_data(self, *args, **kwargs): 20 | context = super(generic.base.TemplateView, self).get_context_data( 21 | *args, **kwargs 22 | ) 23 | 24 | client = Client() 25 | page = self.request.GET.get("page") 26 | response = client.get(page, follow=True) 27 | 28 | soup = BeautifulSoup(response.content, features="lxml") 29 | 30 | # populate class with data 31 | page_stats = site.Site(soup, page) 32 | 33 | # magic happens here! 34 | launch_checks.launch_checks(page_stats) 35 | 36 | # end of magic, get collected problems/warnings/success and put them inside the context 37 | (context["problems"], context["warnings"], context["success"]) = ( 38 | page_stats.problems, 39 | page_stats.warnings, 40 | page_stats.success, 41 | ) 42 | 43 | context["settings"] = json.dumps(settings.DJANGO_CHECK_SEO_SETTINGS, indent=4) 44 | context["html"] = page_stats.content 45 | context["text"] = page_stats.content_text 46 | context["keywords"] = page_stats.keywords 47 | 48 | nb_problems = len(context["problems"]) 49 | nb_warnings = len(context["warnings"]) 50 | 51 | # define some fancy-looking text here because it is wayyy to dirty with all the {% trans %} in template 52 | if nb_problems == 0 and nb_warnings == 0: 53 | context["nb_problems_warnings"] = _( 54 | 'No problem was found on the page!' 55 | ) 56 | else: 57 | if nb_problems == 0: 58 | context["nb_problems_warnings"] = _( 59 | 'No problem found, and ' 60 | ) 61 | else: 62 | context["nb_problems_warnings"] = '' 63 | context["nb_problems_warnings"] += ngettext( 64 | "{nb_problems} problem found, and ", 65 | "{nb_problems} problems found, and ", 66 | nb_problems, 67 | ).format(nb_problems=nb_problems) 68 | 69 | if nb_warnings == 0: 70 | context["nb_problems_warnings"] += _( 71 | 'no warning raised' 72 | ) 73 | else: 74 | context["nb_problems_warnings"] += '' 75 | context["nb_problems_warnings"] += ngettext( 76 | "{nb_warnings} warning raised", 77 | "{nb_warnings} warnings raised", 78 | nb_warnings, 79 | ).format(nb_warnings=nb_warnings) 80 | 81 | return context 82 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_url.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | from django.utils.translation import gettext as _ 3 | 4 | # Local application / specific library imports 5 | from ..checks import custom_list 6 | 7 | 8 | def importance(): 9 | """Scripts with higher importance will be executed first. 10 | 11 | Returns: 12 | int -- Importance of the script. 13 | """ 14 | return 1 15 | 16 | 17 | def run(site): 18 | """Check the length of the url and the maximum depth. 19 | 20 | Arguments: 21 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 22 | """ 23 | 24 | deep_url = custom_list.CustomList( 25 | name=_("Too many levels in path"), 26 | settings=_("less than {}").format( 27 | site.settings.DJANGO_CHECK_SEO_SETTINGS["max_link_depth"] 28 | ), 29 | description=_( 30 | "Google recommand to organize your content by adding depth in your url, but advises against putting too much depth." 31 | ), 32 | ) 33 | 34 | long_url = custom_list.CustomList( 35 | name=_("URL is too long"), 36 | settings=_("less than {}").format( 37 | site.settings.DJANGO_CHECK_SEO_SETTINGS["max_url_length"] 38 | ), 39 | description=_("Shorter URLs tend to rank better than long URLs."), 40 | ) 41 | 42 | # check url depth 43 | # do not count first and last slashes (after domain name and at the end of the url), nor // in the "http://" 44 | url_without_two_points_slash_slash = site.full_url.replace("://", ":..") 45 | number_of_slashes = url_without_two_points_slash_slash[:-1].count("/") 46 | 47 | deep_url.found = number_of_slashes 48 | 49 | # replace concernet / by underlined /, and put only too mush slashes in red background 50 | deep_url.searched_in = [ 51 | url_without_two_points_slash_slash.replace( 52 | "/", '/', number_of_slashes 53 | ) 54 | .replace(":..", "://") 55 | .replace( 56 | '', 57 | "", 58 | site.settings.DJANGO_CHECK_SEO_SETTINGS["max_link_depth"], 59 | ) 60 | ] 61 | 62 | if number_of_slashes > site.settings.DJANGO_CHECK_SEO_SETTINGS["max_link_depth"]: 63 | site.problems.append(deep_url) 64 | else: 65 | deep_url.name = _("Right amount of level in path") 66 | deep_url.searched_in = [ 67 | url_without_two_points_slash_slash.replace( 68 | "/", '/', number_of_slashes 69 | ) 70 | .replace(":..", "://") 71 | .replace( 72 | '', 73 | "", 74 | site.settings.DJANGO_CHECK_SEO_SETTINGS["max_link_depth"], 75 | ) 76 | ] 77 | site.success.append(deep_url) 78 | 79 | # check url length 80 | url_without_protocol = site.full_url.replace("http://", "").replace("https://", "") 81 | long_url.found = len(url_without_protocol) 82 | long_url.searched_in = [url_without_protocol] 83 | 84 | if ( 85 | len(url_without_protocol) 86 | > site.settings.DJANGO_CHECK_SEO_SETTINGS["max_url_length"] 87 | ): 88 | site.warnings.append(long_url) 89 | long_url.searched_in = [ 90 | url_without_protocol[ 91 | : site.settings.DJANGO_CHECK_SEO_SETTINGS["max_url_length"] 92 | ] 93 | + '' 94 | + url_without_protocol[ 95 | site.settings.DJANGO_CHECK_SEO_SETTINGS["max_url_length"] : 96 | ] 97 | + "" 98 | ] 99 | 100 | else: 101 | long_url.name = _("URL length is great") 102 | site.success.append(long_url) 103 | -------------------------------------------------------------------------------- /django_check_seo/templates/django_check_seo/default.html: -------------------------------------------------------------------------------- 1 | {% load i18n static %} 2 | 3 | 4 | 5 | 6 | Django check SEO 7 | 8 | 9 | 10 |
    11 | 12 |
    13 |

    Django Check SEO

    14 |
    15 | 16 |
      17 |
    • {% trans "Keywords:" %}
    • 18 | {% if keywords|length > 0 %} 19 | {{ keywords|unordered_list }} 20 | {% else %} 21 |
    • {% trans "no keywords!" %}
    • 22 | {% endif %} 23 |
    24 | 25 |

    26 | {% autoescape off %}{{ nb_problems_warnings }}{% endautoescape %} 27 |

    28 | 29 |

    {% trans "on the public page" %}

    30 | 31 |
      32 | {% for problem in problems %} 33 | {% include "django_check_seo/element.html" with element=problem %} 34 | {% endfor %} 35 |
    36 | 37 |
      38 | {% for warning in warnings %} 39 | {% include "django_check_seo/element.html" with element=warning %} 40 | {% endfor %} 41 |
    42 | 43 | {% if success|length > 0 %} 44 |
    45 |
      46 |

      {%trans "Successful checks" %}

      47 | {% for successful_check in success %} 48 | {% include "django_check_seo/element.html" with element=successful_check %} 49 | {% endfor %} 50 |
    51 | {% endif %} 52 | 53 |
    54 | 55 |
    56 | 57 | {% block seo_aside %} 58 | 108 | {% endblock seo_aside %} 109 | 110 |
    111 | 112 | 113 | 114 | -------------------------------------------------------------------------------- /tests/test_url.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Use ./launch_tests.sh to launch these tests. 4 | 5 | from bs4 import BeautifulSoup 6 | from django_check_seo.checks import site 7 | 8 | html_content = """ 9 | 10 | 11 | 12 | 13 | Title of the page 14 | 15 | 16 | """ 17 | 18 | 19 | class settings: 20 | def __init__(self): 21 | self.DJANGO_CHECK_SEO_SETTINGS = { 22 | "max_link_depth": 4, 23 | "max_url_length": 70, 24 | } 25 | 26 | 27 | class init: 28 | def __init__(self): 29 | self.keywords = [] 30 | self.problems = [] 31 | self.warnings = [] 32 | self.success = [] 33 | self.settings = settings() 34 | self.soup = BeautifulSoup(html_content, features="lxml") 35 | self.full_url = "https://localhost/fake-url/title-of-the-page/" 36 | # populate class with data 37 | self.page_stats = site.Site(self.soup, self.full_url) 38 | 39 | 40 | def test_url_importance(): 41 | from django_check_seo.checks_list import check_url 42 | 43 | assert check_url.importance() == 1 44 | 45 | 46 | def test_url_okay(): 47 | from django_check_seo.checks_list import check_url 48 | 49 | site = init() 50 | check_url.run(site) 51 | 52 | for success in site.success: 53 | if success.name == "Right amount of level in path": 54 | assert success.name == "Right amount of level in path" 55 | assert success.settings == "less than 4" 56 | assert success.found == 2 57 | assert success.searched_in == [ 58 | 'https://localhost/fake-url/title-of-the-page/' 59 | ] 60 | assert ( 61 | success.description 62 | == "Google recommand to organize your content by adding depth in your url, but advises against putting too much depth." 63 | ) 64 | 65 | if success.name == "URL length is great": 66 | assert success.name == "URL length is great" 67 | assert success.settings == "less than 70" 68 | assert success.found == 37 69 | assert success.searched_in == ["localhost/fake-url/title-of-the-page/"] 70 | assert ( 71 | success.description 72 | == "Shorter URLs tend to rank better than long URLs." 73 | ) 74 | 75 | 76 | def test_url_too_deep(): 77 | from django_check_seo.checks_list import check_url 78 | 79 | site = init() 80 | site.full_url = "https://localhost/really/too/much/levels/in/path/" 81 | check_url.run(site) 82 | 83 | for problem in site.problems: 84 | if problem.name == "Too many levels in path": 85 | assert problem.name == "Too many levels in path" 86 | assert problem.settings == "less than 4" 87 | assert problem.found == 6 88 | assert problem.searched_in == [ 89 | 'https://localhost/really/too/much/levels/in/path/' 90 | ] 91 | 92 | 93 | def test_url_too_long(): 94 | from django_check_seo.checks_list import check_url 95 | 96 | site = init() 97 | site.full_url = "https://localhost/this-is-a-veeeeeeery-long-url-which-will-trigger-a-warning-lorem-ipsum/" 98 | check_url.run(site) 99 | 100 | for warning in site.warnings: 101 | if warning.name == "URL is too long": 102 | assert warning.name == "URL is too long" 103 | assert warning.settings == "less than 70" 104 | assert warning.found == 81 105 | assert warning.searched_in == [ 106 | 'localhost/this-is-a-veeeeeeery-long-url-which-will-trigger-a-warning-lorem-ipsum/' 107 | ] 108 | assert ( 109 | warning.description 110 | == "Shorter URLs tend to rank better than long URLs." 111 | ) 112 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_h2.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | 4 | # Standard Library 5 | import re 6 | 7 | # Third party 8 | from django.utils.translation import gettext as _ 9 | from django.utils.translation import pgettext 10 | 11 | # Local application / specific library imports 12 | from ..checks import custom_list 13 | 14 | 15 | def importance(): 16 | """Scripts with higher importance will be executed first. 17 | 18 | Returns: 19 | int -- Importance of the script. 20 | """ 21 | return 1 22 | 23 | 24 | def run(site): 25 | """Checks that h2 tags exist, and that at least one h2 contains at least one keyword. 26 | 27 | Arguments: 28 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 29 | """ 30 | 31 | no_h2 = custom_list.CustomList( 32 | name=_("No h2 tag"), 33 | settings=_("at least one"), 34 | found=pgettext("masculin", "none"), 35 | description=_( 36 | "H2 tags are useful because they are explored by search engines and can help them understand the subject of your page." 37 | ), 38 | ) 39 | 40 | enough_h2 = custom_list.CustomList( 41 | name=_("H2 tags were found"), 42 | settings=pgettext("feminin", "at least one"), 43 | description=no_h2.description, 44 | ) 45 | 46 | no_keywords = custom_list.CustomList( 47 | name=_("No keyword in h2 tags"), 48 | settings=pgettext("masculin", "at least one"), 49 | found=pgettext("masculin", "none"), 50 | description=_( 51 | "Google uses h2 tags to better understand the subjects of your page." 52 | ), 53 | ) 54 | 55 | enough_keywords = custom_list.CustomList( 56 | name=_("Keyword found in h2 tags"), 57 | settings=pgettext("masculin", "at least one"), 58 | found="", 59 | description=no_keywords.description, 60 | ) 61 | 62 | h2 = site.soup.find_all("h2") 63 | if not h2: 64 | site.warnings.append(no_h2) 65 | else: 66 | enough_h2.found = len(h2) 67 | enough_h2.searched_in = [get_h2_text(t) for t in h2] 68 | site.success.append(enough_h2) 69 | 70 | occurrence = [] 71 | h2_kw = [] 72 | 73 | # for each h2... 74 | for single_h2 in h2: 75 | single_h2 = get_h2_text(single_h2).lower() 76 | # check if it contains at least 1 keyword 77 | for keyword in site.keywords: 78 | keyword_lower = keyword.lower() 79 | 80 | # standardize apostrophes 81 | keyword_lower = keyword_lower.replace("'", "’") 82 | single_h2 = single_h2.replace("'", "’") 83 | 84 | nb_occurrences = len( 85 | re.findall( 86 | r"(^| |\n|,|\.|!|\?)" 87 | + keyword_lower.lower() 88 | + r"s?($| |\n|,|\.|!|\?)", 89 | single_h2, 90 | ) 91 | ) 92 | occurrence.append(nb_occurrences) 93 | 94 | # add kw in found 95 | if nb_occurrences > 0: 96 | # and add bold in found keywords 97 | single_h2 = single_h2.replace( 98 | keyword_lower, '' + keyword_lower + "" 99 | ) 100 | if enough_keywords.found != "": 101 | enough_keywords.found += ", " 102 | enough_keywords.found += keyword 103 | 104 | h2_kw.append(single_h2) 105 | # if no keyword is found in h2 106 | if not any(i > 0 for i in occurrence): 107 | no_keywords.searched_in = [t.text for t in h2] 108 | no_keywords.found = pgettext("masculin", "none") 109 | site.warnings.append(no_keywords) 110 | else: 111 | enough_keywords.searched_in = h2_kw 112 | site.success.append(enough_keywords) 113 | 114 | 115 | def get_h2_text(h2): 116 | # h2 text can be content of alt tag in img 117 | if not h2.text and h2.find("img", {"alt": True}): 118 | return h2.find("img")["alt"] 119 | # of it can be the text in h2 120 | else: 121 | return h2.text 122 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_keyword_url.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | 4 | # Standard Library 5 | import re 6 | import sys 7 | 8 | import unidecode 9 | from django.conf import settings 10 | from django.conf.global_settings import LANGUAGES 11 | from django.utils.translation import gettext as _ 12 | from django.utils.translation import pgettext 13 | 14 | # Local application / specific library imports 15 | from ..checks import custom_list 16 | 17 | # hacky trick to add python2 compatibility to a python3 project after python2 eol 18 | if sys.version_info.major == 2: 19 | from urlparse import urlparse # pragma: no cover 20 | else: 21 | from urllib.parse import urlparse # pragma: no cover 22 | 23 | 24 | def importance(): 25 | """Scripts with higher importance will be executed first. 26 | 27 | Returns: 28 | int -- Importance of the script. 29 | """ 30 | return 1 31 | 32 | 33 | def run(site): 34 | """Check presence of keywords in url. 35 | 36 | Arguments: 37 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 38 | """ 39 | 40 | no_keyword = custom_list.CustomList( 41 | name=_("No keyword in URL"), 42 | settings=pgettext("masculin", "at least one"), 43 | found=pgettext("masculin", "none"), 44 | description=_( 45 | 'Keywords in URL will help your users understand the organisation of your website, and are a small ranking factor for Google. On the other hand, Bing guidelines advises to "keep [your URL] clean and keyword rich when possible".' 46 | ), 47 | ) 48 | 49 | enough_keyword = custom_list.CustomList( 50 | name=_("Keywords found in URL"), 51 | settings=pgettext("masculin", "at least one"), 52 | found="", 53 | description=no_keyword.description, 54 | ) 55 | 56 | # root url may contain str like "/fr/" or "/en/" if i18n is activated 57 | url_path = urlparse(site.full_url, "/").path 58 | 59 | # list of languages from django LANGUAGES list: ['fr', 'en', 'br', 'ia', ...] 60 | languages_list = [i[0] for i in LANGUAGES] 61 | 62 | # do not check keywords in url for root URL 63 | if ( 64 | (settings.USE_I18N and url_path.replace("/", "") in languages_list) 65 | or url_path == "/" 66 | or not url_path 67 | ): 68 | return 69 | 70 | full_url = site.full_url.lower() 71 | occurrence = [] 72 | accented_occurrences = 0 73 | 74 | for keyword in site.keywords: 75 | keyword = keyword.lower().replace(" ", "-") 76 | 77 | # remove apostrophes as they are generally removed from URLs 78 | keyword = keyword.replace("'", "").replace("’", "") 79 | 80 | keyword_unnaccented = unidecode.unidecode(keyword) # pragma: no cover 81 | 82 | nb_occurrences = len( 83 | re.findall( 84 | r"(^| |\n|,|\.|!|\?|/|-)" + keyword + r"s?($| |\n|,|\.|!|\?|/|-)", 85 | full_url, 86 | ) 87 | ) 88 | if nb_occurrences == 0: 89 | # retry with unnaccented kw 90 | accented_occurrences = len( 91 | re.findall( 92 | r"(^| |\n|,|\.|!|\?|/|-)" 93 | + keyword_unnaccented 94 | + r"s?($| |\n|,|\.|!|\?|/|-)", 95 | full_url, 96 | ) 97 | ) 98 | occurrence.append(nb_occurrences + accented_occurrences) 99 | 100 | if nb_occurrences > 0: 101 | full_url = full_url.replace(keyword, '' + keyword + "") 102 | if enough_keyword.found != "": 103 | enough_keyword.found += ", " 104 | enough_keyword.found += keyword 105 | nb_occurrences = 0 106 | 107 | if accented_occurrences > 0: 108 | full_url = full_url.replace( 109 | keyword_unnaccented, '' + keyword_unnaccented + "" 110 | ) 111 | if enough_keyword.found != "": 112 | enough_keyword.found += ", " 113 | enough_keyword.found += keyword 114 | accented_occurrences = 0 115 | 116 | if not any(i > 0 for i in occurrence): 117 | no_keyword.searched_in = [full_url] 118 | site.problems.append(no_keyword) 119 | else: 120 | enough_keyword.searched_in = [full_url] 121 | site.success.append(enough_keyword) 122 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_h1.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | 4 | # Standard Library 5 | import re 6 | 7 | # Third party 8 | from django.utils.translation import gettext as _ 9 | from django.utils.translation import pgettext 10 | 11 | # Local application / specific library imports 12 | from ..checks import custom_list 13 | 14 | 15 | def importance(): 16 | """Scripts with higher importance will be executed first. 17 | 18 | Returns: 19 | int -- Importance of the script. 20 | """ 21 | return 1 22 | 23 | 24 | def run(site): 25 | """Verifies that only one h1 tag is present, and that it contains at least one keyword. 26 | 27 | Arguments: 28 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 29 | """ 30 | too_much_h1 = custom_list.CustomList( 31 | name=_("Too much h1 tags"), 32 | settings=_("exactly 1"), 33 | description=_( 34 | "Google is not really concerned about the number of h1 tags on your page, but Bing clearly indicates in its guidelines for webmasters to use only one h1 tag per page." 35 | ), 36 | ) 37 | 38 | right_number_h1 = custom_list.CustomList( 39 | name=_("H1 tag found"), 40 | settings=_("exactly 1"), 41 | description=too_much_h1.description, 42 | ) 43 | 44 | not_enough_h1 = custom_list.CustomList( 45 | name=_("No h1 tag"), 46 | settings=_("exactly 1"), 47 | description=too_much_h1.description, 48 | ) 49 | 50 | no_keywords = custom_list.CustomList( 51 | name=_("No keyword in h1"), 52 | settings=_("at least one"), 53 | description=_( 54 | "The h1 tag represent the main title of your page, and you may populate it with appropriate content in order to ensure that users (and search engines!) will understand correctly your page." 55 | ), 56 | ) 57 | 58 | enough_keywords = custom_list.CustomList( 59 | name=_("Keyword found in h1"), 60 | settings=_("at least one"), 61 | description=no_keywords.description, 62 | ) 63 | 64 | h1_all = site.soup.find_all("h1") 65 | 66 | if len(h1_all) > 1: 67 | too_much_h1.found = len(h1_all) 68 | too_much_h1.searched_in = [get_h1_text(t) for t in h1_all] 69 | site.problems.append(too_much_h1) 70 | 71 | elif not h1_all: 72 | not_enough_h1.found = pgettext("masculin", "none") 73 | site.problems.append(not_enough_h1) 74 | 75 | else: 76 | right_number_h1.found = len(h1_all) 77 | right_number_h1.searched_in = [get_h1_text(t) for t in h1_all] 78 | site.success.append(right_number_h1) 79 | 80 | enough_keywords.found = "" 81 | h1_text_kw = [] 82 | occurrence = [] 83 | for h1 in h1_all: 84 | h1_text = get_h1_text(h1).lower() 85 | 86 | for keyword in site.keywords: 87 | keyword = keyword.lower() 88 | 89 | # standardize apostrophes 90 | keyword = keyword.replace("'", "’") 91 | h1_text = h1_text.replace("'", "’") 92 | 93 | # ugly regex ? see example at https://github.com/kapt-labs/django-check-seo/issues/38#issuecomment-603108275 94 | nb_occurrences = len( 95 | re.findall( 96 | r"(^| |\n|,|\.|!|\?)" + keyword + r"s?($| |\n|,|\.|!|\?)", 97 | h1_text, 98 | ) 99 | ) 100 | occurrence.append(nb_occurrences) 101 | 102 | if nb_occurrences > 0: 103 | h1_text = h1_text.replace( 104 | keyword, '' + keyword + "" 105 | ) 106 | if enough_keywords.found != "": 107 | enough_keywords.found += ", " 108 | enough_keywords.found += keyword 109 | h1_text_kw.append(h1_text) 110 | 111 | # if no keyword is found in h1 112 | if not any(i > 0 for i in occurrence): 113 | no_keywords.found = pgettext("masculin", "none") 114 | no_keywords.searched_in = [t.text for t in h1_all] 115 | site.problems.append(no_keywords) 116 | else: 117 | enough_keywords.searched_in = h1_text_kw 118 | site.success.append(enough_keywords) 119 | 120 | 121 | def get_h1_text(h1): 122 | # h1 text can be content of alt tag in img 123 | if not h1.text and h1.find("img", {"alt": True}): 124 | return h1.find("img")["alt"] 125 | # of it can be the text in h1 126 | else: 127 | return h1.text 128 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_links.py: -------------------------------------------------------------------------------- 1 | # Third party 2 | import bs4 3 | from django.contrib.sites.models import Site 4 | from django.utils.translation import gettext as _ 5 | 6 | # Local application / specific library imports 7 | from ..checks import custom_list 8 | 9 | 10 | def importance(): 11 | """Scripts with higher importance will be executed first. 12 | 13 | Returns: 14 | int -- Importance of the script. 15 | """ 16 | return 1 17 | 18 | 19 | def run(site): 20 | """Counts the number of internal and external links in the extracted content. 21 | 22 | Arguments: 23 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 24 | """ 25 | 26 | not_enough_internal = custom_list.CustomList( 27 | name=_("Not enough internal links"), 28 | settings=_("at least {}").format( 29 | site.settings.DJANGO_CHECK_SEO_SETTINGS["internal_links"] 30 | ), 31 | description=_( 32 | "Internal links are useful because they can give the structure of your website to search engines, so they can create a hierarchy of your pages." 33 | ), 34 | ) 35 | 36 | enough_internal = custom_list.CustomList( 37 | name=_("Internal links were found"), 38 | settings=_("at least {}").format( 39 | site.settings.DJANGO_CHECK_SEO_SETTINGS["internal_links"] 40 | ), 41 | description=not_enough_internal.description, 42 | ) 43 | 44 | not_enough_external = custom_list.CustomList( 45 | name=_("Not enough external links"), 46 | settings=_("at least {}").format( 47 | site.settings.DJANGO_CHECK_SEO_SETTINGS["external_links"] 48 | ), 49 | description=_( 50 | "External links help your users to check your topic and can save them from having to do additional research." 51 | ), 52 | ) 53 | 54 | enough_external = custom_list.CustomList( 55 | name=_("External links were found"), 56 | settings=_("at least {}").format( 57 | site.settings.DJANGO_CHECK_SEO_SETTINGS["external_links"] 58 | ), 59 | description=not_enough_external.description, 60 | ) 61 | 62 | links = bs4.element.ResultSet(None) 63 | 64 | # only get links with href 65 | for c in site.content: 66 | links += c.find_all("a", href=True) 67 | 68 | internal_links = 0 69 | internal_links_list = [] 70 | external_links_list = [] 71 | external_links = 0 72 | 73 | for link in links: 74 | # specify if there is text of no text 75 | if link.text.strip() != "": 76 | text = link.text 77 | else: 78 | childs = link.find_all() 79 | if len(childs) > 0: 80 | if childs[0].get("alt", False): 81 | text = childs[0]["alt"] + " (<" + childs[0].name + ">)" 82 | else: 83 | text = str(childs[0]).replace("<", "<").replace(">", ">") 84 | else: 85 | text = _("[no content]") 86 | # internal links = absolute links that contains domain name or relative links 87 | if Site.objects.get_current().domain in link["href"] or not link[ 88 | "href" 89 | ].startswith("http"): 90 | internal_links += 1 91 | internal_links_list.append( 92 | '' + text + "" 93 | ) 94 | else: 95 | external_links += 1 96 | external_links_list.append( 97 | '' + text + "" 98 | ) 99 | 100 | # not enough internal links 101 | if internal_links < site.settings.DJANGO_CHECK_SEO_SETTINGS["internal_links"]: 102 | not_enough_internal.found = internal_links 103 | not_enough_internal.searched_in = internal_links_list 104 | site.warnings.append(not_enough_internal) 105 | else: 106 | enough_internal.found = internal_links 107 | enough_internal.searched_in = internal_links_list 108 | site.success.append(enough_internal) 109 | 110 | # not enough external links 111 | if external_links < site.settings.DJANGO_CHECK_SEO_SETTINGS["external_links"]: 112 | not_enough_external.found = external_links 113 | not_enough_external.searched_in = external_links_list 114 | site.warnings.append(not_enough_external) 115 | else: 116 | enough_external.found = external_links 117 | enough_external.searched_in = external_links_list 118 | site.success.append(enough_external) 119 | -------------------------------------------------------------------------------- /tests/test_images.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | # Use ./launch_tests.sh to launch these tests. 4 | 5 | from bs4 import BeautifulSoup 6 | from django_check_seo.checks import site 7 | 8 | html_content = """ 9 | 10 | 11 | 12 | heyy 13 | 14 | 15 | """ 16 | 17 | 18 | class init: 19 | def __init__(self): 20 | self.keywords = [] 21 | self.problems = [] 22 | self.warnings = [] 23 | self.success = [] 24 | 25 | self.soup = BeautifulSoup(html_content, features="lxml") 26 | self.content = self.soup.find_all("body") 27 | self.full_url = "https://localhost/fake-url/title-of-the-page/" 28 | # populate class with data 29 | self.page_stats = site.Site(self.soup, self.full_url) 30 | 31 | 32 | def test_images_importance(): 33 | from django_check_seo.checks_list import check_images 34 | 35 | assert check_images.importance() == 1 36 | 37 | 38 | def test_image_okay(): 39 | from django_check_seo.checks_list import check_images 40 | 41 | site = init() 42 | 43 | check_images.run(site) 44 | 45 | for success in site.success: 46 | if success.name == "Img have alt tag": 47 | assert success.name == "Img have alt tag" 48 | assert success.settings == "all images" 49 | assert success.found == 1 50 | assert success.searched_in == [ 51 | 'image (heyy)' 52 | ] 53 | assert ( 54 | success.description 55 | == 'Your images should always have an alt tag, because it improves accessibility for visually impaired people. The name of the file is important too, because it helps Google understand what your image is about. For example, you could rename a file named "IMG0001.jpg" to "tree_with_a_bird.jpg".' 56 | ) 57 | 58 | 59 | def test_images_okay(): 60 | import copy 61 | from django_check_seo.checks_list import check_images 62 | 63 | site = init() 64 | 65 | site.soup.body.append(copy.copy(site.soup.find("img"))) 66 | check_images.run(site) 67 | 68 | for success in site.success: 69 | if success.name == "Img have alt tag": 70 | assert success.name == "Img have alt tag" 71 | assert success.settings == "all images" 72 | assert success.found == 2 73 | assert success.searched_in == [ 74 | 'image (heyy)', 75 | 'image (heyy)', 76 | ] 77 | assert ( 78 | success.description 79 | == 'Your images should always have an alt tag, because it improves accessibility for visually impaired people. The name of the file is important too, because it helps Google understand what your image is about. For example, you could rename a file named "IMG0001.jpg" to "tree_with_a_bird.jpg".' 80 | ) 81 | 82 | 83 | def test_image_missing(): 84 | from django_check_seo.checks_list import check_images 85 | 86 | site = init() 87 | check_images.run(site) 88 | 89 | for problem in site.problems: 90 | if problem.name == "Img lack alt tag": 91 | assert problem.name == "Img lack alt tag" 92 | assert problem.settings == "all images" 93 | assert problem.found == 1 94 | assert problem.searched_in == [ 95 | 'image' 96 | ] 97 | 98 | 99 | def test_images_missing(): 100 | import copy 101 | from django_check_seo.checks_list import check_images 102 | 103 | site = init() 104 | site.soup.find("img")["alt"] = "" 105 | site.soup.body.append(copy.copy(site.soup.find("img"))) 106 | check_images.run(site) 107 | 108 | for problem in site.problems: 109 | if problem.name == "Img lack alt tag": 110 | assert problem.name == "Img lack alt tag" 111 | assert problem.settings == "all images" 112 | assert problem.found == 2 113 | assert problem.searched_in == [ 114 | 'image', 115 | 'image', 116 | ] 117 | 118 | 119 | def test_image_unknown(): 120 | from django_check_seo.checks_list import check_images 121 | 122 | site = init() 123 | site.soup.find("img")["src"] = "" 124 | site.soup.find("img")["alt"] = "" 125 | check_images.run(site) 126 | 127 | for problem in site.problems: 128 | if problem.name == "Img lack alt tag": 129 | assert problem.name == "Img lack alt tag" 130 | assert problem.settings == "all images" 131 | assert problem.found == 1 132 | assert problem.searched_in == [ 133 | "unknown image", 134 | ] 135 | -------------------------------------------------------------------------------- /launch_tests.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # django-check-seo super test-launching script 4 | # 5 | # requires python3-venv & python2 virtualenv 6 | # 7 | # Usage: 8 | # ./launch_tests.sh activate venv2 & venv3 (or create them & activate them), 9 | # launch tests, deactivate venv2 & venv3 10 | # ./launch_tests.sh 2 activate venv2 (or create it & activate it), 11 | # launch tests, deactivate venv2 12 | # ./launch_tests.sh 3 activate venv3 (or create it & activate it), 13 | # launch tests, deactivate venv3 14 | # ./launch_tests.sh remove remove folders "venv2" & "venv3" 15 | # ./launch_tests.sh help display this text 16 | 17 | function remove { 18 | 19 | if [[ -d "venv2" ]]; then 20 | 21 | echo -e "\e[1;32m✅ test venv2 found\e[0m" 22 | echo -e "\e[2m➡ removing venv2...\e[0m" 23 | rm -r venv2 24 | echo -e "\e[1;32m✅ test venv2 removed successfully\e[0m" 25 | else 26 | echo -e "\e[1;31m❌ test venv2 not found\e[0m" 27 | echo -e "\e[1;32m✅ nothing to remove\e[0m" 28 | fi 29 | 30 | if [[ -d "venv3" ]]; then 31 | 32 | echo -e "\e[1;32m✅ test venv3 found\e[0m" 33 | echo -e "\e[2m➡ removing venv3...\e[0m" 34 | rm -r venv3 35 | echo -e "\e[1;32m✅ test venv3 removed successfully\e[0m" 36 | else 37 | echo -e "\e[1;31m❌ test venv3 not found\e[0m" 38 | echo -e "\e[1;32m✅ nothing to remove\e[0m" 39 | fi 40 | 41 | } 42 | 43 | function display_help { 44 | 45 | echo -e "django-check-seo super test-launching script" 46 | echo 47 | echo -e "Usage:" 48 | echo -e " ./launch_tests.sh \e[2mactivate venv2 & venv3 (or create them & activate them), launch tests, deactivate venv2 & venv3\e[0m" 49 | echo -e " ./launch_tests.sh 2 \e[2mactivate venv2 (or create it & activate it), launch tests, deactivate venv2\e[0m" 50 | echo -e " ./launch_tests.sh 3 \e[2mactivate venv3 (or create it & activate it), launch tests, deactivate venv3\e[0m" 51 | echo -e " ./launch_tests.sh remove \e[2mremove folders \"venv2\" & \"venv3\"\e[0m" 52 | echo -e " ./launch_tests.sh help \e[2mdisplay this text\e[0m" 53 | 54 | } 55 | 56 | function create_and_launch_venv2 { 57 | 58 | if [[ -d "venv2" ]]; then 59 | launch_venv2_tests 60 | else 61 | create_venv2 62 | fi 63 | 64 | } 65 | 66 | function create_and_launch_venv3 { 67 | 68 | if [[ -d "venv3" ]]; then 69 | launch_venv3_tests 70 | else 71 | create_venv3 72 | fi 73 | 74 | } 75 | 76 | function launch_venv2_tests { 77 | 78 | echo -e "\e[1;32m✅ test venv2 found\e[0m" 79 | echo -e "\e[2m➡ launching tests...\e[0m" 80 | . venv2/bin/activate && python2 -m pytest -s --cov-config=.coveragerc --cov=django_check_seo --cov-report term-missing && deactivate 81 | 82 | } 83 | 84 | function launch_venv3_tests { 85 | 86 | echo -e "\e[1;32m✅ test venv3 found\e[0m" 87 | echo -e "\e[2m➡ launching tests...\e[0m" 88 | . venv3/bin/activate && python3 -m pytest -s --cov-config=.coveragerc --cov=django_check_seo --cov-report term-missing && deactivate 89 | 90 | } 91 | 92 | function create_venv2 { 93 | 94 | echo -e "\e[1;31m❌ test venv2 not found\e[0m" 95 | echo -e "\e[2m➡ creating venv2...\e[0m" 96 | python2 -m virtualenv -p python2.7 venv2 1>/dev/null && . venv2/bin/activate && python2 -m pip install "django<3" bs4 lxml "easy-thumbnails==2.3" "djangocms-page-meta==0.8.5" requests pytest pytest-django pytest-cov 1>/dev/null && deactivate 97 | echo -e "\e[1;32m✅ venv2 created successfully\e[0m" 98 | echo -e "\e[2m➡ launching tests...\e[0m" 99 | . venv2/bin/activate && python2 -m pytest -s --cov-config=.coveragerc --cov=django_check_seo --cov-report term-missing && deactivate 100 | 101 | } 102 | 103 | function create_venv3 { 104 | 105 | echo -e "\e[1;31m❌ test venv3 not found\e[0m" 106 | echo -e "\e[2m➡ creating venv3...\e[0m" 107 | python3 -m venv venv3 1>/dev/null && . venv3/bin/activate && python3 -m pip install django bs4 lxml djangocms-page-meta requests pytest pytest-django pytest-cov 1>/dev/null && deactivate 108 | echo -e "\e[1;32m✅ venv3 created successfully\e[0m" 109 | echo -e "\e[2m➡ launching tests...\e[0m" 110 | . venv3/bin/activate && python3 -m pytest -s --cov-config=.coveragerc --cov=django_check_seo --cov-report term-missing && deactivate 111 | 112 | } 113 | 114 | 115 | if [[ $1 == "remove" ]]; then 116 | 117 | remove 118 | 119 | exit 1 120 | 121 | elif [[ $1 == "help" ]]; then 122 | 123 | display_help 124 | 125 | exit 1 126 | 127 | elif [[ $1 == "2" ]]; then 128 | 129 | create_and_launch_venv2 130 | 131 | exit $? 132 | 133 | elif [[ $1 == "3" ]]; then 134 | 135 | create_and_launch_venv3 136 | 137 | exit $? 138 | 139 | elif [[ "$#" -gt 0 ]]; then 140 | 141 | if [[ $1 != ".pre-commit-config.yaml" || $2 != "launch_tests.sh" ]] 142 | then 143 | echo "Wrong args:" 144 | echo $@ 145 | echo "" 146 | display_help 147 | exit 1 148 | fi 149 | 150 | fi 151 | 152 | create_and_launch_venv2 153 | 154 | exit1=$? 155 | 156 | create_and_launch_venv3 157 | 158 | exit2=$? 159 | 160 | if [ "$exit1" -eq "0" ] && [ "$exit2" -eq "0" ] 161 | then 162 | exit 0 163 | else 164 | exit 1 165 | fi 166 | -------------------------------------------------------------------------------- /tests/test_keyword_present_first_paragraph.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Use ./launch_tests.sh to launch these tests. 4 | 5 | import re 6 | 7 | from bs4 import BeautifulSoup 8 | from django_check_seo.checks import site 9 | 10 | html_content = """ 11 | 12 | 13 | 14 | 15 | Title of the page 16 | 17 |

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pharetra metus ut tellus interdum consectetur. In vehicula orci vel fermentum pretium. Vestibulum ante ipsum primis in faucibus orci title luctus et ultrices posuere cubilia Curae; Cras consequat nunc arcu, ut tristique nulla molestie vel. Morbi vitae diam nibh. Proin ex nutella.

    18 | 19 | """ 20 | 21 | 22 | class settings: 23 | def __init__(self): 24 | self.DJANGO_CHECK_SEO_SETTINGS = { 25 | "keywords_in_first_words": 50, 26 | } 27 | 28 | 29 | class init: 30 | def __init__(self): 31 | self.keywords = [] 32 | self.problems = [] 33 | self.warnings = [] 34 | self.success = [] 35 | self.settings = settings() 36 | self.soup = BeautifulSoup(html_content, features="lxml") 37 | 38 | self.content_text = get_content_text(self.soup.find_all("body")) 39 | 40 | self.full_url = "https://localhost/fake-url/title-of-the-page/" 41 | # populate class with data 42 | self.page_stats = site.Site(self.soup, self.full_url) 43 | 44 | 45 | def get_content_text(soup): 46 | content_text = "" 47 | for c in soup: 48 | content_text += c.get_text(separator=" ") 49 | 50 | # strip multiple carriage return (with optional space) to only one 51 | content_text = re.sub(r"(\n( ?))+", "\n", content_text) 52 | # strip multiples spaces (>3) to only 2 (for title readability) 53 | content_text = re.sub(r" +", " ", content_text) 54 | return content_text 55 | 56 | 57 | def test_url_importance(): 58 | from django_check_seo.checks_list import keyword_present_first_paragraph 59 | 60 | assert keyword_present_first_paragraph.importance() == 1 61 | 62 | 63 | def test_keyword_present_first_paragraph_kw(): 64 | from django_check_seo.checks_list import check_keywords 65 | from django_check_seo.checks_list import keyword_present_first_paragraph 66 | 67 | site = init() 68 | check_keywords.run(site) 69 | keyword_present_first_paragraph.run(site) 70 | 71 | for success in site.success: 72 | if success.name == "Keywords found in first paragraph": 73 | assert success.name == "Keywords found in first paragraph" 74 | assert success.settings == "before 50 words" 75 | assert success.found == "title" 76 | assert success.searched_in == [ 77 | 'lorem ipsum dolor sit amet, consectetur adipiscing elit. morbi pharetra metus ut tellus interdum consectetur. in vehicula orci vel fermentum pretium. vestibulum ante ipsum primis in faucibus orci title luctus et ultrices posuere cubilia curae; cras consequat nunc arcu, ut tristique nulla molestie vel. morbi vitae diam nibh. proin ex' 78 | ] 79 | assert ( 80 | success.description 81 | == "The reader will be relieved to find one of his keywords in the first paragraph of your page, and the same logic applies to Google, which will consider the content more relevant." 82 | ) 83 | 84 | 85 | def test_keyword_present_first_paragraph_nokw(): 86 | from django_check_seo.checks_list import check_keywords 87 | from django_check_seo.checks_list import keyword_present_first_paragraph 88 | 89 | site = init() 90 | site.soup.find("p").string = "There is no keyword inside first 50 words." 91 | site.content_text = get_content_text(site.soup.find_all("body")) 92 | 93 | check_keywords.run(site) 94 | keyword_present_first_paragraph.run(site) 95 | 96 | for problem in site.problems: 97 | if problem.name == "No keyword in first paragraph": 98 | assert problem.name == "No keyword in first paragraph" 99 | assert problem.settings == "before 50 words" 100 | assert problem.found == "none" 101 | assert problem.searched_in == ["there is no keyword inside first 50 words."] 102 | 103 | 104 | def test_keyword_present_first_paragraph_kws(): 105 | from django_check_seo.checks_list import check_keywords 106 | from django_check_seo.checks_list import keyword_present_first_paragraph 107 | 108 | site = init() 109 | site.soup.find( 110 | "p" 111 | ).string = "There are two kw inside the first paragraph: title and description." 112 | site.content_text = get_content_text(site.soup.find_all("body")) 113 | 114 | check_keywords.run(site) 115 | keyword_present_first_paragraph.run(site) 116 | 117 | for success in site.success: 118 | if success.name == "Keywords found in first paragraph": 119 | assert success.name == "Keywords found in first paragraph" 120 | assert success.settings == "before 50 words" 121 | assert success.found == "description, title" 122 | assert success.searched_in == [ 123 | 'there are two kw inside the first paragraph: title and description.' 124 | ] 125 | assert ( 126 | success.description 127 | == "The reader will be relieved to find one of his keywords in the first paragraph of your page, and the same logic applies to Google, which will consider the content more relevant." 128 | ) 129 | -------------------------------------------------------------------------------- /django_check_seo/static/django-check-seo/design.css: -------------------------------------------------------------------------------- 1 | @font-face{ 2 | font-family: 'Roboto'; 3 | font-weight: 100; 4 | src: url('fonts/Roboto-Thin.ttf') format('truetype'); 5 | } 6 | 7 | @font-face{ 8 | font-family: 'Roboto'; 9 | font-weight: 400; 10 | src: url('fonts/Roboto-Light.ttf') format('truetype'); 11 | } 12 | 13 | @font-face{ 14 | font-family: 'Roboto'; 15 | font-weight: 500; 16 | src: url('fonts/Roboto-Regular.ttf') format('truetype'); 17 | } 18 | 19 | @font-face{ 20 | font-family: 'Roboto'; 21 | font-weight: 700; 22 | src: url('fonts/Roboto-Bold.ttf') format('truetype'); 23 | } 24 | 25 | @font-face{ 26 | font-family: 'Roboto'; 27 | font-weight: 900; 28 | src: url('fonts/Roboto-Black.ttf') format('truetype'); 29 | } 30 | 31 | *{ 32 | box-sizing: border-box; 33 | } 34 | 35 | html, body{ 36 | height: 100%; 37 | padding: 0; 38 | margin: 0; 39 | } 40 | 41 | body{ 42 | padding-top: 48px; 43 | font-family: 'Roboto'; 44 | background: linear-gradient(#aaa, #fefefe 60px) no-repeat; 45 | } 46 | 47 | #content{ 48 | display: flex; 49 | flex-wrap: wrap; 50 | flex-direction: row; 51 | justify-content: space-evenly; 52 | } 53 | 54 | main{ 55 | width: 70%; 56 | max-width: 800px; 57 | } 58 | 59 | aside{ 60 | width: 30%; 61 | margin-top: 2em; 62 | max-width: 800px; 63 | margin-bottom: 10em; 64 | } 65 | 66 | article{ 67 | padding: 10px; 68 | } 69 | 70 | h1{ 71 | text-align: center; 72 | } 73 | 74 | h2 { 75 | font-size : 1.7em; 76 | margin-bottom: 0; 77 | } 78 | 79 | h2, 80 | h3 { 81 | margin-top : 0; 82 | font-weight: 100; 83 | text-align : center; 84 | } 85 | 86 | .keywords, .searched-in{ 87 | margin: 0; 88 | padding: 0; 89 | list-style-type: none; 90 | } 91 | 92 | .searched-in{ 93 | display: inline-block; 94 | margin-right: 10px; 95 | } 96 | 97 | .searched-in .problem{ 98 | background-color: #fdd; 99 | } 100 | 101 | .searched-in .good{ 102 | background-color: #dfd; 103 | } 104 | 105 | .keywords{ 106 | margin-bottom: 3em; 107 | text-align: center; 108 | } 109 | 110 | .keywords .title{ 111 | color: #888; 112 | font-weight: 500; 113 | } 114 | 115 | .keywords li, .searched-in li{ 116 | margin: 3px; 117 | padding: 5px; 118 | color: #666; 119 | border-radius: 3px; 120 | display: inline-block; 121 | } 122 | 123 | .keywords li:not(:first-child):not(.no-keywords), .searched-in li:not(:first-child):not(.no-data){ 124 | font-family: monospace; 125 | border: 1px solid #aaa; 126 | box-shadow: 0 0 10px 1px #ddd; 127 | } 128 | 129 | .keywords .no-keywords, .searched-in .no-data{ 130 | font-style: italic; 131 | } 132 | 133 | strong{ 134 | font-weight: 500; 135 | } 136 | 137 | details i{ 138 | font-weight: 400; 139 | } 140 | 141 | hr{ 142 | width: 100%; 143 | height: 1px; 144 | border: none; 145 | margin: 30px 0; 146 | background: radial-gradient(#ccc, #ccc 50%, #fefefe 70%); 147 | } 148 | 149 | a{ 150 | color: royalblue; 151 | text-decoration: none; 152 | transition: text-shadow 0.3s; 153 | } 154 | 155 | a:hover{ 156 | text-shadow: 0 0 2px aqua; 157 | transition: text-shadow 0.3s; 158 | } 159 | 160 | summary{ 161 | cursor: pointer; 162 | } 163 | 164 | .red-bg, .yellow-bg, .green-bg{ 165 | padding: 3px; 166 | margin: 3px 0; 167 | color: white; 168 | border-radius: 3px; 169 | display: inline-block; 170 | } 171 | 172 | .red-bg{ 173 | background-color: #e12; 174 | } 175 | 176 | .yellow-bg{ 177 | background-color: #db0; 178 | } 179 | 180 | .green-bg{ 181 | background-color: #2a0; 182 | } 183 | 184 | .list{ 185 | list-style-type: none; 186 | padding: 0; 187 | } 188 | 189 | details > summary::-webkit-details-marker, details > summary::marker{ 190 | color: #999; 191 | } 192 | 193 | .list summary::before{ 194 | content: ""; 195 | width: 0.75em; 196 | height: 0.75em; 197 | margin-right: 0.5em; 198 | display: inline-block; 199 | vertical-align: center; 200 | } 201 | 202 | .list summary{ 203 | transition: background-color 0.3s; 204 | } 205 | 206 | .list summary::before{ 207 | transition: background-color 0.3s; 208 | } 209 | 210 | .red-list summary::before{ 211 | background-color: tomato; 212 | } 213 | 214 | .yellow-list summary::before{ 215 | background-color: #fb3; 216 | } 217 | 218 | .green-list summary::before{ 219 | background-color: #3a1; 220 | } 221 | 222 | .grey-list summary::before{ 223 | background-color: #999; 224 | } 225 | 226 | .list summary:hover, .list summary:hover::before{ 227 | transition: background-color 0.3s; 228 | } 229 | 230 | .red-list summary:hover, .red-list summary:hover::before{ 231 | background-color: #f87; 232 | } 233 | 234 | .yellow-list summary:hover, .yellow-list summary:hover::before{ 235 | background-color: #fd7; 236 | } 237 | 238 | .green-list summary:hover, .green-list summary:hover::before{ 239 | background-color: #af7; 240 | } 241 | 242 | .green-list strong{ 243 | color: darkslategray; 244 | } 245 | 246 | .grey-list summary:hover, .grey-list summary:hover::before{ 247 | background-color: #ccc; 248 | } 249 | 250 | .not-important{ 251 | color: #777; 252 | font-size: 0.8em; 253 | } 254 | 255 | details > summary{ 256 | padding: 5px; 257 | } 258 | 259 | details > div{ 260 | margin: 0; 261 | color: #444; 262 | padding: 1px 10px 10px 1.5em; 263 | } 264 | 265 | details[open] > div{ 266 | background-color: white; 267 | box-shadow: 0 0 3px 1px #aaa inset; 268 | } 269 | 270 | details{ 271 | margin: 5px 0; 272 | transition: border 0.3s; 273 | border: 1px solid #fefefe; 274 | } 275 | 276 | details[open] > summary{ 277 | background-color: #ddd; 278 | } 279 | 280 | details[open]{ 281 | transition: border 0.3s; 282 | border: 1px solid #aaa; 283 | border-radius: 0 0 5px 5px; 284 | } 285 | 286 | pre{ 287 | margin: 0; 288 | padding: 3px; 289 | font-weight: 100; 290 | color: #073642; 291 | white-space: pre-wrap; 292 | border-radius: 0 0 5px 5px; 293 | background-color: #fdf6e3; 294 | } 295 | 296 | /* 297 | ** Media queries : small screens 298 | */ 299 | 300 | @media screen and (max-width: 961px){ 301 | main, aside{ 302 | width: 100%; 303 | } 304 | 305 | aside{ 306 | margin-top: 0; 307 | } 308 | 309 | aside::before{ 310 | content: ""; 311 | height: 1px; 312 | width: 100%; 313 | display: block; 314 | margin: 30px 0; 315 | background: radial-gradient(#ccc, #ccc 50%, #fefefe 70%); 316 | } 317 | 318 | article{ 319 | padding: 10px; 320 | } 321 | } 322 | -------------------------------------------------------------------------------- /django_check_seo/checks_list/check_title.py: -------------------------------------------------------------------------------- 1 | # Standard Library 2 | import re 3 | 4 | # Third party 5 | from django.utils.translation import gettext as _ 6 | from django.utils.translation import pgettext 7 | 8 | # Local application / specific library imports 9 | from ..checks import custom_list 10 | 11 | 12 | def importance(): 13 | """Scripts with higher importance will be executed first. 14 | 15 | Returns: 16 | int -- Importance of the script. 17 | """ 18 | return 1 19 | 20 | 21 | def run(site): 22 | """Checks that only one title tag is present, that its content is within a certain range, and that it contains at least one keyword. 23 | 24 | Arguments: 25 | site {Site} -- Structure containing a good amount of resources from the targeted webpage. 26 | """ 27 | 28 | no_title = custom_list.CustomList( 29 | name=_("No meta title tag"), 30 | settings=pgettext("feminin", "one"), 31 | found=_("none"), 32 | description=_( 33 | "Titles tags are ones of the most important things to add to your pages, sinces they are the main text displayed on result search pages." 34 | ), 35 | ) 36 | 37 | too_much = custom_list.CustomList( 38 | name=_("Too much meta title tags"), 39 | settings=pgettext("feminin", "only one"), 40 | description=_( 41 | "Only the first meta title tag will be displayed on the tab space on your browser, and only one meta title tag will be displayed on the search results pages." 42 | ), 43 | ) 44 | 45 | short_title = custom_list.CustomList( 46 | name=_("Meta title tag is too short"), 47 | settings=_("more than {}").format( 48 | site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][0] 49 | ), 50 | description=_( 51 | "Meta titles tags need to describe the content of the page, and need to contain at least a few words." 52 | ), 53 | ) 54 | 55 | title_okay = custom_list.CustomList( 56 | name=_("Meta title tag have a good length"), 57 | settings=_("more than {}").format( 58 | site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][0] 59 | ), 60 | description=_("Meta titles tags need to describe the content of the page."), 61 | ) 62 | 63 | long_title = custom_list.CustomList( 64 | name=_("Meta title tag is too long"), 65 | settings=_("less than {}").format( 66 | site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][1] 67 | ), 68 | description=_( 69 | "Only the first ~55-60 chars are displayed on modern search engines results. Writing a longer meta title is not really required and can lead to make the user miss informations." 70 | ), 71 | ) 72 | 73 | no_keyword = custom_list.CustomList( 74 | name=_("Meta title tag do not contain any keyword"), 75 | settings=_("at least one"), 76 | found=_("none"), 77 | description=_( 78 | "Meta titles tags need to contain at least one keyword, since they are one of the most important content of the page for search engines." 79 | ), 80 | ) 81 | 82 | keyword = custom_list.CustomList( 83 | name=_("Keywords found in meta title tag"), 84 | settings=_("at least one"), 85 | description=no_keyword.description, 86 | ) 87 | 88 | # title presence 89 | titles = site.soup.head.find_all("title") 90 | if len(titles) < 1 or titles[0] is None or titles == "None": 91 | site.problems.append(no_title) 92 | return 93 | 94 | # multiple titles 95 | elif titles and len(titles) > 1: 96 | too_much.found = len(titles) 97 | too_much.searched_in = [t.string for t in titles] 98 | site.problems.append(too_much) 99 | 100 | # title length too short 101 | if ( 102 | len(titles[0].text) 103 | < site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][0] 104 | ): 105 | short_title.found = len(titles[0].text) 106 | short_title.searched_in = [ 107 | titles[0].text if len(titles[0].text) > 0 else _("[no content]") 108 | ] 109 | site.problems.append(short_title) 110 | 111 | # title length too long 112 | elif ( 113 | len(titles[0].string) 114 | > site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][1] 115 | ): 116 | long_title.found = len(titles[0].text) 117 | long_title.searched_in = [ 118 | titles[0].text[ 119 | : site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][1] 120 | ] 121 | + '' 122 | + titles[0].text[ 123 | site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][1] : 124 | ] 125 | + "" 126 | ] 127 | site.warnings.append(long_title) 128 | else: 129 | title_okay.found = len(titles[0].text) 130 | title_okay.searched_in = [ 131 | titles[0].string[ 132 | : site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][0] 133 | ] 134 | + '' 135 | + titles[0].string[ 136 | site.settings.DJANGO_CHECK_SEO_SETTINGS["meta_title_length"][0] : 137 | ] 138 | + "" 139 | ] 140 | site.success.append(title_okay) 141 | 142 | keyword.found = "" 143 | occurrence = [] 144 | title_text = titles[0].text.lower() 145 | title_text_kw = [] 146 | 147 | for kw in site.keywords: 148 | kw = kw.lower() 149 | nb_occurrences = len( 150 | re.findall( 151 | r"(^| |\n|,|\.|!|\?)" + kw + r"s?($| |\n|,|\.|!|\?)", 152 | title_text, 153 | ) 154 | ) 155 | occurrence.append(nb_occurrences) 156 | 157 | if nb_occurrences > 0: 158 | title_text = title_text.replace( 159 | kw, '' + kw.lower() + "" 160 | ) 161 | if keyword.found != "": 162 | keyword.found += ", " 163 | keyword.found += kw 164 | title_text_kw.append(title_text) 165 | 166 | # title do not contain any keyword 167 | if keyword.found == "": 168 | no_keyword.searched_in = title_text_kw 169 | site.problems.append(no_keyword) 170 | else: 171 | keyword.searched_in = title_text_kw 172 | site.success.append(keyword) 173 | -------------------------------------------------------------------------------- /tests/test_keyword_url.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | 4 | from bs4 import BeautifulSoup 5 | 6 | from django_check_seo.checks import site 7 | 8 | # Use ./launch_tests.sh to launch these tests. 9 | 10 | 11 | html_content = """ 12 | 13 | 14 | 15 | 16 | 17 | 18 | """ 19 | 20 | 21 | class init: 22 | def __init__(self): 23 | self.keywords = [] 24 | self.problems = [] 25 | self.warnings = [] 26 | self.success = [] 27 | 28 | self.soup = BeautifulSoup(html_content, features="lxml") 29 | self.full_url = "https://localhost/fake-url/title-of-the-page/" 30 | # populate class with data 31 | self.page_stats = site.Site(self.soup, self.full_url) 32 | 33 | 34 | def test_keyword_url_importance(): 35 | from django_check_seo.checks_list import check_keyword_url 36 | 37 | assert check_keyword_url.importance() == 1 38 | 39 | 40 | def test_keyword_url_kw(): 41 | from django_check_seo.checks_list import check_keyword_url, check_keywords 42 | 43 | site = init() 44 | 45 | check_keywords.run(site) 46 | check_keyword_url.run(site) 47 | 48 | for success in site.success: 49 | if success.name == "Keywords found in URL": 50 | assert success.name == "Keywords found in URL" 51 | assert success.settings == "at least one" 52 | assert success.found == "title" 53 | assert success.searched_in == [ 54 | 'https://localhost/fake-url/title-of-the-page/' 55 | ] 56 | assert ( 57 | success.description 58 | == 'Keywords in URL will help your users understand the organisation of your website, and are a small ranking factor for Google. On the other hand, Bing guidelines advises to "keep [your URL] clean and keyword rich when possible".' 59 | ) 60 | 61 | 62 | def test_keyword_url_nokw(): 63 | from django_check_seo.checks_list import check_keyword_url, check_keywords 64 | 65 | site = init() 66 | check_keywords.run(site) 67 | site.full_url = "https://localhost/fake-url/notitle-of-the-page" 68 | check_keyword_url.run(site) 69 | 70 | for problem in site.problems: 71 | if problem.name == "No keyword in URL": 72 | assert problem.name == "No keyword in URL" 73 | assert problem.settings == "at least one" 74 | assert problem.found == "none" 75 | assert problem.searched_in == [ 76 | "https://localhost/fake-url/notitle-of-the-page" 77 | ] 78 | 79 | 80 | def test_keyword_url_nokw_root(): 81 | from django_check_seo.checks_list import check_keyword_url, check_keywords 82 | 83 | site = init() 84 | check_keywords.run(site) 85 | site.full_url = "https://localhost/" 86 | check_keyword_url.run(site) 87 | 88 | for success in site.success: 89 | if success.name == "Keywords found in URL": 90 | raise ValueError("We don't espect kw to be found in root URL") 91 | 92 | for problem in site.problems: 93 | if problem.name == "No keyword in URL": 94 | raise ValueError( 95 | "We shouldnt return a problem if a keyword is not found in root URL" 96 | ) 97 | 98 | 99 | def test_keyword_url_kws(): 100 | from django_check_seo.checks_list import check_keyword_url, check_keywords 101 | 102 | site = init() 103 | site.soup.select('meta[name="keywords"]')[0]["content"] = "title, page" 104 | 105 | check_keywords.run(site) 106 | site.full_url = "https://localhost/fake-url/title-of-the-page" 107 | 108 | check_keyword_url.run(site) 109 | 110 | for success in site.success: 111 | if success.name == "Keywords found in URL": 112 | assert success.name == "Keywords found in URL" 113 | assert success.settings == "at least one" 114 | assert success.found == "title, page" 115 | assert success.searched_in == [ 116 | 'https://localhost/fake-url/title-of-the-page' 117 | ] 118 | 119 | 120 | def test_keyword_url_kw_accented(): 121 | from django_check_seo.checks_list import check_keyword_url, check_keywords 122 | 123 | site = init() 124 | site.soup.select('meta[name="keywords"]')[0]["content"] = "énergie" 125 | 126 | check_keywords.run(site) 127 | site.full_url = "https://localhost/fake-url/title-of-energie" 128 | 129 | check_keyword_url.run(site) 130 | 131 | for success in site.success: 132 | if success.name == "Keywords found in URL": 133 | assert success.name == "Keywords found in URL" 134 | assert success.settings == "at least one" 135 | assert success.found == "énergie" 136 | assert success.searched_in == [ 137 | 'https://localhost/fake-url/title-of-energie' 138 | ] 139 | 140 | 141 | def test_keyword_url_kws_accented(): 142 | from django_check_seo.checks_list import check_keyword_url, check_keywords 143 | 144 | site = init() 145 | site.soup.select('meta[name="keywords"]')[0]["content"] = "énergie, éééé" 146 | 147 | check_keywords.run(site) 148 | site.full_url = "https://localhost/fake-url/title-eeee-energie" 149 | 150 | check_keyword_url.run(site) 151 | 152 | for success in site.success: 153 | if success.name == "Keywords found in URL": 154 | assert success.name == "Keywords found in URL" 155 | assert success.settings == "at least one" 156 | assert success.found == "énergie, éééé" 157 | assert success.searched_in == [ 158 | 'https://localhost/fake-url/title-eeee-energie' 159 | ] 160 | 161 | 162 | def test_keyword_url_kws_accented_unaccented(): 163 | from django_check_seo.checks_list import check_keyword_url, check_keywords 164 | 165 | site = init() 166 | site.soup.select('meta[name="keywords"]')[0]["content"] = "énergie, title" 167 | 168 | check_keywords.run(site) 169 | site.full_url = "https://localhost/fake-url/title-of-energie" 170 | 171 | check_keyword_url.run(site) 172 | 173 | for success in site.success: 174 | if success.name == "Keywords found in URL": 175 | assert success.name == "Keywords found in URL" 176 | assert success.settings == "at least one" 177 | assert success.found == "énergie, title" 178 | assert success.searched_in == [ 179 | 'https://localhost/fake-url/title-of-energie' 180 | ] 181 | -------------------------------------------------------------------------------- /tests/test_h1.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Use ./launch_tests.sh to launch these tests. 4 | 5 | from bs4 import BeautifulSoup 6 | from django_check_seo.checks import site 7 | 8 | html_content = """ 9 | 10 | 11 | 12 | 13 | 14 | 15 |

    Title of the page

    16 | 17 | 18 | """ 19 | 20 | 21 | class init: 22 | def __init__(self): 23 | self.keywords = [] 24 | self.problems = [] 25 | self.warnings = [] 26 | self.success = [] 27 | 28 | self.soup = BeautifulSoup(html_content, features="lxml") 29 | self.full_url = "https://localhost/fake-url/title-of-the-page/" 30 | # populate class with data 31 | self.page_stats = site.Site(self.soup, self.full_url) 32 | 33 | 34 | def test_h1_importance(): 35 | from django_check_seo.checks_list import check_h1 36 | 37 | assert check_h1.importance() == 1 38 | 39 | 40 | def test_h1_1_nokw(): 41 | from django_check_seo.checks_list import check_h1 42 | 43 | site = init() 44 | 45 | check_h1.run(site) 46 | 47 | for success in site.success: 48 | assert success.name == "H1 tag found" 49 | assert success.settings == "exactly 1" 50 | assert success.found == 1 51 | assert success.searched_in == ["Title of the page"] 52 | assert ( 53 | success.description 54 | == "Google is not really concerned about the number of h1 tags on your page, but Bing clearly indicates in its guidelines for webmasters to use only one h1 tag per page." 55 | ) 56 | 57 | 58 | def test_h1_2_nokw(): 59 | import copy 60 | 61 | from django_check_seo.checks_list import check_h1 62 | 63 | site = init() 64 | site.soup.body.append(copy.copy(site.soup.find("h1"))) 65 | 66 | check_h1.run(site) 67 | 68 | for problem in site.problems: 69 | if problem.name == "Too much h1 tags": 70 | assert problem.name == "Too much h1 tags" 71 | assert problem.settings == "exactly 1" 72 | assert problem.found == 2 73 | assert problem.searched_in == ["Title of the page", "Title of the page"] 74 | assert ( 75 | problem.description 76 | == "Google is not really concerned about the number of h1 tags on your page, but Bing clearly indicates in its guidelines for webmasters to use only one h1 tag per page." 77 | ) 78 | 79 | 80 | def test_h1_0_nokw(): 81 | from django_check_seo.checks_list import check_h1 82 | 83 | site = init() 84 | site.soup.find("h1").decompose() 85 | check_h1.run(site) 86 | 87 | for problem in site.problems: 88 | if problem.name == "No h1 tag": 89 | assert problem.name == "No h1 tag" 90 | assert problem.settings == "exactly 1" 91 | assert problem.found == "none" 92 | assert problem.searched_in == [] 93 | assert ( 94 | problem.description 95 | == "Google is not really concerned about the number of h1 tags on your page, but Bing clearly indicates in its guidelines for webmasters to use only one h1 tag per page." 96 | ) 97 | 98 | 99 | def test_h1_1_nokw_image(): 100 | from django_check_seo.checks_list import check_h1 101 | 102 | site = init() 103 | site.soup.find("h1").decompose() 104 | site.soup.body.append( 105 | ( 106 | BeautifulSoup( 107 | "

    Title of the page

    ", features="lxml" 108 | ) 109 | ) 110 | ) 111 | check_h1.run(site) 112 | 113 | for success in site.success: 114 | assert success.name == "H1 tag found" 115 | assert success.settings == "exactly 1" 116 | assert success.found == 1 117 | assert success.searched_in == ["Title of the page"] 118 | assert ( 119 | success.description 120 | == "Google is not really concerned about the number of h1 tags on your page, but Bing clearly indicates in its guidelines for webmasters to use only one h1 tag per page." 121 | ) 122 | 123 | 124 | def test_h1_1_kw(): 125 | from django_check_seo.checks_list import check_h1 126 | from django_check_seo.checks_list import check_keywords 127 | 128 | site = init() 129 | 130 | check_keywords.run(site) 131 | 132 | check_h1.run(site) 133 | 134 | for success in site.success: 135 | if success.name == "Keyword found in h1": 136 | assert success.name == "Keyword found in h1" 137 | assert success.settings == "at least one" 138 | assert success.found == "title" 139 | assert success.searched_in == ['title of the page'] 140 | assert ( 141 | success.description 142 | == "The h1 tag represent the main title of your page, and you may populate it with appropriate content in order to ensure that users (and search engines!) will understand correctly your page." 143 | ) 144 | 145 | 146 | def test_h1_1_kws(): 147 | from django_check_seo.checks_list import check_h1 148 | from django_check_seo.checks_list import check_keywords 149 | 150 | site = init() 151 | site.soup.find("h1").string = "Title of the page description" 152 | 153 | check_keywords.run(site) 154 | 155 | check_h1.run(site) 156 | 157 | for success in site.success: 158 | if success.name == "Keyword found in h1": 159 | assert success.name == "Keyword found in h1" 160 | assert success.settings == "at least one" 161 | assert success.found == "description, title" 162 | assert success.searched_in == [ 163 | 'title of the page description' 164 | ] 165 | assert ( 166 | success.description 167 | == "The h1 tag represent the main title of your page, and you may populate it with appropriate content in order to ensure that users (and search engines!) will understand correctly your page." 168 | ) 169 | 170 | 171 | def test_h1_1_kw_strange1(): 172 | from django_check_seo.checks_list import check_h1 173 | from django_check_seo.checks_list import check_keywords 174 | 175 | site = init() 176 | site.soup.select('meta[name="keywords"]')[0]["content"] = "@letics" 177 | 178 | site.soup.find("h1").string = "word @letics another-word" 179 | 180 | check_keywords.run(site) 181 | 182 | check_h1.run(site) 183 | 184 | for success in site.success: 185 | if success.name == "Keyword found in h1": 186 | assert success.found == "@letics" 187 | assert success.searched_in == [ 188 | 'word @letics another-word' 189 | ] 190 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ![Django Check SEO](https://user-images.githubusercontent.com/45763865/114545606-72178380-9c5c-11eb-99dd-1088bb2a0bd9.png) 2 | 3 | *Replacing some features of Yoast or SEMrush for Django & Django-CMS users.* 4 | 5 | In other words, django-check-seo will tell you if you have problems concerning a broad range of SEO aspects of your pages. 6 | 7 | ---- 8 | 9 | [![PyPI](https://img.shields.io/pypi/v/django-check-seo?color=%232a2)](https://pypi.org/project/django-check-seo/) [![PyPI - Downloads](https://img.shields.io/pypi/dm/django-check-seo?color=%232a2)](https://pypi.org/project/django-check-seo/) [![GitHub last commit](https://img.shields.io/github/last-commit/kapt-labs/django-check-seo)](https://github.com/kapt-labs/django-check-seo) 10 | 11 | ---- 12 | 13 | # Install 14 | 15 | *Only for django >= 2.2 & python >= 3.7, see [here](https://github.com/kapt-labs/django-check-seo/tree/python2) for a python2/django 1.8-1.11 version (tl;dr: install version <0.6).* 16 | 17 | 1. Install the module from [PyPI](https://pypi.org/project/django-check-seo/): 18 | ``` 19 | python3 -m pip install django-check-seo 20 | ``` 21 | 22 | 2. Add it in your `INSTALLED_APPS`: 23 | ``` 24 | "django_check_seo", 25 | ``` 26 | 27 | 3. Add this in your `urls.py` *(if you're using django-cms, put it before the `cms.urls` line or it will not work)*: 28 | ``` 29 | path("django-check-seo/", include("django_check_seo.urls")), 30 | ``` 31 | 4. Update your Django [Site](https://i.imgur.com/pNRsKs7.png) object parameters with a working url (here's an [example](https://i.imgur.com/IedF3xE.png) for dev environment). 32 | 33 | 5. Add `testserver` (and maybe `www.testserver`) to your `ALLOWED_HOSTS` list in your settings.py (django-check-seo uses the Test Framework in order to get content, instead of doing an HTTP request). 34 | 35 | 6. ![**new in 1.0.0**](https://img.shields.io/badge/new_in-1.0.0-green) Add the permission (`use_django_check_seo`) to the users/groups you want to give access to. 36 | 37 | 7. *(optional) Configure the settings (see [config](#config)).* 38 | 39 | 8. ![that's all folks!](https://i.imgur.com/o2Tcd2E.png) 40 | 41 | ---- 42 | 43 | # Misc 44 | 45 | This application needs `beautifulsoup4` (>=4.7.0) and `djangocms_page_meta` *(==0.8.5 if using django < 1.11)*. It may be used with or without `django-cms` (a django-check-seo button will appear in the topbar if you're using django-cms). 46 | 47 | If you're not using Django CMS (only Django), here's the link format to access your pages reports: 48 | 49 | ``` 50 | https://example.com/django-check-seo/?page=/example-page/ 51 | -> will check https://example.com/example-page/ 52 | 53 | https://example.com/fr/django-check-seo/?page=/example-page/ 54 | -> will check https://example.com/example-page/ 55 | (using localized url (if you add django-check-seo in i18n_patterns)) 56 | ``` 57 | 58 | ---- 59 | 60 | # Config 61 | 62 | ## Basic settings 63 | 64 | The basic config (used by default) is located in [`django-check-seo/conf/settings.py`](https://github.com/kapt-labs/django-check-seo/blob/master/django_check_seo/conf/settings.py#L5-L15) and looks like this: 65 | ```python 66 | DJANGO_CHECK_SEO_SETTINGS = { 67 | "content_words_number": [300, 600], 68 | "internal_links": 1, 69 | "external_links": 1, 70 | "meta_title_length": [30, 60], 71 | "meta_description_length": [50, 160], 72 | "keywords_in_first_words": 50, 73 | "max_link_depth": 3, 74 | "max_url_length": 70, 75 | } 76 | ``` 77 | 78 | If you need to change something, just define a dict named `DJANGO_CHECK_SEO_SETTINGS` in your settings.py. 79 | 80 | ### *Custom settings example:* 81 | 82 | If you put this in your `settings.py` file: 83 | 84 | ```python 85 | DJANGO_CHECK_SEO_SETTINGS = { 86 | "internal_links": 25, 87 | "meta_title_length": [15,30], 88 | } 89 | ``` 90 | 91 | Then this will be the settings used by the application: 92 | 93 | ```python 94 | DJANGO_CHECK_SEO_SETTINGS = { 95 | "content_words_number": [300, 600], 96 | "internal_links": 25, # 1 if using default settings 97 | "external_links": 1, 98 | "meta_title_length": [15,30], # [30, 60] if using default settings 99 | "meta_description_length": [50, 160], 100 | "keywords_in_first_words": 50, 101 | "max_link_depth": 3, 102 | "max_url_length": 70, 103 | } 104 | ``` 105 | 106 | *Want to know more ? See the wiki page [Settings explained](https://github.com/kapt-labs/django-check-seo/wiki/Settings-explained).* 107 | 108 | ## Templates 109 | 110 | The `django_check_seo/default.html` template have an `