├── requirements.txt ├── .github ├── ISSUE_TEMPLATE │ ├── config.yml │ ├── question.yml │ ├── feature-request.yml │ └── bug-report.yml ├── dependabot.yml └── workflows │ ├── download_websites.yml │ ├── format.yml │ ├── stale.yml │ ├── tag.yml │ ├── links_local.yml │ ├── check_domains.yml │ └── sitemaps.yml ├── .gitignore ├── utils └── check_image_sizes.py ├── docs └── en │ └── compare │ ├── rtdetr-vs-yolov7.md │ ├── yolov7-vs-yolov8.md │ ├── yolov10-vs-yolov7.md │ ├── damo-yolo-vs-yolov8.md │ ├── yolov7-vs-rtdetr.md │ ├── yolov8-vs-yolo11.md │ ├── yolov5-vs-rtdetr.md │ ├── yolox-vs-yolov7.md │ ├── yolov7-vs-yolox.md │ ├── yolov5-vs-yolov7.md │ ├── yolov8-vs-yolox.md │ ├── yolov7-vs-yolov9.md │ ├── rtdetr-vs-damo-yolo.md │ ├── yolox-vs-yolov5.md │ └── pp-yoloe-vs-yolov9.md └── README.md /requirements.txt: -------------------------------------------------------------------------------- 1 | beautifulsoup4 2 | requests 3 | pandas 4 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/config.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | blank_issues_enabled: true 4 | contact_links: 5 | - name: 📄 Ultralytics Docs 6 | url: https://docs.ultralytics.com/ 7 | about: Published docs site powered by MkDocs 8 | - name: 💬 Forum 9 | url: https://community.ultralytics.com/ 10 | about: Ask the Ultralytics community for workflow help 11 | - name: 🎧 Discord 12 | url: https://ultralytics.com/discord 13 | about: Chat with the Ultralytics team and other builders 14 | - name: ⌨️ Reddit 15 | url: https://reddit.com/r/ultralytics 16 | about: Discuss Ultralytics projects on Reddit 17 | -------------------------------------------------------------------------------- /.github/dependabot.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | # Dependabot for package version updates 4 | # https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates 5 | 6 | version: 2 7 | updates: 8 | - package-ecosystem: pip 9 | directory: "/" 10 | schedule: 11 | interval: weekly 12 | time: "04:00" 13 | open-pull-requests-limit: 10 14 | labels: 15 | - dependencies 16 | 17 | - package-ecosystem: github-actions 18 | directory: "/.github/workflows" 19 | schedule: 20 | interval: weekly 21 | time: "04:00" 22 | open-pull-requests-limit: 5 23 | labels: 24 | - dependencies 25 | -------------------------------------------------------------------------------- /.github/workflows/download_websites.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | # Action to download Ultralytics website and docs in parallel 4 | 5 | name: Download Websites 6 | 7 | permissions: 8 | contents: none 9 | 10 | on: 11 | workflow_dispatch: 12 | schedule: 13 | - cron: "0 0 * * *" # runs at 00:00 UTC every day 14 | push: 15 | branches: 16 | - gh-pages 17 | 18 | jobs: 19 | Download: 20 | runs-on: ubuntu-latest 21 | # continue-on-error: true 22 | strategy: 23 | matrix: 24 | url: 25 | - https://www.ultralytics.com/ 26 | - https://docs.ultralytics.com/ 27 | - https://handbook.ultralytics.com/ 28 | fail-fast: false 29 | 30 | steps: 31 | - name: Download ${{ matrix.url }} 32 | run: | 33 | mkdir website 34 | wget -P website \ 35 | --recursive \ 36 | --no-parent \ 37 | --adjust-extension \ 38 | --reject "*.jpg*,*.jpeg*,*.png*,*.gif*,*.webp*,*.svg*,*.avif*,*.txt,*.ico*,*.bmp*,*.tiff*,*.tif*,*.psd*,*.raw*,*.heic*,*.jfif*,*.webm*,*.mp4*,*.mov*,*.wmv*,*.flv*,*.avi*,*.mkv*" \ 39 | --wait=0.5 \ 40 | --random-wait \ 41 | --reject-regex '/(zh|ko|ja|ru|de|fr|es|pt|ar|tr|vi|it)/.*|(embedly\.com|youtube\.com)' \ 42 | ${{ matrix.url }} 43 | -------------------------------------------------------------------------------- /.github/workflows/format.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | # Ultralytics Actions https://github.com/ultralytics/actions 4 | # This workflow formats code and documentation in PRs to Ultralytics standards 5 | 6 | name: Ultralytics Actions 7 | 8 | on: 9 | issues: 10 | types: [opened] 11 | pull_request: 12 | types: [opened, closed, synchronize, review_requested] 13 | 14 | permissions: 15 | contents: write # Modify code in PRs 16 | pull-requests: write # Add comments and labels to PRs 17 | issues: write # Add comments and labels to issues 18 | 19 | jobs: 20 | actions: 21 | runs-on: ubuntu-latest 22 | steps: 23 | - name: Run Ultralytics Actions 24 | uses: ultralytics/actions@main 25 | with: 26 | token: ${{ secrets._GITHUB_TOKEN || secrets.GITHUB_TOKEN }} # Auto-generated token 27 | labels: true # Auto-label issues/PRs using AI 28 | python: true # Format Python with Ruff and docformatter 29 | prettier: true # Format YAML, JSON, Markdown, CSS 30 | spelling: true # Check spelling with codespell 31 | links: false # Check broken links with Lychee 32 | summary: true # Generate AI-powered PR summaries 33 | openai_api_key: ${{ secrets.OPENAI_API_KEY }} # Powers PR summaries, labels and comments 34 | brave_api_key: ${{ secrets.BRAVE_API_KEY }} # Used for broken link resolution 35 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/question.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | name: ❓ Question 4 | description: "Ask an Ultralytics Docs question" 5 | labels: [question] 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: | 10 | Thank you for asking an Ultralytics Docs ❓ Question! 11 | 12 | - type: checkboxes 13 | attributes: 14 | label: Search before asking 15 | description: > 16 | Please search the Ultralytics Docs [docs](https://docs.ultralytics.com/), [issues](https://github.com/ultralytics/docs/issues), and [Ultralytics discussions](https://github.com/orgs/ultralytics/discussions) to see if a similar question already exists. 17 | options: 18 | - label: > 19 | I checked the docs, issues, and discussions and could not find an answer. 20 | required: true 21 | 22 | - type: textarea 23 | attributes: 24 | label: Question 25 | description: What is your question? Provide as much detail as possible so we can assist with Ultralytics Docs. Include code snippets, screenshots, logs, or links to notebooks/demos. 26 | placeholder: | 27 | 💡 ProTip! Include as much information as possible (logs, tracebacks, screenshots, etc.) to receive the most helpful response. 28 | validations: 29 | required: true 30 | 31 | - type: textarea 32 | attributes: 33 | label: Additional 34 | description: Anything else you would like to share? 35 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature-request.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | name: 🚀 Feature Request 4 | description: "Suggest an Ultralytics Docs improvement" 5 | labels: [enhancement] 6 | type: "feature" 7 | body: 8 | - type: markdown 9 | attributes: 10 | value: | 11 | Thank you for submitting an Ultralytics Docs 🚀 Feature Request! 12 | 13 | - type: checkboxes 14 | attributes: 15 | label: Search before asking 16 | description: > 17 | Please search the Ultralytics Docs [docs](https://docs.ultralytics.com/) and [issues](https://github.com/ultralytics/docs/issues) to see if a similar feature request already exists. 18 | options: 19 | - label: > 20 | I have searched https://github.com/ultralytics/docs/issues and did not find a similar request. 21 | required: true 22 | 23 | - type: textarea 24 | attributes: 25 | label: Description 26 | description: Briefly describe the feature you would like to see added to Ultralytics Docs. 27 | placeholder: | 28 | What new capability or improvement are you proposing? 29 | validations: 30 | required: true 31 | 32 | - type: textarea 33 | attributes: 34 | label: Use case 35 | description: Explain how this feature would be used and who benefits from it. Screenshots or mockups are welcome. 36 | placeholder: | 37 | How would this feature improve your workflow? 38 | 39 | - type: textarea 40 | attributes: 41 | label: Additional 42 | description: Anything else you would like to share? 43 | 44 | - type: checkboxes 45 | attributes: 46 | label: Are you willing to submit a PR? 47 | description: > 48 | (Optional) We encourage you to submit a [Pull Request](https://github.com/ultralytics/docs/pulls) to help improve Ultralytics Docs. 49 | See the Ultralytics [Contributing Guide](https://docs.ultralytics.com/help/contributing/) to get started. 50 | options: 51 | - label: Yes I'd like to help by submitting a PR! 52 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | .idea/ 30 | .idea 31 | 32 | # PyInstaller 33 | # Usually these files are written by a python script from a template 34 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 35 | *.manifest 36 | *.spec 37 | 38 | # Installer logs 39 | pip-log.txt 40 | pip-delete-this-directory.txt 41 | 42 | # Unit test / coverage reports 43 | htmlcov/ 44 | .tox/ 45 | .nox/ 46 | .coverage 47 | .coverage.* 48 | .cache 49 | nosetests.xml 50 | coverage.xml 51 | *.cover 52 | *.py,cover 53 | .hypothesis/ 54 | .pytest_cache/ 55 | 56 | # Translations 57 | *.mo 58 | *.pot 59 | 60 | # Django stuff: 61 | *.log 62 | local_settings.py 63 | db.sqlite3 64 | db.sqlite3-journal 65 | 66 | # Flask stuff: 67 | instance/ 68 | .webassets-cache 69 | 70 | # Scrapy stuff: 71 | .scrapy 72 | 73 | # Sphinx documentation 74 | docs/_build/ 75 | 76 | # PyBuilder 77 | target/ 78 | 79 | # Jupyter Notebook 80 | .ipynb_checkpoints 81 | 82 | # IPython 83 | profile_default/ 84 | ipython_config.py 85 | 86 | # pyenv 87 | .python-version 88 | 89 | # pipenv 90 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 91 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 92 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 93 | # install all needed dependencies. 94 | #Pipfile.lock 95 | 96 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 97 | __pypackages__/ 98 | 99 | # Celery stuff 100 | celerybeat-schedule 101 | celerybeat.pid 102 | 103 | # SageMath parsed files 104 | *.sage.py 105 | 106 | # Environments 107 | .env 108 | .venv 109 | env/ 110 | venv/ 111 | ENV/ 112 | env.bak/ 113 | venv.bak/ 114 | 115 | # Spyder project settings 116 | .spyderproject 117 | .spyproject 118 | 119 | # Rope project settings 120 | .ropeproject 121 | 122 | # mkdocs documentation 123 | /site 124 | 125 | # mypy 126 | .mypy_cache/ 127 | .dmypy.json 128 | dmypy.json 129 | 130 | # Pyre type checker 131 | .pyre/ 132 | -------------------------------------------------------------------------------- /.github/workflows/stale.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | name: Close stale issues 4 | 5 | permissions: 6 | contents: read 7 | issues: write 8 | pull-requests: write 9 | 10 | on: 11 | schedule: 12 | - cron: "0 0 * * *" # Runs at 00:00 UTC every day 13 | 14 | jobs: 15 | stale: 16 | runs-on: ubuntu-latest 17 | steps: 18 | - uses: actions/stale@v10 19 | with: 20 | repo-token: ${{ secrets.GITHUB_TOKEN }} 21 | 22 | stale-issue-message: | 23 | 👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help. 24 | 25 | For additional resources and information, please see the links below: 26 | 27 | - **Docs**: https://docs.ultralytics.com 28 | - **HUB**: https://hub.ultralytics.com 29 | - **Community**: https://community.ultralytics.com 30 | 31 | Feel free to inform us of any other **issues** you discover or **feature requests** that come to mind in the future. Pull Requests (PRs) are also always welcomed! 32 | 33 | Thank you for your contributions to YOLO 🚀 and Vision AI ⭐ 34 | 35 | stale-pr-message: | 36 | 👋 Hello there! We wanted to let you know that we've decided to close this pull request due to inactivity. We appreciate the effort you put into contributing to our project, but unfortunately, not all contributions are suitable or aligned with our product roadmap. 37 | 38 | We hope you understand our decision, and please don't let it discourage you from contributing to open source projects in the future. We value all of our community members and their contributions, and we encourage you to keep exploring new projects and ways to get involved. 39 | 40 | For additional resources and information, please see the links below: 41 | 42 | - **Docs**: https://docs.ultralytics.com 43 | - **HUB**: https://hub.ultralytics.com 44 | - **Community**: https://community.ultralytics.com 45 | 46 | Thank you for your contributions to YOLO 🚀 and Vision AI ⭐ 47 | 48 | days-before-issue-stale: 30 49 | days-before-issue-close: 10 50 | days-before-pr-stale: 90 51 | days-before-pr-close: 30 52 | exempt-issue-labels: "documentation,tutorial,TODO" 53 | operations-per-run: 300 # The maximum number of operations per run, used to control rate limiting. 54 | -------------------------------------------------------------------------------- /.github/workflows/tag.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | # Ultralytics Actions https://github.com/ultralytics/actions 4 | # This workflow automatically publishes a new repository tag and release 5 | 6 | name: Tag and Release 7 | 8 | permissions: 9 | contents: write 10 | 11 | on: 12 | workflow_dispatch: 13 | inputs: 14 | tag_name: 15 | description: "Tag name (e.g., v0.0.0)" 16 | required: true 17 | type: string 18 | publish_tag: 19 | description: "Publish new tag" 20 | required: true 21 | type: boolean 22 | default: true 23 | publish_release: 24 | description: "Publish new release" 25 | required: true 26 | type: boolean 27 | default: true 28 | 29 | jobs: 30 | tag-and-release: 31 | if: github.repository == 'ultralytics/docs' && github.actor == 'glenn-jocher' 32 | name: Tag and Release 33 | runs-on: ubuntu-latest 34 | steps: 35 | - name: Checkout code 36 | uses: actions/checkout@v6 37 | with: 38 | fetch-depth: 0 39 | token: ${{ secrets._GITHUB_TOKEN }} 40 | 41 | - name: Git config 42 | run: | 43 | git config --global user.name "UltralyticsAssistant" 44 | git config --global user.email "web@ultralytics.com" 45 | 46 | - name: Check if tag exists 47 | id: check_tag 48 | run: | 49 | if git rev-parse ${{ github.event.inputs.tag_name }} >/dev/null 2>&1; then 50 | echo "Tag ${{ github.event.inputs.tag_name }} already exists" 51 | echo "tag_exists=true" >> $GITHUB_OUTPUT 52 | else 53 | echo "Tag ${{ github.event.inputs.tag_name }} does not exist" 54 | echo "tag_exists=false" >> $GITHUB_OUTPUT 55 | fi 56 | 57 | - name: Publish new tag 58 | if: steps.check_tag.outputs.tag_exists == 'false' 59 | run: | 60 | git tag -a "${{ github.event.inputs.tag_name }}" -m "$(git log -1 --pretty=%B)" 61 | git push origin "${{ github.event.inputs.tag_name }}" 62 | 63 | - name: Set up Python environment 64 | uses: actions/setup-python@v6 65 | with: 66 | python-version: "3.x" 67 | 68 | - uses: astral-sh/setup-uv@v7 69 | 70 | - name: Install dependencies 71 | run: uv pip install --system ultralytics-actions 72 | 73 | - name: Publish new release 74 | env: 75 | OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} 76 | GITHUB_TOKEN: ${{ secrets._GITHUB_TOKEN }} 77 | CURRENT_TAG: ${{ github.event.inputs.tag_name }} 78 | run: ultralytics-actions-summarize-release 79 | shell: bash 80 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug-report.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | name: 🐛 Bug Report 4 | description: "Problems with docs.ultralytics.com" 5 | labels: [bug, triage] 6 | type: "bug" 7 | body: 8 | - type: markdown 9 | attributes: 10 | value: | 11 | Thank you for submitting an Ultralytics Docs 🐛 Bug Report! 12 | 13 | - type: checkboxes 14 | attributes: 15 | label: Search before asking 16 | description: > 17 | Please search the Ultralytics Docs [docs](https://docs.ultralytics.com/) and [issues](https://github.com/ultralytics/docs/issues) to see if a similar bug report already exists. 18 | options: 19 | - label: > 20 | I have searched https://github.com/ultralytics/docs/issues and did not find a similar report. 21 | required: true 22 | 23 | - type: dropdown 24 | attributes: 25 | label: Project area 26 | description: | 27 | Help us route the report to the right maintainers. 28 | multiple: true 29 | options: 30 | - "Content accuracy" 31 | - "Navigation or search" 32 | - "Localization" 33 | - "Build tooling" 34 | - "Links/assets" 35 | - "Other" 36 | validations: 37 | required: false 38 | 39 | - type: textarea 40 | attributes: 41 | label: Bug 42 | description: Please describe the issue in detail so we can reproduce it in Ultralytics Docs. Include logs, screenshots, console output, and any context that helps explain the problem. 43 | placeholder: | 44 | 💡 ProTip! Include as much information as possible (logs, tracebacks, screenshots, etc.) to receive the most helpful response. 45 | validations: 46 | required: true 47 | 48 | - type: textarea 49 | attributes: 50 | label: Environment 51 | description: Share the platform and version information relevant to your report. 52 | placeholder: | 53 | Please include: 54 | - OS (e.g., Ubuntu 20.04, macOS 13.5, Windows 11) 55 | - Language or framework version (Python, Swift, Flutter, etc.) 56 | - Package or app version 57 | - Hardware (e.g., CPU, GPU model, device model) 58 | - Any other environment details 59 | validations: 60 | required: true 61 | 62 | - type: textarea 63 | attributes: 64 | label: Minimal Reproducible Example 65 | description: > 66 | Provide the smallest possible snippet, command, or steps required to reproduce the issue. This helps us pinpoint problems faster. 67 | placeholder: | 68 | ```python 69 | # Code or commands to reproduce your issue here 70 | ``` 71 | validations: 72 | required: false 73 | 74 | - type: textarea 75 | attributes: 76 | label: Additional 77 | description: Anything else you would like to share? 78 | 79 | - type: checkboxes 80 | attributes: 81 | label: Are you willing to submit a PR? 82 | description: > 83 | (Optional) We encourage you to submit a [Pull Request](https://github.com/ultralytics/docs/pulls) to help improve Ultralytics Docs, especially if you know how to fix the issue. 84 | See the Ultralytics [Contributing Guide](https://docs.ultralytics.com/help/contributing/) to get started. 85 | options: 86 | - label: Yes I'd like to help by submitting a PR! 87 | -------------------------------------------------------------------------------- /.github/workflows/links_local.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | # Continuous Integration (CI) GitHub Actions tests broken link checker using https://github.com/lycheeverse/lychee 4 | # Ignores the following status codes to reduce false positives: 5 | # - 401(Vimeo, 'unauthorized') 6 | # - 403(OpenVINO, 'forbidden') 7 | # - 429(Instagram, 'too many requests') 8 | # - 500(Zenodo, 'cached') 9 | # - 502(Zenodo, 'bad gateway') 10 | # - 999(LinkedIn, 'unknown status code') 11 | 12 | name: Check Broken Repo links 13 | 14 | permissions: 15 | contents: read 16 | 17 | on: 18 | workflow_dispatch: 19 | schedule: 20 | - cron: "0 0 * * *" # runs at 00:00 UTC every day 21 | push: 22 | branches: 23 | - main 24 | - gh-pages 25 | pull_request: 26 | branches: 27 | - main 28 | - gh-pages 29 | 30 | jobs: 31 | Links: 32 | runs-on: ubuntu-latest 33 | steps: 34 | - name: Checkout code 35 | uses: actions/checkout@v6 36 | 37 | - name: Install lychee 38 | run: curl -sSfL "https://github.com/lycheeverse/lychee/releases/latest/download/lychee-x86_64-unknown-linux-gnu.tar.gz" | sudo tar xz -C /usr/local/bin 39 | 40 | - name: Test Markdown and HTML links with retry 41 | uses: ultralytics/actions/retry@main 42 | with: 43 | timeout_minutes: 60 44 | retry_delay_seconds: 3 45 | retries: 2 46 | run: | 47 | rm -rf .lycheecache 48 | lychee \ 49 | --scheme 'https' \ 50 | --timeout 60 \ 51 | --insecure \ 52 | --accept 100..=103,200..=299,401,403,429,500,502,999 \ 53 | --exclude-all-private \ 54 | --exclude 'https?://(www\.)?(linkedin\.com|twitter\.com|instagram\.com|kaggle\.com|tiktok\.com|fonts\.gstatic\.com|fonts\.googleapis\.com|url\.com|tesla\.com)' \ 55 | --exclude-path './**/ci.yml' \ 56 | --github-token ${{ secrets.GITHUB_TOKEN }} \ 57 | --header "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.6478.183 Safari/537.36" \ 58 | './**/*.md' \ 59 | './**/*.html' 60 | 61 | - name: Test Website, Markdown, HTML, YAML, Python and Notebook links with retry 62 | if: github.event_name == 'workflow_dispatch' 63 | uses: ultralytics/actions/retry@main 64 | with: 65 | timeout_minutes: 60 66 | retry_delay_seconds: 3 67 | retries: 2 68 | run: | 69 | rm -rf .lycheecache 70 | lychee \ 71 | --scheme 'https' \ 72 | --timeout 60 \ 73 | --insecure \ 74 | --accept 100..=103,200..=299,401,403,429,500,502,999 \ 75 | --exclude-all-private \ 76 | --exclude 'https?://(www\.)?(linkedin\.com|twitter\.com|instagram\.com|kaggle\.com|tiktok\.com|fonts\.gstatic\.com|fonts\.googleapis\.com|url\.com|tesla\.com)' \ 77 | --exclude-path './**/ci.yml' \ 78 | --github-token ${{ secrets.GITHUB_TOKEN }} \ 79 | --header "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.6478.183 Safari/537.36" \ 80 | './**/*.md' \ 81 | './**/*.html' \ 82 | './**/*.yml' \ 83 | './**/*.yaml' \ 84 | './**/*.py' \ 85 | './**/*.ipynb' 86 | -------------------------------------------------------------------------------- /.github/workflows/check_domains.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | # Ultralics website domain checks 4 | 5 | name: Check Domains 6 | 7 | permissions: 8 | contents: read 9 | 10 | on: 11 | schedule: 12 | # Runs every day at 05:00 UTC 13 | - cron: "0 5 * * *" 14 | workflow_dispatch: 15 | 16 | jobs: 17 | Test: 18 | runs-on: ubuntu-latest 19 | strategy: 20 | fail-fast: false 21 | matrix: 22 | domain: 23 | [ 24 | "ultralytics.com", 25 | "ultralytics.co", 26 | "ultralytics.ai", 27 | "ultralytics.app", 28 | "ultralytics.eu", 29 | "ultralytics.es", 30 | "ultralytics.us", 31 | "ultralytics.cn", 32 | "ultralytics.com.cn", 33 | "ultralytics.io", 34 | "ultralytics.net", 35 | "ultralytics.org", 36 | "ultralitics.com", 37 | "ultralytiks.com", 38 | "ultralyitcs.com", 39 | "ultralyics.com", 40 | "ultralytcs.com", 41 | "ultralytycs.com", 42 | "ultraltics.com", 43 | "ultralyctics.com", 44 | "ultralytix.com", 45 | "ultralytic.com", 46 | "ultrlaytics.com", 47 | "ultraltyics.com", 48 | "pjreddie.org", 49 | "pjreddie.net", 50 | "yolov5.com", 51 | "yolo11.com", 52 | "yolo11.ai", 53 | "yolo11.io", 54 | "yolo11.net", 55 | "yolo11.org", 56 | "yolo14.com", 57 | "yolo15.com", 58 | "yolo19.com", 59 | "yolo-vision.com", 60 | ] 61 | prefix: ["www.", ""] 62 | steps: 63 | - name: Set up Python 64 | uses: actions/setup-python@v6 65 | with: 66 | python-version: "3.x" 67 | - uses: astral-sh/setup-uv@v7 68 | - name: Install dependencies 69 | run: uv pip install --system requests 70 | - name: Check domain redirections 71 | shell: python 72 | run: | 73 | import requests 74 | import time 75 | 76 | def check_domain_redirection(domain, prefix, max_attempts=5): 77 | """Check if the given domain redirects correctly, with delays between retries.""" 78 | valid_destinations = ["ultralytics.com", "yolo11.com"] 79 | url = f"https://{prefix}{domain}" 80 | print(f"\nChecking {url}") 81 | 82 | for attempt in range(max_attempts): 83 | try: 84 | if attempt > 0: 85 | delay = 2 ** attempt # 2, 4, 8, 16, 32 seconds... 86 | time.sleep(delay) 87 | 88 | response = requests.get(url, allow_redirects=True, timeout=10) 89 | response.raise_for_status() 90 | 91 | # Check if the final URL contains any of the valid destinations 92 | if any(dest in response.url for dest in valid_destinations) and response.status_code == 200: 93 | print("Success ✅") 94 | return True 95 | 96 | except requests.RequestException as e: 97 | print(f"Error: {e}") 98 | if attempt == max_attempts - 1: 99 | print(f"Failed after {max_attempts} attempts ❌") 100 | return False 101 | 102 | return False 103 | 104 | success = check_domain_redirection('${{ matrix.domain }}', '${{ matrix.prefix }}') 105 | if not success: 106 | raise Exception(f"Domain check failed for ${{ matrix.domain }} with prefix '${{ matrix.prefix }}'") 107 | 108 | Summary: 109 | runs-on: ubuntu-latest 110 | needs: [Test] 111 | if: always() 112 | steps: 113 | - name: Check for failure and notify 114 | if: needs.Test.result == 'failure' && github.repository == 'ultralytics/docs' && (github.event_name == 'schedule' || github.event_name == 'push') && github.run_attempt == '1' 115 | uses: slackapi/slack-github-action@v2.1.1 116 | with: 117 | webhook-type: incoming-webhook 118 | webhook: ${{ secrets.SLACK_WEBHOOK_URL_WEBSITE }} 119 | payload: | 120 | text: " GitHub Actions error for ${{ github.workflow }} ❌\n\n\n*Repository:* https://github.com/${{ github.repository }}\n*Action:* https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n*Author:* ${{ github.actor }}\n*Event:* ${{ github.event_name }}\n" 121 | -------------------------------------------------------------------------------- /.github/workflows/sitemaps.yml: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | # Submit Sitemaps to Google Search Console after Pages Deployment 4 | 5 | name: Submit Sitemaps 6 | 7 | permissions: 8 | contents: read 9 | 10 | on: 11 | workflow_dispatch: 12 | inputs: 13 | submit_all_urls: 14 | type: boolean 15 | description: Submit all URLs to IndexNow (do not filter by changed) 16 | default: false 17 | workflow_run: 18 | workflows: ["pages-build-deployment"] 19 | types: 20 | - completed 21 | 22 | jobs: 23 | submit-sitemaps: 24 | runs-on: ubuntu-latest 25 | if: ${{ github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success' || github.event_name == 'workflow_dispatch' }} 26 | 27 | steps: 28 | - name: Checkout Repo 29 | uses: actions/checkout@v6 30 | with: 31 | ref: gh-pages # checkout gh-pages branch 32 | fetch-depth: 2 # fetch the current and previous commit 33 | 34 | - name: Set up Python 35 | uses: actions/setup-python@v6 36 | with: 37 | python-version: "3.x" 38 | 39 | - uses: astral-sh/setup-uv@v7 40 | 41 | - name: Install Dependencies 42 | run: uv pip install --system google-api-python-client oauth2client packaging 43 | 44 | - name: Get modified files 45 | id: modified_files 46 | run: | 47 | modified_files=$(git diff --name-only HEAD^ HEAD | tr '\n' ' ') 48 | echo "Modified files: $modified_files" 49 | echo "MODIFIED_FILES=$modified_files" >> $GITHUB_ENV 50 | 51 | - name: Submit Sitemaps to Google 52 | env: 53 | CREDENTIALS_JSON: ${{ secrets.GOOGLE_SEARCH_CONSOLE_API_JSON }} 54 | shell: python 55 | run: | 56 | import os 57 | import json 58 | from googleapiclient.discovery import build 59 | from oauth2client.service_account import ServiceAccountCredentials 60 | def submit_sitemap(site_url, sitemap_url, credentials_json): 61 | try: 62 | credentials = ServiceAccountCredentials.from_json_keyfile_dict(json.loads(credentials_json), ['https://www.googleapis.com/auth/webmasters']) 63 | webmasters_service = build('webmasters', 'v3', credentials=credentials) 64 | webmasters_service.sitemaps().submit(siteUrl=site_url, feedpath=sitemap_url).execute() 65 | print(f'Submitted {sitemap_url} ✅') 66 | except Exception as e: 67 | print(f'ERROR ❌: {sitemap_url} failed to submit {e}') 68 | credentials_json = os.environ['CREDENTIALS_JSON'] 69 | # Submit sitemaps for each language 70 | for host in ["www.ultralytics.com", "docs.ultralytics.com"]: 71 | for lang in ['', '/zh', '/ko', '/ja', '/ru', '/de', '/fr', '/es', '/pt', '/ar', '/tr', '/vi', '/it']: 72 | sitemap = f'https://{host}{lang}/sitemap.xml' 73 | submit_sitemap(f'https://{host}/', sitemap, credentials_json) 74 | 75 | - name: Submit URLs to IndexNow 76 | env: 77 | INDEXNOW_KEY: ${{ secrets.INDEXNOW_KEY_DOCS }} 78 | SUBMIT_ALL_URLS: ${{ github.event.inputs.submit_all_urls }} 79 | shell: python 80 | run: | 81 | import json 82 | import os 83 | import re 84 | import requests 85 | 86 | def submit_urls_to_indexnow(host, urls): 87 | key = os.environ['INDEXNOW_KEY'] 88 | endpoint = "https://api.indexnow.org/indexnow" # static API endpoint from https://www.indexnow.org/faq 89 | headers = {"Content-Type": "application/json; charset=utf-8"} 90 | payload = {"host": host, "key": key, "urlList": urls, "keyLocation": f"https://{host}/{key}.txt"} 91 | try: 92 | response = requests.post(endpoint, headers=headers, data=json.dumps(payload)) 93 | if response.status_code == 200: 94 | print(f"Submitted batch of {len(urls)} {host} URLs to IndexNow endpoint {endpoint} ✅") 95 | else: 96 | print(f"Failed to submit batch of URLs: Status code {response.status_code}, Response: {response.text}") 97 | except Exception as e: 98 | print(f"ERROR ❌: Failed to submit batch of URLs - {e}") 99 | 100 | def extract_urls_from_sitemap(sitemap_url): 101 | try: 102 | response = requests.get(sitemap_url) 103 | return re.findall(r"(.*?)", response.text) 104 | except Exception as e: 105 | print(f"ERROR ❌: Failed to extract URLs from {sitemap_url} - {e}") 106 | return [] 107 | 108 | def filter_modified_urls(urls, modified_files): 109 | # Filter URLs based on modified files 110 | modified_urls = [] 111 | for file in modified_files: 112 | # Convert file path to URL path, i.e. 'modes/index.html' -> 'https://docs.ultralytics.com/modes/' 113 | full_url = f'https://{host}/{file.replace('index.html', '')}' 114 | if full_url in urls: 115 | modified_urls.append(full_url) 116 | return modified_urls 117 | 118 | # Submit URLs from each sitemap to IndexNow 119 | host = "docs.ultralytics.com" 120 | all_urls = [] 121 | for lang in ['', '/zh', '/ko', '/ja', '/ru', '/de', '/fr', '/es', '/pt', '/ar', '/tr', '/vi', '/it']: 122 | sitemap = f'https://{host}{lang}/sitemap.xml' 123 | lang_urls = extract_urls_from_sitemap(sitemap) 124 | all_urls.extend(lang_urls) 125 | print(f'Found {len(lang_urls)} in {sitemap} ({len(all_urls)} total)') 126 | 127 | # Filter URLs based on modified files 128 | if os.getenv('SUBMIT_ALL_URLS', 'false').lower() == 'true': 129 | urls_to_submit = all_urls 130 | else: 131 | urls_to_submit = filter_modified_urls(all_urls, os.environ['MODIFIED_FILES'].split()) 132 | print(f'\nFound {len(urls_to_submit)} URLs updated in last commit to submit:\n{"\n".join(urls_to_submit)}\n') 133 | 134 | # Submit filtered URLs 135 | if urls_to_submit: 136 | submit_urls_to_indexnow(host, urls_to_submit) 137 | -------------------------------------------------------------------------------- /utils/check_image_sizes.py: -------------------------------------------------------------------------------- 1 | # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license 2 | 3 | import os 4 | import sys 5 | from collections import defaultdict 6 | from concurrent.futures import ThreadPoolExecutor 7 | from pathlib import Path 8 | from urllib.parse import urljoin, urlparse 9 | 10 | import requests 11 | from bs4 import BeautifulSoup 12 | 13 | HEADERS = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"} 14 | 15 | # URLs to ignore when checking image sizes 16 | URL_IGNORE_LIST = { 17 | # Add image URLs here that should be excluded from size checks 18 | # Example: "https://example.com/large-banner.png", 19 | } 20 | 21 | 22 | def check_image_sizes(download_dir, website, threshold_kb=750, max_workers=32, ignore_gifs=False): 23 | """Check image sizes in downloaded HTML files and report large images.""" 24 | print(f"Scanning {download_dir} for images...") 25 | unique_images = defaultdict(set) 26 | 27 | # Scan downloaded HTML files for image URLs 28 | for html_file in Path(download_dir).rglob("*.html"): 29 | try: 30 | with open(html_file, encoding="utf-8") as f: 31 | soup = BeautifulSoup(f.read(), "html.parser") 32 | page_url = f"https://{website}/{html_file.relative_to(download_dir)}".replace( 33 | "/index.html", "/" 34 | ).removesuffix(".html") 35 | for img in soup.find_all("img", src=True): 36 | img_url = urljoin(f"https://{website}", img["src"]) 37 | if img_url not in URL_IGNORE_LIST and (not ignore_gifs or not img_url.lower().endswith(".gif")): 38 | unique_images[img_url].add(page_url) 39 | except Exception: 40 | pass 41 | 42 | print(f"Found {len(unique_images)} unique images") 43 | 44 | # Check sizes 45 | def get_size(url): 46 | """Get file size and format for a URL.""" 47 | try: 48 | response = requests.head(url, allow_redirects=True, timeout=10, headers=HEADERS) 49 | size = int(response.headers.get("content-length", 0)) 50 | if size == 0: 51 | response = requests.get(url, allow_redirects=True, timeout=10, stream=True, headers=HEADERS) 52 | size = int(response.headers.get("content-length", 0)) 53 | response.close() 54 | 55 | # Get format from Content-Type header first (more reliable) 56 | content_type = response.headers.get("content-type", "").lower() 57 | format_map = { 58 | "image/jpeg": ".jpg", 59 | "image/jpg": ".jpg", 60 | "image/png": ".png", 61 | "image/gif": ".gif", 62 | "image/webp": ".webp", 63 | "image/svg+xml": ".svg", 64 | "image/avif": ".avif", 65 | "image/bmp": ".bmp", 66 | "image/tiff": ".tiff", 67 | } 68 | fmt = format_map.get(content_type.split(";")[0].strip()) 69 | 70 | # Fallback to URL parsing - check original URL first, then redirected URL 71 | if not fmt: 72 | fmt = Path(urlparse(url).path).suffix.lower() 73 | if not fmt: 74 | final_url = response.url if response.history else url 75 | fmt = Path(urlparse(final_url).path).suffix.lower() or ".unknown" 76 | 77 | return url, size, fmt 78 | except Exception: 79 | return url, None, None 80 | 81 | # Collect all image data 82 | all_images = [] 83 | with requests.Session() as session: 84 | session.headers.update(HEADERS) 85 | with ThreadPoolExecutor(max_workers=max_workers) as executor: 86 | for url, size, fmt in executor.map(get_size, unique_images.keys()): 87 | if size: 88 | all_images.append((size / 1024, fmt, len(unique_images[url]), url)) 89 | 90 | all_images.sort(reverse=True) 91 | 92 | # Print statistics 93 | if all_images: 94 | import pandas as pd 95 | 96 | df = pd.DataFrame(all_images, columns=["Size (KB)", "Format", "Pages", "URL"]) 97 | 98 | # Format statistics 99 | format_stats = ( 100 | df.groupby("Format") 101 | .agg({"URL": "count", "Size (KB)": ["min", "max", "mean", "sum"]}) 102 | .round(1) 103 | .reset_index() 104 | ) 105 | format_stats.columns = ["Format", "Count", "Min (KB)", "Max (KB)", "Mean (KB)", "Total (MB)"] 106 | format_stats["Total (MB)"] = (format_stats["Total (MB)"] / 1024).round(1) 107 | format_stats = format_stats.sort_values("Total (MB)", ascending=False) 108 | 109 | print("\nImage Format Statistics:") 110 | print(format_stats.to_string(index=False)) 111 | print(f"\nTotal images processed: {len(all_images)}") 112 | 113 | # Print top 50 largest 114 | print("\nTop 50 Largest Images:") 115 | top_50 = df.head(50).copy() 116 | top_50["Size (KB)"] = top_50["Size (KB)"].round(1) 117 | top_50["Example Page"] = top_50["URL"].apply(lambda url: next(iter(unique_images[url]))) 118 | top_50["URL"] = top_50["URL"].apply(lambda x: x if len(x) <= 120 else f"{x[:60]}...{x[-57:]}") 119 | print(top_50[["URL", "Pages", "Size (KB)", "Format", "Example Page"]].to_string(index=False)) 120 | 121 | # Check for large images above threshold 122 | large_images = [(size_kb, fmt, pages, url) for size_kb, fmt, pages, url in all_images if size_kb > threshold_kb] 123 | 124 | if large_images: 125 | print(f"\n⚠️ Found {len(large_images)} images > {threshold_kb} KB") 126 | output = [f"*{len(large_images)} images > {threshold_kb}KB*"] 127 | for size_kb, fmt, pages, url in large_images[:10]: 128 | # Extract filename from URL for concise display 129 | filename = Path(urlparse(url).path).name or "image" 130 | # Append format if filename doesn't have an extension 131 | if not Path(filename).suffix and fmt: 132 | filename = f"{filename}{fmt}" 133 | # Truncate from start if too long, keeping extension visible 134 | if len(filename) > 40: 135 | filename = f"...{filename[-37:]}" 136 | # Get first page URL for context 137 | page_url = next(iter(unique_images[url])) 138 | # Format as Slack hyperlink to avoid auto-expansion: 139 | output.append(f"• {size_kb:.0f}KB <{url}|{filename}> ➜ {page_url}") 140 | if len(large_images) > 10: 141 | repo = os.environ.get("GITHUB_REPOSITORY", "") 142 | run_id = os.environ.get("GITHUB_RUN_ID", "") 143 | if repo and run_id: 144 | output.append( 145 | f"...{len(large_images) - 10} more ➜ " 146 | ) 147 | 148 | result = "\\n".join(output) 149 | with open(os.environ["GITHUB_ENV"], "a") as f: 150 | f.write(f"IMAGE_RESULTS< {threshold_kb} KB") 154 | return 0 155 | 156 | 157 | if __name__ == "__main__": 158 | if len(sys.argv) < 3: 159 | print("Usage: python check_image_sizes.py [ignore_gifs]") 160 | sys.exit(1) 161 | 162 | download_dir = sys.argv[1] 163 | website = sys.argv[2] 164 | ignore_gifs = sys.argv[3].lower() in ("true", "1", "yes") if len(sys.argv) > 3 else False 165 | sys.exit(check_image_sizes(download_dir, website, ignore_gifs=ignore_gifs)) 166 | -------------------------------------------------------------------------------- /docs/en/compare/rtdetr-vs-yolov7.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Compare RTDETRv2 and YOLOv7 for object detection. Explore their architecture, performance, and use cases to choose the best model for your needs. 4 | keywords: RTDETRv2, YOLOv7, object detection, model comparison, computer vision, machine learning, performance metrics, real-time detection, transformer models, YOLO 5 | --- 6 | 7 | # RTDETRv2 vs YOLOv7: A Detailed Technical Comparison 8 | 9 | The landscape of real-time [object detection](https://docs.ultralytics.com/tasks/detect/) has witnessed a fierce competition between Convolutional Neural Networks (CNNs) and the emerging Vision Transformers (ViTs). Two significant milestones in this evolution are **RTDETRv2** (Real-Time Detection Transformer v2) and **YOLOv7** (You Only Look Once version 7). While YOLOv7 represents the pinnacle of efficient CNN architecture optimization, RTDETRv2 introduces the power of transformers to eliminate the need for post-processing steps like Non-Maximum Suppression (NMS). 10 | 11 | This comparison explores the technical specifications, architectural differences, and performance metrics of both models to help developers choose the right tool for their [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) applications. 12 | 13 | 14 | 15 | 16 | 17 | 18 | ## Performance Metrics: Accuracy vs. Speed 19 | 20 | The following table presents a direct comparison of key performance metrics. **RTDETRv2-x** demonstrates superior accuracy with a higher mAP, largely due to its transformer-based global context understanding. However, **YOLOv7** remains competitive, particularly in scenarios where lighter weight and balanced inference speeds on varying hardware are required. 21 | 22 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 23 | | ---------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 24 | | RTDETRv2-s | 640 | 48.1 | - | **5.03** | **20** | **60** | 25 | | RTDETRv2-m | 640 | 51.9 | - | 7.51 | 36 | 100 | 26 | | RTDETRv2-l | 640 | 53.4 | - | 9.76 | 42 | 136 | 27 | | RTDETRv2-x | 640 | **54.3** | - | 15.03 | 76 | 259 | 28 | | | | | | | | | 29 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 30 | | YOLOv7x | 640 | 53.1 | - | 11.57 | 71.3 | 189.9 | 31 | 32 | ## RTDETRv2: The Transformer Approach 33 | 34 | RTDETRv2 builds upon the success of the original RT-DETR, the first transformer-based detector to genuinely rival YOLO models in real-time speed. Developed by researchers at **Baidu**, it addresses the computational bottlenecks associated with multi-scale interaction in standard DETR architectures. 35 | 36 | - **Authors:** Wenyu Lv, Yian Zhao, Qinyao Chang, Kui Huang, Guanzhong Wang, and Yi Liu 37 | - **Organization:** [Baidu](https://www.baidu.com/) 38 | - **Date:** 2023-04-17 39 | - **Arxiv:** [https://arxiv.org/abs/2304.08069](https://arxiv.org/abs/2304.08069) 40 | - **GitHub:** [https://github.com/lyuwenyu/RT-DETR/tree/main/rtdetrv2_pytorch](https://github.com/lyuwenyu/RT-DETR/tree/main/rtdetrv2_pytorch) 41 | 42 | ### Key Architectural Features 43 | 44 | RTDETRv2 utilizes a **hybrid encoder** that efficiently processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion. This design significantly reduces computational costs compared to standard transformers. A standout feature is its **IoU-aware query selection**, which improves the initialization of object queries, leading to faster convergence and higher accuracy. Unlike CNN-based models, RTDETRv2 is **NMS-free**, meaning it does not require Non-Maximum Suppression post-processing, simplifying the deployment pipeline and reducing latency jitter. 45 | 46 | !!! info "Transformer Advantage" 47 | 48 | The primary advantage of the RTDETRv2 architecture is its ability to capture global context. While CNNs look at localized receptive fields, the self-attention mechanism in transformers allows the model to consider the entire image context when detecting objects, which is beneficial for resolving ambiguities in complex scenes with occlusion. 49 | 50 | [Learn more about RT-DETR](https://docs.ultralytics.com/models/rtdetr/){ .md-button } 51 | 52 | ## YOLOv7: The CNN Peak 53 | 54 | YOLOv7 pushes the boundaries of what is possible with Convolutional Neural Networks. It focuses on optimizing the training process and model architecture to achieve a "bag-of-freebies"—methods that increase accuracy without increasing inference cost. 55 | 56 | - **Authors:** Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao 57 | - **Organization:** Institute of Information Science, Academia Sinica 58 | - **Date:** 2022-07-06 59 | - **Arxiv:** [https://arxiv.org/abs/2207.02696](https://arxiv.org/abs/2207.02696) 60 | - **GitHub:** [https://github.com/WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7) 61 | 62 | ### Key Architectural Features 63 | 64 | YOLOv7 introduces **E-ELAN** (Extended Efficient Layer Aggregation Network), which enhances the network's learning capability by controlling the gradient path length. It also employs **model re-parameterization**, a technique where the model structure is complex during training for better learning but simplified during inference for speed. This allows YOLOv7 to maintain high performance on [GPU devices](https://www.ultralytics.com/glossary/gpu-graphics-processing-unit) while keeping parameters relatively low compared to transformer models. 65 | 66 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 67 | 68 | ## Comparison Analysis 69 | 70 | ### Architecture and Versatility 71 | 72 | The fundamental difference lies in the backbone and head design. YOLOv7 relies on deep CNN structures which are highly optimized for [CUDA](https://developer.nvidia.com/cuda) acceleration but may struggle with long-range dependencies in an image. RTDETRv2 leverages attention mechanisms to understand relationships between distant pixels, making it robust in cluttered environments. However, this comes at the cost of higher memory consumption during training. 73 | 74 | Ultralytics models like **YOLO11** bridge this gap by offering a CNN-based architecture that integrates modern attention-like modules, providing the speed of CNNs with the accuracy usually reserved for transformers. Furthermore, while RTDETRv2 is primarily an object detector, newer Ultralytics models support [instance segmentation](https://docs.ultralytics.com/tasks/segment/), [pose estimation](https://docs.ultralytics.com/tasks/pose/), and [classification](https://docs.ultralytics.com/tasks/classify/) natively. 75 | 76 | ### Training and Ease of Use 77 | 78 | Training transformer models like RTDETRv2 typically requires significant GPU memory and longer training epochs to converge compared to CNNs like YOLOv7. 79 | 80 | For developers seeking **Training Efficiency** and **Ease of Use**, the Ultralytics ecosystem offers a distinct advantage. With the `ultralytics` Python package, users can train, validate, and deploy models with just a few lines of code, accessing a suite of pre-trained weights for varying tasks. 81 | 82 | ```python 83 | from ultralytics import RTDETR, YOLO 84 | 85 | # Load an Ultralytics YOLOv7-style model (if available) or YOLO11 86 | model_yolo = YOLO("yolo11n.pt") # Recommended for best performance 87 | model_yolo.train(data="coco8.yaml", epochs=10) 88 | 89 | # Load RT-DETR for comparison 90 | model_rtdetr = RTDETR("rtdetr-l.pt") 91 | model_rtdetr.predict("asset.jpg") 92 | ``` 93 | 94 | ### Deployment and Ecosystem 95 | 96 | YOLOv7 has widespread support due to its age, but integration into modern MLOps pipelines can be manual. RTDETRv2 is newer and has growing support. In contrast, **Ultralytics** models benefit from a **Well-Maintained Ecosystem**, including seamless export to [ONNX](https://docs.ultralytics.com/integrations/onnx/), TensorRT, and CoreML, and integration with tools like [Ultralytics HUB](https://docs.ultralytics.com/hub/) for cloud training and dataset management. 97 | 98 | ## Ideal Use Cases 99 | 100 | - **Choose RTDETRv2 if:** You have ample GPU memory and require high precision in scenes with heavy occlusion or crowding, where NMS traditionally fails. It is excellent for research and high-end surveillance systems. 101 | - **Choose YOLOv7 if:** You need a proven, legacy CNN architecture that runs efficiently on standard GPU hardware for general-purpose detection tasks. 102 | - **Choose Ultralytics YOLO11 if:** You need the best **Performance Balance** of speed and accuracy, lower **Memory requirements**, and a versatile model capable of detection, segmentation, and pose estimation. It is the ideal choice for developers who value a streamlined workflow and extensive [documentation](https://docs.ultralytics.com/). 103 | 104 | !!! tip "Why Upgrade to YOLO11?" 105 | 106 | While YOLOv7 and RTDETRv2 are powerful, **YOLO11** represents the latest evolution in vision AI. It requires less CUDA memory than transformers, trains faster, and offers state-of-the-art accuracy across a wider range of hardware, from edge devices to cloud servers. 107 | 108 | ## Conclusion 109 | 110 | Both RTDETRv2 and YOLOv7 have shaped the direction of computer vision. RTDETRv2 successfully challenged the notion that transformers are too slow for real-time applications, while YOLOv7 demonstrated the enduring efficiency of CNNs. However, for most real-world applications today, the **Ultralytics YOLO11** model offers a superior developer experience, combining the best attributes of these predecessors with a modern, supportive ecosystem. 111 | 112 | ## Explore Other Comparisons 113 | 114 | To further understand the model landscape, explore these comparisons: 115 | 116 | - [YOLO11 vs. RT-DETR](https://docs.ultralytics.com/compare/yolo11-vs-rtdetr/) 117 | - [YOLOv8 vs. RT-DETR](https://docs.ultralytics.com/compare/rtdetr-vs-yolov8/) 118 | - [YOLOv7 vs. YOLOv8](https://docs.ultralytics.com/compare/yolov7-vs-yolov8/) 119 | - [YOLOv10 vs. RT-DETR](https://docs.ultralytics.com/compare/yolov10-vs-rtdetr/) 120 | - [YOLOv9 vs. YOLOv7](https://docs.ultralytics.com/compare/yolo11-vs-yolov9/) 121 | -------------------------------------------------------------------------------- /docs/en/compare/yolov7-vs-yolov8.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Compare YOLOv7 and YOLOv8 for object detection. Explore performance, architecture, and use cases to choose the best model for your vision tasks. 4 | keywords: YOLOv7, YOLOv8, object detection, model comparison, computer vision, real-time detection, performance benchmarks, deep learning, Ultralytics 5 | --- 6 | 7 | # Model Comparison: YOLOv7 vs. YOLOv8 for Object Detection 8 | 9 | In the rapidly evolving landscape of computer vision, the "You Only Look Once" (YOLO) family of models has consistently set the standard for real-time object detection. Two significant milestones in this lineage are YOLOv7 and Ultralytics YOLOv8. While both models pushed the boundaries of accuracy and speed upon their release, they represent different design philosophies and ecosystem maturities. 10 | 11 | This guide provides a detailed technical comparison to help developers and researchers choose the right tool for their specific needs, ranging from academic research to production-grade deployment. 12 | 13 | 14 | 15 | 16 | 17 | 18 | ## Performance Metrics Comparison 19 | 20 | The following table presents a direct comparison of performance metrics between key YOLOv7 and YOLOv8 models. YOLOv8 demonstrates a significant advantage in inference speed and a favorable parameter count, particularly in the smaller model variants which are critical for [edge AI](https://www.ultralytics.com/glossary/edge-ai) applications. 21 | 22 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 23 | | ------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 24 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 25 | | YOLOv7x | 640 | 53.1 | - | 11.57 | 71.3 | 189.9 | 26 | | | | | | | | | 27 | | YOLOv8n | 640 | 37.3 | **80.4** | **1.47** | **3.2** | **8.7** | 28 | | YOLOv8s | 640 | 44.9 | 128.4 | 2.66 | 11.2 | 28.6 | 29 | | YOLOv8m | 640 | 50.2 | 234.7 | 5.86 | 25.9 | 78.9 | 30 | | YOLOv8l | 640 | 52.9 | 375.2 | 9.06 | 43.7 | 165.2 | 31 | | YOLOv8x | 640 | **53.9** | 479.1 | 14.37 | 68.2 | 257.8 | 32 | 33 | ## YOLOv7: The "Bag-of-Freebies" Evolution 34 | 35 | Released in July 2022, YOLOv7 was developed primarily by the authors of YOLOv4 and YOLOR. It introduced several architectural innovations aimed at optimizing the training process without increasing inference costs, a concept referred to as a "trainable bag-of-freebies." 36 | 37 | - **Authors:** Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao 38 | - **Organization:** Institute of Information Science, Academia Sinica, Taiwan 39 | - **Date:** 2022-07-06 40 | - **Links:** [Arxiv Paper](https://arxiv.org/abs/2207.02696) | [GitHub Repository](https://github.com/WongKinYiu/yolov7) 41 | 42 | ### Key Architectural Features 43 | 44 | YOLOv7 introduced the Extended Efficient Layer Aggregation Network (E-ELAN). This architecture controls the shortest and longest gradient paths to allow the network to learn more diverse features. Furthermore, it utilized model scaling techniques that modify the architecture's depth and width simultaneously, ensuring optimal performance across different sizes. 45 | 46 | Despite its impressive benchmarks at launch, YOLOv7 primarily focuses on [object detection](https://docs.ultralytics.com/tasks/detect/), with less integrated support for other tasks compared to newer frameworks. 47 | 48 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 49 | 50 | ## Ultralytics YOLOv8: Unified Framework and Modern Architecture 51 | 52 | Launched in early 2023 by Ultralytics, YOLOv8 represented a major overhaul of the YOLO architecture. It was designed not just as a model, but as a unified framework capable of performing detection, [instance segmentation](https://docs.ultralytics.com/tasks/segment/), pose estimation, and classification seamlessly. 53 | 54 | - **Authors:** Glenn Jocher, Ayush Chaurasia, and Jing Qiu 55 | - **Organization:** Ultralytics 56 | - **Date:** 2023-01-10 57 | - **Links:** [Ultralytics Docs](https://docs.ultralytics.com/models/yolov8/) | [GitHub Repository](https://github.com/ultralytics/ultralytics) 58 | 59 | ### Architectural Innovations 60 | 61 | YOLOv8 moved away from the anchor-based detection used in previous versions (including YOLOv7) to an [anchor-free detector](https://www.ultralytics.com/glossary/anchor-free-detectors) mechanism. This shift simplifies the training process by eliminating the need to calculate anchor boxes, making the model more robust to variations in object shape and size. 62 | 63 | The backbone was upgraded to use C2f modules (Cross-Stage Partial Bottleneck with two convolutions), which replace the C3 modules of [YOLOv5](https://docs.ultralytics.com/models/yolov5/). This change improves gradient flow and allows the model to remain lightweight while capturing richer feature information. 64 | 65 | [Learn more about YOLOv8](https://docs.ultralytics.com/models/yolov8/){ .md-button } 66 | 67 | ## Detailed Technical Comparison 68 | 69 | ### Anchor-Based vs. Anchor-Free 70 | 71 | One of the most defining differences is the detection head. YOLOv7 relies on anchor boxes—pre-defined shapes that the model tries to match to objects. While effective, this requires hyperparameter tuning for custom datasets. 72 | 73 | In contrast, YOLOv8 utilizes an anchor-free approach, predicting the center of an object directly. This reduces the number of box predictions, speeding up Non-Maximum Suppression (NMS) and making the model easier to train on diverse data without manual anchor configuration. 74 | 75 | ### Training Efficiency and Memory Usage 76 | 77 | Ultralytics models are renowned for their engineering efficiency. YOLOv8 utilizes a smart data augmentation strategy that disables Mosaic augmentation during the final epochs of training. This technique stabilizes the training loss and improves precision. 78 | 79 | !!! tip "Memory Efficiency" 80 | 81 | A significant advantage of Ultralytics YOLOv8 over complex architectures like transformers (e.g., [RT-DETR](https://docs.ultralytics.com/models/rtdetr/)) is its lower CUDA memory requirement. This allows users to train larger batch sizes on consumer-grade GPUs, democratizing access to state-of-the-art model training. 82 | 83 | ### Ecosystem and Ease of Use 84 | 85 | While YOLOv7 is a powerful research repository, Ultralytics YOLOv8 offers a polished product experience. The Ultralytics ecosystem provides: 86 | 87 | 1. **Streamlined API:** A consistent Python interface for all tasks. 88 | 2. **Deployment:** One-click export to formats like ONNX, TensorRT, CoreML, and TFLite via the [Export mode](https://docs.ultralytics.com/modes/export/). 89 | 3. **Community Support:** An active [Discord community](https://discord.com/invite/ultralytics) and frequent updates ensuring compatibility with the latest PyTorch versions. 90 | 91 | ## Code Comparison 92 | 93 | The usability gap is evident when comparing the code required to run inference. Ultralytics prioritizes a low-code approach, allowing developers to integrate vision AI into applications with minimal overhead. 94 | 95 | ### Running YOLOv8 with Python 96 | 97 | ```python 98 | from ultralytics import YOLO 99 | 100 | # Load a pre-trained YOLOv8 model 101 | model = YOLO("yolov8n.pt") 102 | 103 | # Run inference on an image 104 | results = model("https://ultralytics.com/images/bus.jpg") 105 | 106 | # Display the results 107 | for result in results: 108 | result.show() 109 | ``` 110 | 111 | ### CLI Implementation 112 | 113 | YOLOv8 can also be executed directly from the command line, a feature that simplifies pipeline integration and quick testing. 114 | 115 | ```bash 116 | # Detect objects in an image using the nano model 117 | yolo predict model=yolov8n.pt source='https://ultralytics.com/images/zidane.jpg' imgsz=640 118 | ``` 119 | 120 | ## Ideal Use Cases 121 | 122 | ### When to use YOLOv7 123 | 124 | YOLOv7 remains a viable choice for researchers benchmarking against 2022/2023 standards or maintaining legacy systems built specifically around the Darknet-style architecture. Its "bag-of-freebies" approach offers interesting insights for those studying neural network optimization strategies. 125 | 126 | ### When to use YOLOv8 127 | 128 | YOLOv8 is the recommended choice for the vast majority of new projects, including: 129 | 130 | - **Real-Time Applications:** The YOLOv8n (nano) model offers incredible speeds (approx. 80ms on CPU), making it perfect for mobile apps and [embedded systems](https://docs.ultralytics.com/guides/raspberry-pi/). 131 | - **Multi-Task Pipelines:** Projects requiring [pose estimation](https://docs.ultralytics.com/tasks/pose/) or [segmentation](https://docs.ultralytics.com/tasks/segment/) alongside detection can use a single API. 132 | - **Commercial Deployment:** The robust export compatibility ensures that models trained in PyTorch can be deployed efficiently to production environments using TensorRT or OpenVINO. 133 | 134 | ## Conclusion 135 | 136 | While YOLOv7 made significant contributions to the field of [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) by optimizing trainable parameters, **Ultralytics YOLOv8** represents the modern standard for practical AI development. 137 | 138 | YOLOv8's superior balance of speed and accuracy, combined with an anchor-free design and the extensive Ultralytics support ecosystem, makes it more accessible for beginners and more powerful for experts. For developers looking to build scalable, maintainable, and high-performance vision applications, YOLOv8—and its successors like [YOLO11](https://docs.ultralytics.com/models/yolo11/)—offer the most compelling path forward. 139 | 140 | ### Further Reading 141 | 142 | For those interested in exploring the latest advancements in object detection, consider reviewing these related models: 143 | 144 | - **[YOLO11](https://docs.ultralytics.com/models/yolo11/):** The latest iteration from Ultralytics, refining the architecture for even greater efficiency. 145 | - **[YOLOv6](https://docs.ultralytics.com/models/yolov6/):** Another anchor-free model focusing on industrial applications. 146 | - **[YOLOv9](https://docs.ultralytics.com/models/yolov9/):** Focuses on Programmable Gradient Information (PGI) for deep network training. 147 | -------------------------------------------------------------------------------- /docs/en/compare/yolov10-vs-yolov7.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Compare YOLOv10 and YOLOv7 object detection models. Analyze performance, architecture, and use cases to choose the best fit for your AI project. 4 | keywords: YOLOv10, YOLOv7, object detection, model comparison, AI, deep learning, computer vision, performance metrics, architecture, edge AI, robotics, autonomous systems 5 | --- 6 | 7 | # YOLOv10 vs YOLOv7: Advancing Real-Time Object Detection Architecture 8 | 9 | The evolution of the YOLO (You Only Look Once) family has consistently pushed the boundaries of computer vision, balancing speed and accuracy for real-time applications. This comparison explores the architectural shifts and performance differences between **YOLOv10**, a state-of-the-art model released by researchers from Tsinghua University, and **YOLOv7**, a highly influential model developed by Academia Sinica. While both models have made significant contributions to the field of [object detection](https://www.ultralytics.com/glossary/object-detection), they employ distinct strategies to achieve their performance goals. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## Evolution of Model Architectures 17 | 18 | The transition from YOLOv7 to YOLOv10 marks a paradigm shift in how neural networks handle post-processing and feature integration. 19 | 20 | ### YOLOv10: The NMS-Free Revolution 21 | 22 | **YOLOv10**, released on May 23, 2024, by Ao Wang, Hui Chen, and others from [Tsinghua University](https://www.tsinghua.edu.cn/en/), introduces a groundbreaking NMS-free training strategy. Traditionally, object detectors rely on [Non-Maximum Suppression (NMS)](https://www.ultralytics.com/glossary/non-maximum-suppression-nms) to filter out duplicate bounding boxes, which can create a bottleneck in inference latency. 23 | 24 | YOLOv10 utilizes **Consistent Dual Assignments** for NMS-free training, allowing the model to predict unique object instances directly. Combined with a **holistic efficiency-accuracy driven model design**, it optimizes various components—including the lightweight classification head and spatial-channel decoupled downsampling—to reduce computational redundancy. 25 | 26 | [Learn more about YOLOv10](https://docs.ultralytics.com/models/yolov10/){ .md-button } 27 | 28 | ### YOLOv7: Optimized for Trainable Bag-of-Freebies 29 | 30 | **YOLOv7**, released on July 6, 2022, by Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao from Academia Sinica, focuses on optimizing the training process without increasing inference cost. It introduced the **Extended Efficient Layer Aggregation Network (E-ELAN)**, which enhances the learning capability of the network by controlling the gradient path. 31 | 32 | YOLOv7 heavily leverages "Bag-of-Freebies"—methods that improve accuracy during training without impacting inference speed—and model scaling techniques that compound parameters efficiently. While highly effective, its reliance on traditional NMS post-processing means its end-to-end latency is often higher than the newer NMS-free architectures. 33 | 34 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 35 | 36 | ## Technical Performance Comparison 37 | 38 | When evaluating these models, distinct patterns emerge regarding efficiency and raw detection capability. YOLOv10 generally offers superior efficiency, achieving similar or better [mAP (Mean Average Precision)](https://www.ultralytics.com/glossary/mean-average-precision-map) with significantly fewer parameters and faster inference times compared to YOLOv7. 39 | 40 | The table below outlines the key metrics on the [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/). 41 | 42 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 43 | | -------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 44 | | YOLOv10n | 640 | 39.5 | - | **1.56** | **2.3** | **6.7** | 45 | | YOLOv10s | 640 | 46.7 | - | 2.66 | 7.2 | 21.6 | 46 | | YOLOv10m | 640 | 51.3 | - | 5.48 | 15.4 | 59.1 | 47 | | YOLOv10b | 640 | 52.7 | - | 6.54 | 24.4 | 92.0 | 48 | | YOLOv10l | 640 | 53.3 | - | 8.33 | 29.5 | 120.3 | 49 | | YOLOv10x | 640 | **54.4** | - | 12.2 | 56.9 | 160.4 | 50 | | | | | | | | | 51 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 52 | | YOLOv7x | 640 | 53.1 | - | 11.57 | 71.3 | 189.9 | 53 | 54 | !!! tip "Efficiency Insight" 55 | 56 | The data highlights a critical advantage for YOLOv10 in resource-constrained environments. **YOLOv10m** achieves a nearly identical accuracy (51.3% mAP) to **YOLOv7l** (51.4% mAP) but does so with **less than half the parameters** (15.4M vs 36.9M) and significantly lower FLOPs (59.1B vs 104.7B). 57 | 58 | ### Latency and Throughput 59 | 60 | YOLOv10's removal of the NMS step drastically reduces the latency variance often seen in crowded scenes. In applications like [autonomous vehicles](https://www.ultralytics.com/glossary/autonomous-vehicles) or [drone surveillance](https://www.ultralytics.com/blog/computer-vision-applications-ai-drone-uav-operations), where every millisecond counts, the predictable inference time of YOLOv10 provides a safety-critical advantage. YOLOv7 remains competitive in throughput on high-end GPUs but consumes more memory and computation to achieve comparable results. 61 | 62 | ## Use Cases and Applications 63 | 64 | The architectural differences dictate the ideal deployment scenarios for each model. 65 | 66 | ### Ideal Scenarios for YOLOv10 67 | 68 | - **Edge AI:** Due to its low parameter count and [FLOPs](https://www.ultralytics.com/glossary/flops), YOLOv10 is perfect for devices like the [Raspberry Pi](https://docs.ultralytics.com/guides/raspberry-pi/) or [NVIDIA Jetson](https://docs.ultralytics.com/guides/nvidia-jetson/). 69 | - **Real-Time Video Analytics:** The high inference speed supports high-FPS processing for [traffic management](https://www.ultralytics.com/blog/ai-in-traffic-management-from-congestion-to-coordination) and retail analytics. 70 | - **Robotics:** Lower latency translates to faster reaction times for robot navigation and manipulation tasks. 71 | 72 | ### Ideal Scenarios for YOLOv7 73 | 74 | - **Legacy Systems:** Projects already integrated with the YOLOv7 codebase may find it stable enough to maintain without immediate refactoring. 75 | - **General Purpose Detection:** For server-side deployments where VRAM is abundant, YOLOv7's larger models still provide robust detection capabilities, though they are less efficient than newer alternatives like [YOLO11](https://docs.ultralytics.com/models/yolo11/). 76 | 77 | ## The Ultralytics Advantage 78 | 79 | While both models are powerful, leveraging the **Ultralytics ecosystem** offers distinct benefits for developers and researchers. The Ultralytics framework standardizes the interface for training, validation, and deployment, making it significantly easier to switch between models and benchmark performance. 80 | 81 | ### Ease of Use and Training Efficiency 82 | 83 | One of the primary barriers in deep learning is the complexity of training pipelines. Ultralytics models, including YOLOv10 and [YOLO11](https://docs.ultralytics.com/models/yolo11/), utilize a streamlined Python API that handles data augmentation, [hyperparameter tuning](https://docs.ultralytics.com/guides/hyperparameter-tuning/), and [exporting](https://docs.ultralytics.com/modes/export/) automatically. 84 | 85 | - **Simple API:** Train a model in a few lines of code. 86 | - **Memory Efficiency:** Ultralytics optimizations often result in lower CUDA memory usage during training compared to raw implementations. 87 | - **Pre-trained Weights:** Access to high-quality pre-trained models on [ImageNet](https://docs.ultralytics.com/datasets/classify/imagenet/) and COCO accelerates [transfer learning](https://www.ultralytics.com/glossary/transfer-learning). 88 | 89 | !!! info "Versatility Across Tasks" 90 | 91 | Modern Ultralytics models extend beyond simple bounding box detection. They support [Instance Segmentation](https://docs.ultralytics.com/tasks/segment/), [Pose Estimation](https://docs.ultralytics.com/tasks/pose/), [Oriented Object Detection (OBB)](https://docs.ultralytics.com/tasks/obb/), and [Classification](https://docs.ultralytics.com/tasks/classify/) within the same framework. This versatility is a key advantage over older standalone repositories. 92 | 93 | ### Code Example: Running YOLOv10 with Ultralytics 94 | 95 | The following example demonstrates the simplicity of using the Ultralytics API to load a pre-trained YOLOv10 model and run inference. This ease of use contrasts with the more manual setup often required for older architectures like YOLOv7. 96 | 97 | ```python 98 | from ultralytics import YOLO 99 | 100 | # Load a pre-trained YOLOv10n model 101 | model = YOLO("yolov10n.pt") 102 | 103 | # Run inference on an image 104 | results = model("path/to/image.jpg") 105 | 106 | # Display the results 107 | results[0].show() 108 | ``` 109 | 110 | ## Conclusion and Recommendation 111 | 112 | For new projects, **YOLOv10** or the even more advanced **[YOLO11](https://docs.ultralytics.com/models/yolo11/)** are the recommended choices. YOLOv10's NMS-free architecture delivers a superior balance of speed and accuracy, making it highly adaptable for modern [edge computing](https://www.ultralytics.com/glossary/edge-computing) needs. It addresses the latency bottlenecks of previous generations while reducing the computational footprint. 113 | 114 | Although **YOLOv7** remains a respected milestone in computer vision history, its architecture is less efficient by today's standards. Developers seeking the best performance, long-term maintenance, and ease of deployment will find the [Ultralytics ecosystem](https://www.ultralytics.com/)—with its continuous updates and broad tool support—to be the most productive environment for building vision AI solutions. 115 | 116 | ### Explore More 117 | 118 | - [YOLOv10 vs YOLOv8 Comparison](https://docs.ultralytics.com/compare/yolov10-vs-yolov8/) 119 | - [YOLOv10 vs YOLOv9 Comparison](https://docs.ultralytics.com/compare/yolov10-vs-yolov9/) 120 | - [YOLO11: The Latest in Real-Time Detection](https://docs.ultralytics.com/models/yolo11/) 121 | - [Guide to Exporting Models to TensorRT](https://docs.ultralytics.com/integrations/tensorrt/) 122 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Ultralytics logo 2 | 3 | # 📚 Ultralytics Docs 4 | 5 | Welcome to Ultralytics Docs, your comprehensive resource for understanding and utilizing our state-of-the-art [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) tools and models, including [Ultralytics YOLO](https://docs.ultralytics.com/models/yolov8/). These documents are actively maintained and deployed to [https://docs.ultralytics.com](https://docs.ultralytics.com/) for easy access. 6 | 7 | [![Ultralytics Actions](https://github.com/ultralytics/docs/actions/workflows/format.yml/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/format.yml) 8 | [![jsDelivr hits](https://data.jsdelivr.com/v1/package/gh/ultralytics/llm/badge?style=rounded)](https://www.jsdelivr.com/package/gh/ultralytics/llm) 9 | 10 | [![pages-build-deployment](https://github.com/ultralytics/docs/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/pages/pages-build-deployment) 11 | [![Check Broken links](https://github.com/ultralytics/docs/actions/workflows/links.yml/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/links.yml) 12 | [![Check Domains](https://github.com/ultralytics/docs/actions/workflows/check_domains.yml/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/check_domains.yml) 13 | [![Download Websites](https://github.com/ultralytics/docs/actions/workflows/download_websites.yml/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/download_websites.yml) 14 | 15 | [![Ultralytics Discord](https://img.shields.io/discord/1089800235347353640?logo=discord&logoColor=white&label=Discord&color=blue)](https://discord.com/invite/ultralytics) 16 | [![Ultralytics Forums](https://img.shields.io/discourse/users?server=https%3A%2F%2Fcommunity.ultralytics.com&logo=discourse&label=Forums&color=blue)](https://community.ultralytics.com/) 17 | [![Ultralytics Reddit](https://img.shields.io/reddit/subreddit-subscribers/ultralytics?style=flat&logo=reddit&logoColor=white&label=Reddit&color=blue)](https://reddit.com/r/ultralytics) 18 | 19 | ## 🛠️ Installation 20 | 21 | [![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) 22 | [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://clickpy.clickhouse.com/dashboard/ultralytics) 23 | [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/) 24 | 25 | To install the `ultralytics` package in developer mode, which allows you to modify the source code directly, ensure you have [Git](https://git-scm.com/downloads/) and [Python](https://www.python.org/downloads/) 3.9 or later installed on your system. Then, follow these steps: 26 | 27 | 1. Clone the `ultralytics` repository to your local machine using Git: 28 | 29 | ```bash 30 | git clone https://github.com/ultralytics/ultralytics.git 31 | ``` 32 | 33 | 2. Navigate to the cloned repository's root directory: 34 | 35 | ```bash 36 | cd ultralytics 37 | ``` 38 | 39 | 3. Install the package in editable mode (`-e`) along with its development dependencies (`[dev]`) using [pip](https://pip.pypa.io/en/stable/installation/): 40 | 41 | ```bash 42 | pip install -e '.[dev]' 43 | ``` 44 | 45 | This command installs the `ultralytics` package such that changes to the source code are immediately reflected in your environment, ideal for development and contributing. 46 | 47 | ## 🧰 Technology Stack 48 | 49 | - **[MkDocs](https://www.mkdocs.org/)** - Static site generator for project documentation 50 | - **[Ultralytics Chat](https://github.com/ultralytics/llm)** - Realtime conversational AI with open-source [chat.js](https://github.com/ultralytics/llm) implementation 51 | - **[GitHub Pages](https://pages.github.com/)** - Hosting and deployment 52 | - **[GitHub Actions](https://github.com/features/actions)** - CI/CD automation 53 | 54 | ## 🚀 Building and Serving Locally 55 | 56 | The `mkdocs serve` command builds and serves a local version of your [MkDocs](https://www.mkdocs.org/) documentation. This is highly useful during development and testing to preview changes in real-time. 57 | 58 | ```bash 59 | mkdocs serve 60 | ``` 61 | 62 | - **Command Breakdown:** 63 | - `mkdocs`: The main MkDocs command-line interface tool. 64 | - `serve`: The subcommand used to build and locally serve your documentation site. 65 | - **Note:** 66 | - `mkdocs serve` includes live reloading, automatically updating the preview in your browser as you save changes to the documentation files. 67 | - To stop the local server, simply press `CTRL+C` in your terminal. 68 | 69 | ## 🌍 Building and Serving Multi-Language 70 | 71 | If your documentation supports multiple languages, follow these steps to build and preview all versions: 72 | 73 | 1. Stage all new or modified language Markdown (`.md`) files using Git: 74 | 75 | ```bash 76 | git add docs/**/*.md -f 77 | ``` 78 | 79 | 2. Build all language versions into the `/site` directory. This script ensures that relevant root-level files are included and clears the previous build: 80 | 81 | ```bash 82 | # Clear existing /site directory to prevent conflicts 83 | rm -rf site 84 | 85 | # Build the default language site using the primary config file 86 | mkdocs build -f docs/mkdocs.yml 87 | 88 | # Loop through each language-specific config file and build its site 89 | for file in docs/mkdocs_*.yml; do 90 | echo "Building MkDocs site with $file" 91 | mkdocs build -f "$file" 92 | done 93 | ``` 94 | 95 | 3. To preview the complete multi-language site locally, navigate into the build output directory and start a simple [Python HTTP server](https://docs.python.org/3/library/http.server.html): 96 | ```bash 97 | cd site 98 | python -m http.server 99 | # Open http://localhost:8000 in your preferred web browser 100 | ``` 101 | Access the live preview site at `http://localhost:8000`. 102 | 103 | ## 📤 Deploying Your Documentation Site 104 | 105 | To deploy your MkDocs documentation site, choose a hosting provider and configure your deployment method. Common options include [GitHub Pages](https://pages.github.com/), GitLab Pages, or other static site hosting services like [Netlify](https://www.netlify.com/) or [Vercel](https://vercel.com/). 106 | 107 | - Configure deployment settings within your `mkdocs.yml` file. 108 | - Use the `mkdocs deploy` command specific to your chosen provider to build and deploy your site. 109 | 110 | * **GitHub Pages Deployment Example:** 111 | If deploying to GitHub Pages, you can use the built-in command: 112 | 113 | ```bash 114 | mkdocs gh-deploy 115 | ``` 116 | 117 | After deployment, you might need to update the "Custom domain" settings in your repository's settings page if you wish to use a personalized URL. 118 | 119 | ![GitHub Pages Custom Domain Setting](https://user-images.githubusercontent.com/26833433/210150206-9e86dcd7-10af-43e4-9eb2-9518b3799eac.png) 120 | 121 | - For detailed instructions on various deployment methods, consult the official [MkDocs Deploying your docs guide](https://www.mkdocs.org/user-guide/deploying-your-docs/). 122 | 123 | ## 💡 Contribute 124 | 125 | We deeply value contributions from the open-source community to enhance Ultralytics projects. Your input helps drive innovation in [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) and [AI](https://www.ultralytics.com/glossary/artificial-intelligence-ai)! Please review our [Contributing Guide](https://docs.ultralytics.com/help/contributing/) for detailed information on how to get involved. You can also share your feedback and ideas through our quick [Survey](https://www.ultralytics.com/survey?utm_source=github&utm_medium=social&utm_campaign=Survey). A heartfelt thank you 🙏 to all our contributors for their dedication and support! 126 | 127 | [![Ultralytics open-source contributors](https://raw.githubusercontent.com/ultralytics/assets/main/im/image-contributors.png)](https://github.com/ultralytics/ultralytics/graphs/contributors) 128 | 129 | We look forward to your contributions! 130 | 131 | ## 📜 License 132 | 133 | Ultralytics Docs are available under two licensing options to accommodate different usage scenarios: 134 | 135 | - **AGPL-3.0 License**: Ideal for students, researchers, and enthusiasts involved in academic pursuits and open collaboration. See the [LICENSE](https://github.com/ultralytics/docs/blob/main/LICENSE) file for full details. This license promotes sharing improvements back with the community, fostering an open and collaborative environment. 136 | - **Enterprise License**: Designed for commercial applications, this license allows seamless integration of Ultralytics software and [AI models](https://docs.ultralytics.com/models/) into commercial products and services without the open-source requirements of AGPL-3.0. Visit [Ultralytics Licensing](https://www.ultralytics.com/license) for more information on obtaining an Enterprise License. 137 | 138 | ## ✉️ Contact 139 | 140 | For bug reports, feature requests, and other issues related to the documentation, please use [GitHub Issues](https://github.com/ultralytics/docs/issues). For discussions, questions, and community support regarding Ultralytics software, [Ultralytics HUB](https://docs.ultralytics.com/hub/), and more, join the conversation with peers and the Ultralytics team on our [Discord server](https://discord.com/invite/ultralytics)! 141 | 142 |
143 |
144 | Ultralytics GitHub 145 | space 146 | Ultralytics LinkedIn 147 | space 148 | Ultralytics Twitter 149 | space 150 | Ultralytics YouTube 151 | space 152 | Ultralytics TikTok 153 | space 154 | Ultralytics BiliBili 155 | space 156 | Ultralytics Discord 157 |
158 | -------------------------------------------------------------------------------- /docs/en/compare/damo-yolo-vs-yolov8.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Discover the key differences between DAMO-YOLO and YOLOv8. Compare accuracy, speed, architecture, and use cases to choose the best object detection model. 4 | keywords: DAMO-YOLO, YOLOv8, object detection, model comparison, accuracy, speed, AI, deep learning, computer vision, YOLO models 5 | --- 6 | 7 | # DAMO-YOLO vs. YOLOv8: A Technical Deep Dive 8 | 9 | The landscape of [object detection](https://docs.ultralytics.com/tasks/detect/) is constantly evolving, with researchers and engineers striving to balance the competing demands of speed, accuracy, and computational efficiency. Two prominent architectures that have made significant waves in the computer vision community are **DAMO-YOLO**, developed by Alibaba Group, and **YOLOv8**, created by [Ultralytics](https://www.ultralytics.com/). 10 | 11 | This technical comparison explores the architectural innovations, performance metrics, and practical usability of both models. While DAMO-YOLO introduces novel research concepts like Neural Architecture Search (NAS), Ultralytics YOLOv8 focuses on delivering a robust, [user-friendly ecosystem](https://docs.ultralytics.com/) that streamlines the workflow from training to deployment. 12 | 13 | ## Performance Analysis: Speed and Accuracy 14 | 15 | To understand how these models compare in real-world scenarios, we analyze their performance on the standard [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/). The metrics below highlight trade-offs between mean Average Precision (mAP), inference speed on different hardware, and model complexity. 16 | 17 | 18 | 19 | 20 | 21 | 22 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 23 | | ---------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 24 | | DAMO-YOLOt | 640 | 42.0 | - | 2.32 | 8.5 | 18.1 | 25 | | DAMO-YOLOs | 640 | 46.0 | - | 3.45 | 16.3 | 37.8 | 26 | | DAMO-YOLOm | 640 | 49.2 | - | 5.09 | 28.2 | 61.8 | 27 | | DAMO-YOLOl | 640 | 50.8 | - | 7.18 | 42.1 | 97.3 | 28 | | | | | | | | | 29 | | YOLOv8n | 640 | 37.3 | **80.4** | **1.47** | **3.2** | **8.7** | 30 | | YOLOv8s | 640 | 44.9 | 128.4 | 2.66 | 11.2 | 28.6 | 31 | | YOLOv8m | 640 | 50.2 | 234.7 | 5.86 | 25.9 | 78.9 | 32 | | YOLOv8l | 640 | 52.9 | 375.2 | 9.06 | 43.7 | 165.2 | 33 | | YOLOv8x | 640 | **53.9** | 479.1 | 14.37 | 68.2 | 257.8 | 34 | 35 | ### Key Takeaways 36 | 37 | The data reveals distinct advantages depending on the deployment target: 38 | 39 | - **Edge Performance:** The **YOLOv8n** (Nano) model is the undisputed leader for resource-constrained environments. With only **3.2M parameters** and **8.7B FLOPs**, it achieves the fastest inference speeds on both CPU and GPU. This makes it ideal for [mobile applications](https://docs.ultralytics.com/modes/export/) or IoT devices where memory and power are scarce. 40 | - **Peak Accuracy:** For applications where precision is paramount, **YOLOv8x** achieves the highest mAP of **53.9%**. While DAMO-YOLO models perform well, the largest YOLOv8 variant pushes the boundary of detection accuracy further. 41 | - **Latency Trade-offs:** DAMO-YOLO demonstrates impressive throughput on dedicated GPUs (like the T4), driven by its NAS-optimized backbone. However, Ultralytics YOLOv8 maintains a superior balance across a wider variety of hardware, including CPUs, ensuring broader [deployment flexibility](https://docs.ultralytics.com/guides/model-deployment-options/). 42 | 43 | ## DAMO-YOLO: Research-Driven Innovation 44 | 45 | DAMO-YOLO is a product of the Alibaba Group's research initiatives. The name stands for "Discovery, Adventure, Momentum, and Outlook," reflecting a focus on exploring new architectural frontiers. 46 | 47 | **Authors:** Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, and Xiuyu Sun 48 | **Organization:** [Alibaba Group](https://www.alibabagroup.com/) 49 | **Date:** 2022-11-23 50 | **Arxiv:** [2211.15444v2](https://arxiv.org/abs/2211.15444v2) 51 | **GitHub:** [tinyvision/DAMO-YOLO](https://github.com/tinyvision/DAMO-YOLO) 52 | 53 | ### Architectural Highlights 54 | 55 | DAMO-YOLO integrates several advanced technologies to optimize the trade-off between latency and accuracy: 56 | 57 | 1. **MAE-NAS Backbone:** It utilizes Neural Architecture Search (NAS) to automatically discover efficient network structures, specifically utilizing a method called MAE-NAS. 58 | 2. **RepGFPN Neck:** A heavily parameterized Generalized Feature Pyramid Network (GFPN) is used to maximize information flow between different scale levels, improving detection of objects at varying distances. 59 | 3. **ZeroHead:** To counterbalance the heavy neck, the model employs a lightweight "ZeroHead," reducing the computational burden at the final detection stage. 60 | 4. **AlignedOTA:** A dynamic label assignment strategy that aligns the classification and regression tasks during training, helping the model converge more effectively. 61 | 62 | [Learn more about DAMO-YOLO](https://github.com/tinyvision/DAMO-YOLO){ .md-button } 63 | 64 | ## Ultralytics YOLOv8: The Ecosystem Standard 65 | 66 | YOLOv8 represents a refinement of the YOLO architecture focusing on usability, versatility, and state-of-the-art performance. Unlike pure research models, YOLOv8 is designed as a product for developers, emphasizing a **well-maintained ecosystem** and ease of integration. 67 | 68 | **Authors:** Glenn Jocher, Ayush Chaurasia, and Jing Qiu 69 | **Organization:** [Ultralytics](https://www.ultralytics.com/) 70 | **Date:** 2023-01-10 71 | **Docs:** [Ultralytics YOLOv8](https://docs.ultralytics.com/models/yolov8/) 72 | 73 | ### Architectural Strengths 74 | 75 | - **Anchor-Free Detection:** YOLOv8 eliminates anchor boxes, reducing the number of hyperparameters developers need to tune and simplifying the training process. 76 | - **C2f Module:** The architecture replaces the C3 module with C2f, offering richer gradient flow information while maintaining a lightweight footprint. 77 | - **Decoupled Head:** By separating classification and regression tasks in the head, the model achieves higher localization accuracy. 78 | - **Unified Framework:** Perhaps its strongest architectural feature is its native support for multiple vision tasks—[instance segmentation](https://docs.ultralytics.com/tasks/segment/), [pose estimation](https://docs.ultralytics.com/tasks/pose/), [classification](https://docs.ultralytics.com/tasks/classify/), and [oriented object detection (OBB)](https://docs.ultralytics.com/tasks/obb/)—all within a single codebase. 79 | 80 | !!! tip "Did you know?" 81 | 82 | Ultralytics provides a seamless path to export models to optimized formats like **ONNX**, **TensorRT**, **CoreML**, and **OpenVINO**. This [export capability](https://docs.ultralytics.com/modes/export/) ensures that your trained models can run efficiently on almost any hardware platform. 83 | 84 | [Learn more about YOLOv8](https://docs.ultralytics.com/models/yolov8/){ .md-button } 85 | 86 | ## Usability and Developer Experience 87 | 88 | The most significant divergence between the two models lies in their ease of use and the surrounding ecosystem. 89 | 90 | **Ultralytics YOLO** models are famous for their "zero-to-hero" experience. With a simple PIP installation, developers gain access to a powerful CLI and Python API. This lowers the barrier to entry significantly compared to research repositories that often require complex environment setups. 91 | 92 | ### Training Efficiency 93 | 94 | Ultralytics models are engineered for **training efficiency**. They efficiently utilize CUDA memory, allowing for larger batch sizes or training on consumer-grade GPUs. Furthermore, the availability of high-quality [pre-trained weights](https://docs.ultralytics.com/models/) accelerates convergence, saving valuable compute time and energy. 95 | 96 | Here is a complete, runnable example of how to load and predict with a YOLOv8 model in just three lines of Python: 97 | 98 | ```python 99 | from ultralytics import YOLO 100 | 101 | # Load a pre-trained YOLOv8n model 102 | model = YOLO("yolov8n.pt") 103 | 104 | # Run inference on an image (automatically downloads image if needed) 105 | results = model.predict("https://ultralytics.com/images/bus.jpg") 106 | 107 | # Show the results 108 | for result in results: 109 | result.show() 110 | ``` 111 | 112 | In contrast, while DAMO-YOLO offers strong performance, it generally requires more manual configuration and familiarity with research-oriented frameworks, making it less accessible for rapid prototyping or commercial integration. 113 | 114 | ## Conclusion: Choosing the Right Tool 115 | 116 | Both DAMO-YOLO and YOLOv8 are exceptional achievements in computer vision. 117 | 118 | **DAMO-YOLO** is an excellent choice for researchers interested in Neural Architecture Search and those deploying specifically on hardware where its custom backbone is fully optimized. 119 | 120 | However, for most developers, researchers, and enterprises, **Ultralytics YOLOv8** (and the newer **[YOLO11](https://docs.ultralytics.com/models/yolo11/)**) offers a superior value proposition: 121 | 122 | 1. **Versatility:** Capable of handling Detection, Segmentation, Pose, and OBB in one framework. 123 | 2. **Ease of Use:** Unmatched documentation, simple API, and robust [community support](https://community.ultralytics.com/). 124 | 3. **Deployment:** Extensive support for [export modes](https://docs.ultralytics.com/modes/export/) covers everything from mobile phones to cloud servers. 125 | 4. **Performance Balance:** Excellent accuracy-to-speed ratio, particularly on CPU and Edge devices. 126 | 127 | For those looking to stay on the absolute cutting edge, we also recommend checking out **[YOLO11](https://docs.ultralytics.com/models/yolo11/)**, which builds upon the strengths of YOLOv8 with even greater efficiency and accuracy. 128 | 129 | ## Explore Other Model Comparisons 130 | 131 | To help you make the most informed decision for your computer vision projects, explore these additional detailed comparisons: 132 | 133 | - [YOLO11 vs. DAMO-YOLO](https://docs.ultralytics.com/compare/yolo11-vs-damo-yolo/) 134 | - [RT-DETR vs. YOLOv8](https://docs.ultralytics.com/compare/rtdetr-vs-yolov8/) 135 | - [YOLOv8 vs. YOLOv9](https://docs.ultralytics.com/compare/yolov8-vs-yolov9/) 136 | - [YOLOv8 vs. YOLOv10](https://docs.ultralytics.com/compare/yolov8-vs-yolov10/) 137 | - [YOLOv5 vs. DAMO-YOLO](https://docs.ultralytics.com/compare/yolov5-vs-damo-yolo/) 138 | -------------------------------------------------------------------------------- /docs/en/compare/yolov7-vs-rtdetr.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Compare YOLOv7 and RTDETRv2 for object detection. Explore architecture, performance, and use cases to pick the best model for your project. 4 | keywords: YOLOv7, RTDETRv2, model comparison, object detection, computer vision, machine learning, real-time detection, AI models, Vision Transformers 5 | --- 6 | 7 | # YOLOv7 vs RTDETRv2: A Technical Comparison of Modern Object Detectors 8 | 9 | Selecting the optimal object detection architecture is a pivotal step in developing robust computer vision solutions. This decision often involves navigating the complex trade-offs between inference speed, detection accuracy, and computational resource requirements. This guide provides an in-depth technical comparison between **YOLOv7**, a highly optimized CNN-based detector known for its speed, and **RTDETRv2**, a state-of-the-art transformer-based model designed to bring global context understanding to real-time applications. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## YOLOv7: The Pinnacle of CNN Efficiency 17 | 18 | YOLOv7 represents a major evolution in the You Only Look Once (YOLO) family, released to push the boundaries of what [convolutional neural networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs) can achieve in real-time scenarios. By focusing on architectural refinements and advanced training strategies, it delivers impressive speed on GPU hardware. 19 | 20 | - **Authors:** Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao 21 | - **Organization:** Institute of Information Science, Academia Sinica, Taiwan 22 | - **Date:** 2022-07-06 23 | - **Arxiv:** [https://arxiv.org/abs/2207.02696](https://arxiv.org/abs/2207.02696) 24 | - **GitHub:** [https://github.com/WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7) 25 | - **Docs:** [https://docs.ultralytics.com/models/yolov7/](https://docs.ultralytics.com/models/yolov7/) 26 | 27 | ### Architectural Innovations 28 | 29 | YOLOv7 introduces the **Extended Efficient Layer Aggregation Network (E-ELAN)**, a novel backbone design that enhances the network's learning capability without destroying the gradient path. This allows for deeper networks that remain efficient to train. A defining feature of YOLOv7 is the "trainable bag-of-freebies," a collection of optimization methods—such as model re-parameterization and coarse-to-fine lead guided label assignment—that improve accuracy without increasing [inference latency](https://www.ultralytics.com/glossary/inference-latency). 30 | 31 | ### Strengths and Weaknesses 32 | 33 | YOLOv7 excels in environments where [real-time inference](https://www.ultralytics.com/glossary/real-time-inference) on standard GPUs is the priority. Its architecture is highly optimized for CUDA, delivering high FPS for video feeds. However, as a pure CNN, it may struggle with long-range dependencies compared to transformers. Additionally, customizing its complex architecture can be challenging for beginners. 34 | 35 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 36 | 37 | ## RTDETRv2: Transformers for Real-Time Detection 38 | 39 | RTDETRv2 builds upon the success of the Real-Time Detection Transformer (RT-DETR), leveraging the power of [Vision Transformers (ViT)](https://www.ultralytics.com/glossary/vision-transformer-vit) to capture global information across an image. Unlike CNNs, which process local neighborhoods of pixels, transformers use self-attention mechanisms to understand relationships between distant objects. 40 | 41 | - **Authors:** Wenyu Lv, Yian Zhao, Qinyao Chang, Kui Huang, Guanzhong Wang, and Yi Liu 42 | - **Organization:** Baidu 43 | - **Date:** 2023-04-17 (Original RT-DETR), 2024-07 (RTDETRv2) 44 | - **Arxiv:** [https://arxiv.org/abs/2304.08069](https://arxiv.org/abs/2304.08069) 45 | - **GitHub:** [https://github.com/lyuwenyu/RT-DETR/tree/main/rtdetrv2_pytorch](https://github.com/lyuwenyu/RT-DETR/tree/main/rtdetrv2_pytorch) 46 | 47 | ### Architectural Innovations 48 | 49 | RTDETRv2 employs a hybrid architecture. It uses a CNN backbone for efficient [feature extraction](https://www.ultralytics.com/glossary/feature-extraction) and a transformer encoder-decoder for the detection head. Crucially, it is **anchor-free**, eliminating the need for manually tuned [anchor boxes](https://www.ultralytics.com/glossary/anchor-boxes) and non-maximum suppression (NMS) post-processing in some configurations. The "v2" improvements focus on a flexible backbone and improved training strategies to further reduce latency while maintaining high [mean Average Precision (mAP)](https://www.ultralytics.com/glossary/mean-average-precision-map). 50 | 51 | ### Strengths and Weaknesses 52 | 53 | The primary advantage of RTDETRv2 is its accuracy in complex scenes with occlusions, thanks to its global context awareness. It often outperforms CNNs of similar scale in mAP. However, this comes at a cost: transformer models are notoriously memory-hungry during training and can be slower to converge. They generally require more powerful GPUs to train effectively compared to CNNs like YOLOv7. 54 | 55 | [Learn more about RT-DETR](https://docs.ultralytics.com/models/rtdetr/){ .md-button } 56 | 57 | ## Performance Comparison: Metrics and Analysis 58 | 59 | The following table presents a side-by-side comparison of key performance metrics. While **RTDETRv2-x** achieves superior accuracy, **YOLOv7** models often provide a competitive edge in pure inference speed on specific hardware configurations due to their CNN-native design. 60 | 61 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 62 | | ---------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 63 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 64 | | YOLOv7x | 640 | 53.1 | - | 11.57 | 71.3 | 189.9 | 65 | | | | | | | | | 66 | | RTDETRv2-s | 640 | 48.1 | - | 5.03 | 20 | 60 | 67 | | RTDETRv2-m | 640 | 51.9 | - | 7.51 | 36 | 100 | 68 | | RTDETRv2-l | 640 | 53.4 | - | 9.76 | 42 | 136 | 69 | | RTDETRv2-x | 640 | 54.3 | - | 15.03 | 76 | 259 | 70 | 71 | !!! tip "Understanding the Trade-offs" 72 | 73 | When choosing between these architectures, consider your deployment hardware. Transformers like RTDETRv2 often require specific TensorRT optimizations to reach their full speed potential on NVIDIA GPUs, whereas CNNs like YOLOv7 generally run efficiently on a wider range of hardware with less tuning. 74 | 75 | ### Training Methodology and Resources 76 | 77 | Training methodologies differ significantly between the two architectures. YOLOv7 utilizes standard [stochastic gradient descent (SGD)](https://www.ultralytics.com/glossary/stochastic-gradient-descent-sgd) or Adam optimizers with a focus on data augmentation pipelines like Mosaic. It is relatively memory-efficient, making it feasible to train on mid-range GPUs. 78 | 79 | In contrast, RTDETRv2 requires a more resource-intensive training regimen. The self-attention mechanisms in transformers scale quadratically with sequence length (image size), leading to higher VRAM usage. Users often need high-end [NVIDIA GPUs](https://www.ultralytics.com/glossary/gpu-graphics-processing-unit) with large memory capacities (e.g., A100s) to train larger RT-DETR variants effectively. Furthermore, transformers typically require longer training schedules (more epochs) to converge compared to CNNs. 80 | 81 | ## Why Ultralytics Models Are the Recommended Choice 82 | 83 | While YOLOv7 and RTDETRv2 are excellent models in their own right, the **Ultralytics ecosystem**—headed by the state-of-the-art [YOLO11](https://docs.ultralytics.com/models/yolo11/)—offers a more comprehensive solution for modern AI development. 84 | 85 | ### Superior Ease of Use and Ecosystem 86 | 87 | Ultralytics models are designed with developer experience as a priority. Unlike the complex configuration files and manual setup often required for YOLOv7 or the specific environment needs of RTDETRv2, Ultralytics provides a unified, simple Python API. This allows you to load, train, and deploy models in just a few lines of code. 88 | 89 | ```python 90 | from ultralytics import YOLO 91 | 92 | # Load a pre-trained YOLO11 model 93 | model = YOLO("yolo11n.pt") 94 | 95 | # Train the model on your custom dataset 96 | model.train(data="coco8.yaml", epochs=100, imgsz=640) 97 | 98 | # Run inference on an image 99 | results = model("path/to/image.jpg") 100 | ``` 101 | 102 | ### Balanced Performance and Versatility 103 | 104 | [YOLO11](https://docs.ultralytics.com/models/yolo11/) achieves an exceptional balance of speed and accuracy, often surpassing both YOLOv7 and RT-DETR in efficiency. Crucially, Ultralytics models are not limited to [object detection](https://docs.ultralytics.com/tasks/detect/). They natively support a wide array of computer vision tasks within the same framework: 105 | 106 | - **Instance Segmentation:** Precise object outlining. 107 | - **Pose Estimation:** Keypoint detection for human or animal pose. 108 | - **Classification:** Whole-image categorization. 109 | - **Oriented Object Detection (OBB):** Detecting rotated objects (e.g., in aerial imagery). 110 | 111 | ### Efficiency and Training 112 | 113 | Ultralytics models are optimized for **memory efficiency**. They typically require significantly less CUDA memory during training than transformer-based alternatives like RTDETRv2, democratizing access to high-performance AI. With widely available [pre-trained weights](https://docs.ultralytics.com/models/) and efficient [transfer learning](https://www.ultralytics.com/glossary/transfer-learning) capabilities, you can achieve production-ready results in a fraction of the time. 114 | 115 | ## Conclusion 116 | 117 | **YOLOv7** remains a strong contender for legacy systems requiring strictly optimized CNN inference, while **RTDETRv2** offers cutting-edge accuracy for complex scenes where computational resources are abundant. However, for the majority of developers and researchers seeking a modern, versatile, and user-friendly solution, **Ultralytics YOLO11** is the superior choice. 118 | 119 | By choosing Ultralytics, you gain access to a thriving community, frequent updates, and a robust toolset that simplifies the entire [MLOps](https://www.ultralytics.com/glossary/machine-learning-operations-mlops) lifecycle—from data management to deployment. 120 | 121 | ## Explore Other Model Comparisons 122 | 123 | To further inform your decision, explore these additional technical comparisons: 124 | 125 | - [YOLO11 vs. YOLOv8](https://docs.ultralytics.com/compare/yolo11-vs-yolov8/) 126 | - [RT-DETR vs. YOLOv8](https://docs.ultralytics.com/compare/rtdetr-vs-yolov8/) 127 | - [YOLOv7 vs. YOLOv8](https://docs.ultralytics.com/compare/yolov7-vs-yolov8/) 128 | - [YOLO11 vs. EfficientDet](https://docs.ultralytics.com/compare/yolo11-vs-efficientdet/) 129 | - [YOLOv10 vs. RT-DETR](https://docs.ultralytics.com/compare/yolov10-vs-rtdetr/) 130 | -------------------------------------------------------------------------------- /docs/en/compare/yolov8-vs-yolo11.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Compare YOLOv8 and YOLO11 for object detection. Explore their performance, architecture, and best-use cases to find the right model for your needs. 4 | keywords: YOLOv8, YOLO11, object detection, Ultralytics, YOLO comparison, machine learning, computer vision, inference speed, model accuracy 5 | --- 6 | 7 | # YOLOv8 vs YOLO11: Evolution of Real-Time Object Detection 8 | 9 | Choosing the right computer vision architecture is a critical decision that impacts the speed, accuracy, and scalability of your AI projects. This guide provides an in-depth technical comparison between **Ultralytics YOLOv8**, a widely adopted industry standard released in 2023, and **Ultralytics YOLO11**, the latest evolution in the YOLO series designed for superior efficiency and performance. We will analyze their architectural differences, benchmark metrics, and ideal use cases to help you select the best model for your needs. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## Ultralytics YOLOv8 17 | 18 | **Authors:** Glenn Jocher, Ayush Chaurasia, and Jing Qiu 19 | **Organization:** [Ultralytics](https://www.ultralytics.com/) 20 | **Date:** 2023-01-10 21 | **GitHub:** [https://github.com/ultralytics/ultralytics](https://github.com/ultralytics/ultralytics) 22 | **Docs:** [https://docs.ultralytics.com/models/yolov8/](https://docs.ultralytics.com/models/yolov8/) 23 | 24 | Released in early 2023, YOLOv8 marked a significant milestone in the history of [object detection](https://docs.ultralytics.com/tasks/detect/). It introduced a unified framework that supports multiple computer vision tasks—including detection, [instance segmentation](https://docs.ultralytics.com/tasks/segment/), [pose estimation](https://docs.ultralytics.com/tasks/pose/), and [image classification](https://docs.ultralytics.com/tasks/classify/)—within a single repository. YOLOv8 moved away from anchor-based detection to an **anchor-free** approach, which simplifies the design and improves generalization across different object shapes. 25 | 26 | ### Architecture and Key Features 27 | 28 | YOLOv8 replaced the C3 modules found in [YOLOv5](https://docs.ultralytics.com/models/yolov5/) with the **C2f module** (Cross-Stage Partial bottleneck with two convolutions). This change improved gradient flow and feature integration while maintaining a lightweight footprint. The architecture also features a decoupled head, separating objectness, classification, and regression tasks to increase accuracy. 29 | 30 | !!! tip "Legacy of Reliability" 31 | 32 | YOLOv8 has been tested in thousands of commercial applications, from [manufacturing automation](https://www.ultralytics.com/solutions/ai-in-manufacturing) to [autonomous vehicles](https://www.ultralytics.com/solutions/ai-in-automotive), establishing a reputation for stability and ease of deployment. 33 | 34 | ### Strengths and Weaknesses 35 | 36 | - **Strengths:** 37 | - **Mature Ecosystem:** Supported by a vast array of community tutorials, [integrations](https://docs.ultralytics.com/integrations/), and deployment guides. 38 | - **Versatility:** Natively supports OBB (Oriented Bounding Box) and classification alongside standard detection. 39 | - **Proven Stability:** A safe choice for production environments requiring a model with a long track record. 40 | - **Weaknesses:** 41 | - **Speed Efficiency:** While fast, it is outperformed by YOLO11 in CPU inference speeds and parameter efficiency. 42 | - **Compute Requirements:** Larger variants (L, X) demand more VRAM and FLOPs compared to the optimized YOLO11 equivalents. 43 | 44 | ```python 45 | from ultralytics import YOLO 46 | 47 | # Load a pretrained YOLOv8 model 48 | model = YOLO("yolov8n.pt") 49 | 50 | # Train the model on a custom dataset 51 | model.train(data="coco8.yaml", epochs=50, imgsz=640) 52 | ``` 53 | 54 | [Learn more about YOLOv8](https://docs.ultralytics.com/models/yolov8/){ .md-button } 55 | 56 | ## Ultralytics YOLO11 57 | 58 | **Authors:** Glenn Jocher and Jing Qiu 59 | **Organization:** [Ultralytics](https://www.ultralytics.com/) 60 | **Date:** 2024-09-27 61 | **GitHub:** [https://github.com/ultralytics/ultralytics](https://github.com/ultralytics/ultralytics) 62 | **Docs:** [https://docs.ultralytics.com/models/yolo11/](https://docs.ultralytics.com/models/yolo11/) 63 | 64 | **YOLO11** represents the cutting edge of the Ultralytics model family. Engineered to redefine [real-time inference](https://www.ultralytics.com/glossary/real-time-inference), it builds upon the successes of YOLOv8 but introduces substantial architectural refinements. YOLO11 focuses on maximizing accuracy while minimizing computational cost, making it the premier choice for modern AI applications ranging from edge devices to cloud servers. 65 | 66 | ### Architecture and Key Features 67 | 68 | YOLO11 introduces the **C3k2 block** and **C2PSA** (Cross-Stage Partial with Spatial Attention) module. These components enhance the model's ability to extract intricate features and handle occlusion more effectively than previous iterations. The architecture is optimized for speed, delivering significantly faster processing times on CPUs—a critical factor for [edge AI](https://www.ultralytics.com/glossary/edge-ai) deployments where GPU resources may be unavailable. 69 | 70 | The model maintains the unified interface characteristic of Ultralytics, ensuring that developers can switch between tasks like [OBB](https://docs.ultralytics.com/tasks/obb/) or segmentation without changing their workflow. 71 | 72 | ### Strengths and Weaknesses 73 | 74 | - **Strengths:** 75 | - **Superior Efficiency:** Achieves higher mAP with up to **22% fewer parameters** than YOLOv8, reducing model size and storage needs. 76 | - **Faster Inference:** Optimized specifically for modern hardware, offering faster speeds on both CPU and GPU backends. 77 | - **Enhanced Feature Extraction:** The new backbone improves detection of small objects and performance in cluttered scenes. 78 | - **Lower Memory Usage:** Requires less CUDA memory during training compared to transformer-based models like [RT-DETR](https://docs.ultralytics.com/models/rtdetr/), enabling training on more accessible hardware. 79 | - **Weaknesses:** 80 | - **Newer Release:** As a recent model, specific niche third-party tools may take time to fully update support, though the core Ultralytics ecosystem supports it day-one. 81 | 82 | ```python 83 | from ultralytics import YOLO 84 | 85 | # Load the latest YOLO11 model 86 | model = YOLO("yolo11n.pt") 87 | 88 | # Run inference on an image 89 | results = model("https://ultralytics.com/images/bus.jpg") 90 | results[0].show() 91 | ``` 92 | 93 | [Learn more about YOLO11](https://docs.ultralytics.com/models/yolo11/){ .md-button } 94 | 95 | ## Performance Head-to-Head 96 | 97 | The comparison below highlights the efficiency gains of YOLO11. While YOLOv8 remains a powerful contender, YOLO11 consistently delivers higher accuracy (mAP) with reduced computational complexity (FLOPs) and faster inference speeds. This is particularly noticeable in the "Nano" and "Small" models, where YOLO11n achieves a **39.5 mAP** compared to YOLOv8n's 37.3, all while running significantly faster on CPU. 98 | 99 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 100 | | ------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 101 | | YOLOv8n | 640 | 37.3 | 80.4 | **1.47** | 3.2 | 8.7 | 102 | | YOLOv8s | 640 | 44.9 | 128.4 | 2.66 | 11.2 | 28.6 | 103 | | YOLOv8m | 640 | 50.2 | 234.7 | 5.86 | 25.9 | 78.9 | 104 | | YOLOv8l | 640 | 52.9 | 375.2 | 9.06 | 43.7 | 165.2 | 105 | | YOLOv8x | 640 | 53.9 | 479.1 | 14.37 | 68.2 | 257.8 | 106 | | | | | | | | | 107 | | YOLO11n | 640 | **39.5** | **56.1** | 1.5 | **2.6** | **6.5** | 108 | | YOLO11s | 640 | **47.0** | **90.0** | **2.5** | **9.4** | **21.5** | 109 | | YOLO11m | 640 | **51.5** | **183.2** | **4.7** | **20.1** | **68.0** | 110 | | YOLO11l | 640 | **53.4** | **238.6** | **6.2** | **25.3** | **86.9** | 111 | | YOLO11x | 640 | **54.7** | **462.8** | **11.3** | **56.9** | **194.9** | 112 | 113 | !!! note "Metric Analysis" 114 | 115 | YOLO11 demonstrates a clear advantage in the **speed-accuracy trade-off**. For example, the YOLO11l model surpasses the YOLOv8l in accuracy (+0.5 mAP) while using roughly **42% fewer parameters** and running **36% faster on CPU**. 116 | 117 | ## Ecosystem and Ease of Use 118 | 119 | Both models benefit from the robust [Ultralytics ecosystem](https://github.com/ultralytics/ultralytics), which is designed to democratize AI by making state-of-the-art technology accessible to everyone. 120 | 121 | - **Unified API:** Switching between YOLOv8 and YOLO11 is as simple as changing the model string from `yolov8n.pt` to `yolo11n.pt`. No code refactoring is required. 122 | - **Training Efficiency:** Ultralytics provides [auto-downloading datasets](https://docs.ultralytics.com/datasets/) and pre-trained weights, streamlining the pipeline from data collection to model training. 123 | - **Deployment Versatility:** Both models support one-click [export](https://docs.ultralytics.com/modes/export/) to formats like ONNX, TensorRT, CoreML, and TFLite, facilitating deployment on diverse hardware including Raspberry Pis, mobile phones, and cloud instances. 124 | - **Well-Maintained:** Frequent updates ensure compatibility with the latest versions of PyTorch and CUDA, backed by an active community on [Discord](https://discord.com/invite/ultralytics) and GitHub. 125 | 126 | ## Conclusion and Recommendations 127 | 128 | While **YOLOv8** remains a dependable and highly capable model suitable for maintaining legacy systems, **YOLO11** is the clear recommendation for all new development. 129 | 130 | - **Choose YOLO11 if:** You need the highest possible accuracy, faster inference speeds (especially on CPU), or are deploying to resource-constrained edge devices where memory and storage are premium. Its architectural improvements provide a future-proof foundation for commercial applications. 131 | - **Choose YOLOv8 if:** You have an existing pipeline heavily tuned for v8 specific behaviors or are constrained by strict project requirements that prevent updating to the latest architecture. 132 | 133 | For those interested in exploring other architectures, the Ultralytics docs also cover models like [YOLOv9](https://docs.ultralytics.com/models/yolov9/), [YOLOv10](https://docs.ultralytics.com/models/yolov10/), and [RT-DETR](https://docs.ultralytics.com/models/rtdetr/). You can view broader comparisons on our [model comparison page](https://docs.ultralytics.com/compare/). 134 | -------------------------------------------------------------------------------- /docs/en/compare/yolov5-vs-rtdetr.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Compare YOLOv5 and RTDETRv2 for object detection. Explore their architectures, performance metrics, strengths, and best use cases in computer vision. 4 | keywords: YOLOv5, RTDETRv2, object detection, model comparison, Ultralytics, computer vision, machine learning, real-time detection, Vision Transformers, AI models 5 | --- 6 | 7 | # YOLOv5 vs. RTDETRv2: Balancing Real-Time Speed and Transformer Accuracy 8 | 9 | In the rapidly evolving landscape of [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv), selecting the right object detection model is critical for project success. This comprehensive technical comparison examines two distinct approaches: **YOLOv5**, the legendary CNN-based detector known for its versatility and speed, and **RTDETRv2**, a modern transformer-based model focusing on high accuracy. 10 | 11 | While RTDETRv2 leverages [Vision Transformers (ViT)](https://www.ultralytics.com/glossary/vision-transformer-vit) to capture global context, **Ultralytics YOLOv5** remains a top choice for developers requiring a robust, deployment-ready solution with low resource overhead. 12 | 13 | 14 | 15 | 16 | 17 | 18 | ## Model Specifications and Origins 19 | 20 | Before diving into performance metrics, it is essential to understand the background and architectural philosophy of each model. 21 | 22 | | Feature | Ultralytics YOLOv5 | RTDETRv2 | 23 | | :---------------- | :------------------------------------------ | :---------------------------------- | 24 | | **Architecture** | CNN-based (Anchor-based) | Hybrid (CNN Backbone + Transformer) | 25 | | **Primary Focus** | Real-time Speed, Versatility, Ease of Use | High Accuracy, Global Context | 26 | | **Authors** | Glenn Jocher | Wenyu Lv, Yian Zhao, et al. | 27 | | **Organization** | [Ultralytics](https://www.ultralytics.com/) | Baidu | 28 | | **Release Date** | 2020-06-26 | 2023-04-17 | 29 | | **Tasks** | Detect, Segment, Classify | Detection | 30 | 31 | [Learn more about YOLOv5](https://docs.ultralytics.com/models/yolov5/){ .md-button } 32 | 33 | ## Architecture and Design Philosophy 34 | 35 | The fundamental difference between these models lies in how they process visual data. 36 | 37 | ### Ultralytics YOLOv5 38 | 39 | YOLOv5 employs a highly optimized **Convolutional Neural Network (CNN)** architecture. It utilizes a modified CSPDarknet backbone and a Path Aggregation Network (PANet) neck to extract feature maps. 40 | 41 | - **Anchor-Based:** Relies on predefined [anchor boxes](https://www.ultralytics.com/glossary/anchor-boxes) to predict object locations, which simplifies the learning process for common object shapes. 42 | - **Efficiency:** Designed for maximum inference speed on a wide variety of hardware, from edge devices like the [NVIDIA Jetson](https://docs.ultralytics.com/guides/nvidia-jetson/) to standard CPUs. 43 | - **Versatility:** Supports multiple tasks including [instance segmentation](https://docs.ultralytics.com/tasks/segment/) and [image classification](https://docs.ultralytics.com/tasks/classify/) within a single unified framework. 44 | 45 | ### RTDETRv2 46 | 47 | RTDETRv2 (Real-Time Detection Transformer v2) represents a shift towards transformer architectures. 48 | 49 | - **Hybrid Design:** Combines a CNN backbone with a transformer encoder-decoder, utilizing [self-attention mechanisms](https://www.ultralytics.com/glossary/self-attention) to process object relationships. 50 | - **Global Context:** The transformer component allows the model to "see" the entire image at once, improving performance in complex scenes with occlusion. 51 | - **Computational Cost:** This sophisticated architecture typically demands significantly more GPU memory and computational power (FLOPs) compared to purely CNN-based solutions. 52 | 53 | ## Performance Analysis 54 | 55 | The table below provides a direct comparison of key performance metrics. While RTDETRv2 shows impressive accuracy (mAP) on the [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/), YOLOv5 demonstrates superior inference speeds, particularly on CPU hardware where transformers often struggle. 56 | 57 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 58 | | ---------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 59 | | YOLOv5n | 640 | 28.0 | **73.6** | **1.12** | **2.6** | **7.7** | 60 | | YOLOv5s | 640 | 37.4 | 120.7 | 1.92 | 9.1 | 24.0 | 61 | | YOLOv5m | 640 | 45.4 | 233.9 | 4.03 | 25.1 | 64.2 | 62 | | YOLOv5l | 640 | 49.0 | 408.4 | 6.61 | 53.2 | 135.0 | 63 | | YOLOv5x | 640 | 50.7 | 763.2 | 11.89 | 97.2 | 246.4 | 64 | | | | | | | | | 65 | | RTDETRv2-s | 640 | 48.1 | - | 5.03 | 20 | 60 | 66 | | RTDETRv2-m | 640 | 51.9 | - | 7.51 | 36 | 100 | 67 | | RTDETRv2-l | 640 | 53.4 | - | 9.76 | 42 | 136 | 68 | | RTDETRv2-x | 640 | **54.3** | - | 15.03 | 76 | 259 | 69 | 70 | !!! note "Interpreting the Data" 71 | 72 | While RTDETRv2 achieves higher mAP numbers, notice the **Speed** and **FLOPs** columns. YOLOv5n runs at **73.6 ms** on a CPU, making it feasible for real-time applications on non-accelerated hardware. RTDETRv2 models are significantly heavier, requiring powerful GPUs to maintain real-time frame rates. 73 | 74 | ### Training Efficiency and Memory Usage 75 | 76 | A crucial advantage of **YOLOv5** is its training efficiency. Transformer-based models like RTDETRv2 are notorious for high VRAM consumption and slow convergence rates. 77 | 78 | - **Lower Memory Footprint:** YOLOv5 can be trained on consumer-grade GPUs with modest CUDA memory, democratizing access to AI development. 79 | - **Faster Convergence:** Users can often achieve usable results in fewer epochs, saving valuable time and cloud compute costs. 80 | 81 | ## Key Strengths of Ultralytics YOLOv5 82 | 83 | For most developers and commercial applications, YOLOv5 offers a more balanced and practical set of advantages: 84 | 85 | 1. **Unmatched Ease of Use:** The Ultralytics [Python API](https://docs.ultralytics.com/usage/python/) is the industry standard for simplicity. Loading a model, running inference, and training on custom data can be done with just a few lines of code. 86 | 2. **Rich Ecosystem:** Backed by a massive open-source community, YOLOv5 integrates seamlessly with [Ultralytics HUB](https://www.ultralytics.com/hub) for no-code training, [MLOps tools](https://www.ultralytics.com/glossary/machine-learning-operations-mlops) for tracking, and diverse export formats like [ONNX](https://docs.ultralytics.com/integrations/onnx/) and TensorRT. 87 | 3. **Deployment Flexibility:** From iOS and Android mobile apps to [Raspberry Pi](https://docs.ultralytics.com/guides/raspberry-pi/) and cloud servers, YOLOv5's lightweight architecture allows it to run where heavier transformer models cannot. 88 | 4. **Task Versatility:** Unlike RTDETRv2, which is primarily an object detector, YOLOv5 supports classification and segmentation, reducing the need to maintain multiple codebases for different vision tasks. 89 | 90 | !!! tip "Upgrade Path" 91 | 92 | If you need even higher accuracy than YOLOv5 while maintaining these ecosystem benefits, consider the new **[YOLO11](https://docs.ultralytics.com/models/yolo11/)**. It incorporates modern architectural improvements to rival or beat transformer accuracy with the efficiency you expect from YOLO. 93 | 94 | ## Code Comparison: ease of use 95 | 96 | The following example demonstrates the simplicity of using YOLOv5 with the Ultralytics package. 97 | 98 | ```python 99 | from ultralytics import YOLO 100 | 101 | # Load a pre-trained YOLOv5 model 102 | model = YOLO("yolov5s.pt") 103 | 104 | # Run inference on an image 105 | results = model("https://ultralytics.com/images/bus.jpg") 106 | 107 | # Display results 108 | for result in results: 109 | result.show() # show to screen 110 | result.save(filename="result.jpg") # save to disk 111 | ``` 112 | 113 | ## Ideal Use Cases 114 | 115 | ### When to Choose Ultralytics YOLOv5 116 | 117 | - **Edge Computing:** Deploying on battery-powered or resource-constrained devices (drones, mobile phones, IoT). 118 | - **Real-Time Video Analytics:** Processing multiple video streams simultaneously for [traffic management](https://www.ultralytics.com/blog/optimizingtraffic-management-with-ultralytics-yolo11) or security. 119 | - **Rapid Prototyping:** When you need to go from dataset to deployed model in hours, not days. 120 | - **Multi-Task Requirements:** Projects needing both object detection and [image segmentation](https://docs.ultralytics.com/tasks/segment/). 121 | 122 | ### When to Choose RTDETRv2 123 | 124 | - **Academic Research:** Benchmarking against the absolute state-of-the-art on static datasets where speed is secondary. 125 | - **High-End GPU Availability:** Environments where dedicated server-grade GPUs (like NVIDIA A100s) are available for both training and inference. 126 | - **Complex Static Scenes:** Scenarios with dense occlusion where the [self-attention](https://www.ultralytics.com/glossary/self-attention) mechanism provides a critical edge in accuracy. 127 | 128 | ## Conclusion 129 | 130 | While **RTDETRv2** showcases the potential of transformers in computer vision with impressive accuracy figures, it comes with significant costs in terms of hardware resources and training complexity. For the vast majority of real-world applications, **Ultralytics YOLOv5** remains the superior choice. Its perfect blend of speed, accuracy, and low memory usage—combined with a supportive ecosystem and extensive [documentation](https://docs.ultralytics.com/models/yolov5/)—ensures that developers can build scalable, efficient, and effective AI solutions. 131 | 132 | For those seeking the absolute latest in performance without sacrificing the usability of the Ultralytics framework, we highly recommend exploring **[YOLO11](https://docs.ultralytics.com/models/yolo11/)**, which bridges the gap between CNN efficiency and transformer-level accuracy. 133 | 134 | ## Explore Other Models 135 | 136 | - [YOLOv5 vs YOLOv8](https://docs.ultralytics.com/compare/yolov5-vs-yolov8/) 137 | - [RT-DETR vs YOLO11](https://docs.ultralytics.com/compare/rtdetr-vs-yolo11/) 138 | - [YOLOv5 vs EfficientDet](https://docs.ultralytics.com/compare/efficientdet-vs-yolov5/) 139 | - [YOLOv8 vs RT-DETR](https://docs.ultralytics.com/compare/rtdetr-vs-yolov8/) 140 | - [YOLOv10 vs YOLOv5](https://docs.ultralytics.com/compare/yolov10-vs-yolov5/) 141 | -------------------------------------------------------------------------------- /docs/en/compare/yolox-vs-yolov7.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Discover the differences between YOLOX and YOLOv7, two top computer vision models. Learn about their architecture, performance, and ideal use cases. 4 | keywords: YOLOX, YOLOv7, object detection, computer vision, model comparison, anchor-free, YOLO models, machine learning, AI performance 5 | --- 6 | 7 | # YOLOX vs. YOLOv7: A Detailed Technical Comparison 8 | 9 | Navigating the landscape of object detection models requires a deep understanding of architectural nuances and performance trade-offs. This guide provides a comprehensive technical comparison between **YOLOX** and **YOLOv7**, two influential architectures that have significantly shaped the field of computer vision. We explore their structural innovations, benchmark metrics, and practical applications to help you determine the best fit for your projects. While both models represented state-of-the-art advancements at their respective launches, modern developers often look to the **Ultralytics ecosystem** for unified workflows and cutting-edge performance. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## Performance Head-to-Head 17 | 18 | When selecting a model, the balance between Mean Average Precision (mAP) and inference latency is often the deciding factor. YOLOX offers a highly scalable family of models ranging from Nano to X, emphasizing simplicity through its anchor-free design. Conversely, YOLOv7 focuses on maximizing the speed-accuracy trade-off for real-time applications using advanced architectural optimizations. 19 | 20 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 21 | | --------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 22 | | YOLOXnano | 416 | 25.8 | - | - | **0.91** | **1.08** | 23 | | YOLOXtiny | 416 | 32.8 | - | - | 5.06 | 6.45 | 24 | | YOLOXs | 640 | 40.5 | - | **2.56** | 9.0 | 26.8 | 25 | | YOLOXm | 640 | 46.9 | - | 5.43 | 25.3 | 73.8 | 26 | | YOLOXl | 640 | 49.7 | - | 9.04 | 54.2 | 155.6 | 27 | | YOLOXx | 640 | 51.1 | - | 16.1 | 99.1 | 281.9 | 28 | | | | | | | | | 29 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 30 | | YOLOv7x | 640 | **53.1** | - | 11.57 | 71.3 | 189.9 | 31 | 32 | The data illustrates distinct strengths. **YOLOXnano** is incredibly lightweight, making it ideal for extremely resource-constrained environments. However, for high-performance scenarios, **YOLOv7x** demonstrates superior accuracy (53.1% mAP) and efficiency, delivering higher precision than YOLOXx with significantly fewer Floating Point Operations (FLOPs) and faster inference times on T4 GPUs. 33 | 34 | ## YOLOX: Simplicity via Anchor-Free Design 35 | 36 | YOLOX marked a paradigm shift in the YOLO series by discarding the anchor-based mechanism in favor of an anchor-free approach. This design choice simplifies the training process and eliminates the need for manual anchor box tuning, which often requires domain-specific heuristic optimization. 37 | 38 | - **Authors:** Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun 39 | - **Organization:** [Megvii](https://www.megvii.com/) 40 | - **Date:** 2021-07-18 41 | - **Arxiv:** [https://arxiv.org/abs/2107.08430](https://arxiv.org/abs/2107.08430) 42 | - **GitHub:** [https://github.com/Megvii-BaseDetection/YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) 43 | 44 | ### Architecture and Key Innovations 45 | 46 | YOLOX integrates a **decoupled head** structure, separating the classification and regression tasks. This separation allows the model to learn distinct features for recognizing what an object is versus where it is located, leading to faster convergence and better accuracy. Additionally, YOLOX employs **SimOTA**, an advanced label assignment strategy that dynamically matches positive samples to ground truth objects, improving the model's robustness in crowded scenes. 47 | 48 | !!! info "Anchor-Free vs. Anchor-Based" 49 | 50 | Traditional YOLO models (prior to YOLOX) used predefined "anchor boxes" to predict object dimensions. YOLOX's **anchor-free** method predicts bounding boxes directly from pixel locations, reducing the number of hyperparameters and making the model more generalizable to diverse [datasets](https://docs.ultralytics.com/datasets/). 51 | 52 | ### Use Cases and Limitations 53 | 54 | YOLOX excels in scenarios where model deployment needs to be streamlined across various hardware platforms without extensive hyperparameter tuning. Its lightweight variants (Nano/Tiny) are popular for mobile applications. However, its peak performance on larger scales has been surpassed by newer architectures like YOLOv7 and [YOLO11](https://docs.ultralytics.com/models/yolo11/), which utilize more complex feature aggregation networks. 55 | 56 | [Learn more about YOLOX](https://yolox.readthedocs.io/en/latest/){ .md-button } 57 | 58 | ## YOLOv7: The "Bag-of-Freebies" Powerhouse 59 | 60 | Released a year after YOLOX, YOLOv7 introduced a suite of architectural reforms aimed at optimizing the training process to boost inference results purely through "trainable bag-of-freebies." 61 | 62 | - **Authors:** Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao 63 | - **Organization:** Institute of Information Science, Academia Sinica 64 | - **Date:** 2022-07-06 65 | - **Arxiv:** [https://arxiv.org/abs/2207.02696](https://arxiv.org/abs/2207.02696) 66 | - **GitHub:** [https://github.com/WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7) 67 | 68 | ### Architecture and Key Innovations 69 | 70 | The core of YOLOv7 is the **Extended Efficient Layer Aggregation Network (E-ELAN)**. This architecture allows the network to learn more diverse features by controlling the shortest and longest gradient paths, ensuring effective convergence for very deep networks. Furthermore, YOLOv7 utilizes model scaling techniques specifically designed for concatenation-based models, ensuring that increasing model depth and width translates linearly to performance gains without diminishing returns. 71 | 72 | YOLOv7 also effectively employs auxiliary heads during training to provide coarse-to-fine supervision, a technique that improves the main detection head's accuracy without adding computational cost during deployment. 73 | 74 | ### Use Cases and Limitations 75 | 76 | With its exceptional speed-to-accuracy ratio, YOLOv7 is a top contender for real-time [video analytics](https://docs.ultralytics.com/guides/analytics/) and edge computing tasks where every millisecond counts. It pushed the boundaries of what was possible on standard GPU hardware (like the V100 and T4). However, the complexity of its architecture can make it challenging to modify or fine-tune for custom tasks outside of standard [object detection](https://docs.ultralytics.com/tasks/detect/). 77 | 78 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 79 | 80 | ## The Ultralytics Advantage: Why Modernize? 81 | 82 | While YOLOX and YOLOv7 remain capable tools, the field of computer vision moves rapidly. Modern developers and researchers increasingly prefer the **Ultralytics ecosystem** with models like **YOLO11** and **YOLOv8** due to their comprehensive support, unified design, and ease of use. 83 | 84 | ### Streamlined Developer Experience 85 | 86 | One of the biggest hurdles with older models is the fragmentation of codebases. Ultralytics solves this by providing a unified Python API and CLI that works consistently across all model versions. You can switch between detecting, segmenting, or classifying with a single line of code. 87 | 88 | ```python 89 | from ultralytics import YOLO 90 | 91 | # Load a model (YOLO11 or YOLOv8) 92 | model = YOLO("yolo11n.pt") # or "yolov8n.pt" 93 | 94 | # Run inference on an image 95 | results = model("path/to/image.jpg") 96 | 97 | # Export to ONNX for deployment 98 | model.export(format="onnx") 99 | ``` 100 | 101 | ### Key Benefits of Ultralytics Models 102 | 103 | - **Versatility:** Unlike YOLOX and YOLOv7, which focus primarily on detection, Ultralytics models support [instance segmentation](https://docs.ultralytics.com/tasks/segment/), [pose estimation](https://docs.ultralytics.com/tasks/pose/), [classification](https://docs.ultralytics.com/tasks/classify/), and [oriented object detection (OBB)](https://docs.ultralytics.com/tasks/obb/) out-of-the-box. 104 | - **Well-Maintained Ecosystem:** Frequent updates ensure compatibility with the latest versions of PyTorch, CUDA, and Python. The active community and detailed [documentation](https://docs.ultralytics.com/) reduce the time spent debugging environment issues. 105 | - **Performance Balance:** Models like YOLO11 represent the latest state-of-the-art, offering superior accuracy and lower latency than both YOLOX and YOLOv7. They are optimized for [real-time inference](https://docs.ultralytics.com/modes/predict/) on diverse hardware, from edge devices to cloud servers. 106 | - **Training Efficiency:** Ultralytics models are designed to converge faster, saving valuable GPU hours. Pre-trained weights are readily available for a variety of tasks, making [transfer learning](https://docs.ultralytics.com/guides/model-training-tips/) straightforward. 107 | - **Memory Requirements:** These models are engineered for efficiency, typically requiring less VRAM during training and inference compared to transformer-based alternatives (like RT-DETR), making them accessible on consumer-grade hardware. 108 | 109 | [Learn more about YOLO11](https://docs.ultralytics.com/models/yolo11/){ .md-button } 110 | 111 | ## Conclusion 112 | 113 | Both YOLOX and YOLOv7 have earned their places in the history of computer vision. **YOLOX** democratized the anchor-free approach, offering a simplified pipeline that is easy to understand and deploy on small devices. **YOLOv7** pushed the envelope of performance, proving that efficient architectural design could yield massive gains in speed and accuracy. 114 | 115 | However, for those building production-grade AI systems today, the recommendation leans heavily towards the **Ultralytics YOLO** family. With **YOLO11**, you gain access to a versatile, robust, and user-friendly platform that handles the complexities of [MLOps](https://docs.ultralytics.com/guides/model-deployment-practices/), allowing you to focus on solving real-world problems. 116 | 117 | ## Explore Other Comparisons 118 | 119 | To further inform your model selection, consider exploring these related comparisons: 120 | 121 | - [YOLOX vs. YOLOv8](https://docs.ultralytics.com/compare/yolox-vs-yolov8/) 122 | - [YOLOv7 vs. YOLOv8](https://docs.ultralytics.com/compare/yolov7-vs-yolov8/) 123 | - [RT-DETR vs. YOLOv7](https://docs.ultralytics.com/compare/rtdetr-vs-yolov7/) 124 | - [YOLOv5 vs. YOLOX](https://docs.ultralytics.com/compare/yolov5-vs-yolox/) 125 | - [YOLOv6 vs. YOLOv7](https://docs.ultralytics.com/compare/yolov6-vs-yolov7/) 126 | -------------------------------------------------------------------------------- /docs/en/compare/yolov7-vs-yolox.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Explore YOLOv7 vs YOLOX in this detailed comparison. Learn their architectures, performance metrics, and best use cases for object detection. 4 | keywords: YOLOv7, YOLOX, object detection, YOLO comparison, YOLO models, computer vision, model benchmarks, real-time AI, machine learning 5 | --- 6 | 7 | # YOLOv7 vs. YOLOX: A Detailed Technical Comparison 8 | 9 | In the rapidly evolving landscape of computer vision, the YOLO (You Only Look Once) family of models has consistently set the standard for real-time object detection. Two significant milestones in this history are **YOLOv7** and **YOLOX**. While both models aim to balance speed and accuracy, they diverge significantly in their architectural philosophies—specifically regarding anchor-based versus anchor-free methodologies. 10 | 11 | This guide provides an in-depth technical comparison to help researchers and engineers select the right tool for their specific [computer vision applications](https://www.ultralytics.com/blog/all-you-need-to-know-about-computer-vision-tasks). We will analyze their architectures, benchmark performance, and explore why modern alternatives like **Ultralytics YOLO11** often provide a superior developer experience. 12 | 13 | 14 | 15 | 16 | 17 | 18 | ## Performance Metrics: Speed and Accuracy 19 | 20 | When evaluating object detectors, the trade-off between inference latency and Mean Average Precision (mAP) is paramount. The table below presents a direct comparison between YOLOv7 and YOLOX variants on the [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/). 21 | 22 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 23 | | --------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 24 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 25 | | YOLOv7x | 640 | 53.1 | - | 11.57 | 71.3 | 189.9 | 26 | | | | | | | | | 27 | | YOLOXnano | 416 | 25.8 | - | - | 0.91 | 1.08 | 28 | | YOLOXtiny | 416 | 32.8 | - | - | 5.06 | 6.45 | 29 | | YOLOXs | 640 | 40.5 | - | 2.56 | 9.0 | 26.8 | 30 | | YOLOXm | 640 | 46.9 | - | 5.43 | 25.3 | 73.8 | 31 | | YOLOXl | 640 | 49.7 | - | 9.04 | 54.2 | 155.6 | 32 | | YOLOXx | 640 | 51.1 | - | 16.1 | 99.1 | 281.9 | 33 | 34 | ### Analysis of Results 35 | 36 | The data highlights distinct advantages for each model family depending on the deployment constraints. **YOLOv7** demonstrates exceptional efficiency in the high-performance bracket. For instance, **YOLOv7l** achieves a **51.4% mAP** with only 36.9M parameters, outperforming **YOLOXx** (51.1% mAP, 99.1M parameters) while using significantly fewer computational resources. This makes YOLOv7 a strong candidate for scenarios where [GPU efficiency](https://docs.ultralytics.com/guides/nvidia-jetson/) is critical but memory is constrained. 37 | 38 | Conversely, **YOLOX** shines in the lightweight category. The **YOLOX-Nano** model (0.91M parameters) offers a viable solution for ultra-low-power edge devices where even the smallest standard YOLO models might be too heavy. Its scalable depth-width multipliers allow for fine-grained tuning across a wide range of hardware profiles. 39 | 40 | ## YOLOv7: Optimized Bag-of-Freebies 41 | 42 | Released in July 2022, YOLOv7 introduced several architectural innovations designed to optimize the training process without incurring inference costs. 43 | 44 | - **Authors:** Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao 45 | - **Organization:** Institute of Information Science, Academia Sinica, Taiwan 46 | - **Date:** 2022-07-06 47 | - **Paper:** [Arxiv Link](https://arxiv.org/abs/2207.02696) 48 | - **GitHub:** [YOLOv7 Repository](https://github.com/WongKinYiu/yolov7) 49 | 50 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 51 | 52 | ### Architectural Highlights 53 | 54 | YOLOv7 focuses on "trainable bag-of-freebies"—optimization methods that improve accuracy during training but are removed or merged during inference. Key features include: 55 | 56 | 1. **E-ELAN (Extended Efficient Layer Aggregation Network):** An improved backbone structure that enhances the model's ability to learn diverse features by controlling the shortest and longest gradient paths. 57 | 2. **Model Scaling:** Instead of simply scaling depth or width, YOLOv7 uses a compound scaling method for concatenation-based models, maintaining optimal structure during upscaling. 58 | 3. **Auxiliary Head Coarse-to-Fine:** An auxiliary loss head is used during training to assist supervision, which is then re-parameterized into the main head for inference. 59 | 60 | !!! tip "Re-parameterization" 61 | 62 | YOLOv7 utilizes planned re-parameterization, where distinct training modules are mathematically merged into a single convolutional layer for inference. This reduces the [inference latency](https://www.ultralytics.com/glossary/inference-latency) significantly without sacrificing the feature-learning capability gained during training. 63 | 64 | ## YOLOX: The Anchor-Free Evolution 65 | 66 | YOLOX, released in 2021, represented a shift in the YOLO paradigm by moving away from anchor boxes toward an anchor-free mechanism, similar to [semantic segmentation](https://docs.ultralytics.com/tasks/segment/) approaches. 67 | 68 | - **Authors:** Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun 69 | - **Organization:** Megvii 70 | - **Date:** 2021-07-18 71 | - **Paper:** [Arxiv Link](https://arxiv.org/abs/2107.08430) 72 | - **GitHub:** [YOLOX Repository](https://github.com/Megvii-BaseDetection/YOLOX) 73 | 74 | [Learn more about YOLOX Comparison](https://docs.ultralytics.com/compare/yolov7-vs-yolox/){ .md-button } 75 | 76 | ### Architectural Highlights 77 | 78 | YOLOX simplified the detection pipeline by removing the need for manual anchor box tuning, which was a common pain point in previous versions like YOLOv4 and YOLOv5. 79 | 80 | 1. **Anchor-Free Mechanism:** By predicting the center of objects directly, YOLOX eliminates the complex hyperparameters associated with anchors, improving generalization on diverse datasets. 81 | 2. **Decoupled Head:** Unlike earlier YOLO versions that coupled classification and localization in one head, YOLOX separates them. This leads to faster convergence and better accuracy. 82 | 3. **SimOTA:** An advanced label assignment strategy that dynamically assigns positive samples to the ground truth with the lowest cost, balancing classification and regression losses effectively. 83 | 84 | ## Why Ultralytics Models Are the Preferred Choice 85 | 86 | While YOLOv7 and YOLOX differ in architecture, both are surpassed in usability and ecosystem support by modern [Ultralytics YOLO models](https://docs.ultralytics.com/models/). For developers seeking a robust, future-proof solution, transitioning to **YOLO11** offers distinct advantages. 87 | 88 | ### 1. Unified Ecosystem and Ease of Use 89 | 90 | YOLOv7 and YOLOX often require cloning specific GitHub repositories, managing complex dependency requirements, and utilizing disparate formats for data. In contrast, Ultralytics offers a pip-installable package that unifies all tasks. 91 | 92 | ```python 93 | from ultralytics import YOLO 94 | 95 | # Load a model (YOLO11n recommended for speed) 96 | model = YOLO("yolo11n.pt") 97 | 98 | # Train on a custom dataset with a single line 99 | results = model.train(data="coco8.yaml", epochs=100, imgsz=640) 100 | 101 | # Run inference on an image 102 | results = model("path/to/image.jpg") 103 | ``` 104 | 105 | ### 2. Superior Performance Balance 106 | 107 | As illustrated in the benchmarks, modern Ultralytics models achieve a better trade-off between speed and accuracy. **YOLO11** utilizes an optimized anchor-free architecture that learns from the advancements of both YOLOX (anchor-free design) and YOLOv7 (gradient path optimization). This results in models that are not only faster on [CPU inference](https://docs.ultralytics.com/guides/optimizing-openvino-latency-vs-throughput-modes/) but also require less CUDA memory during training, making them accessible on a wider range of hardware. 108 | 109 | ### 3. Versatility Across Tasks 110 | 111 | YOLOv7 and YOLOX are primarily designed for object detection. Ultralytics models extend this capability natively to a suite of computer vision tasks without changing the API: 112 | 113 | - **[Instance Segmentation](https://docs.ultralytics.com/tasks/segment/):** Pixel-level object understanding. 114 | - **[Pose Estimation](https://docs.ultralytics.com/tasks/pose/):** Detecting keypoints on human bodies. 115 | - **[Oriented Object Detection (OBB)](https://docs.ultralytics.com/tasks/obb/):** Detecting rotated objects (e.g., aerial imagery). 116 | - **[Classification](https://docs.ultralytics.com/tasks/classify/):** Assigning a class label to an entire image. 117 | 118 | ### 4. Seamless Deployment and MLOps 119 | 120 | Taking a model from research to production is challenging with older frameworks. The Ultralytics ecosystem includes built-in export modes for ONNX, TensorRT, CoreML, and OpenVINO, simplifying [model deployment](https://docs.ultralytics.com/guides/model-deployment-practices/). Furthermore, integrations with [Ultralytics HUB](https://www.ultralytics.com/hub) allow for web-based dataset management, remote training, and one-click deployment to edge devices. 121 | 122 | [Learn more about YOLO11](https://docs.ultralytics.com/models/yolo11/){ .md-button } 123 | 124 | ## Conclusion 125 | 126 | Both YOLOv7 and YOLOX have made significant contributions to the field of computer vision. **YOLOv7** optimized the architecture for peak performance on GPU devices, maximizing the efficiency of the "bag-of-freebies" approach. **YOLOX** successfully demonstrated the viability of anchor-free detection, simplifying the pipeline and improving generalization. 127 | 128 | However, for modern development workflows, **Ultralytics YOLO11** stands out as the superior choice. It combines the architectural strengths of its predecessors with an unmatched [Python API](https://docs.ultralytics.com/usage/python/), lower memory requirements, and support for a comprehensive range of vision tasks. Whether you are deploying to an edge device or a cloud server, the active community and extensive documentation of the Ultralytics ecosystem ensure a smoother path to production. 129 | 130 | ## Explore Other Models 131 | 132 | If you are interested in further technical comparisons, explore these resources: 133 | 134 | - [YOLOv7 vs. YOLOv8](https://docs.ultralytics.com/compare/yolov7-vs-yolov8/): A look at the generational leap in performance. 135 | - [RT-DETR vs. YOLOv7](https://docs.ultralytics.com/compare/rtdetr-vs-yolov7/): Comparing Transformers with CNNs. 136 | - [YOLO11 vs. YOLOv10](https://docs.ultralytics.com/compare/yolo11-vs-yolov10/): The latest advancements in real-time detection. 137 | -------------------------------------------------------------------------------- /docs/en/compare/yolov5-vs-yolov7.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Discover the technical comparison between YOLOv5 and YOLOv7, covering architectures, benchmarks, strengths, and ideal use cases for object detection. 4 | keywords: YOLOv5, YOLOv7, object detection, model comparison, AI, deep learning, computer vision, benchmarks, accuracy, inference speed, Ultralytics 5 | --- 6 | 7 | # YOLOv5 vs YOLOv7: Balancing Ecosystem and Architecture 8 | 9 | Choosing the right object detection model is a critical decision for developers and researchers alike. In the evolution of the YOLO (You Only Look Once) family, **YOLOv5** and **YOLOv7** stand out as pivotal architectures that have shaped the landscape of [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv). While YOLOv7 introduced significant architectural innovations for accuracy, Ultralytics YOLOv5 revolutionized the developer experience with a focus on usability, deployment, and a robust ecosystem. 10 | 11 | This guide provides an in-depth technical comparison of these two models, analyzing their architectures, performance metrics on the [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/), and suitability for real-world applications. 12 | 13 | 14 | 15 | 16 | 17 | 18 | ## Ultralytics YOLOv5: The Engineering Standard 19 | 20 | Launched in 2020, YOLOv5 redefined the expectations for open-source object detection software. Unlike previous iterations that existed primarily as research code, YOLOv5 was engineered as a product-ready framework. It prioritized ease of use, exportability, and speed, making it the go-to choice for companies building [real-time inference](https://www.ultralytics.com/glossary/real-time-inference) applications. 21 | 22 | **Authors:** Glenn Jocher 23 | **Organization:** [Ultralytics](https://www.ultralytics.com) 24 | **Date:** 2020-06-26 25 | **GitHub:** [https://github.com/ultralytics/yolov5](https://github.com/ultralytics/yolov5) 26 | **Docs:** [https://docs.ultralytics.com/models/yolov5/](https://docs.ultralytics.com/models/yolov5/) 27 | 28 | ### Key Advantages of YOLOv5 29 | 30 | - **User-Centric Design:** YOLOv5 introduced a streamlined API and a seamless training workflow that lowered the barrier to entry for training custom [object detection](https://docs.ultralytics.com/tasks/detect/) models. 31 | - **Deployment Flexibility:** With native support for [export modes](https://docs.ultralytics.com/modes/export/), YOLOv5 models can be easily converted to formats like [ONNX](https://docs.ultralytics.com/integrations/onnx/), CoreML, TFLite, and [TensorRT](https://docs.ultralytics.com/integrations/tensorrt/) for deployment on diverse hardware. 32 | - **Efficient Resource Usage:** The architecture is optimized for low memory consumption, making it ideal for [edge AI](https://www.ultralytics.com/glossary/edge-ai) devices like the [NVIDIA Jetson](https://docs.ultralytics.com/guides/nvidia-jetson/) or Raspberry Pi. 33 | 34 | !!! tip "Ecosystem Support" 35 | 36 | YOLOv5 is backed by the comprehensive Ultralytics ecosystem. This includes seamless integration with experiment tracking tools like [Comet](https://docs.ultralytics.com/integrations/comet/) and [MLflow](https://docs.ultralytics.com/integrations/mlflow/), as well as dataset management platforms. 37 | 38 | [Learn more about YOLOv5](https://docs.ultralytics.com/models/yolov5/){ .md-button } 39 | 40 | ## YOLOv7: The "Bag-of-Freebies" Approach 41 | 42 | Released in 2022, YOLOv7 focused heavily on pushing the boundaries of accuracy through architectural optimization. The authors introduced several novel concepts aimed at improving feature learning without increasing the inference cost, a strategy they termed "trainable bag-of-freebies." 43 | 44 | **Authors:** Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao 45 | **Organization:** Institute of Information Science, Academia Sinica, Taiwan 46 | **Date:** 2022-07-06 47 | **Arxiv:** [https://arxiv.org/abs/2207.02696](https://arxiv.org/abs/2207.02696) 48 | **GitHub:** [https://github.com/WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7) 49 | **Docs:** [https://docs.ultralytics.com/models/yolov7/](https://docs.ultralytics.com/models/yolov7/) 50 | 51 | ### Architectural Innovations 52 | 53 | YOLOv7 incorporates Extended Efficient Layer Aggregation Networks (E-ELAN) to enhance the network's learning capability. It also utilizes model scaling techniques that modify the architecture's depth and width simultaneously. While effective for raising [mAP scores](https://www.ultralytics.com/glossary/mean-average-precision-map), these complex architectural changes can sometimes make the model harder to modify or deploy compared to the more straightforward CSP-Darknet backbone found in YOLOv5. 54 | 55 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 56 | 57 | ## Technical Performance Comparison 58 | 59 | When comparing the two models, the trade-off usually lies between raw accuracy and practical deployment speed. YOLOv7 models (specifically the larger variants) generally achieve higher mAP on the COCO val2017 dataset. However, Ultralytics YOLOv5 maintains a dominance in inference speed and parameter efficiency, particularly with its smaller variants (Nano and Small), which are crucial for [mobile deployment](https://docs.ultralytics.com/guides/model-deployment-options/). 60 | 61 | The table below highlights the performance metrics. Note the exceptional speed of the **YOLOv5n**, which remains one of the fastest options for extremely resource-constrained environments. 62 | 63 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 64 | | ------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 65 | | YOLOv5n | 640 | 28.0 | **73.6** | **1.12** | **2.6** | **7.7** | 66 | | YOLOv5s | 640 | 37.4 | 120.7 | 1.92 | 9.1 | 24.0 | 67 | | YOLOv5m | 640 | 45.4 | 233.9 | 4.03 | 25.1 | 64.2 | 68 | | YOLOv5l | 640 | 49.0 | 408.4 | 6.61 | 53.2 | 135.0 | 69 | | YOLOv5x | 640 | 50.7 | 763.2 | 11.89 | 97.2 | 246.4 | 70 | | | | | | | | | 71 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 72 | | YOLOv7x | 640 | **53.1** | - | 11.57 | 71.3 | 189.9 | 73 | 74 | ### Analysis of Metrics 75 | 76 | - **Speed vs. Accuracy:** YOLOv7x achieves a higher **53.1% mAP**, making it suitable for high-end security or medical analysis where every pixel counts. However, for applications like [video analytics](https://www.ultralytics.com/blog/a-look-at-real-time-queue-monitoring-enabled-by-computer-vision) or autonomous navigation, the **1.12ms** inference time of YOLOv5n on TensorRT offers a frame rate capability that heavier models cannot match. 77 | - **Training Efficiency:** Ultralytics YOLOv5 utilizes "AutoAnchor" strategies and advanced hyperparameter evolution, which often results in faster convergence during training compared to the complex re-parameterization schemes required by YOLOv7. 78 | - **Memory Footprint:** Training transformers or complex architectures like YOLOv7 often requires high-end GPUs (e.g., A100s). In contrast, YOLOv5's efficient design allows for training on consumer-grade hardware, democratizing access to [AI development](https://www.ultralytics.com/blog/a-quick-guide-for-beginners-on-how-to-train-an-ai-model). 79 | 80 | ## Code Implementation 81 | 82 | One of the strongest arguments for Ultralytics YOLOv5 is the simplicity of its Python API. Loading a pre-trained model and running inference requires only a few lines of code, a testament to the framework's maturity. 83 | 84 | ```python 85 | import torch 86 | 87 | # Load the YOLOv5s model from PyTorch Hub 88 | model = torch.hub.load("ultralytics/yolov5", "yolov5s", pretrained=True) 89 | 90 | # Define an image (url, local path, or numpy array) 91 | img = "https://ultralytics.com/images/zidane.jpg" 92 | 93 | # Run inference 94 | results = model(img) 95 | 96 | # Print results and show the image with bounding boxes 97 | results.print() 98 | results.show() 99 | ``` 100 | 101 | This level of abstraction allows developers to focus on building their [business solutions](https://www.ultralytics.com/solutions) rather than debugging model architectures. 102 | 103 | ## Ideal Use Cases 104 | 105 | ### When to Choose YOLOv7 106 | 107 | YOLOv7 is an excellent choice for academic research and scenarios where hardware constraints are secondary to raw detection performance. 108 | 109 | - **Academic Research:** For benchmarking state-of-the-art detection techniques. 110 | - **High-Precision Inspection:** Such as [manufacturing quality control](https://www.ultralytics.com/solutions/ai-in-manufacturing) where detecting minute defects is critical and latency is less of a concern. 111 | 112 | ### When to Choose Ultralytics YOLOv5 113 | 114 | YOLOv5 remains the industry standard for rapid development and production deployment. 115 | 116 | - **Edge Deployment:** Perfect for running on [iOS and Android](https://docs.ultralytics.com/hub/app/) devices via TFLite or CoreML exports. 117 | - **Robotics:** Its low latency is crucial for the feedback loops required in [autonomous robotics](https://www.ultralytics.com/solutions/ai-in-robotics). 118 | - **Versatility:** Beyond detection, the YOLOv5 repository supports [instance segmentation](https://docs.ultralytics.com/tasks/segment/) and [image classification](https://docs.ultralytics.com/tasks/classify/), providing a unified codebase for multiple vision tasks. 119 | 120 | ## Conclusion: The Modern Path Forward 121 | 122 | While YOLOv7 demonstrated the power of architectural tuning, **Ultralytics YOLOv5** remains the superior choice for developers needing a reliable, well-documented, and easy-to-deploy solution. Its balance of speed, accuracy, and ecosystem support ensures it remains relevant in production environments worldwide. 123 | 124 | However, the field of computer vision moves rapidly. For those seeking the absolute best performance, **[YOLO11](https://docs.ultralytics.com/models/yolo11/)** represents the latest evolution from Ultralytics. YOLO11 builds upon the usability of YOLOv5 but incorporates cutting-edge transformer-based modules and anchor-free designs, surpassing both YOLOv5 and YOLOv7 in accuracy and efficiency. 125 | 126 | For a future-proof solution that supports [Object Detection](https://docs.ultralytics.com/tasks/detect/), [Pose Estimation](https://docs.ultralytics.com/tasks/pose/), and [Oriented Bounding Boxes (OBB)](https://docs.ultralytics.com/tasks/obb/), migrating to the Ultralytics YOLO11 framework is highly recommended. 127 | 128 | ## Discover More Comparisons 129 | 130 | Explore how other models stack up against the Ultralytics YOLO family: 131 | 132 | - [YOLOv5 vs YOLOv8](https://docs.ultralytics.com/compare/yolov5-vs-yolov8/) 133 | - [YOLOv7 vs YOLOv8](https://docs.ultralytics.com/compare/yolov7-vs-yolov8/) 134 | - [YOLOv7 vs YOLO11](https://docs.ultralytics.com/compare/yolo11-vs-yolov7/) 135 | - [RT-DETR vs YOLOv7](https://docs.ultralytics.com/compare/rtdetr-vs-yolov7/) 136 | - [YOLOv6 vs YOLOv7](https://docs.ultralytics.com/compare/yolov6-vs-yolov7/) 137 | -------------------------------------------------------------------------------- /docs/en/compare/yolov8-vs-yolox.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Compare YOLOv8 and YOLOX models for object detection. Discover strengths, weaknesses, benchmarks, and choose the right model for your application. 4 | keywords: YOLOv8, YOLOX, object detection, model comparison, Ultralytics, computer vision, anchor-free models, AI benchmarks 5 | --- 6 | 7 | # YOLOv8 vs. YOLOX: A Comprehensive Technical Comparison 8 | 9 | In the rapidly evolving landscape of computer vision, selecting the right object detection model is critical for project success. This comparison explores the technical nuances between **Ultralytics YOLOv8** and **YOLOX**, two prominent anchor-free architectures. We analyze their structural differences, performance metrics, and suitability for real-world applications to help developers make informed decisions. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## Ultralytics YOLOv8: The State-of-the-Art Standard 17 | 18 | Introduced by Ultralytics in 2023, YOLOv8 represents a significant leap forward in the YOLO series. It was designed to unify high performance with an accessible user experience, supporting a wide range of computer vision tasks beyond just detection. 19 | 20 | - **Authors:** Glenn Jocher, Ayush Chaurasia, and Jing Qiu 21 | - **Organization:** [Ultralytics](https://www.ultralytics.com/) 22 | - **Date:** 2023-01-10 23 | - **GitHub:** [https://github.com/ultralytics/ultralytics](https://github.com/ultralytics/ultralytics) 24 | - **Docs:** [https://docs.ultralytics.com/models/yolov8/](https://docs.ultralytics.com/models/yolov8/) 25 | 26 | ### Key Architecture and Features 27 | 28 | YOLOv8 employs an **anchor-free** detection mechanism, which simplifies the training process by eliminating the need to manually calculate anchor boxes. Its architecture features the C2f module, replacing the C3 module found in previous versions to improve gradient flow and feature extraction. 29 | 30 | A standout feature of YOLOv8 is its **multi-task versatility**. Unlike many competitors restricted to bounding boxes, YOLOv8 natively supports: 31 | 32 | - [Object Detection](https://docs.ultralytics.com/tasks/detect/) 33 | - [Instance Segmentation](https://docs.ultralytics.com/tasks/segment/) 34 | - [Image Classification](https://docs.ultralytics.com/tasks/classify/) 35 | - [Pose Estimation](https://docs.ultralytics.com/tasks/pose/) 36 | - [Oriented Bounding Box (OBB)](https://docs.ultralytics.com/tasks/obb/) 37 | 38 | ### Usage and Ecosystem 39 | 40 | One of the strongest advantages of YOLOv8 is its integration into the Ultralytics ecosystem. Developers can access the model via a streamlined [Python API](https://docs.ultralytics.com/usage/python/) or a powerful [Command Line Interface (CLI)](https://docs.ultralytics.com/usage/cli/). 41 | 42 | ```python 43 | from ultralytics import YOLO 44 | 45 | # Load a pretrained YOLOv8 model 46 | model = YOLO("yolov8n.pt") 47 | 48 | # Run inference on an image 49 | results = model("path/to/image.jpg") 50 | 51 | # View results 52 | for result in results: 53 | result.show() 54 | ``` 55 | 56 | !!! tip "Integrated Workflows" 57 | 58 | YOLOv8 integrates seamlessly with [Ultralytics HUB](https://www.ultralytics.com/hub), allowing teams to visualize datasets, train models in the cloud, and deploy to [edge devices](https://docs.ultralytics.com/guides/model-deployment-practices/) without writing complex boilerplate code. 59 | 60 | [Learn more about YOLOv8](https://docs.ultralytics.com/models/yolov8/){ .md-button } 61 | 62 | ## YOLOX: An Anchor-Free Pioneer 63 | 64 | Released in 2021 by Megvii, YOLOX was one of the first high-performance detectors to successfully decouple the prediction head and remove anchors, influencing subsequent designs in the field. 65 | 66 | - **Authors:** Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun 67 | - **Organization:** [Megvii](https://www.megvii.com/) 68 | - **Date:** 2021-07-18 69 | - **Arxiv:** [https://arxiv.org/abs/2107.08430](https://arxiv.org/abs/2107.08430) 70 | - **GitHub:** [https://github.com/Megvii-BaseDetection/YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) 71 | - **Docs:** [https://yolox.readthedocs.io/en/latest/](https://yolox.readthedocs.io/en/latest/) 72 | 73 | ### Key Architecture and Features 74 | 75 | YOLOX introduced a **decoupled head** structure, separating classification and regression tasks into different branches. This approach helps the model converge faster and improves accuracy. Additionally, YOLOX utilizes **SimOTA** (Simplified Optimal Transport Assignment) for label assignment, a dynamic strategy that treats the training process as an optimal transport problem. 76 | 77 | While innovative at launch, YOLOX focuses primarily on standard [object detection](https://www.ultralytics.com/glossary/object-detection) and does not natively support complex tasks like segmentation or pose estimation without significant customization. 78 | 79 | [Learn more about YOLOX](https://yolox.readthedocs.io/en/latest/){ .md-button } 80 | 81 | ## Comparative Performance Analysis 82 | 83 | When evaluating these models for production, the trade-off between speed and accuracy is paramount. The table below illustrates that **YOLOv8 consistently outperforms YOLOX** across comparable model sizes on the [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/). 84 | 85 | ### Accuracy and Speed Metrics 86 | 87 | YOLOv8 demonstrates superior [Mean Average Precision (mAP)](https://www.ultralytics.com/glossary/mean-average-precision-map), particularly in the larger variants. For example, **YOLOv8x** achieves a mAP of **53.9**, surpassing YOLOX-x at 51.1. Furthermore, Ultralytics provides transparent CPU inference benchmarks using [ONNX](https://docs.ultralytics.com/integrations/onnx/), highlighting YOLOv8's optimization for non-GPU environments. 88 | 89 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 90 | | --------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 91 | | YOLOv8n | 640 | 37.3 | 80.4 | 1.47 | 3.2 | 8.7 | 92 | | YOLOv8s | 640 | **44.9** | 128.4 | 2.66 | 11.2 | 28.6 | 93 | | YOLOv8m | 640 | **50.2** | 234.7 | 5.86 | 25.9 | 78.9 | 94 | | YOLOv8l | 640 | **52.9** | 375.2 | 9.06 | **43.7** | 165.2 | 95 | | YOLOv8x | 640 | **53.9** | 479.1 | **14.37** | **68.2** | **257.8** | 96 | | | | | | | | | 97 | | YOLOXnano | 416 | 25.8 | - | - | 0.91 | 1.08 | 98 | | YOLOXtiny | 416 | 32.8 | - | - | 5.06 | 6.45 | 99 | | YOLOXs | 640 | 40.5 | - | 2.56 | **9.0** | **26.8** | 100 | | YOLOXm | 640 | 46.9 | - | **5.43** | **25.3** | **73.8** | 101 | | YOLOXl | 640 | 49.7 | - | **9.04** | 54.2 | **155.6** | 102 | | YOLOXx | 640 | 51.1 | - | 16.1 | 99.1 | 281.9 | 103 | 104 | ### Architecture and Efficiency 105 | 106 | While YOLOX models (S/M/L) have slightly fewer parameters in some configurations, YOLOv8 offers a better **performance balance**. The efficiency of YOLOv8 is evident in its ability to deliver higher accuracy per parameter. Additionally, YOLOv8 is highly optimized for [training efficiency](https://docs.ultralytics.com/guides/model-training-tips/), often converging faster and requiring less [memory](https://docs.ultralytics.com/guides/yolo-performance-metrics/) than older architectures. This is a crucial factor when training on custom datasets where computational resources might be limited. 107 | 108 | ## Why Choose Ultralytics YOLOv8? 109 | 110 | For the vast majority of developers and researchers, YOLOv8 is the preferred choice due to its modern architecture, robust support, and ease of use. 111 | 112 | ### 1. Ease of Use and Documentation 113 | 114 | Ultralytics prioritizes the developer experience. The extensive [documentation](https://docs.ultralytics.com/) covers everything from installation to advanced [hyperparameter tuning](https://docs.ultralytics.com/guides/hyperparameter-tuning/). In contrast, older repositories like YOLOX often require more manual configuration and have steeper learning curves. 115 | 116 | ### 2. Well-Maintained Ecosystem 117 | 118 | YOLOv8 benefits from an active community and frequent updates. Issues are addressed quickly on [GitHub](https://github.com/ultralytics/ultralytics/issues), and the model integrates natively with [MLOps tools](https://docs.ultralytics.com/integrations/) such as MLflow, TensorBoard, and Weights & Biases. This level of support ensures long-term viability for commercial projects. 119 | 120 | ### 3. Deployment Flexibility 121 | 122 | Deploying models to production is streamlined with YOLOv8. It supports one-click [export](https://docs.ultralytics.com/modes/export/) to formats like TensorRT, OpenVINO, CoreML, and TFLite. This makes it ideal for running on diverse hardware, from cloud servers to [Raspberry Pi](https://docs.ultralytics.com/guides/raspberry-pi/) devices. 123 | 124 | !!! example "Real-World Application" 125 | 126 | A manufacturing plant using [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) for quality control can leverage YOLOv8's multi-task capabilities. A single model could detect defective parts (detection) and identify the exact boundaries of the flaw (segmentation), improving the precision of automated sorting systems. 127 | 128 | ## Conclusion 129 | 130 | Both architectures have contributed significantly to the field of computer vision. YOLOX helped popularize anchor-free detection and remains a respected baseline in academic research. However, **Ultralytics YOLOv8** represents the evolution of these concepts into a production-ready framework. 131 | 132 | With superior [mAP scores](https://docs.ultralytics.com/guides/model-evaluation-insights/), broader task support, and an unmatched ecosystem, YOLOv8 is the definitive solution for modern AI applications. Whether you are building [autonomous vehicles](https://www.ultralytics.com/solutions/ai-in-automotive), smart security systems, or agricultural monitors, YOLOv8 provides the tools and performance needed to succeed. 133 | 134 | ## Explore Other Models 135 | 136 | The field of object detection moves fast. To ensure you are using the best tool for your specific needs, consider exploring these other comparisons and newer models: 137 | 138 | - [YOLOv8 vs. YOLOv5](https://docs.ultralytics.com/compare/yolov5-vs-yolov8/) 139 | - [YOLOv8 vs. YOLOv7](https://docs.ultralytics.com/compare/yolov7-vs-yolov8/) 140 | - [YOLOv8 vs. RT-DETR](https://docs.ultralytics.com/compare/rtdetr-vs-yolov8/) 141 | - [YOLOv8 vs. YOLOv10](https://docs.ultralytics.com/compare/yolov8-vs-yolov10/) 142 | - Discover the latest [YOLO11](https://docs.ultralytics.com/models/yolo11/), which pushes efficiency and accuracy even further. 143 | -------------------------------------------------------------------------------- /docs/en/compare/yolov7-vs-yolov9.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Explore the differences between YOLOv7 and YOLOv9. Compare architecture, performance, and use cases to choose the best model for object detection. 4 | keywords: YOLOv7, YOLOv9, object detection, model comparison, YOLO architecture, AI models, computer vision, machine learning, Ultralytics 5 | --- 6 | 7 | # YOLOv7 vs. YOLOv9: A Comprehensive Technical Comparison 8 | 9 | The evolution of the YOLO (You Only Look Once) family has been marked by continuous innovation in neural network architecture, balancing the critical trade-offs between inference speed, accuracy, and computational efficiency. This comparison delves into **YOLOv7**, a milestone release from 2022 known for its trainable "bag-of-freebies," and **YOLOv9**, a 2024 architecture introducing Programmable Gradient Information (PGI) to overcome information bottlenecks in deep networks. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## Performance and Efficiency Analysis 17 | 18 | The transition from YOLOv7 to YOLOv9 represents a significant leap in parameter efficiency. While YOLOv7 was optimized to push the limits of real-time object detection using Extended Efficient Layer Aggregation Networks (E-ELAN), YOLOv9 introduces architectural changes that allow it to achieve higher Mean Average Precision (mAP) with fewer parameters and Floating Point Operations (FLOPs). 19 | 20 | For developers focused on [edge AI deployment](https://www.ultralytics.com/glossary/edge-ai), this efficiency is crucial. As illustrated in the table below, **YOLOv9e** achieves a dominant **55.6% mAP**, surpassing the larger **YOLOv7x** while maintaining a competitive computational footprint. Conversely, the smaller **YOLOv9t** offers a lightweight solution for highly constrained devices, a tier that YOLOv7 does not explicitly target with the same granularity. 21 | 22 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 23 | | ------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 24 | | YOLOv7l | 640 | 51.4 | - | 6.84 | 36.9 | 104.7 | 25 | | YOLOv7x | 640 | 53.1 | - | 11.57 | 71.3 | 189.9 | 26 | | | | | | | | | 27 | | YOLOv9t | 640 | 38.3 | - | **2.3** | **2.0** | **7.7** | 28 | | YOLOv9s | 640 | 46.8 | - | 3.54 | 7.1 | 26.4 | 29 | | YOLOv9m | 640 | 51.4 | - | 6.43 | 20.0 | 76.3 | 30 | | YOLOv9c | 640 | 53.0 | - | 7.16 | 25.3 | 102.1 | 31 | | YOLOv9e | 640 | **55.6** | - | 16.77 | 57.3 | 189.0 | 32 | 33 | ## YOLOv7: Optimizing the Trainable Bag-of-Freebies 34 | 35 | Released in July 2022, YOLOv7 introduced several structural reforms to the YOLO architecture, focusing on optimizing the training process without increasing inference cost. 36 | 37 | - **Authors:** Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao 38 | - **Organization:** [Institute of Information Science, Academia Sinica, Taiwan](https://en.wikipedia.org/wiki/Academia_Sinica) 39 | - **Date:** 2022-07-06 40 | - **Arxiv:** [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art](https://arxiv.org/abs/2207.02696) 41 | - **GitHub:** [WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7) 42 | 43 | ### Architecture Highlights 44 | 45 | YOLOv7 utilizes **E-ELAN (Extended Efficient Layer Aggregation Network)**, which controls the shortest and longest gradient paths to allow the network to learn more features effectively. It also popularized **model scaling** for concatenation-based models, allowing depth and width to be scaled simultaneously. A key innovation was the planned re-parameterized convolution, which streamlines the model architecture during inference to boost speed. 46 | 47 | !!! info "Legacy Status" 48 | 49 | While YOLOv7 remains a capable model, it lacks the native support for newer optimizations found in the [Ultralytics ecosystem](https://docs.ultralytics.com/). Developers may find integration with modern MLOps tools more challenging compared to newer iterations. 50 | 51 | [Learn more about YOLOv7](https://docs.ultralytics.com/models/yolov7/){ .md-button } 52 | 53 | ## YOLOv9: Solving the Information Bottleneck 54 | 55 | YOLOv9, introduced in early 2024, addresses a fundamental issue in deep learning: information loss as data passes through successive layers. 56 | 57 | - **Authors:** Chien-Yao Wang and Hong-Yuan Mark Liao 58 | - **Organization:** [Institute of Information Science, Academia Sinica, Taiwan](https://en.wikipedia.org/wiki/Academia_Sinica) 59 | - **Date:** 2024-02-21 60 | - **Arxiv:** [YOLOv9: Learning What You Want to Learn Using PGI](https://arxiv.org/abs/2402.13616) 61 | - **GitHub:** [WongKinYiu/yolov9](https://github.com/WongKinYiu/yolov9) 62 | 63 | ### Architecture Highlights 64 | 65 | The core innovation in YOLOv9 is **Programmable Gradient Information (PGI)**. In deep networks, useful information can be lost during the feedforward process, leading to unreliable gradients. PGI provides an auxiliary supervision framework that ensures key information is preserved for the loss function. Additionally, the **Generalized Efficient Layer Aggregation Network (GELAN)** extends the capabilities of ELAN by allowing for arbitrary blocking, maximizing the use of parameters and computational resources. 66 | 67 | This architecture makes YOLOv9 exceptionally strong for [complex detection tasks](https://docs.ultralytics.com/tasks/detect/), such as detecting small objects in cluttered environments or high-resolution [aerial imagery analysis](https://docs.ultralytics.com/datasets/detect/visdrone/). 68 | 69 | [Learn more about YOLOv9](https://docs.ultralytics.com/models/yolov9/){ .md-button } 70 | 71 | ## Why Ultralytics Models (YOLO11 & YOLOv8) Are the Preferred Choice 72 | 73 | While YOLOv7 and YOLOv9 are impressive academic achievements, the **Ultralytics YOLO** series—including [YOLOv8](https://docs.ultralytics.com/models/yolov8/) and the state-of-the-art **YOLO11**—is engineered specifically for practical, real-world application development. These models prioritize **ease of use**, **ecosystem integration**, and **operational efficiency**, making them the superior choice for most engineering teams. 74 | 75 | ### Streamlined User Experience 76 | 77 | Ultralytics models are wrapped in a unified [Python API](https://docs.ultralytics.com/usage/python/) that abstracts away the complexities of training pipelines. Switching between [object detection](https://docs.ultralytics.com/tasks/detect/), [instance segmentation](https://docs.ultralytics.com/tasks/segment/), [pose estimation](https://docs.ultralytics.com/tasks/pose/), and [oriented bounding box (OBB)](https://docs.ultralytics.com/tasks/obb/) tasks requires only a single argument change, a versatility lacking in standard YOLOv7 or YOLOv9 implementations. 78 | 79 | ```python 80 | from ultralytics import YOLO 81 | 82 | # Load a model (YOLO11 automatically handles architecture) 83 | model = YOLO("yolo11n.pt") # Load a pretrained model 84 | 85 | # Train the model with a single line of code 86 | results = model.train(data="coco8.yaml", epochs=100, imgsz=640) 87 | 88 | # Perform inference on an image 89 | results = model("path/to/image.jpg") 90 | ``` 91 | 92 | ### Well-Maintained Ecosystem 93 | 94 | Choosing an Ultralytics model grants access to a robust ecosystem. This includes seamless integration with [Ultralytics HUB](https://hub.ultralytics.com/) (and the upcoming Ultralytics Platform) for cloud training and dataset management. Furthermore, the active community and frequent updates ensure compatibility with the latest hardware, such as exporting to [TensorRT](https://docs.ultralytics.com/integrations/tensorrt/) or [OpenVINO](https://docs.ultralytics.com/integrations/openvino/) for optimal inference speeds. 95 | 96 | ### Memory and Training Efficiency 97 | 98 | Ultralytics models are renowned for their **training efficiency**. Unlike transformer-based models (like [RT-DETR](https://docs.ultralytics.com/models/rtdetr/)) which can be memory-hungry and slow to converge, Ultralytics YOLO models utilize optimized data loaders and [Mosaic augmentation](https://docs.ultralytics.com/reference/data/augment/#ultralytics.data.augment.Mosaic) to deliver rapid training times with lower CUDA memory requirements. This allows developers to train state-of-the-art models on consumer-grade GPUs. 99 | 100 | [Learn more about YOLO11](https://docs.ultralytics.com/models/yolo11/){ .md-button } 101 | 102 | ## Ideal Use Cases 103 | 104 | Selecting the right model depends on the specific constraints of your project. 105 | 106 | ### Real-World Applications for YOLOv9 107 | 108 | - **Research & Benchmarking:** Ideal for academic studies requiring the absolute highest reported accuracy on the [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/). 109 | - **High-Fidelity Surveillance:** In scenarios like [security alarm systems](https://docs.ultralytics.com/guides/security-alarm-system/) where a 1-2% accuracy gain justifies higher implementation complexity. 110 | 111 | ### Real-World Applications for YOLOv7 112 | 113 | - **Legacy Systems:** Projects already built on the Darknet or early PyTorch ecosystems that require a stable, known quantity without refactoring the entire codebase. 114 | 115 | ### Real-World Applications for Ultralytics YOLO11 116 | 117 | - **Smart Cities:** Using [object tracking](https://docs.ultralytics.com/modes/track/) for traffic flow analysis where speed and ease of deployment are paramount. 118 | - **Healthcare:** [Medical image analysis](https://www.ultralytics.com/solutions/ai-in-healthcare) where segmentation and detection are often needed simultaneously. 119 | - **Manufacturing:** Deploying [quality control](https://www.ultralytics.com/solutions/ai-in-manufacturing) systems on edge devices like NVIDIA Jetson or Raspberry Pi, benefiting from the straightforward export options to TFLite and ONNX. 120 | 121 | ## Conclusion 122 | 123 | Both YOLOv7 and YOLOv9 represent significant milestones in the history of computer vision. **YOLOv9** offers a compelling upgrade over v7 with its PGI architecture, delivering better efficiency and accuracy. However, for developers looking for a **versatile, easy-to-use, and well-supported solution**, **Ultralytics YOLO11** remains the recommended choice. Its balance of performance, comprehensive documentation, and multi-task capabilities (detect, segment, classify, pose) provide the fastest path from concept to production. 124 | 125 | ## Explore Other Models 126 | 127 | To find the perfect fit for your specific computer vision tasks, consider exploring these other comparisons: 128 | 129 | - [YOLOv8 vs. YOLOv9](https://docs.ultralytics.com/compare/yolov8-vs-yolov9/) - Compare the widely adopted v8 with the research-focused v9. 130 | - [YOLOv10 vs. YOLOv9](https://docs.ultralytics.com/compare/yolov10-vs-yolov9/) - See how the end-to-end YOLOv10 stacks up. 131 | - [YOLO11 vs. YOLOv8](https://docs.ultralytics.com/compare/yolo11-vs-yolov8/) - Understand the improvements in the latest Ultralytics release. 132 | - [RT-DETR vs. YOLOv9](https://docs.ultralytics.com/compare/rtdetr-vs-yolov9/) - A look at Transformer-based detection vs. CNNs. 133 | -------------------------------------------------------------------------------- /docs/en/compare/rtdetr-vs-damo-yolo.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Discover a detailed comparison of RTDETRv2 and DAMO-YOLO for object detection. Learn about their performance, strengths, and ideal use cases. 4 | keywords: RTDETRv2,DAMO-YOLO,object detection,model comparison,Ultralytics,computer vision,real-time detection,AI models,deep learning 5 | --- 6 | 7 | # RTDETRv2 vs. DAMO-YOLO: A Deep Dive into Real-Time Object Detection 8 | 9 | The landscape of [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) is rapidly evolving, with researchers constantly pushing the boundaries between inference speed and detection accuracy. Two prominent contenders in this arena are RTDETRv2, a transformer-based model from Baidu, and DAMO-YOLO, a highly optimized convolutional network from Alibaba. This technical comparison explores the distinct architectural philosophies of these models, their performance metrics, and ideal application scenarios. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## Performance Benchmarks: Speed vs. Accuracy 17 | 18 | When selecting an [object detection](https://www.ultralytics.com/glossary/object-detection) model, the primary trade-off usually lies between Mean Average Precision (mAP) and latency. The following data highlights the performance differences between RTDETRv2 and DAMO-YOLO on the COCO validation dataset. 19 | 20 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 21 | | ---------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 22 | | RTDETRv2-s | 640 | 48.1 | - | 5.03 | 20 | 60 | 23 | | RTDETRv2-m | 640 | 51.9 | - | 7.51 | 36 | 100 | 24 | | RTDETRv2-l | 640 | 53.4 | - | 9.76 | 42 | 136 | 25 | | RTDETRv2-x | 640 | **54.3** | - | 15.03 | 76 | 259 | 26 | | | | | | | | | 27 | | DAMO-YOLOt | 640 | 42.0 | - | **2.32** | **8.5** | **18.1** | 28 | | DAMO-YOLOs | 640 | 46.0 | - | 3.45 | 16.3 | 37.8 | 29 | | DAMO-YOLOm | 640 | 49.2 | - | 5.09 | 28.2 | 61.8 | 30 | | DAMO-YOLOl | 640 | 50.8 | - | 7.18 | 42.1 | 97.3 | 31 | 32 | The data reveals a clear distinction in design philosophy. DAMO-YOLO prioritizes raw speed and efficiency, with the 'Tiny' variant achieving exceptionally low latency suitable for constrained [edge computing](https://www.ultralytics.com/glossary/edge-computing) environments. Conversely, RTDETRv2 pushes for maximum [accuracy](https://www.ultralytics.com/glossary/accuracy), with its largest variant achieving a notable **54.3 mAP**, making it superior for tasks where precision is paramount. 33 | 34 | ## RTDETRv2: The Transformer Powerhouse 35 | 36 | RTDETRv2 builds upon the success of the Detection Transformer (DETR) architecture, addressing the high computational cost typically associated with vision transformers while maintaining their ability to capture global context. 37 | 38 | - **Authors:** Wenyu Lv, Yian Zhao, Qinyao Chang, Kui Huang, Guanzhong Wang, and Yi Liu 39 | - **Organization:** [Baidu](https://www.baidu.com/) 40 | - **Date:** 2023-04-17 (Initial), 2024-07-24 (v2 Update) 41 | - **Arxiv:** [RT-DETRv2: Improved Baseline with Bag-of-Freebies](https://arxiv.org/abs/2304.08069) 42 | - **GitHub:** [RT-DETRv2 Repository](https://github.com/lyuwenyu/RT-DETR/tree/main/rtdetrv2_pytorch) 43 | 44 | ### Architecture and Capabilities 45 | 46 | RTDETRv2 employs a hybrid encoder that efficiently processes multi-scale features. Unlike traditional CNN-based YOLO models, RTDETR eliminates the need for [Non-Maximum Suppression (NMS)](https://www.ultralytics.com/glossary/non-maximum-suppression-nms) post-processing. This end-to-end approach simplifies the deployment pipeline and reduces latency variability in crowded scenes. 47 | 48 | The model utilizes an efficient hybrid encoder that decouples intra-scale interaction and cross-scale fusion, significantly reducing computational overhead compared to standard DETR models. This design allows it to excel in identifying objects in complex environments where [occlusion](https://www.ultralytics.com/glossary/object-tracking) might confuse standard convolutional detectors. 49 | 50 | !!! info "Transformer Memory Usage" 51 | 52 | While RTDETRv2 offers high accuracy, it is important to note that [Transformer](https://www.ultralytics.com/glossary/transformer) architectures generally consume significantly more CUDA memory during training compared to CNNs. Users with limited GPU VRAM may find training these models challenging compared to efficient alternatives like YOLO11. 53 | 54 | [Learn more about RTDETR](https://docs.ultralytics.com/models/rtdetr/){ .md-button } 55 | 56 | ## DAMO-YOLO: Optimized for Efficiency 57 | 58 | DAMO-YOLO represents a rigorous approach to architectural optimization, leveraging Neural Architecture Search (NAS) to find the most efficient structures for feature extraction and fusion. 59 | 60 | - **Authors:** Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, and Xiuyu Sun 61 | - **Organization:** [Alibaba Group](https://www.alibabagroup.com/en-US/) 62 | - **Date:** 2022-11-23 63 | - **Arxiv:** [DAMO-YOLO: A Report on Real-Time Object Detection Design](https://arxiv.org/abs/2211.15444v2) 64 | - **GitHub:** [DAMO-YOLO Repository](https://github.com/tinyvision/DAMO-YOLO) 65 | 66 | ### Key Architectural Innovations 67 | 68 | DAMO-YOLO integrates several advanced technologies to maximize the speed-accuracy trade-off: 69 | 70 | - **MAE-NAS Backbone:** It employs a backbone discovered via Method-Aware Efficient Neural Architecture Search, ensuring that every parameter contributes effectively to feature extraction. 71 | - **RepGFPN:** A specialized neck design that fuses features across scales with minimal computational cost, enhancing the detection of small objects without stalling [inference speeds](https://www.ultralytics.com/glossary/inference-latency). 72 | - **ZeroHead:** A simplified detection head that reduces the complexity of the final prediction layers. 73 | 74 | This model is particularly strong in scenarios requiring high throughput, such as industrial assembly lines or high-speed traffic monitoring, where milliseconds count. 75 | 76 | [Learn more about DAMO-YOLO](https://github.com/tinyvision/DAMO-YOLO/blob/master/README.md){ .md-button } 77 | 78 | ## Real-World Application Scenarios 79 | 80 | Choosing between these two models often comes down to the specific constraints of the deployment environment. 81 | 82 | ### When to Choose RTDETRv2 83 | 84 | RTDETRv2 is the preferred choice for applications where accuracy is non-negotiable and hardware resources are ample. 85 | 86 | - **Medical Imaging:** In [medical image analysis](https://www.ultralytics.com/glossary/medical-image-analysis), missing a detection (false negative) can have serious consequences. The high mAP of RTDETRv2 makes it suitable for detecting anomalies in X-rays or MRI scans. 87 | - **Detailed Surveillance:** For security systems requiring [facial recognition](https://www.ultralytics.com/glossary/facial-recognition) or identifying small details at a distance, the global context capabilities of the transformer architecture provide a distinct advantage. 88 | 89 | ### When to Choose DAMO-YOLO 90 | 91 | DAMO-YOLO shines in resource-constrained environments or applications requiring ultra-low latency. 92 | 93 | - **Robotics:** For autonomous mobile robots that process visual data on battery-powered [embedded devices](https://www.ultralytics.com/blog/show-and-tell-yolov8-deployment-on-embedded-devices), the efficiency of DAMO-YOLO ensures real-time responsiveness. 94 | - **High-Speed Manufacturing:** In [manufacturing automation](https://www.ultralytics.com/blog/manufacturing-automation), detecting defects on fast-moving conveyor belts requires the rapid inference speeds provided by the DAMO-YOLO-tiny and small variants. 95 | 96 | ## The Ultralytics Advantage: Why YOLO11 is the Optimal Choice 97 | 98 | While RTDETRv2 and DAMO-YOLO offer compelling features, [Ultralytics YOLO11](https://docs.ultralytics.com/models/yolo11/) provides a holistic solution that balances performance, usability, and ecosystem support, making it the superior choice for most developers and researchers. 99 | 100 | ### Unmatched Ecosystem and Usability 101 | 102 | One of the most significant barriers to adopting research models is the complexity of their codebase. Ultralytics eliminates this friction with a unified, user-friendly Python API. Whether you are performing [instance segmentation](https://docs.ultralytics.com/tasks/segment/), [pose estimation](https://docs.ultralytics.com/tasks/pose/), or [classification](https://docs.ultralytics.com/tasks/classify/), the workflow remains consistent and intuitive. 103 | 104 | ```python 105 | from ultralytics import YOLO 106 | 107 | # Load a model (YOLO11 offers various sizes: n, s, m, l, x) 108 | model = YOLO("yolo11n.pt") 109 | 110 | # Train the model with a single line of code 111 | results = model.train(data="coco8.yaml", epochs=100, imgsz=640) 112 | 113 | # Run inference on an image 114 | results = model("path/to/image.jpg") 115 | ``` 116 | 117 | ### Versatility Across Tasks 118 | 119 | Unlike DAMO-YOLO, which is primarily focused on detection, YOLO11 is a versatile platform. It supports a wide array of computer vision tasks out of the box, including [Oriented Bounding Box (OBB)](https://docs.ultralytics.com/tasks/obb/) detection, which is crucial for aerial imagery and document analysis. This versatility allows teams to standardize on a single framework for multiple project requirements. 120 | 121 | ### Training Efficiency and Memory Management 122 | 123 | YOLO11 is engineered for efficiency. It typically requires less GPU memory (VRAM) for training compared to transformer-based models like RTDETRv2. This efficiency lowers the hardware barrier, allowing developers to train state-of-the-art models on consumer-grade GPUs or effectively utilize cloud resources via the [Ultralytics ecosystem](https://www.ultralytics.com/). Furthermore, the extensive library of pre-trained weights ensures that transfer learning is fast and effective, significantly reducing the time-to-market for AI solutions. 124 | 125 | For those seeking a robust, well-maintained, and high-performance solution that evolves with the industry, **Ultralytics YOLO11** remains the recommended standard. 126 | 127 | ## Explore Other Comparisons 128 | 129 | To further understand how these models fit into the broader computer vision landscape, explore these related comparisons: 130 | 131 | - [YOLO11 vs. RTDETR](https://docs.ultralytics.com/compare/yolo11-vs-rtdetr/) 132 | - [YOLO11 vs. DAMO-YOLO](https://docs.ultralytics.com/compare/yolo11-vs-damo-yolo/) 133 | - [YOLOv8 vs. RTDETR](https://docs.ultralytics.com/compare/yolov8-vs-rtdetr/) 134 | - [YOLOv8 vs. DAMO-YOLO](https://docs.ultralytics.com/compare/yolov8-vs-damo-yolo/) 135 | - [EfficientDet vs. DAMO-YOLO](https://docs.ultralytics.com/compare/efficientdet-vs-damo-yolo/) 136 | - [PP-YOLOE vs. RTDETR](https://docs.ultralytics.com/compare/pp-yoloe-vs-rtdetr/) 137 | -------------------------------------------------------------------------------- /docs/en/compare/yolox-vs-yolov5.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Explore a detailed technical comparison of YOLOX vs YOLOv5. Learn their differences in architecture, performance, and ideal applications for object detection. 4 | keywords: YOLOX, YOLOv5, object detection, anchor-free model, real-time detection, computer vision, Ultralytics, model comparison, AI benchmark 5 | --- 6 | 7 | # YOLOX vs. YOLOv5: Exploring Anchor-Free Innovation and Proven Efficiency 8 | 9 | In the rapidly evolving landscape of [object detection](https://www.ultralytics.com/glossary/object-detection), selecting the right architecture is pivotal for project success. This comparison explores two influential models: **YOLOX**, an academic powerhouse known for its anchor-free design, and **YOLOv5**, the industry standard for speed and ease of deployment. Both models have shaped the field of [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv), yet they serve distinct needs depending on whether your priority lies in research-grade precision or production-ready efficiency. 10 | 11 | ## Performance Analysis: Speed, Accuracy, and Efficiency 12 | 13 | When evaluating YOLOX and YOLOv5, the distinction often comes down to the trade-off between raw accuracy and operational efficiency. YOLOX introduced significant architectural changes, such as a decoupled head and an [anchor-free](https://www.ultralytics.com/glossary/anchor-free-detectors) mechanism, which allowed it to achieve state-of-the-art [mAP (mean Average Precision)](https://www.ultralytics.com/glossary/mean-average-precision-map) scores upon its release. It excels in scenarios where every percentage point of accuracy counts, particularly on difficult benchmarks like COCO. 14 | 15 | Conversely, Ultralytics **YOLOv5** was engineered with a focus on "real-world" performance. It prioritizes [inference speed](https://www.ultralytics.com/glossary/inference-latency) and low latency, making it exceptionally well-suited for mobile apps, embedded systems, and [edge AI](https://www.ultralytics.com/glossary/edge-ai) devices. While YOLOX may hold a slight edge in mAP for specific large models, YOLOv5 consistently outperforms it in throughput (frames per second) and deployment flexibility, leveraging the comprehensive [Ultralytics ecosystem](https://www.ultralytics.com/). 16 | 17 | 18 | 19 | 20 | 21 | 22 | The table below provides a detailed side-by-side comparison of the models across various sizes. Note how YOLOv5 maintains competitive accuracy while offering significantly faster inference times, especially when optimized with [TensorRT](https://docs.ultralytics.com/integrations/tensorrt/). 23 | 24 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 25 | | --------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 26 | | YOLOXnano | 416 | 25.8 | - | - | **0.91** | **1.08** | 27 | | YOLOXtiny | 416 | 32.8 | - | - | 5.06 | 6.45 | 28 | | YOLOXs | 640 | 40.5 | - | 2.56 | 9.0 | 26.8 | 29 | | YOLOXm | 640 | 46.9 | - | 5.43 | 25.3 | 73.8 | 30 | | YOLOXl | 640 | 49.7 | - | 9.04 | 54.2 | 155.6 | 31 | | YOLOXx | 640 | **51.1** | - | 16.1 | 99.1 | 281.9 | 32 | | | | | | | | | 33 | | YOLOv5n | 640 | 28.0 | **73.6** | **1.12** | 2.6 | 7.7 | 34 | | YOLOv5s | 640 | 37.4 | 120.7 | 1.92 | 9.1 | 24.0 | 35 | | YOLOv5m | 640 | 45.4 | 233.9 | 4.03 | 25.1 | 64.2 | 36 | | YOLOv5l | 640 | 49.0 | 408.4 | 6.61 | 53.2 | 135.0 | 37 | | YOLOv5x | 640 | 50.7 | 763.2 | 11.89 | 97.2 | 246.4 | 38 | 39 | ## YOLOX: The Anchor-Free Contender 40 | 41 | YOLOX was developed by researchers at Megvii to bridge the gap between the YOLO series and the academic advancements in anchor-free detection. By removing the constraint of predefined anchor boxes, YOLOX simplifies the training process and reduces the need for heuristic tuning. 42 | 43 | - **Authors:** Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun 44 | - **Organization:** [Megvii](https://www.megvii.com/) 45 | - **Date:** 2021-07-18 46 | - **Arxiv:** [https://arxiv.org/abs/2107.08430](https://arxiv.org/abs/2107.08430) 47 | - **GitHub:** [https://github.com/Megvii-BaseDetection/YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) 48 | - **Docs:** [https://yolox.readthedocs.io/en/latest/](https://yolox.readthedocs.io/en/latest/) 49 | 50 | ### Architecture and Innovations 51 | 52 | YOLOX incorporates a **Decoupled Head**, which separates classification and regression tasks into different branches. This design contrasts with the coupled heads of earlier YOLO versions and reportedly improves convergence speed and accuracy. Furthermore, it utilizes **SimOTA**, an advanced label assignment strategy that dynamically assigns positive samples, enhancing the model's robustness in dense scenes. 53 | 54 | ### Strengths and Weaknesses 55 | 56 | The primary strength of YOLOX lies in its **high accuracy ceiling**, particularly with its largest variants (YOLOX-x), and its clean, anchor-free design which appeals to researchers. However, these benefits come with trade-offs. The decoupled head adds computational complexity, often resulting in slower inference compared to YOLOv5. Additionally, as a research-focused model, it lacks the cohesive, user-friendly tooling found in the Ultralytics ecosystem, potentially complicating integration into commercial pipelines. 57 | 58 | ### Ideal Use Cases 59 | 60 | - **Academic Research:** Experimenting with novel detection architectures and label assignment strategies. 61 | - **High-Precision Tasks:** Scenarios where a 1-2% gain in mAP outweighs the cost of slower inference, such as offline video analytics. 62 | - **Dense Object Detection:** Environments with heavily cluttered objects where SimOTA performs well. 63 | 64 | [Learn more about YOLOX](https://yolox.readthedocs.io/en/latest/){ .md-button } 65 | 66 | ## YOLOv5: The Production Standard 67 | 68 | Since its release in 2020, Ultralytics **YOLOv5** has become the go-to model for developers worldwide. It strikes an exceptional balance between performance and practicality, supported by a platform designed to streamline the entire [machine learning operations (MLOps)](https://www.ultralytics.com/glossary/machine-learning-operations-mlops) lifecycle. 69 | 70 | - **Author:** Glenn Jocher 71 | - **Organization:** [Ultralytics](https://www.ultralytics.com/) 72 | - **Date:** 2020-06-26 73 | - **GitHub:** [https://github.com/ultralytics/yolov5](https://github.com/ultralytics/yolov5) 74 | - **Docs:** [https://docs.ultralytics.com/models/yolov5/](https://docs.ultralytics.com/models/yolov5/) 75 | 76 | ### Architecture and Ecosystem 77 | 78 | YOLOv5 utilizes a CSPNet backbone and a path aggregation network (PANet) neck, optimized for efficient feature extraction. While it originally popularized the anchor-based approach in PyTorch, its greatest asset is the surrounding ecosystem. Users benefit from automatic [export](https://docs.ultralytics.com/modes/export/) to formats like ONNX, CoreML, and TFLite, as well as seamless integration with [Ultralytics HUB](https://www.ultralytics.com/hub) for model training and management. 79 | 80 | !!! tip "Did You Know?" 81 | 82 | YOLOv5 is not limited to bounding boxes. It supports multiple tasks including [instance segmentation](https://docs.ultralytics.com/tasks/segment/) and [image classification](https://docs.ultralytics.com/tasks/classify/), making it a versatile tool for complex vision pipelines. 83 | 84 | ### Strengths and Weaknesses 85 | 86 | **Ease of Use** is the hallmark of YOLOv5. With a simple Python API, developers can load pre-trained weights and run inference in just a few lines of code. The model is highly optimized for **speed**, consistently delivering lower latency on both CPUs and GPUs compared to YOLOX. It also boasts **lower memory requirements** during training, making it accessible on standard hardware. While its anchor-based design requires anchor evolution for custom datasets (handled automatically by YOLOv5), its reliability and **well-maintained ecosystem** make it superior for production. 87 | 88 | ### Ideal Use Cases 89 | 90 | - **Real-Time Applications:** Video surveillance, autonomous driving, and robotics where low latency is critical. 91 | - **Edge Deployment:** Running on Raspberry Pi, NVIDIA Jetson, or mobile devices due to its efficient architecture. 92 | - **Commercial Products:** Rapid prototyping and deployment where long-term support and ease of integration are required. 93 | - **Multi-Task Vision:** Projects requiring detection, segmentation, and classification within a single framework. 94 | 95 | [Learn more about YOLOv5](https://docs.ultralytics.com/models/yolov5/){ .md-button } 96 | 97 | ### Code Example: Running YOLOv5 with Ultralytics 98 | 99 | The Ultralytics Python package makes utilizing YOLOv5 models incredibly straightforward. Below is an example of how to run inference using a pre-trained model. 100 | 101 | ```python 102 | from ultralytics import YOLO 103 | 104 | # Load a pre-trained YOLOv5 model (Nano version for speed) 105 | model = YOLO("yolov5nu.pt") 106 | 107 | # Run inference on an image 108 | results = model("https://ultralytics.com/images/bus.jpg") 109 | 110 | # Display the results 111 | results[0].show() 112 | ``` 113 | 114 | ## Conclusion: Making the Right Choice 115 | 116 | Both models represent significant achievements in computer vision, but they cater to different audiences. **YOLOX** is a formidable choice for researchers pushing the boundaries of anchor-free detection who are comfortable navigating a more fragmented toolset. 117 | 118 | However, for the vast majority of developers, engineers, and businesses, **Ultralytics YOLOv5** remains the superior option. Its winning combination of **unrivaled speed**, **versatility**, and a **robust, active ecosystem** ensures that you can move from concept to deployment with minimal friction. Furthermore, adopting the Ultralytics framework provides a clear upgrade path to next-generation models like [YOLO11](https://docs.ultralytics.com/models/yolo11/), which combines the best of anchor-free design with Ultralytics' signature efficiency. 119 | 120 | ## Other Model Comparisons 121 | 122 | Explore how these models stack up against other architectures to find the best fit for your specific needs: 123 | 124 | - [YOLO11 vs YOLOX](https://docs.ultralytics.com/compare/yolo11-vs-yolox/) 125 | - [YOLOv8 vs YOLOX](https://docs.ultralytics.com/compare/yolov8-vs-yolox/) 126 | - [YOLOv10 vs YOLOX](https://docs.ultralytics.com/compare/yolov10-vs-yolox/) 127 | - [RT-DETR vs YOLOX](https://docs.ultralytics.com/compare/rtdetr-vs-yolox/) 128 | - [EfficientDet vs YOLOX](https://docs.ultralytics.com/compare/efficientdet-vs-yolox/) 129 | - [YOLOv5 vs YOLOv8](https://docs.ultralytics.com/compare/yolov5-vs-yolov8/) 130 | -------------------------------------------------------------------------------- /docs/en/compare/pp-yoloe-vs-yolov9.md: -------------------------------------------------------------------------------- 1 | --- 2 | comments: true 3 | description: Explore the differences between PP-YOLOE+ and YOLOv9 with detailed architecture, performance benchmarks, and use case analysis for object detection. 4 | keywords: PP-YOLOE+, YOLOv9, object detection, model comparison, computer vision, anchor-free detector, programmable gradient information, AI models, benchmarking 5 | --- 6 | 7 | # PP-YOLOE+ vs. YOLOv9: A Technical Comparison 8 | 9 | Selecting the optimal architecture for [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) projects requires navigating a landscape of rapidly evolving models. This page provides a detailed technical comparison between Baidu's PP-YOLOE+ and [YOLOv9](https://docs.ultralytics.com/models/yolov9/), two sophisticated single-stage object detectors. We analyze their architectural innovations, performance metrics, and ecosystem integration to help you make an informed decision. While both models demonstrate high capabilities, they represent distinct design philosophies and framework dependencies. 10 | 11 | 12 | 13 | 14 | 15 | 16 | ## PP-YOLOE+: High Accuracy within the PaddlePaddle Ecosystem 17 | 18 | PP-YOLOE+ is an evolved version of PP-YOLOE, developed by Baidu as part of the [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection/) suite. It is engineered to provide a balanced trade-off between precision and inference speed, specifically optimized for the [PaddlePaddle](https://docs.ultralytics.com/integrations/paddlepaddle/) deep learning framework. 19 | 20 | **Authors:** PaddlePaddle Authors 21 | **Organization:** [Baidu](https://www.baidu.com/) 22 | **Date:** 2022-04-02 23 | **Arxiv:** [https://arxiv.org/abs/2203.16250](https://arxiv.org/abs/2203.16250) 24 | **GitHub:** [https://github.com/PaddlePaddle/PaddleDetection/](https://github.com/PaddlePaddle/PaddleDetection/) 25 | **Docs:** [PaddleDetection PP-YOLOE+ README](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.8.1/configs/ppyoloe/README.md) 26 | 27 | ### Architecture and Key Features 28 | 29 | PP-YOLOE+ operates as an anchor-free, single-stage detector. It builds upon the CSPRepResNet backbone and utilizes a Task Alignment Learning (TAL) strategy to improve the alignment between classification and localization tasks. A key feature is the Efficient Task-aligned Head (ET-Head), which reduces computational overhead while maintaining accuracy. The model uses a Varifocal Loss function to handle class imbalance during training. 30 | 31 | ### Strengths and Weaknesses 32 | 33 | The primary strength of PP-YOLOE+ lies in its optimization for Baidu's hardware and software stack. It offers scalable models (s, m, l, x) that perform well in standard [object detection](https://www.ultralytics.com/glossary/object-detection) benchmarks. 34 | 35 | However, its heavy reliance on the PaddlePaddle ecosystem presents a significant hurdle for the broader AI community, which largely favors [PyTorch](https://www.ultralytics.com/glossary/pytorch). Migrating existing PyTorch workflows to PaddlePaddle can be resource-intensive. Additionally, compared to newer architectures, PP-YOLOE+ requires more parameters to achieve similar [accuracy](https://www.ultralytics.com/glossary/accuracy), impacting storage and memory on constrained devices. 36 | 37 | [Learn more about PP-YOLOE+](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.8.1/configs/ppyoloe/README.md){ .md-button } 38 | 39 | ## YOLOv9: Programmable Gradient Information for Enhanced Learning 40 | 41 | Ultralytics [YOLOv9](https://docs.ultralytics.com/models/yolov9/) introduces a paradigm shift in real-time object detection by addressing the "information bottleneck" problem inherent in deep neural networks. 42 | 43 | **Authors:** Chien-Yao Wang and Hong-Yuan Mark Liao 44 | **Organization:** [Institute of Information Science, Academia Sinica, Taiwan](https://www.iis.sinica.edu.tw/en/page/AboutUs/Introduction.html) 45 | **Date:** 2024-02-21 46 | **Arxiv:** [https://arxiv.org/abs/2402.13616](https://arxiv.org/abs/2402.13616) 47 | **GitHub:** [https://github.com/WongKinYiu/yolov9](https://github.com/WongKinYiu/yolov9) 48 | **Documentation:** [https://docs.ultralytics.com/models/yolov9/](https://docs.ultralytics.com/models/yolov9/) 49 | 50 | ### Architecture and Key Features 51 | 52 | YOLOv9 integrates two groundbreaking concepts: **Programmable Gradient Information (PGI)** and the **Generalized Efficient Layer Aggregation Network (GELAN)**. 53 | 54 | - **PGI:** As networks deepen, input data information is often lost during the feedforward process. PGI provides an auxiliary supervision branch that ensures reliable gradient generation, allowing the model to "remember" crucial features for [object tracking](https://docs.ultralytics.com/modes/track/) and detection tasks without adding inference cost. 55 | - **GELAN:** This architectural design optimizes parameter efficiency, allowing the model to achieve higher accuracy with fewer computational resources (FLOPs) compared to conventional backbones using depth-wise convolution. 56 | 57 | !!! info "Did you know?" 58 | 59 | YOLOv9's PGI technique solves the information bottleneck issue that previously required cumbersome deep supervision methods. This results in models that are both lighter and more accurate, significantly improving **performance balance**. 60 | 61 | ### Strengths and Weaknesses 62 | 63 | YOLOv9 excels in **training efficiency** and parameter utilization. It achieves state-of-the-art results on the [COCO dataset](https://docs.ultralytics.com/datasets/detect/coco/), surpassing previous iterations in accuracy while maintaining real-time speeds. Its integration into the Ultralytics ecosystem means it benefits from a **well-maintained ecosystem**, including simple deployment via [export modes](https://docs.ultralytics.com/modes/export/) to formats like ONNX and TensorRT. 64 | 65 | A potential consideration is that the largest variants (YOLOv9-E) require significant GPU resources for training. However, the inference memory footprint remains competitive, avoiding the high costs associated with transformer-based models. 66 | 67 | [Learn more about YOLOv9](https://docs.ultralytics.com/models/yolov9/){ .md-button } 68 | 69 | ## Comparative Performance Analysis 70 | 71 | In a direct comparison, YOLOv9 demonstrates superior efficiency. For example, the YOLOv9-C model achieves a higher mAP (53.0%) than the PP-YOLOE+l (52.9%) while utilizing approximately **half the parameters** (25.3M vs 52.2M). This drastic reduction in model size without compromising accuracy highlights the effectiveness of the GELAN architecture. 72 | 73 | | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
T4 TensorRT10
(ms) | params
(M) | FLOPs
(B) | 74 | | ---------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | 75 | | PP-YOLOE+t | 640 | 39.9 | - | 2.84 | 4.85 | 19.15 | 76 | | PP-YOLOE+s | 640 | 43.7 | - | **2.62** | 7.93 | 17.36 | 77 | | PP-YOLOE+m | 640 | 49.8 | - | 5.56 | 23.43 | 49.91 | 78 | | PP-YOLOE+l | 640 | 52.9 | - | 8.36 | 52.2 | 110.07 | 79 | | PP-YOLOE+x | 640 | 54.7 | - | 14.3 | 98.42 | 206.59 | 80 | | | | | | | | | 81 | | YOLOv9t | 640 | 38.3 | - | 2.3 | **2.0** | **7.7** | 82 | | YOLOv9s | 640 | 46.8 | - | 3.54 | **7.1** | 26.4 | 83 | | YOLOv9m | 640 | 51.4 | - | 6.43 | **20.0** | 76.3 | 84 | | YOLOv9c | 640 | 53.0 | - | 7.16 | **25.3** | **102.1** | 85 | | YOLOv9e | 640 | **55.6** | - | 16.77 | **57.3** | **189.0** | 86 | 87 | The table illustrates that for similar accuracy targets, YOLOv9 consistently requires fewer computational resources. The YOLOv9-E model pushes the envelope further, achieving 55.6% mAP, a clear advantage over the largest PP-YOLOE+ variant. 88 | 89 | ## The Ultralytics Advantage 90 | 91 | While PP-YOLOE+ is a capable detector, choosing YOLOv9 through the Ultralytics framework offers distinct advantages regarding **ease of use** and **versatility**. 92 | 93 | ### Streamlined User Experience 94 | 95 | Ultralytics prioritizes a developer-friendly experience. Unlike the complex configuration files often required by PaddleDetection, Ultralytics models can be loaded, trained, and deployed with just a few lines of Python code. This significantly lowers the barrier to entry for engineers and researchers. 96 | 97 | ### Versatility and Ecosystem 98 | 99 | Ultralytics supports a wide array of tasks beyond simple detection, including [instance segmentation](https://docs.ultralytics.com/tasks/segment/), [pose estimation](https://docs.ultralytics.com/tasks/pose/), and [oriented bounding box (OBB)](https://docs.ultralytics.com/tasks/obb/) detection. This versatility allows developers to tackle diverse challenges using a single, unified API. Furthermore, the active community and frequent updates ensure that users have access to the latest optimizations and [integrations](https://docs.ultralytics.com/integrations/) with tools like TensorBoard and MLflow. 100 | 101 | ### Code Example: Using YOLOv9 102 | 103 | The following example demonstrates how effortlessly you can run inference with YOLOv9 using the Ultralytics Python API. This simplicity contrasts with the more verbose setup often required for PP-YOLOE+. 104 | 105 | ```python 106 | from ultralytics import YOLO 107 | 108 | # Load a pre-trained YOLOv9 model 109 | model = YOLO("yolov9c.pt") 110 | 111 | # Run inference on an image 112 | results = model("path/to/image.jpg") 113 | 114 | # Display results 115 | results[0].show() 116 | ``` 117 | 118 | ## Ideal Use Cases 119 | 120 | - **PP-YOLOE+:** Best suited for teams already deeply integrated into the Baidu/PaddlePaddle ecosystem, or for specific legacy industrial applications in regions where PaddlePaddle hardware support is dominant. 121 | - **YOLOv9:** Ideal for applications demanding the highest accuracy-to-efficiency ratio, such as [autonomous vehicles](https://www.ultralytics.com/solutions/ai-in-automotive), real-time video analytics, and edge deployment where **memory requirements** and storage are constraints. 122 | 123 | ## Conclusion and Recommendations 124 | 125 | For most developers and organizations, **YOLOv9 represents the superior choice** due to its modern architecture (GELAN/PGI), superior parameter efficiency, and the robust support of the Ultralytics ecosystem. It offers a future-proof solution with readily available pre-trained weights and seamless export capabilities. 126 | 127 | If you are looking for even greater versatility and speed, we also recommend exploring **[YOLO11](https://docs.ultralytics.com/models/yolo11/)**, the latest iteration in the YOLO series. YOLO11 refines the balance between performance and latency even further, offering state-of-the-art capabilities for detection, segmentation, and classification tasks in a compact package. 128 | 129 | For those interested in a proven workhorse, **[YOLOv8](https://docs.ultralytics.com/models/yolov8/)** remains a highly reliable option with extensive community resources and third-party integrations. 130 | --------------------------------------------------------------------------------