├── docs_en
├── .nojekyll
├── imgs
│ ├── code.jpg
│ ├── logo.png
│ ├── sb1.jpg
│ ├── sb2.jpg
│ ├── sb3.jpg
│ ├── sb4.jpg
│ ├── change1.png
│ ├── change2.png
│ ├── gitee_1.jpg
│ ├── gitee_2.jpg
│ ├── mixpage.jpg
│ ├── webpage.jpg
│ ├── color_logo.png
│ ├── login_gitee1.jpg
│ ├── login_gitee2.jpg
│ ├── login_gitee3.jpg
│ ├── 20230105105418.png
│ └── find_browser_path.png
├── SessionPage
│ ├── get_elements.md
│ ├── introduction.md
│ └── create_page_object.md
├── cooperation.md
├── ChromiumPage
│ ├── get_elements.md
│ ├── upload_files.md
│ ├── introduction.md
│ └── visit_web_page.md
├── features
│ └── features_demos
│ │ ├── download_file.md
│ │ ├── get_element_attributes.md
│ │ ├── switch_mode.md
│ │ ├── compare_with_requests.md
│ │ └── compare_with_selenium.md
├── get_start
│ ├── installation.md
│ ├── examples
│ │ ├── control_browser.md
│ │ ├── data_packets.md
│ │ └── switch_mode.md
│ ├── import.md
│ └── before_start.md
├── WebPage
│ ├── introduction.md
│ ├── webpage_function.md
│ ├── mode_switch.md
│ └── create_page_object.md
├── MixPage
│ └── introduction.md
├── demos
│ ├── login_gitee.md
│ ├── douban_book_pics.md
│ ├── maoyan_TOP100.md
│ ├── starbucks_pics.md
│ └── multithreading_with_tabs.md
├── advance
│ ├── settings.md
│ ├── commands.md
│ ├── errors.md
│ ├── accelerate_reading.md
│ ├── packaging.md
│ └── tools.md
├── download
│ └── introduction.md
├── get_elements
│ ├── simplify.md
│ ├── not_found.md
│ ├── cheat_sheet.md
│ └── introduction.md
├── history
│ └── 1.x.md
├── usage_introduction.md
└── Q&A.md
├── code2.jpg
├── requirements.txt
├── MANIFEST.in
├── DrissionPage
├── version.py
├── _functions
│ ├── by.py
│ ├── browser.pyi
│ ├── cli.py
│ ├── locator.pyi
│ ├── settings.py
│ ├── cookies.pyi
│ ├── settings.pyi
│ ├── keys.pyi
│ ├── tools.pyi
│ └── web.pyi
├── __init__.pyi
├── items.py
├── __init__.py
├── _elements
│ ├── none_element.pyi
│ └── none_element.py
├── _configs
│ ├── configs.ini
│ └── options_manage.pyi
├── _units
│ ├── console.pyi
│ ├── screencast.pyi
│ ├── console.py
│ ├── cookies_setter.py
│ ├── cookies_setter.pyi
│ ├── scroller.py
│ ├── states.pyi
│ └── clicker.pyi
├── errors.py
├── common.py
├── _pages
│ ├── chromium_tab.py
│ ├── chromium_tab.pyi
│ └── chromium_page.py
└── _base
│ └── driver.pyi
├── .github
└── FUNDING.yml
├── setup.py
├── .gitignore
├── LICENSE
└── README.md
/docs_en/.nojekyll:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/code2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/code2.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/code.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/code.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/logo.png
--------------------------------------------------------------------------------
/docs_en/imgs/sb1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/sb1.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/sb2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/sb2.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/sb3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/sb3.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/sb4.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/sb4.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/change1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/change1.png
--------------------------------------------------------------------------------
/docs_en/imgs/change2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/change2.png
--------------------------------------------------------------------------------
/docs_en/imgs/gitee_1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/gitee_1.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/gitee_2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/gitee_2.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/mixpage.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/mixpage.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/webpage.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/webpage.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/color_logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/color_logo.png
--------------------------------------------------------------------------------
/docs_en/imgs/login_gitee1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/login_gitee1.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/login_gitee2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/login_gitee2.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/login_gitee3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/login_gitee3.jpg
--------------------------------------------------------------------------------
/docs_en/imgs/20230105105418.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/20230105105418.png
--------------------------------------------------------------------------------
/docs_en/imgs/find_browser_path.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/g1879/DrissionPage/HEAD/docs_en/imgs/find_browser_path.png
--------------------------------------------------------------------------------
/docs_en/SessionPage/get_elements.md:
--------------------------------------------------------------------------------
1 | 🚄 Search for Elements
2 | ---
3 |
4 | Please refer to the "Search for Elements" section.
5 |
6 |
--------------------------------------------------------------------------------
/docs_en/cooperation.md:
--------------------------------------------------------------------------------
1 | ---
2 | id: cooper
3 | title: Business cooperation
4 | ---
5 |
6 | Undertake automated orders, contact QQ:178691442
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | requests
2 | lxml
3 | cssselect
4 | DownloadKit>=2.0.7
5 | websocket-client
6 | click
7 | tldextract>=3.4.4
8 | psutil
--------------------------------------------------------------------------------
/docs_en/ChromiumPage/get_elements.md:
--------------------------------------------------------------------------------
1 | 🚤 Find Elements
2 | ---
3 |
4 | Please refer to the "[Find Elements](https://g1879.gitee.io/drissionpagedocs/get_elements/get_ele_intro)" section.
5 |
6 |
--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include DrissionPage/_configs/configs.ini
2 | include DrissionPage/_functions/suffixes.dat
3 | include DrissionPage/*.pyi
4 | include DrissionPage/*/*.py
5 | include DrissionPage/*/*.pyi
--------------------------------------------------------------------------------
/DrissionPage/version.py:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 | """
8 | __version__ = '4.1.1.2'
9 |
--------------------------------------------------------------------------------
/docs_en/features/features_demos/download_file.md:
--------------------------------------------------------------------------------
1 | # Download Files
2 |
3 | DrissionPage comes with a convenient downloader that allows you to easily download files with just one line of code.
4 |
5 | ```python
6 | from DrissionPage import WebPage
7 |
8 | url = 'https://www.baidu.com/img/flexible/logo/pc/result.png'
9 | save_path = r'C:\download'
10 |
11 | page = WebPage('s')
12 | page.download(url, save_path)
13 | ```
14 |
15 |
--------------------------------------------------------------------------------
/DrissionPage/_functions/by.py:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 | """
8 |
9 |
10 | class By:
11 | ID = 'id'
12 | XPATH = 'xpath'
13 | LINK_TEXT = 'link text'
14 | PARTIAL_LINK_TEXT = 'partial link text'
15 | NAME = 'name'
16 | TAG_NAME = 'tag name'
17 | CLASS_NAME = 'class name'
18 | CSS_SELECTOR = 'css selector'
19 |
--------------------------------------------------------------------------------
/DrissionPage/__init__.pyi:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 | """
8 | from ._base.chromium import Chromium
9 | from ._configs.chromium_options import ChromiumOptions
10 | from ._configs.session_options import SessionOptions
11 | from ._pages.chromium_page import ChromiumPage
12 | from ._pages.session_page import SessionPage
13 | from ._pages.web_page import WebPage
14 | from .version import __version__
15 |
16 |
17 | __all__ = ['WebPage', 'ChromiumPage', 'Chromium', 'ChromiumOptions', 'SessionOptions', 'SessionPage', '__version__']
18 |
--------------------------------------------------------------------------------
/docs_en/ChromiumPage/upload_files.md:
--------------------------------------------------------------------------------
1 | 🚤 File Upload
2 | ---
3 |
4 | There are two ways to upload a file:
5 |
6 | - Find the `` element and insert the file path.
7 |
8 | - Intercept the file input box and automatically fill in the path.
9 |
10 | ## ✅️️ Traditional Method
11 |
12 | The first method is the traditional method, where developers need to find the file upload control in the DOM and use the `input()` method of the element object to insert the path.
13 |
14 | The file upload control is an `` element with the `type` attribute set to `'file'`, and the file path can be entered into the element. Its usage is the same as entering text.
15 |
16 | The only difference is that
17 |
18 |
--------------------------------------------------------------------------------
/docs_en/get_start/installation.md:
--------------------------------------------------------------------------------
1 | 🌏 Installation
2 | ---
3 |
4 | ## ✅️️ System Requirements
5 |
6 | Operating System: Windows, Linux, or Mac.
7 |
8 | Python Version: 3.6 and above.
9 |
10 | Supported Browsers: Chromium-based browsers (such as Chrome and Edge).
11 |
12 | ---
13 |
14 | ## ✅️️ Installation
15 |
16 | Please use pip to install DrissionPage:
17 |
18 | ```shell
19 | pip install DrissionPage
20 | ```
21 |
22 | ---
23 |
24 | ## ✅️️ Upgrading
25 |
26 | ### 📌 Upgrade to the Latest Stable Version
27 |
28 | ```shell
29 | pip install DrissionPage --upgrade
30 | ```
31 |
32 | ---
33 |
34 | ### 📌 Upgrade to a Specific Version
35 |
36 | ```shell
37 | pip install DrissionPage==4.0.0b17
38 | ```
39 |
40 |
--------------------------------------------------------------------------------
/DrissionPage/items.py:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 | """
8 | from ._elements.chromium_element import ChromiumElement, ShadowRoot
9 | from ._elements.none_element import NoneElement
10 | from ._elements.session_element import SessionElement
11 | from ._pages.chromium_frame import ChromiumFrame
12 | from ._pages.chromium_tab import ChromiumTab
13 | from ._pages.mix_tab import MixTab
14 | from ._pages.mix_tab import MixTab as WebPageTab
15 |
16 | __all__ = ['ChromiumElement', 'ShadowRoot', 'NoneElement', 'SessionElement', 'ChromiumFrame', 'ChromiumTab',
17 | 'MixTab', 'WebPageTab']
18 |
--------------------------------------------------------------------------------
/docs_en/features/features_demos/get_element_attributes.md:
--------------------------------------------------------------------------------
1 | ⭐ Get Element Attribute
2 | ---
3 |
4 | ```python
5 | # Continuing from previous code
6 | foot = page.ele('#footer-left') # Find element by id
7 | first_col = foot.ele('css:>div') # Find element within the subordinates using css selector (the first one)
8 | lnk = first_col.ele('text:命令学') # Find element using text content
9 | text = lnk.text # Get element text
10 | href = lnk.attr('href') # Get element attribute value
11 |
12 | print(text, href, '\n')
13 |
14 | # Concise chaining mode
15 | text = page('@id:footer-left')('css:>div')('text:命令学').text
16 | print(text)
17 | ```
18 |
19 | **Output:**
20 |
21 | ```shell
22 | Learn Git Command https://oschina.gitee.io/learn-git-branching/
23 |
24 | Learn Git Command
25 | ```
26 |
27 |
--------------------------------------------------------------------------------
/docs_en/WebPage/introduction.md:
--------------------------------------------------------------------------------
1 | 🛸 Overview
2 | ---
3 |
4 | The `WebPage` object integrates `SessionPage` and `ChromiumPage`, enabling communication between the two.
5 |
6 | It can control the browser and send/receive data packets, and synchronizes login information between the two.
7 |
8 | It has two modes: d and s, corresponding to controlling the browser and sending/receiving data packets, respectively.
9 |
10 | `WebPage` can flexibly switch between these two modes, allowing for interesting use cases.
11 |
12 | For example, if the website login code is very complex and using data packets is too complicated, we can use the browser to handle the login and then switch to the data packet mode to collect data.
13 |
14 | The logic for using both modes is the same and there is no difference compared to `ChromiumPage`, making it easy to get started.
15 |
16 | Diagram of the `WebPage` structure:
17 |
18 | 
19 |
20 |
--------------------------------------------------------------------------------
/docs_en/features/features_demos/switch_mode.md:
--------------------------------------------------------------------------------
1 | ⭐ Mode Switch
2 | ---
3 |
4 | Log in to the website using a browser and switch to reading the webpage with requests. They will share login information.
5 |
6 | ```python
7 | from DrissionPage import WebPage
8 | from time import sleep
9 |
10 | # Create a page object with the default d mode
11 | page = WebPage()
12 | # Visit the personal center page (not logged in, redirect to the login page)
13 | page.get('https://gitee.com/profile')
14 |
15 | # Enter the account password to log in
16 | page.ele('@id:user_login').input('your_user_name')
17 | page.ele('@id:user_password').input('your_password\n')
18 | page.wait.load_start()
19 |
20 | # Switch to the s mode
21 | page.change_mode()
22 | # Output of session mode after login
23 | print('Logged in title:', page.title, '\n')
24 | ```
25 |
26 | **Output:**
27 |
28 | ```shell
29 | Logged in title: Personal Information - Gitee.com
30 | ```
31 |
32 |
--------------------------------------------------------------------------------
/.github/FUNDING.yml:
--------------------------------------------------------------------------------
1 | # These are supported funding model platforms
2 |
3 | github: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
4 | patreon: # Replace with a single Patreon username
5 | open_collective: # Replace with a single Open Collective username
6 | ko_fi: # Replace with a single Ko-fi username
7 | tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
8 | community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
9 | liberapay: # Replace with a single Liberapay username
10 | issuehunt: # Replace with a single IssueHunt username
11 | lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
12 | polar: # Replace with a single Polar username
13 | buy_me_a_coffee: # Replace with a single Buy Me a Coffee username
14 | thanks_dev: # Replace with a single thanks.dev username
15 | custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']
16 |
--------------------------------------------------------------------------------
/DrissionPage/__init__.py:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 |
8 | 允许任何人以个人身份使用或分发本项目源代码,但仅限于学习和合法非盈利目的。
9 | 个人或组织如未获得版权持有人授权,不得将本项目以源代码或二进制形式用于商业行为。
10 |
11 | 使用本项目需满足以下条款,如使用过程中出现违反任意一项条款的情形,授权自动失效。
12 | * 禁止将DrissionPage应用到任何可能违反当地法律规定和道德约束的项目中
13 | * 禁止将DrissionPage用于任何可能有损他人利益的项目中
14 | * 禁止将DrissionPage用于攻击与骚扰行为
15 | * 遵守Robots协议,禁止将DrissionPage用于采集法律或系统Robots协议不允许的数据
16 |
17 | 使用DrissionPage发生的一切行为均由使用人自行负责。
18 | 因使用DrissionPage进行任何行为所产生的一切纠纷及后果均与版权持有人无关,
19 | 版权持有人不承担任何使用DrissionPage带来的风险和损失。
20 | 版权持有人不对DrissionPage可能存在的缺陷导致的任何损失负任何责任。
21 | """
22 | from ._base.chromium import Chromium
23 | from ._configs.chromium_options import ChromiumOptions
24 | from ._configs.session_options import SessionOptions
25 | from ._pages.chromium_page import ChromiumPage
26 | from ._pages.session_page import SessionPage
27 | from ._pages.web_page import WebPage
28 | from .version import __version__
29 |
--------------------------------------------------------------------------------
/DrissionPage/_elements/none_element.pyi:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 | """
8 | from typing import Any, Optional
9 |
10 | from .._base.base import BasePage
11 |
12 |
13 | class NoneElement(object):
14 | _none_ele_value: Any = ...
15 | _none_ele_return_value: Any = ...
16 | method: Optional[str] = ...
17 | args: Optional[dict] = ...
18 |
19 | def __init__(self,
20 | page: BasePage = None,
21 | method: str = None,
22 | args: dict = None):
23 | """
24 | :param page: 元素所在页面
25 | :param method: 查找元素的方法
26 | :param args: 查找元素的参数
27 | """
28 | ...
29 |
30 | def __call__(self, *args, **kwargs) -> NoneElement: ...
31 |
32 | def __repr__(self) -> str: ...
33 |
34 | def __getattr__(self, item: str) -> str: ...
35 |
36 | def __eq__(self, other: Any) -> bool: ...
37 |
38 | def __bool__(self) -> bool: ...
39 |
--------------------------------------------------------------------------------
/DrissionPage/_configs/configs.ini:
--------------------------------------------------------------------------------
1 | [paths]
2 | download_path =
3 | tmp_path =
4 |
5 | [chromium_options]
6 | address = 127.0.0.1:9222
7 | browser_path = chrome
8 | arguments = ['--no-default-browser-check', '--disable-suggestions-ui', '--no-first-run', '--disable-infobars', '--disable-popup-blocking', '--hide-crash-restore-bubble', '--disable-features=PrivacySandboxSettings4']
9 | extensions = []
10 | prefs = {'profile.default_content_settings.popups': 0, 'profile.default_content_setting_values': {'notifications': 2}}
11 | flags = {}
12 | load_mode = normal
13 | user = Default
14 | auto_port = False
15 | system_user_path = False
16 | existing_only = False
17 | new_env = False
18 |
19 | [session_options]
20 | headers = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/603.3.8 (KHTML, like Gecko) Version/10.1.2 Safari/603.3.8', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'connection': 'keep-alive', 'accept-charset': 'GB2312,utf-8;q=0.7,*;q=0.7'}
21 |
22 | [timeouts]
23 | base = 10
24 | page_load = 30
25 | script = 30
26 |
27 | [proxies]
28 | http =
29 | https =
30 |
31 | [others]
32 | retry_times = 3
33 | retry_interval = 2
34 |
--------------------------------------------------------------------------------
/docs_en/features/features_demos/compare_with_requests.md:
--------------------------------------------------------------------------------
1 | ⭐ Comparison with requests
2 | ---
3 |
4 | The following code achieves the same functionality, comparing the amount of code for each:
5 |
6 | 🔸 Get element content
7 |
8 | ```python
9 | url = 'https://baike.baidu.com/item/python'
10 |
11 | # Using requests:
12 | from lxml import etree
13 | headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36'}
14 | response = requests.get(url, headers = headers)
15 | html = etree.HTML(response.text)
16 | element = html.xpath('//h1')[0]
17 | title = element.text
18 |
19 | # Using DrissionPage:
20 | page = WebPage('s')
21 | page.get(url)
22 | title = page('tag:h1').text
23 | ```
24 |
25 | :::tip Tips
26 | DrissionPage comes with default headers
27 | :::
28 |
29 | 🔸 Download file
30 |
31 | ```python
32 | url = 'https://www.baidu.com/img/flexible/logo/pc/result.png'
33 | save_path = r'C:\download'
34 |
35 | # Using requests:
36 | r = requests.get(url)
37 | with open(f'{save_path}\\img.png', 'wb') as fd:
38 | for chunk in r.iter_content():
39 | fd.write(chunk)
40 |
41 | # Using DrissionPage:
42 | page.download(url, save_path, 'img') # Supports renaming, handles filename conflicts
43 | ```
44 |
45 |
--------------------------------------------------------------------------------
/docs_en/MixPage/introduction.md:
--------------------------------------------------------------------------------
1 | 🛠 Old Version (MixPage)
2 | ---
3 |
4 | The versions of this repository prior to 3.0 were implemented by re-encapsulating selenium.
5 |
6 | The page objects for this version are `MixPage` and `DriverPage`, corresponding to `WebPage` and `ChromiumPage` of DrissionPage. The usage is basically the same as the new version.
7 |
8 | After years of use, the old version has become quite stable. However, due to reliance on selenium, the development of functions has been greatly restricted. Moreover, with the iteration of versions, the new version has surpassed the old version comprehensively, and it is time for the old version to retire.
9 |
10 | Therefore, starting from version 3.0, the old version code has been separated from this repository and developed into an independent library.
11 |
12 | This is to commemorate the achievements it has made.
13 |
14 | Currently, the development of the old version has been frozen. Except for bug fixes, there will be no more functional modifications for the old version.
15 |
16 | Interested readers can take a look.
17 |
18 | ---
19 |
20 | Project address: [MixPage](https://gitee.com/g1879/MixPage)
21 |
22 | Documentation: [MixPage User Manual](http://g1879.gitee.io/mixpage)
23 |
24 | ---
25 |
26 | The structure of `MixPage` is as follows:
27 |
28 | 
29 |
30 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | from setuptools import setup, find_packages
3 | from DrissionPage import __version__
4 |
5 | with open("README.md", "r", encoding='utf-8') as fh:
6 | long_description = fh.read()
7 |
8 | setup(
9 | name="DrissionPage",
10 | version=__version__,
11 | author="g1879",
12 | author_email="g1879@qq.com",
13 | description="Python based web automation tool. It can control the browser and send and receive data packets.",
14 | long_description=long_description,
15 | long_description_content_type="text/markdown",
16 | # license="BSD",
17 | keywords="DrissionPage",
18 | url="https://DrissionPage.cn",
19 | include_package_data=True,
20 | packages=find_packages(),
21 | zip_safe=False,
22 | install_requires=[
23 | 'lxml',
24 | 'requests',
25 | 'cssselect',
26 | 'DownloadKit>=2.0.7',
27 | 'websocket-client',
28 | 'click',
29 | 'tldextract>=3.4.4',
30 | 'psutil'
31 | ],
32 | classifiers=[
33 | "Programming Language :: Python :: 3.6",
34 | "Development Status :: 4 - Beta",
35 | "Topic :: Utilities",
36 | # "License :: OSI Approved :: BSD License",
37 | ],
38 | python_requires='>=3.6',
39 | entry_points={
40 | 'console_scripts': [
41 | 'dp = DrissionPage._functions.cli:main',
42 | ],
43 | },
44 | )
45 |
--------------------------------------------------------------------------------
/DrissionPage/_functions/browser.pyi:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 | """
8 | from typing import Union
9 |
10 | from .._configs.chromium_options import ChromiumOptions
11 |
12 |
13 | def connect_browser(option: ChromiumOptions) -> bool:
14 | """连接或启动浏览器
15 | :param option: ChromiumOptions对象
16 | :return: 返回是否接管的浏览器
17 | """
18 | ...
19 |
20 |
21 | def get_launch_args(opt: ChromiumOptions) -> list:
22 | """从ChromiumOptions获取命令行启动参数
23 | :param opt: ChromiumOptions
24 | :return: 启动参数列表
25 | """
26 | ...
27 |
28 |
29 | def set_prefs(opt: ChromiumOptions) -> None:
30 | """处理启动配置中的prefs项,目前只能对已存在文件夹配置
31 | :param opt: ChromiumOptions
32 | :return: None
33 | """
34 | ...
35 |
36 |
37 | def set_flags(opt: ChromiumOptions) -> None:
38 | """处理启动配置中的flags项
39 | :param opt: ChromiumOptions
40 | :return: None
41 | """
42 | ...
43 |
44 |
45 | def test_connect(ip: str, port: Union[int, str], timeout: float = 30) -> bool:
46 | """测试浏览器是否可用
47 | :param ip: 浏览器ip
48 | :param port: 浏览器端口
49 | :param timeout: 超时时间(秒)
50 | :return: None
51 | """
52 | ...
53 |
54 |
55 | def get_chrome_path(ini_path: str) -> Union[str, None]:
56 | """从ini文件或系统变量中获取chrome可执行文件的路径
57 | :param ini_path: ini文件路径
58 | :return: 文件路径
59 | """
60 | ...
61 |
--------------------------------------------------------------------------------
/docs_en/demos/login_gitee.md:
--------------------------------------------------------------------------------
1 | 🌠 Gitee Auto Login
2 | ---
3 |
4 | This example demonstrates how to automatically login to the Gitee website by controlling the browser.
5 |
6 | ## ✅️️ Web Analysis
7 |
8 | URL: https://gitee.com/login
9 |
10 | 
11 |
12 | Press `F12` to view the code, and you can see that both input boxes can be located using the `id` attribute, as shown in the image.
13 |
14 | 
15 |
16 | ---
17 |
18 | ## ✅️️ Coding Idea
19 |
20 | Elements with the `id` attribute are easy to locate. Both input boxes can be directly located using the `id` attribute.
21 | The login button does not have an `id` attribute, but it can be observed that it is the first element with the `value` attribute set to `'登 录'`, so it can also be located using the Chinese text for better code readability.
22 |
23 | Since we are using a browser for logging in, we will use `ChromiumPage` to control the browser.
24 |
25 | ---
26 |
27 | ## ✅️️ Sample Code
28 |
29 | ```python
30 | from DrissionPage import ChromiumPage
31 |
32 | # Create a page object in 'd' mode (default mode)
33 | page = ChromiumPage()
34 | # Navigate to the login page
35 | page.get('https://gitee.com/login')
36 |
37 | # Locate the account input box and enter the account
38 | page.ele('#user_login').input('Your account')
39 | # Locate the password input box and enter the password
40 | page.ele('#user_password').input('Your password')
41 |
42 | # Click the login button
43 | page.ele('@value=登 录').click()
44 | ```
45 |
46 | ---
47 |
48 | ## ✅️️ Result
49 |
50 | Login successful.
51 |
52 | 
53 |
54 |
--------------------------------------------------------------------------------
/DrissionPage/_functions/cli.py:
--------------------------------------------------------------------------------
1 | # -*- coding:utf-8 -*-
2 | """
3 | @Author : g1879
4 | @Contact : g1879@qq.com
5 | @Website : https://DrissionPage.cn
6 | @Copyright: (c) 2020 by g1879, Inc. All Rights Reserved.
7 | """
8 | from click import command, option
9 |
10 | from .._functions.tools import configs_to_here as ch
11 | from .._configs.chromium_options import ChromiumOptions
12 | from .._pages.chromium_page import ChromiumPage
13 |
14 |
15 | @command()
16 | @option("-p", "--set-browser-path", help="设置浏览器路径")
17 | @option("-u", "--set-user-path", help="设置用户数据路径")
18 | @option("-c", "--configs-to-here", is_flag=True, help="复制默认配置文件到当前路径")
19 | @option("-l", "--launch-browser", default=-1, help="启动浏览器,传入端口号,0表示用配置文件中的值")
20 | def main(set_browser_path, set_user_path, configs_to_here, launch_browser):
21 | if set_browser_path:
22 | set_paths(browser_path=set_browser_path)
23 |
24 | if set_user_path:
25 | set_paths(user_data_path=set_user_path)
26 |
27 | if configs_to_here:
28 | ch()
29 |
30 | if launch_browser >= 0:
31 | port = f'127.0.0.1:{launch_browser}' if launch_browser else None
32 | ChromiumPage(port)
33 |
34 |
35 | def set_paths(browser_path=None, user_data_path=None):
36 | """快捷的路径设置函数
37 | :param browser_path: 浏览器可执行文件路径
38 | :param user_data_path: 用户数据路径
39 | :return: None
40 | """
41 | co = ChromiumOptions()
42 |
43 | if browser_path is not None:
44 | co.set_browser_path(browser_path)
45 |
46 | if user_data_path is not None:
47 | co.set_user_data_path(user_data_path)
48 |
49 | co.save()
50 |
51 |
52 | if __name__ == '__main__':
53 | main()
54 |
--------------------------------------------------------------------------------
/docs_en/demos/douban_book_pics.md:
--------------------------------------------------------------------------------
1 | 🌠 Download Douban Book Covers
2 | ---
3 |
4 | The example from Starbucks uses the `download()` method to download images. This example demonstrates how to directly save images in a browser.
5 |
6 | This feature is a highlight of this library. It does not require any UI operations or re-downloading of images. Instead, it directly reads and saves images from the cache, making it very convenient to use.
7 |
8 | ## ✅️️ Page Analysis
9 |
10 | Target URL: [https://book.douban.com/tag/小说](https://book.douban.com/tag/%E5%B0%8F%E8%AF%B4)
11 |
12 | By pressing `F12`, you can see that each book is contained in an element with the `class` attribute set to `subject-item`. You can retrieve them in batches and then retrieve the `
` element to save the image.
13 |
14 | ---
15 |
16 | ## ✅️️ Coding Approach
17 |
18 | In order to demonstrate the `save()` method of the element object, we will use browser operations to save the image files to the local `imgs` folder.
19 |
20 | ---
21 |
22 | ## ✅️️ Example Code
23 |
24 | The following code can be run directly.
25 |
26 | ```python
27 | from DrissionPage import ChromiumPage
28 |
29 | # Create a page object
30 | page = ChromiumPage()
31 | # Visit the target webpage
32 | page.get('https://book.douban.com/tag/小说?start=0&type=T')
33 |
34 | # Scrape 4 pages
35 | for _ in range(4):
36 | # Iterate through all the books on a single page
37 | for book in page.eles('.subject-item'):
38 | # Get the cover image object
39 | img = book('t:img')
40 | # Save the image
41 | img.save(r'.\imgs')
42 |
43 | # Click the next page
44 | page('后页>').click()
45 | page.wait.load_start()
46 | ```
47 |
48 | ---
49 |
50 | ## ✅️️ Result
51 |
52 | 
53 |
54 |
--------------------------------------------------------------------------------
/docs_en/advance/settings.md:
--------------------------------------------------------------------------------
1 | ⚙️ Global Settings
2 | ---
3 |
4 | There are some global settings at runtime that can control certain behaviors of the program.
5 |
6 | ## ✅️️ Usage
7 |
8 | Global settings are located in the `DrissionPage.common` path.
9 |
10 | Use assignment to modify the properties of the `Settings` object.
11 |
12 | Usage:
13 |
14 | ```python
15 | from DrissionPage.common import Settings
16 |
17 | Settings.raise_when_wait_failed = True
18 | ```
19 |
20 | ---
21 |
22 | ## ✅️️ Settings Options
23 |
24 | ### 📌 `raise_when_ele_not_found`
25 |
26 | Sets whether or not to raise an exception when an element is not found. Default is `False`.
27 |
28 | ---
29 |
30 | ### 📌 `raise_when_click_failed`
31 |
32 | Sets whether or not to raise an exception when clicking fails. Default is `False`.
33 |
34 | ---
35 |
36 | ### 📌 `raise_when_wait_failed`
37 |
38 | Sets whether or not to raise an exception when waiting fails. Default is `False`.
39 |
40 | ---
41 |
42 | ### 📌 `singleton_tab_obj`
43 |
44 | Sets whether or not the Tab object should use the singleton pattern. Default is `True`.
45 |
46 | ---
47 |
48 | ## ✅️️ Examples
49 |
50 | This example sets to immediately raise an exception when an element is not found (instead of returning `NoneElement`).
51 |
52 | You can execute it directly to see the effect.
53 |
54 | ```python
55 | from DrissionPage import SessionPage
56 | from DrissionPage.common import Settings
57 |
58 | Settings.raise_when_ele_not_found = True
59 |
60 | page = SessionPage()
61 | page.get('https://www.baidu.com')
62 | ele = page('#abcd')
63 | ```
64 |
65 | **Output:**
66 |
67 | ```shell
68 | ...omitted...
69 | DrissionPage.errors.ElementNotFoundError:
70 | Element not found.
71 | method: ele()
72 | args: {'locator': '#abcd'}
73 | ```
74 |
75 |
--------------------------------------------------------------------------------
/docs_en/download/introduction.md:
--------------------------------------------------------------------------------
1 | ⤵️ Overview
2 | ---
3 |
4 | DrissionPage provides powerful file download management capabilities.
5 |
6 | It can initiate download tasks actively and also manage download tasks triggered by the browser.
7 |
8 | ## ✅️️ `download()` method
9 |
10 | This method can actively initiate download tasks and provide features such as task management, multi-threading, large file chunking, automatic reconnection, and file name conflict handling.
11 |
12 | This method is supported by page objects, tab objects, and `